Rendering

Rendering is divided into five distinctive passes. The passes are only responsible for rendering the scene. The scene does not include the UI. The UI pass is applied as a layer on the final window back buffer.

Compile & Cull

The compile and cull pass is in charge of compiling the information required to render the frame as well as culling anything not within the viewing frustum.

Currently this involves running through all spot and point lights in the scene and checking if they're visible. When they are visible, they're hashed and compared against their previous hashed value to see if any changes where made to them. If there are changes present and the light is set to cast shadows, the world geometry is traversed to calculate indices for the vertices responsible in the light's contribution area (as defined by a sphere.) These indices are used when rendering the geometry for the world into the shadow map.

When the light is a point light, each triangle in the light's area of influence is checked to see which plane the triangle is on in the viewing frustum. The triangles are then outputted to six separate lists of indices. This is done using masks as triangles can be in multiple frustum planes at once. The code for calculating this mask is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
static uint8_t calcTriangleSideMask(const m::vec3 &p1,
                                    const m::vec3 &p2,
                                    const m::vec3 &p3,
                                    float bias)
{
    // p1, p2, p3 are in the cubemap's local coordinate system
    // bias = border/(size - border)
    uint8_t mask = 0x3F;
    float dp1 = p1.x + p1.y, dn1 = p1.x - p1.y, ap1 = m::abs(dp1), an1 = m::abs(dn1),
          dp2 = p2.x + p2.y, dn2 = p2.x - p2.y, ap2 = m::abs(dp2), an2 = m::abs(dn2),
          dp3 = p3.x + p3.y, dn3 = p3.x - p3.y, ap3 = m::abs(dp3), an3 = m::abs(dn3);
    if (ap1 > bias*an1 && ap2 > bias*an2 && ap3 > bias*an3)
        mask &= (3<<4)
            | (dp1 < 0 ? (1<<0)|(1<<2) : (2<<0)|(2<<2))
            | (dp2 < 0 ? (1<<0)|(1<<2) : (2<<0)|(2<<2))
            | (dp3 < 0 ? (1<<0)|(1<<2) : (2<<0)|(2<<2));
    if (an1 > bias*ap1 && an2 > bias*ap2 && an3 > bias*ap3)
        mask &= (3<<4)
            | (dn1 < 0 ? (1<<0)|(2<<2) : (2<<0)|(1<<2))
            | (dn2 < 0 ? (1<<0)|(2<<2) : (2<<0)|(1<<2))
            | (dn3 < 0 ? (1<<0)|(2<<2) : (2<<0)|(1<<2));

    dp1 = p1.y + p1.z, dn1 = p1.y - p1.z, ap1 = m::abs(dp1), an1 = m::abs(dn1),
    dp2 = p2.y + p2.z, dn2 = p2.y - p2.z, ap2 = m::abs(dp2), an2 = m::abs(dn2),
    dp3 = p3.y + p3.z, dn3 = p3.y - p3.z, ap3 = m::abs(dp3), an3 = m::abs(dn3);
    if (ap1 > bias*an1 && ap2 > bias*an2 && ap3 > bias*an3)
        mask &= (3<<0)
            | (dp1 < 0 ? (1<<2)|(1<<4) : (2<<2)|(2<<4))
            | (dp2 < 0 ? (1<<2)|(1<<4) : (2<<2)|(2<<4))
            | (dp3 < 0 ? (1<<2)|(1<<4) : (2<<2)|(2<<4));
    if (an1 > bias*ap1 && an2 > bias*ap2 && an3 > bias*ap3)
        mask &= (3<<0)
            | (dn1 < 0 ? (1<<2)|(2<<4) : (2<<2)|(1<<4))
            | (dn2 < 0 ? (1<<2)|(2<<4) : (2<<2)|(1<<4))
            | (dn3 < 0 ? (1<<2)|(2<<4) : (2<<2)|(1<<4));

    dp1 = p1.z + p1.x, dn1 = p1.z - p1.x, ap1 = m::abs(dp1), an1 = m::abs(dn1),
    dp2 = p2.z + p2.x, dn2 = p2.z - p2.x, ap2 = m::abs(dp2), an2 = m::abs(dn2),
    dp3 = p3.z + p3.x, dn3 = p3.z - p3.x, ap3 = m::abs(dp3), an3 = m::abs(dn3);
    if (ap1 > bias*an1 && ap2 > bias*an2 && ap3 > bias*an3)
        mask &= (3<<2)
            | (dp1 < 0 ? (1<<4)|(1<<0) : (2<<4)|(2<<0))
            | (dp2 < 0 ? (1<<4)|(1<<0) : (2<<4)|(2<<0))
            | (dp3 < 0 ? (1<<4)|(1<<0) : (2<<4)|(2<<0));
    if (an1 > bias*ap1 && an2 > bias*ap2 && an3 > bias*ap3)
        mask &= (3<<2)
            | (dn1 < 0 ? (1<<4)|(2<<0) : (2<<4)|(1<<0))
            | (dn2 < 0 ? (1<<4)|(2<<0) : (2<<4)|(1<<0))
            | (dn3 < 0 ? (1<<4)|(2<<0) : (2<<4)|(1<<0));

    return mask;
}

To prevent calculating the transforms required to render the shadow map every frame, this pass also calculates the world-view-projection matrix to render the shadow map only when the light's properties have changed.

Geometry pass

The geometry pass renders into a off-screen buffer with two color attachments and a depth-stencil attachment which are used to store diffuse, normal and specular information.

All geometry, including map models, player models are rendered during this pass. Ambient occlusion is also calculated during this pass to a separate off-screen buffer. Masking of ambient occlusion elements is possible as stencil-test is used.

Lots of effort has gone into packing the geometry buffer as much as possible to reduce bandwidth. The layout is setup as follows:

              R8         G8         B8         A8
attachment1: [diffuse R][diffuse G][diffuse B][spec power]          (RGBA8)
attachment2: [normals R][normals G][normals B][spec intensity]      (RGBA8)
                

Lighting pass

Lighting passs begins by outputting to a final composite off-screen buffer with blending enabled.

Point lights are rendered with a sphere of appropriate radius, taking care to change culling order of whether the observer is inside the volume or not.

Previously calculated world-view-projection matrices are used to render into a shadow-map the depth from the light's perspective. Similarly, the previously calculated indices during the Compile & Cull pass are used here to reduce the amount of geometry rendered for the shadow map. The rendering of the shadow-map uses a polygon offset to do slope-dependent bias.

The result of the shadow-map is used immediately after it's available during the lighting pass. There is only one shadow-map, this is to keep the memory and bandwidth requirements low. No batching or caching of shadow maps is done.

To make use of throughput of fragment operations an unusual percentage-close-filtering algorithm is used when sampling the shadow-map which makes sure the distance between taps is four unique pixels and are correctly weighted by the area they take in the final average. The filter is listed here

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
vec2 scale = 1.0f / textureSize(gShadowMap, 0);
shadowCoord.xy *= textureSize(gShadowMap, 0);
vec2 offset = fract(shadowCoord.xy - 0.5f);
shadowCoord.xy -= offset*0.5f;
vec4 size = vec4(offset + 1.0f, 2.0f - offset);
return (1.0f / 9.0f) * dot(size.zxzx*size.wwyy,
    vec4(texture(gShadowMap, vec3(scale * (shadowCoord.xy + vec2(-0.5f, -0.5f)), shadowCoord.z)),
         texture(gShadowMap, vec3(scale * (shadowCoord.xy + vec2(1.0f, -0.5f)), shadowCoord.z)),
         texture(gShadowMap, vec3(scale * (shadowCoord.xy + vec2(-0.5f, 1.0f)), shadowCoord.z)),
         texture(gShadowMap, vec3(scale * (shadowCoord.xy + vec2(1.0f, 1.0f)), shadowCoord.z))));
The same technique and filter is used for rendering the spot lights as well.

Directional lighting is rendered in two passes of its own. The first pass applies direct lighting with stencil test disabled for all fragments. Lighting the scene. The second pass applies ambient-occlusion results, but only for elements masked in stencil.

Forward pass

The forward pass is responsible for rendering everything else that simply cannot be rendered deferred easily. This includes transparent items (such as billboards & particles.) The skybox is also rendered during this pass as well.

Composite pass

The composite pass is responsible for applying color grading and other post-effects such as: vignette, color grading and anti-aliasing before outputting to the window back buffer.