Thursday, January 19, 2012

Source Code

It seems like the publisher didn't manage to upload the source code to the books website at

Please find a temporary solution at

This file is 425 Mb. This file should be up in an hour from now. In the moment I upload a changed version that holds the demo for the second article "Practical Binary Surface and Solid Voxelization with Direct3D 11 by Michael Schwarz". It was missing in the original version.

Monday, January 2, 2012

Extended Table of Contents

Here is the introduction to each section of the book:

Geometry Manipulation
The "Geometry Manipulation" section of the book focuses on the ability of graphics processing units (GPUs) to process and generate geometry in exciting ways.

The first chapter "Vertex Shader Tesselation" by Holger Gruen presents a method to implement tesselation using the vertex shader only. It requires DirectX 10 and above to work. This method does not require any data in addition to the already availalbe vertex data in contrast to older techniques that were called "Instanced Tesselation". It solely relies on the delivery of SV_VertexID and uses the original vertex and index buffers as input shader resources.

The next chapter "Real-Time deformable Terrain Rendering with DirectX 11" by Egor Yusov describes a high-quality real-time terrain visualization of large data sets, that dynamically controls the triangulation complexity. This terrain system combines efficient compression schemes for the height map with the GPU-accelerated triangulation construction, while at the same time supports dynamic modifications.

Alan Chambers describes in the article "Optimized Stadium Crowd Rendering" the design and methods used to reproduce a 80,000 seater stadium in detail. This method was used in the game Rugby Challenge on XBOX 360, PS3 and PC. He reveals several tricks used to achieve colored "writing" in the stands, ambient occlusion that darkens the upper echelons and variable crowd density that can be controlled live in-game.

Anti-Aliasing methods to replace hardware Multi-Sample Anti-Aliasing (MSAA) with a software method that works in the Post-Processing Pipeline are very popular since a Multi-Sample Anti-Aliasing (MLAA) solution on the GPU was covered in GPU Pro 2. Emil Persson covers in his article "Geometric Anti-Aliasing Methods" two Anti-Aliasing methods, that are driven by additional geometric data generated in the geometry shader or stored upfront in a dedicated geometry buffer that might be part of the G-Buffer.

-- Wolfgang Engel

The field of real-time rendering is constantly evolving and it can be challenging to keep up-to-date with the latest tricks and techniques. The goal of the rendering section is to introduce both beginner as well as seasoned graphics programmers to some of the latest advancements in real-time rendering. These techniques are all very practical and many can be found in the latest games on the market.

The first article in the rendering section is “Practical Elliptical Texture Filtering,” by Pavlos Mavridis and Georgios Papaioannou. This article presents a useful technique for achieving high quality, shader-based texture filtering on the GPU. The authors provide a reference implementation that can easily be integrated into an existing renderer.

Our next article is “An Approximation to the Chapman Grazing-Incidence Function for Atmospheric Scattering,” by Christian Schüler. This article describes an inexpensive approximation to atmospheric scattering and will be of particular interest to those interested in physically based, fully dynamic, virtual environments in which both visual realism and computational efficiency are of high importance.

The third article in the rendering section is “Volumetric Real-Time Water and Foam Rendering,” by Daniel Scherzer, Florian Bagar and Oliver Mattausch. This article presents a dynamic, multi-layered approach for rendering fluids and foam. This technique is presented in the context of a GPU-based fluid simulation but is compatible with other forms of fluid simulation as well.

The next article in this section is “CryENGINE 3,” by Tiago Sousa, Nick Kasyan, and Nicolas Schulz. This article covers some of the latest features of a production proven, highly successful real-time rendering engine. The authors discuss many cutting edge topics with an eye for efficiency and scalability. Some of the techniques they cover include: screen-space methods for reflections, character self-shadowing, efficient stereoscopic image-pair generation, and much more.

The last article "Inexpensive Anti-Aliasing of Simple Objects" by Mikkel Gjol and Mark Gjol explores the use of Discontinuity Edge Overdraw for anti-aliasing simple objects on mobile phones. The essence of this technique is to render a "smooth" line on top of aliasing primitive-edges to cover the aliasing edge.

The ideas and techniques presented in this section represent some of the latest developments in the realm of computer graphics. I would like to thank our authors for sharing their exciting new work with the graphics community and I hope that these ideas inspire readers to further extend the state-of-the-art in real-time rendering.

-- Christopher Oat

Global Illumination Effects
The global illumination section is a permanent part of this series of books, underlining the importance of realistic shading techniques and, more specifically, global illumination effects in real-time applications. The section contains three articles addressing very different phenomena, each using dedicated data structures and algorithms in order to achieve the desired performance. The techniques range from screen-space approximations to using splat-based representations and spatial index structures.

Ray-traced Approximate Reflections Using a Grid of Oriented Splats
Holger Gruen exploits the features of DX11 class hardware to render approximate ray-traced reflections in dynamic scene elements. His method creates a 3D grid containing a surface splat representation on-the-fly and then ray marches the grid to render reflections in real-time. Holger also outlines further improvements, e.g. using hierarchical grids, for future hardware.

Screen-space Bent Cones: A Practical Approach
Ambient occlusion computed in screen-space is a widely used approach to add realism to real-time rendered scenes at constant and low cost. Oliver Klehm and his co-authors describe a simple solution to computing bent normals as a by-product of screen-space ambient occlusion. This recovers some directional information of the otherwise fully decoupled occlusion and lighting computation. They further extend bent normals to bent cones, which not only store the average direction of incident light, but also the opening angle. When pre-convolving distant lighting, this results in physically more plausible lighting at the speed and simplicity of ambient occlusion.

Real-time Near-field Global Illumination based on a Voxel Model
Sinje Thiedemann and her colleagues describe method for computing one-bounce (near-field) indirect illumination with occlusion in dynamic scenes. It is based on a fast texture atlas-based generation of scene voxelizations for visibility, and reflective shadow maps (RSM) to sample directly lit surfaces. The indirect illumination is computed using Monte Carlo integration using the voxel representation to find the closest intersection for a secondary ray within a user-defined search radius (the near-field). The indirect light is then obtained from projecting the intersection into the RSM.

-- Carsten Dachsbacher

In Section IV algorithms that are used to generate shadow data are covered. Shadows are the dark companions of lights and although both can exist on their own, they shouldn't exist without each other in games. Achieving good visual results in rendering shadows is considered one of the particularly difficult tasks of graphics programmers. 

The first article "Efficient Online Visibility for Shadow Maps" by Oliver Mattausch, Jiri Bittner, Ari Silvennoinen, Daniel Scherzer and Michael Wimmer describes a solution to one of the biggest challenges for real-time shadows that cover large parts of a large gaming world. Rendering geometry into shadow maps in a game world with a large viewing distance might require rendering more geometry than is rendered for the main view. This chapter offers a solution that quickly detects and culls the geometry that does not contribute to the shadow in the final image. The main idea is to use camera-view visibility information to create a mask of potential shadow receivers in the light view, which restricts the area where shadow casters have to be rendered. The algorithm consists of the following four main steps:
1. Determine shadow receivers by rendering the scene from the camera as part of the depth pre-pass. The visible geoemtry is then stored in a bounding volume hierarchy that can be used to render later the shadow casters. An occlusion culling algorithm that uses occlusion queries is used to cull out geometry not necessary for the scene.
2. Create a shadow receiver mask from the point of the view of the light to restrict shadow map updates to this mask.
3. Render the shadow casters into the shadow maps using the mask for culling to decude the number of renderered shadow casters.
4. Shadow map comparison as commonly used in any other approach.

John White describes in his article "Depth Rejected Gobo Shadows" a technique to provide soft shadows using a simple texture sample. This approach extends the basic projected gobo texture idea by removing the incorrect projections on objects closer to the light source.

-- Wolfgang Engel

3D Engine Design
Welcome to the 3D Engine Design section of this edition of GPU Pro. The selection of articles you will find in here covers various aspects of engine design, such as quality and optimization, in addition to high-level architecture.

First, Pascal Gautron, Jean-Eudes Marvie and Gaël Sourimant present us with the article Z3 Culling, in which the authors suggest a novel method to optimize depth testing over the Z-buffer algorithm. The new technique adds two “depth-buffers” to keep the early z-culling optimization even on objects drawn with states that prevent early z-culling (such as alpha-testing).

Next, Dzmitry Malyshau brings his experience of designing a quaternion-based 3D engine in his article Quaternion-based rendering pipeline. The article shows the benefits of using quaternions in place of transformation matrices in various steps of the rendering pipeline from the experience of a real-world 3D engine implementation.

In the article Implementing a Directionally Adaptive Edge AA Filter using DirectX 11, Matthew Johnson improves upon the box anti-aliasing filter using a post-processing technique that calculates a best fit gradient line along the direction of candidate primitive edges to construct a filter that gives a better representation of edge information in the scene, and thus higher quality anti-aliased edges.

Finally, Donal Revie describes the high-level architecture of a 3D engine in the article Designing a Data-Driven Renderer. The design aims to bridge the gap between the logical simulation at the core of most game engines and the strictly ordered stream of commands required to render a frame through a graphics API. The solution focuses on providing a flexible data-driven foundation on which to build a rendering pipeline, making minimal assumptions about the exact rendering style used.

I would like to thank the authors who contributed to this section for their great work. I would like also to extend these thanks to my wife Suzan and my brother Homam for their wonderful support.

I hope you find these articles inspiring and enlightening to your rendering and engine development work.


-- Wessame Bahnassi

With the latest advances in computer graphics, the use of general compute APIs such as CUDA, OpenCL and DirectX 11 Compute shaders has now become mainstream. By allowing modern GPUs to go far beyond the standard processing of triangles and pixels, the power of the graphics processor is now not only open to domains reaching far beyond that of visualization or video games. The latest advances in GPU technologies
now allow the implementation of various parallel algorithms such as AI or physics. With the parallel nature of the GPU, such algorithms can generally run order of magnitudes faster than their CPU counterparts.
This section will covers articles that present techniques that go beyond the normal pixel and triangle scope of GPUs and take advantage of the parallelism of modern graphic processors to accomplish such tasks.

The first article "Volumetric transparency with Per-Pixel Fragment Lists" by L aszl o Sz ecsi, P al Barta and Bal azs Kov acs presents an efficient approach to rendering multiple layers of translucency by harnessing the power of compute shaders. By implementing a simple ray tracing approach in a computational shader, they can determine the appropriate color intensity for simple particles. The approach than be taken further and extended to even account for visual effects such as refraction and volumetric shadows.

In the second article named "Practical Binary Surface and Solid Voxelization with Direct3D 11" by Michael Schwarz, a new real-time voxelization technique is presented. This technique is efficient and tackles some of the problems such as voxel holes that occur in rasterization based voxelization algorithms. The resulting voxels can then be used to the application of a variety of techniques such as collision detection, ambient occlusion and even real-time global illumination.

And finally, in "Interactive Ray Tracing Using the Compute Shader in DirectX 11" by Arturo Garc  a, Francisco Avila, Sergio Murgua and Leo Reyes, a novel technique is presented to allow for real-time
interactive ray tracing using a combination of the GPU and CPU processing power. This implementation properly handles glossy reflections as global illumination. An efficient bounding volume hierarchy is also offered to accelerate the discovery of ray intersections.

-- Sebastien St-Laurent

Table of Content

I Geometry Manipulation
Wolfgang Engel, editor
  1. Vertex Shader Tesselatin by Holger Gruen
  2. Real-time Deformable Terrain Rendering by Egor Yusov
  3. Optimized Stadium Crowd Rendering by Alan Chambers
  4. Geometric Anti-Aliasing Methods by Emil Persson
II Rendering
Christopher Oat, editor
  1. Practical Elliptical Texture Filtering by Pavlos Mavridis and Georgios Papaioannou
  2. An Approximation to the Chapman Grazing-Incidence Function for Atmospheric Scattering by Christian Schüler
  3. Volumetric Real-Time Water and Foam Rendering by Daniel Scherzer, Florian Bagar and Oliver Mattausch
  4. CryENGINE 3 by Tiago Sousa, Nick Kasyan, and Nicolas Schulz
  5. Inexpensive Anti-Aliasing of Simple Objects by Mikkel Gjol and Mark Gjol
III Global Illumination Effects
Carsten Dachsbacher, editor
  1. Ray-traced Approximate Reflections Using a Grid of Oriented Splats by Holger Gruen
  2. Screen-space Bent Cones: A Practical Approach by Oliver Klehm, Tobias Ritschel, Elmar Eisemann, Hans-Peter Seidel
  3. Real-time Near-field Global Illumination based on a Voxel Model by Sinje Thiedemann, Niklas Henrich, Thorsten Grosch, Stefan Mueller
IV Shadows
Wolfgang Engel, editor
  1. Efficient Online Visibility for Shadow Maps by Oliver Mattausch, Jiri Bittner, Ari Silvnennoinen, Daniel Scherzer and Michael Wimmer
  2. Depth Rejected Gobo Shadows by John White
V 3D Engine Design
Wessam Bahnassi, editor
  1. Z3 Culling by Pascal Gautron, Jean-Eudes Marvie and Gaël Sourimant
  2. Quaternion-based rendering pipeline by Dzmitry Malyshau
  3. Implementing a Directionally Adaptive Edge AA Filter using DirectX 11 by Matthew Johnson
  4. Designing a Data-Driven Renderer by Donal Revie
Sebastien St-Laurent, editor
  1. Volumetric transparency with Per-Pixel Fragment Lists" by Laszlo Szecsi, Pal Barta and Balazs Kovacs
  2. Practical Binary Surface and Solid Voxelization with Direct3D 11 by Michael Schwarz
  3. Interactive Ray Tracing Using the Compute Shader in DirectX 11 by Arturo Garca, Francisco  Avila,  Sergio Murgua and Leo Reyes