Thursday, September 10, 2015

Temporal Reprojection and SAO

We've recently re-visited our AO solution with the goal of improving its performance on consoles. We currently use the Scalable Ambient Obscurance algorithm presented by Morgan McGuire. Out target was to bring down the cost of the entire effect to something between 1-1.5ms on Xbox One. To achieve this it was clear that we needed to reduce the number of taps we were taking for each AO sample. An important part of this work went towards improving the efficiency of the temporal reprojection of our AO buffer. I thought I'd share a few observations we've made along the way.

Distributing AO Samples

When reprojecting data the key to success is to make sure your samples are well distributed through time. The Halton sequence was made popular for reprojection methods after Brian Karis presented his High Quality Temporal Supersampling. It is a sequence that gives well distributed samples in space as well as in time.

If we add an offset to a single AO sample we can see that after 2π it repeats (as expected)

So we use 8 samples ranging from [0, 2π] distributed by using the first 8 terms of a base 3 Halton sequence: {1/3, 2/3, 1/9, 4/9, 7/9, 2/9, 5/9, 8/9} x 2π

And this is the result using 6 AO samples.

Notice that there is some banding that can appear. By adding a dithered offset to the current sample's radius we can remove this quite nicely. We use a 4x4 pattern based on the Bayer matrix.

And here are our samples distributed through time. If anyone has a better way of distributing these I would be very interested to know.

Reprojection function

When doing any temporal reprojection it is crucial to have a good reprojection function. I like to refer to this as a similarity function. Its purpose is to identify how likely it is that the reprojected samples correspond to the samples of the current pixel. When writing this function it's quite important to have a convenient way to visualize it. If the function gets complicated, it's a good idea to isolate the different terms of the function so that you can reason and debug them individually. The reprojection function we use is a combination of three main terms.

Disocclusion Term

This is a simple term which identifies depth differences and classifies previous pixels as disoccluded. We use the relative depth difference described by Huw Bowles in Iterative Image Warping.
depth_similarity = saturate(pow(prev_depth/current_depth, 4) + DEPTH_MIN_SIMILARITY);

Velocity Term

The second term is also very straight forward and consists simply of reducing the similarity for fast moving pixels. A moving pixel has less chance of reprojecting successfully than a still one.
velocity_similarity = saturate(velocity * VELOCITY_SCALAR);

Dangerous Samples Term

The idea is to identify if the AO samples we are gathering are touching moving objects. To do this efficiently, we encode a moving bit as part of the depth buffer info passed in to the SAO algorithm. If you use the mip chain described by the SAO paper, make sure that you forward that 'moving bit' to the lower levels of the mip chain. This idea was presented by Anton Michels during the Labs R&D: Rendering Techniques in Rise of the Tomb Raider presentation earlier this year. Since each AO sample will need to read the depth information we get the 'moving bit' read for free.
samples_similarity = saturate(num_moving_samples * MOVING_SAMPLES_SCALAR);
samples_similarity = lerp(samples_similarity, prev_samples_similarity, 0.9);
samples_similarity = min(samples_similarity, current_samples_similarity);

To try and invalidate samples associated with fast moving object, the Dangerous Samples term is accumulated through time. An idea described quite well by Oliver Mattausch in his TSSAO Gpu-Pro2 article which he called 'smooth invalidation'.

Here's the kind of ghosting we get without identifying 'dangerous' samples.

With this term as part of the reprojection function we can eliminate most of the ghosting that arises from the temporal reprojection:

Putting it all together

The final similarity term is calculated by combining all terms together.
similarity = depth_similarity * LOW_VELOCITY_SIMILARITY - velocity_similarity;
similarity = saturate(similarity - samples_similarity);

Well, that's it! Nothing too ground breaking but I thought I'd share. If anyone has ideas or suggestions on how to improve any of this please let us know!