Tuesday, February 14, 2017

Stingray Renderer Walkthrough #4: Sorting

Stingray Renderer Walkthrough #4: Sorting


This post will focus on ordering of the commands in the RenderContexts. I briefly touched on this subject in the last post and if you’ve implemented a rendering engine before you’re probably not new to this problem. Basically we need a way to make sure our RenderJobPackages (draw calls) end up on the screen in the correct order, both from a visual point of view as well as from a performance point of view. Some concrete examples,

  1. Make sure g-buffers and shadow maps are rendered before any lighting happens.
  2. Make sure opaque geometry is rendered front to back to reduce overdraw.
  3. Make sure transparent geometry is rendered back to front for alpha blending to generate correct results.
  4. Make sure the sky dome is rendered after all opaque geometry but before any transparent geometry.
  5. All of the above but also strive to reduce state switches as much as possible.
  6. All of the above but depending on GPU architecture maybe shift some work around to better utilize the hardware.

There are many ways of tackling this problem and it’s not uncommon that engines uses multiple sorting systems and spend quite a lot of frame time getting this right.

Personally I’m a big fan of explicit ordering with a single stable sort. What I mean by explicit ordering is that every command that gets recorded to a RenderContext already has the knowledge of when it will be executed relative to other commands. For us this knowledge is in the form of a 64 bit sort_key, in the case where we get two commands with the exact same sort_key we rely on the sort being stable to not introduce any kind of temporal instabilities in the final output.

The reasons I like this approach are many,

  1. It’s trivial to implement compared to various bucketing schemes and sorting of those buckets.
  2. We only need to visit renderable objects once per view (when calling their render() function), no additional pre-visits for sorting are needed.
  3. The sort is typically fast, and cost is isolated and easy to profile.
  4. Parallel rendering works out of the box, we can just take all the Command arrays of all the RenderContexts and merge them before sorting.

To make this work each command needs to know its absolute sort_key. Let’s breakdown the sort_key we use when working with our data-driven rendering pipe in Stingray. (Note: if the user doesn’t care about playing nicely together with our system for data-driven rendering it is fine to completely ignore the bit allocation patterns described below and roll their own.)

sort_key breakdown

Most significant bit on the left, here are our bit ranges:

MSB [ 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ] LSB
      ^ ^       ^  ^                                   ^^                 ^
      | |       |  |                                   ||                 |- 3 bits - Shader System (Pass Immediate)
      | |       |  |                                   ||- 16 bits - Depth
      | |       |  |                                   |- 1 bit - Instance bit
      | |       |  |- 32 bits - User defined
      | |       |- 3 bits - Shader System (Pass Deferred)
      | - 7 bits - Layer System
      |- 2 bits - Unused

2 bits - Unused

Nothing to see here, moving on… (Not really sure why these 2 bits are unused, I guess they weren’t at some point but for the moment they are always zero) :)

7 bits - Layer System

This 7-bits range is managed by the “Layer system”. The Layer system is responsible for controlling the overall scheduling of a frame and is set up in the render_config file. It’s a central part of the data-driven rendering architecture in Stingray. It allows you to configure what layers to expose to the shader system and in which order these layers should be drawn. We will look closer at the implementation of the layer system in a later post but in the interest of clarifying how it interops with the sort_key here’s a small example:

default = [
  // sort_key = [ 00000000 10000000 00000000 00000000 00000000 00000000 00000000 00000000 ]
  { name="gbuffer" render_targets=["gbuffer0", "gbuffer1", "gbuffer2", "gbuffer3"]
     depth_stencil_target="depth_stencil_buffer" sort="FRONT_BACK" profiling_scope="gbuffer" }

  // sort_key = [ 00000001 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ]
  { name="decals" render_targets=["gbuffer0" "gbuffer1"] depth_stencil_target="depth_stencil_buffer"
     profiling_scope="decal" sort="EXPLICIT" }

  // sort_key = [ 00000001 10000000 00000000 00000000 00000000 00000000 00000000 00000000 ]
  { resource_generator="lighting" profiling_scope="lighting" }

  // sort_key = [ 00000010 00000000 00000000 00000000 00000000 00000000 00000000 00000000 ] LSB
  { name="emissive" render_targets=["hdr0"] depth_stencil_target="depth_stencil_buffer"
    sort="FRONT_BACK" profiling_scope="emissive" }

Above we have three layers exposed to the shader system and one kick of a resource_generator called lighting (more about resource_generators in a later post). The layers are rendered in the order they are declared, this is handled by letting each new layer increment the 7 bits range belonging to the Layer System with 1 (as can be seen in the sort_key comments above).

The shader author dictates into which layer(s) it wants to render. When a RenderJobPackage is recorded to the RenderContext (as described in the last post) the correct layer sort_keys are looked up from the layer system and the result is bitwise ORed together with the sort_key value piped as argument to RenderContext::render().

3 bits - Shader System (Pass Deferred)

The next 3 bits are controlled by the Shader System. These three bits encode the shader pass index within a layer. When I say shader in this context I refer to our ShaderTemplate::Context which is basically a wrapper around multiple linked shaders rendering into one or many layers. (Nathan Reed recently blogged about “The Many Meanings of “Shader””, in his analogy our ShaderTemplate is the same as an “Effect”)

Since we can have a multi-pass shader rendering into the same layer we need to encode the pass index into the sort_key, that is what this 3 bit range is used for.

32 bits - User defined

We then have 32 user defined bits, these bits are primarily used by our “Resource Generator” system (I will be covering this system in the post about render_config & data-driven rendering later), but the user is free to use them anyway they like and still maintain compatibility with the data-driven rendering system.

1 bit - Instance bit

This single bit also comes from the Shader System and is set if the shader implements support for “Instance Merging”. I will be covering this in a bit more detail in my next post about the RenderDevice but essentially this bit allows us to scan through all commands and find ranges of commands that potentially can be merged together to fewer draw calls.

16 bits - Depth

One of the arguments piped to RenderContext::render() is an unsigned normalized depth value (0.0-1.0). This value gets quantized into these 16 bits and is what drives the front-to-back vs back-to-front sorting of RenderJobPackages. If the sorting criteria for the layer (see layer example above) is set to back-to-front we simply flip the bits in this range.

3 bits - Shader System (Pass Immediate)

A shader can be configured to run in “Immediate Mode” instead of “Deferred Mode” (default). This forces passes in a multi-pass shader to run immediately after each other and is achieved by moving the pass index bits into the least significant bits of the sort_key. The concept is probably easiest to explain with an artificial example and some pseudo code:

Take a simple scene with a few instances of the same mesh, each mesh recording one RenderJobPackages to one or many RenderContexts and all RenderJobPackages are being rendered with the same multi-pass shader.

In “Deferred Mode” (i.e pass indices encoded in the “Shader System (Pass Deferred)” range) you would get something like this:

foreach (pass in multi-pass-shader)
  foreach (render-job in render-job-packages)
    render (render-job)

If shader is configured to run in “Immediate Mode” you would instead get something like this:

foreach (render-job in render-job-packages)
  foreach (pass in multi-pass-shader)
    render (render-job)

As you probably can imagine the latter results in more shader / state switches but can sometimes be necessary to guarantee correctly rendered results. A typical example is when using multi-pass shaders that does alpha blending.

Wrap up

The actual sort is implemented using a standard stable radix sort and happens immediately after the user has called RenderDevice::dispatch() handing over n-number of RenderContexts to the RenderDevice for translation into graphics API calls.

Next post will cover this and give an overview of what a typical rendering back-end (RenderDevice) looks like in Stingray. Stay tuned.


  1. I feel really happy to have seen your webpage and look forward to so many more entertaining times reading here. Thanks once more for all the details.

    Data Science Training in Bangalore


  2. I've been surfing on the web more than 3 hours today, yet I never found any stupefying article like yours. It's imperatively worth for me. As I would see it, if all web proprietors and bloggers made confusing substance as you did, the net will be in a general sense more profitable than at whatever point in late memory.

    Digital Marketing Training in Mumbai

    Six Sigma Training in Dubai

    Six Sigma Abu Dhabi

  3. I think this is a great site to post and I have read most of contents and I found it useful for my Career .Thanks for the useful information. For any information or Queries Comment like and share it.

    PMP Training Abu Dhabi

    GDPR Training in Hyderabad

    Digital Marketing Training in Hyderabad

    six sigma Training Pune

  4. Gaining Python certifications will validate your skills and advance your career.
    python certification

  5. Very nice post here and thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.
    Good discussion. Thank you.
    Six Sigma Training in Abu Dhabi
    Six Sigma Training in Dammam
    Six Sigma Training in Riyadh

  6. Amazing article. Your blog helped me to improve myself in many ways thanks for sharing this kind of wonderful informative blogs in live. I have bookmarked more article from this website. Such a nice blog you are providing ! Kindly Visit Us @ Best Travels in Madurai | Tours and Travels in Madurai | Madurai Travels

  7. Information from this blog is very useful for me, am very happy to read this blog Kindly visit us @ Luxury Watch Box | Shoe Box Manufacturer | Luxury Cosmetics Box

  8. Very clear and precise post. Seems like you've taken a lot of effort for this. Keep posting. Looking forward for more from you.
    LINUX Training in Chennai | Best LINUX Training institute in Chennai | Learn LINUX | LINUX Course in Chennai | LINUX Certification Courses in Chennai

  9. I think this is the best article today about the future technology. Thanks for taking your own time to discuss this topic, I feel happy about that curiosity has increased to learn more about this topic. Artificial Intelligence Training in Bangalore. Keep sharing your information regularly for my future reference.

  10. This is the exact information I am been searching for, Thanks for sharing the required infos with the clear update and required points.
    samsung service centres in chennai
    samsung mobile service center in velachery

  11. Hey Nice Blog!! Thanks For Sharing!!!Wonderful blog & good post.Its really helpful for me, waiting for a more new post. Keep Blogging!
    SEO company in coimbatore
    SEO company
    web design company in coimbatore

  12. Really wonderful post. Thanks for taking your valuable time to share this with us.

    Website Designing Company in Delhi

  13. Thank you for this great article i learn a lot from your article keep it up.
    attitude status in hindi
    Life status in hindi
    Love Status in hindi

  14. On this website you can join unlimited groups . click and get unlimited whatsapp group links

  15. many peoples want to join random whatsapp groups . as per your demand we are ready to serve you whatsapp group links . On this website you can join unlimited groups . click and get unlimited whatsapp group links

  16. many peoples want to join random whatsapp groups . as per your demand we are ready to serve you whatsapp group links . On this website you can join unlimited groups . click and get unlimited whatsapp group links

  17. Nice Article…
    Really appreciate your work
    Bike Status

  18. Very useful post, very great introduction posted. thanks for sharing

    ExcelR Data Science Course Bangalore

  19. I like viewing web sites which comprehend the price of delivering the excellent useful resource free of charge. I truly adored reading your posting. Thank you!
    date analytics certification training courses
    data science courses training
    data analytics certification courses in Bangalore
    ExcelR Data science courses in Bangalore

  20. If you have DSLR than you know Why you Use Telephoto Lenses in 2019.


  21. It is perfect time to make some plans for the future and it is time to be happy. I've read this post and if I could I desire to suggest you some interesting things or suggestions. Perhaps you could write next articles referring to this article. I want to read more things about it!


  22. I just got to this amazing site not long ago. I was actually captured with the piece of resources you have got here. Big thumbs up for making such wonderful blog page!data science course in dubai

  23. I'm happy to see the considerable subtle element here!.
    Data Science Course in Pune

  24. Nice post. Thanks for sharing! I want people to know just how good this information is in your article. It’s interesting content and Great work.
    Thanks & Regards,
    VRIT Professionals,
    No.1 Leading Web Designing Training Institute In Chennai.

    And also those who are looking for
    Web Designing Training Institute in Chennai
    SEO Training Institute in Chennai
    Photoshop Training Institute in Chennai
    PHP & Mysql Training Institute in Chennai
    Android Training Institute in Chennai