Overview
The RenderDevice is essentially our abstraction layer for platform specific rendering APIs. It is implemented as an abstract base class that various rendering back-ends (D3D11, D3D12, OGL, Metal, GNM, etc.) implement.
The RenderDevice has a bunch of helper functions for initializing/shutting down the graphics APIs, creating/destroying swap chains, etc. All of which are fairly straightforward so I won’t cover them in this post, instead I will put my focus on the two dispatch functions consuming RenderResourceContexts and RenderContexts:
class RenderDevice {
public:
virtual void dispatch(uint32_t n_contexts, RenderResourceContext **rrc,
uint32_t gpu_affinity_mask = RenderContext::GPU_DEFAULT) = 0;
virtual void dispatch(uint32_t n_contexts, RenderContext **rc,
uint32_t gpu_affinity_mask = RenderContext::GPU_DEFAULT) = 0;
};
Resource Management
As covered in the post about RenderResourceContexts, they provide a free-threaded interface for allocating and deallocating GPU resources. However, it is not until the user has called RenderDevice::dispatch() handing over the RenderResourceContexts as their representation gets created on the RenderDevice side.
All implementations of a RenderDevice have some form of resource management that deals with creating, updating and destroying of the graphics API specific representations of resources. Typically we track the state of all various types of resources in a single struct, here’s a stripped down example from the DX12 RenderDevice implementation called D3D12ResourceContext:
struct D3D12VertexBuffer
{
D3D12_VERTEX_BUFFER_VIEW view;
uint32_t allocation_index;
int32_t size;
};
struct D3D12IndexBuffer
{
D3D12_INDEX_BUFFER_VIEW view;
uint32_t allocation_index;
int32_t size;
};
struct D3D12ResourceContext
{
Array<D3D12VertexBuffer> vertex_buffers;
Array<uint32_t> unused_vertex_buffers;
Array<D3D12IndexBuffer> index_buffers;
Array<uint32_t> unused_index_buffers;
// .. lots of other resources
Array<uint32_t> resource_lut;
};
As you might remember, the linking between the engine representation and the RenderDevice representation is done using the RenderResource::render_resource_handle. It encodes both the type of the resource as well as a handle. The resource_lut is an indirection to go from the engine handle to a local index for a specific type (e.g vertex_buffers or index_buffers in the sample above). We also track freed indices for each type (e.g. unused_vertex_buffers) to simplify recycling of slots.
The implementation of the dispatch function is fairly straight forward. We simply iterate over all the RenderResourceContexts and for each context iterate over its commands and either allocate or deallocate resources in the D3D12ResourceContext. It is important to note that this is a synchronous operation, nothing else is peeking or poking on the D3D12ResourceContext when the dispatch of RenderResourceContexts is happening, which makes our life a lot easier.
Unfortunately that isn’t the case when we dispatch RenderContexts as in that case we want to go wide (i.e. forking the workload and process it using multiple worker threads) when translating the commands into API calls. While we don’t allow allocating and deallocating new resources from the RenderContexts we do allow updating them which mutates the state of the RenderDevice representations (e.g. a D3D12VertexBuffer).
At the moment our solution for this isn’t very nice, basically we don’t allow asynchronous updates for anything else than DYNAMIC buffers. UPDATABLE buffers are always updated serially before we kick the worker threads no matter what their sort_key is. All worker threads access resources through their own copy of something we call a ResourceAccessor, it is responsible for tracking the worker threads state of dynamic buffers (among other things). In the future I think we probably should generalize this and treat UPDATABLE buffers in a similar way.
(Note: this limitation doesn’t mean you can’t update an UPDATABLE buffer more than once per frame, it simply means you cannot update it more than once per dispatch).
Shaders
Resources in the D3D12ResourceContext are typically buffers. One exception that stands out is the RenderDevice representation of a “shader”. A “shader” on the RenderDevice side maps to a ShaderTemplate::Context on the engine side, or what I guess we could call a multi-pass shader. Here’s some pseudo code:
struct ShaderPass
{
struct ShaderProgram
{
Array<uint8_t> bytecode;
struct ConstantBufferBindInfo;
struct ResourceBindInfo;
struct SamplerBindInfo;
};
ShaderProgram vertex_shader;
ShaderProgram domain_shader;
ShaderProgram hull_shader;
ShaderProgram geometry_shader;
ShaderProgram pixel_shader;
ShaderProgram compute_shader;
struct RenderStates;
};
struct Shader
{
Vector<ShaderPass> passes;
enum SortMode { IMMADIATE, DEFERRED };
uint32_t sort_mode;
};
The pseudo code above is essentially the RenderDevice representation of a shader that we serialize to disk during data compilation. From that we can create all the necessary graphics API specific objects expressing an executable shader together with its various state blocks (Rasterizer, Depth Stencil, Blend, etc.).
As discussed in the last post the sort_key encodes the shader pass index. Using Shader::sort_mode, we know which bit range to extract from the sort_key as pass index, which we then use to look up the ShaderPass from Shader::passes. A ShaderPass contains one ShaderProgram per active shader stage and each ShaderProgram contains the byte code for the shader to compile as well as “bind info” for various resources that the shader wants as input.
We will look at this in a bit more detail in the post about “Shaders & Materials”, for now I just wanted to familiarize you with the concept.
Render Context translation
Let’s move on and look at the dispatch for translating RenderContexts into graphics API calls:
class RenderDevice {
public:
virtual void dispatch(uint32_t n_contexts, RenderContext **rc,
uint32_t gpu_affinity_mask = RenderContext::GPU_DEFAULT) = 0;
};
The first thing all RenderDevice implementation do when receiving a bunch of RenderContexts is to merge and sort their Commands. All implementations share the same code for doing this:
void prepare_command_list(RenderContext::Commands &output, unsigned n_contexts, RenderContext **contexts);
This function basically just takes the RenderContext::Commands from all RenderContexts and merges them into a new array, runs a stable radix sort, and returns the sorted commands in output. To avoid memory allocations the RenderDevice implementation owns the memory of the output buffer.
Now we have all the commands nicely sorted based on their sort_key. Next step is to do the actual translation of the data referenced by the commands into graphics API calls. I will explain this process with the assumption that we are running on a graphics API that allows us to build graphics API command lists in parallel (e.g. DX12, GNM, Vulkan, Metal), as that feels most relevant in 2017.
Before we start figuring out our per thread workloads for going wide, we have one more thing to do; “instance merging”.
Instance Merging
I’ve mentioned the idea behind instance merging before [1,2], basically we want to try to reduce the number of RenderJobPackages (i.e. draw calls) by identifying packages that are similar enough to be merged. In Stingray “similar enough” basically means that they must have identical inputs to the input assembler as well as identical resources bound to all shader stages, the only thing that is allowed to differ are constant buffer variables. (Note: by todays standards this can be considered a bit old school, new graphics APIs and hardware allows to tackle this problem more aggressively using “bindless” concepts. )
The way it works is by filtering out ranges of RenderContexts::Commands where the “instance bit” of the sort_key is set and all bits above the instance bit are identical. Then for each of those ranges we fork and go wide to analyze the actual RenderJobPackage data to see if the instance_hash and the shader are the same, and if so we know its safe to merge them.
The actual merge is done by extracting the instance specific constants (these are tagged by the shader author) from the constant buffers and propagating them into a dynamic RawBuffer that gets bound as input to the vertex shader.
Depending on how the scene is constructed, instance merging can significantly reduce the number of draw calls needed to render the final scene. The instance merger in itself is not graphics API specific and is isolated in its own system, it just happens to be the responsibility of the RenderDevice to call it. The interface looks like this:
namespace instance_merger {
struct ProcessMergedCommandsResult
{
uint32_t n_instances;
uint32_t instanced_batches;
uint32_t instance_buffer_size;
};
ProcessMergedCommandsResult process_merged_commands(Merger &instance_merger,
RenderContext::Commands &merged_commands);
}
Pass in a reference to the sorted RenderContext::Commands in merged_commands and after the instance merger is done running you hopefully have fewer commands in the array. :)
You could argue that merging, sorting and instance merging should all happen before we enter the world of the RenderDevice. I wouldn’t argue against that.
Prepare workloads
Last step before we can start translating our commands into state / draw / dispatch calls is to split the workload into reasonable chunks and prepare the execution contexts for our worker threads.
Typically we just divide the number of RenderContext::Commands we have to process with the number of worker threads we have available. We don’t care about the type of different commands we will be processing and trying to load balance differently. The reasoning behind this is that we anticipate that draw calls will always represent the bulk of the commands and the rest of the commands can be considered as unavoidable “noise”. We do, however, make sure that we don’t do less than x-number of commands per worker threads, where x can differ a bit depending on platform but is usually ~128.
For each execution context we create a ResourceAccessors (described above) as well as make sure we have the correct state setup in terms of bound render targets and similar. To do this we are stuck with having to do a synchronous serial sweep over all the commands to find bigger state changing commands (such as RenderContext::set_render_target).
This is where the Command::command_flags bit-flag comes into play, instead of having to jump around in memory to figure out what type of command the Command::head points to, we put some hinting about the type in the Command::command_flags, like for example if it is a “state command”. This way the serial sweep doesn’t become very costly even when dealing with large number of commands. During this sweep we also deal with updating of UPDATABLE resources, and on newer graphics APIs we track fences (discussed in the post about Render Contexts).
The last thing we do is to set up the execution contexts with create graphics API specific representations of command lists (e.g. ID3D12GraphicsCommandList in DX12),
Translation
When getting to this point doing the actual translation is fairly straight forward. Within each worker thread we simply loop over its dedicated range of commands, fetch its data from Command::head and generate any number of API specific commands necessary based on the type of command.
For a RenderJobPackage representing a draw call it involves:
- Look up the correct shader pass and, unless already bound, bind all active shader stages
- Look up the state blocks (Rasterizer, Depth stencil, Blending, etc.) from the shader and bind them unless already bound
- Look up and bind the resources for each shader stage using the
RenderResource::render_resource_handletranslated through theD3D12ResourceAccessor - Setup the input assembler by looping over the
RenderResource::render_resource_handlespointed to by theRenderJobPackage::resource_offsetand translated through theD3D12ResourceAccessor - Bind and potentially update constant buffers
- Issue the draw call
The execution contexts also holds most-recently-used caches to avoid unnecessary binds of resources/shaders/states etc.
Note: In DX12 we also track where resource barriers are needed during this stage. After all worker threads are done we might also end up having to inject further resource barriers between the command lists generated by the worker threads. We have ideas on how to improve on this by doing at least parts of this tracking when building the RenderContexts but haven’t gotten around looking into it yet.
Execute
When the translation is done we pass the resulting command lists to the correct queues for execution.
Note: In DX12 this is a bit more complicated as we have to interleave signaling / waiting on fences between command list execution (ExecuteCommandList).
Next up
I’ve deliberately not dived into too much details in this post to make it a bit easier to digest. I think I’ve manage to cover the overall design of a RenderDevice though, enough to make it easier for people diving into the code for the first time.
With this post we’ve reached half-way through this series, we have covered the “low-level” aspects of the Stingray rendering architecture. As of next post we will start looking at more high-level stuff, starting with the RenderInterface which is the main interface for other threads to talk with the renderer.
Very interesting series, thanks a lot!
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteYour Shader::SortMode-enum has a typo in it.
ReplyDeleteIt should be IMMEDIATE, not IMMADIATE.
Office Setup
ReplyDeleteoffice.com/setup
www.office.com/setup
Office Com Setup
Printer Tech Support
www.norton.com/setup
hp Printer Tech Support
Printer Technical Support number
www.norton.com/setup
norton com setup
Wonderful ! Thanks for Sharing this article keep update this kind of nice articles ..
ReplyDelete
ReplyDeleteIn love with this post.thankyou for the information.
Please do find the attached files and download it form our website.
http://acmarketap11993.yolasite.com/
https://happychickapk1960.weebly.com/
https://happychickapk.jimdofree.com/
http://site-1760435-9004-6622.strikingly.com/
https://acmarket541.jimdofree.com/
http://acmarket541.over-blog.com/2019/04/ac-market.html
https://sites.google.com/view/livenettvapk541/home
https://livenettvapk541.yolasite.com/
Top airlines in the world
ReplyDeleteAn airline is an organization that gives air transport administrations to voyaging travelers and cargo. Carriers use flying machine to supply these administrations, and may frame organizations or coalitions with different airlines for codeshare understandings. For the most part, airline organizations are perceived with an air working testament or permit issued by a legislative aeronautics body
Visit for more :-Qantas Airlines Phone Number
PosLaju parcel tracker of the Malaysia & World. Add tracking number to track your PosLaju packages as well as obtain delivery status online.
ReplyDeletehttps://poslajutracking.xyz/
poslaju tracking
poslaju track and trace
poslaju tracking number
poslaju tracking express
This is a great article, with lots of information in it, These types of articles interest users in your site. Please continue to share more interesting articles!
ReplyDeleteAmong, Infrastructure as a Service (IaaS), Software as a Service (SaaS), and Platform as a Service (PaaS), AWS chooses right kind of distributed computing and gives it to the business. AWS is known for adaptability with recognizable design, databases, operating systems, and programming dialects. It likewise guarantees security for the framework including physical, operational and programming measures. Thusly, in every one of the ways, AWS enables organizations to bring down their IT costs.
ReplyDeleteFor More Info:- AWS Institute in Gurgaon
excellant information please keep sharing such useful information.
ReplyDeletePACKERS AND MOVERS
Are you in need of Online Essay Writing Services for your Custom College Essay Paper? Seek our Legitimate Cheap Custom Essay Writing Services.
ReplyDeleteFinding the best Help with Medical Assignment is not easy unless one is keen to establish a professional medical assignment help & medical homework help online.
ReplyDeleteSuperb topic Resource Management you share.Thanks for taking the time to discuss this. I feel about it and love learning more on this topic. If possible, as you gain expertise, would you mind updating your blog with more information? It is extremely helpful for me.Get best Mobile App Development Dubai you visit here for more info.
ReplyDeleteHello! I think these information Will be helpful for you.
ReplyDeleteopenergroup.com
kopithecat25.wixsite.com/style1982
Thank you!
Such a great post you share with us, I really appreciate your work and content idea and I have found here lots of knowledgeable information this website perfect for my need. for More Information Please Visit: Outsource SEO Link Building Services
ReplyDeleteI am really appreciating very much by seeing your interesting posts.
ReplyDeleteHotschedules login employee hot schedule
Rasmussen student portal
Gmglobalconnect
Suggest good information in this message, click here.
ReplyDeleteไพ่ป๊อก เด้ง เล่น ยัง ไง
เซียนคาสิโน
I think this article is useful to everyone.
ReplyDeleteสล็อต Slot online ฟรีเครดิต ไม่ต้องฝาก 2019"
ซื้อหวยออนไลน์ เว็บไหนดี"
หวยออนไลน์ lottovip"
หวยฮานอย เล่นยังไง"
what great info. It is truly amazing. I have not read this type of post to date. Thank you very much for sharing this information.
ReplyDeleteIs the print job being interrupted by Epson Error Code 0XF1? Do not waste outside. You can find a solution. In this report, we have provided a list of solutions for Epson Printer Error 0XF1. To remove this Epson 0XF1 printer error code, you may visit our website.
Hey, what a blog you write,is such wonderful wrinting i never read this ind of stuff.
ReplyDeleteLets talk about one of the best senior care sevices all over Colorado or sarrounding areas. So i rescomand you the Gardens Care Homes Company who is top leading organization in the country for assisted living indutry. Dont worry about the prices, our cheap rates always suits your pocket.
Arvada Assisted Living
Lakewood Assisted Living
Castle Rock Assisted Living
Aurora Assisted Living
Federal Heights Assisted Living
Denver Assisted Living
Colorado Assisted Living
Some facts I agree to your points but some I don't. Yes, I want to appreciate your hardwork for sharing this information but at my part I have to research more. Though there are some interesting view angle I could find in your remark. Thanks for sharing.
ReplyDeletehire wordpress developer india
php developers
outsource digital marketing services
I have just come across your post and I believe this is exactly what I am looking for. I want an economics assignment help from a tutor who can guarantee me a top grade. Do you charge per page or does it depend on the
ReplyDeletebulk of the economics homework help being completed? More to that if the work is not good enough do you offer free corrections.
Matlab Assignment Help helped me to complete my seventh Matlab assignment, which was also the best-performed! It scored 92/100, which I've never scored before on any other assignment/exam in my lifetime. Otherwise, their service was as quick as usual. The delivery was also on time. I'm now requesting to use this same programmer multiple times. He seems the best in Image Processing tasks. Meanwhile, I'll ask for more Matlab Homework Help soon.
ReplyDeleteI am looking for a Statistics Assignment Help expert for Statistics Homework Help. I have struggled enough with statistics and therefore I just can't do it anymore on my own. . I have come across your post and I think you are the right person to provide me with SPSS homework help. Let me know how much you charge per assignment so that I can hire you today.
ReplyDeleteBesides my C course, I have a job and family, both of which compete to get my time. I couldn't find sufficient time for the challenging C assignments, and these people came in and saved my skin. I must commend them for the genius Programming Assignment Help. Their C Homework Help tutors did the best job and got me shining grades.
ReplyDeleteHi, other than economics assignment help are there other subjects that you cover? I am having several assignments one needs an economics homework help expert and the other one needs a financial expert. If you can guarantee quality work on both then I can hire you to complete them. All I am sure of is that I can hire you for the economics one but the finance one I am not sure.
ReplyDeleteMe and my classmates took too long to understand Matlab Assignment Help pricing criteria. we're always grateful for unique solutions on their Matlab assignments. Matlab Homework Help experts have the right experience and qualifications to work on any programming student's homework. They help us in our project.
ReplyDeleteHey STATA homework help expert, I need to know if you can conduct the Kappa measurement of agreement. This is what is in my assignment. I can only hire someone for statistics assignment help if they are aware of the kappa measurement of agreement. If you can do it, then reply to me with a few lines of what the kappa measure of agreement is. Let me know also how much you charge for statistics homework help in SAS.
ReplyDeleteThe ardent Programming Homework Help tutor that nailed down my project was very passionate. He answered my Python questions with long, self-explanatory solutions that make it easy for any average student to revise. Moreover, he didn't hesitate to answer other questions, too, even though they weren't part of the exam. If all Python Homework Help experts can be like this then they can trend as the best Programming school ever online.
ReplyDeleteHello. Please check the task I have just sent and reply as soon as possible. I want an adjustment assignment done within a period of one week. I have worked with an Accounting Homework Help tutor from your team and therefore I know it’s possible to complete it within that period. Let me know the cost so that I can settle it now as your Accounting Assignment Help experts work on it.
ReplyDeleteThat is a huge number of students. Are they from the same country or different countries? I also want your math assignment help. I want to perform in my assignments and since this is what you have been doing for years, I believe you are the right person for me. Let me know how much you charge for your math homework help services.
ReplyDeleteI don’t have time to look for another expert and therefore I am going to hire you with the hope that I will get quality economics assignment help .Being aneconomics homework help professor I expect that your solutions are first class. All I want to tell you is that if the solutions are not up to the mark I am going to cancel the project.
ReplyDeleteHey there, I need an Statistics Homework Help expert to help me understand the topic of piecewise regression. In our lectures, the concept seemed very hard, and I could not understand it completely. I need someone who can explain to me in a simpler way that I can understand the topic. he/she should explain to me which is the best model, the best data before the model and how to fit the model using SPSS. If you can deliver quality work then you would be my official Statistics Assignment Help partner.
ReplyDeletegoogle 1867
ReplyDeletegoogle 1868
google 1869
google 1870
google 1871
Senior Living Homes Near Me
ReplyDeleteSenior Living Meadow Hills
Senior Living Pinehurst
Arvada Senior Living
Castle Rock Senior Living
Federal Heights Senior Living
Gardens Care Homestead Lakewood
Assisted Living in Lakewood
Independent Living in Lakewood
Independent Living
This is the best article
ReplyDeleteitubego youtube download crack
wnsoft pte av studio
comodo internet Security license key
A new type of investment That is ready to make money continuously With the best casino services.
ReplyDeleteหวยออนไลน์ จ่ายจริง
ยี่กี เล่นยังไง
สมัครเล่นสล็อต มือถือ
Welldone information. Thanks for sharing with us.
ReplyDeleteMill Stand Manufacturers
Mill Stand
Rolling Mill Stand Manufacturer
Rolling Mill Stand Manufacturers
Rolling Mill Stand Manufacturers in India
Mill Stand-in India
Water damage can be a homeowner’s worst nightmare. Not only is your home rendered unlivable for the foreseeable future, but you’ve got a massive water damage cleanup and restoration process to deal with.
ReplyDelete