Download presentation
Presentation is loading. Please wait.
Published byMilton Knight Modified over 9 years ago
1
Maths & Technologies for Games Graphics Optimisation - Batching CO3303 Week 5
2
Today’s Lecture 1.Rendering Process in Practice 2.State Changes Improving the Process 3.Batches & Batching Performance Characteristics 4.Adding Flexibility
3
Rendering Process in Practice Consider the process of rendering primitives in a naïve graphics engine For each model: 1.Select a vertex / pixel shader 2.Get view/projection matrices from camera 3.Get world matrix from model 4.Set matrices in vertex/pixel shaders 5.Set other constants in shaders: E.g. Light positions/colours, material colours 6.Select textures 7.Set the vertex / index buffer for the model 8.Set render states (e.g. alpha blending) 9.Render primitives
4
State Changes Graphics devices are state-based: –Set a collection of states –Render a batch of primitives States include: –Render states – e.g. alpha blending, culling –Shader state – shader selection, constants –Texture state – texture selection, filtering modes –Data state – vertex / index buffer selection State changes are expensive –Some more expensive than others The process described above contains many state changes for every model – very inefficient
5
Improving the Process Need to reduce the number of state changes during rendering –The process above had many redundant state changes Some global state only needs to be set once –E.g. camera matrices Some states shared for all models of a given mesh –E.g. Vertex / index buffers Set once & render all the mesh’s models together Only a few states must be set per-model –E.g. World matrix
6
Improving the Process Example process with fewer state changes: 1.Get view/projection matrices from camera 2.Set camera matrices in every shader / constant buffers 3.Set other global shader constants, e.g. light positions For each mesh type 1.Set render states (e.g. alpha blending) 2.Select vertex / pixel shader 3.Set mesh constants, e.g. material colour 4.Select mesh textures 5.Set vertex / index buffer used by mesh For each model using this mesh 1.Get world matrix from model and set in shader 2.Render primitives
7
Limitations of Revised Approach Above process suffers from inflexibility: –Assumes same render state / shader / textures for the whole of each mesh I.e. assumes each mesh is made of one material Meshes often consist of several different materials, e.g. car made of metal, glass and rubber –Assumes same lights for everything in scene Not true in large environment Assumes that rendering is done in one pass: –List of models traversed – rendered as they are found –Causes tension between state change efficiency and engine flexibility
8
Sub-Meshes - Multiple Materials First consider splitting each mesh into sub- meshes, each using a single material –Allows meshes with multiple materials –A material determines render state, shader, textures etc. Give each material a name (or a UID?) –Artists create models using these named materials –Direct mapping from artwork to engine operation (shader/render state etc.) –We previously handled such material state manually –This is huge improvement to asset production process
9
“Material Buckets” To improve state change performance we can decouple rendering from models: –Traverse list of models, but don’t render immediately –Instead distribute their sub-meshes to “material buckets” Each bucket represents a single material Receives sub-meshes using that material Store model state in bucket with each sub-mesh: world matrix, incident lights, etc. –In a second pass, render contents of each bucket in turn –Much reduction in state changes The first pass in this process is an example of a “pre-render” stage, as discussed in Games Dev 2
10
Material Bucket Example For each model 1.Distribute sub-meshes into material buckets, include model state such as world matrix, incident lights For each material bucket 1.Set render states for material (e.g. alpha blending) 2.Select vertex / pixel shader & constants for material 3.Get view / projection matrices from camera & set in shader 4.Select textures for material 5.Set the vertex / index buffer used by the bucket (assuming shared buffer for all sub-meshes with same material) For each sub-mesh in the bucket 1.Set associated model state / shaders (world matrix, etc.) 2.Render primitives
11
Further Improvements This process may still not suit every need –Efficient mesh splitting is tricky to organise Shared vertex / index buffers Changing material types at runtime problematic –Still must change model state per sub-mesh Might be large state changes (many bones, lights) –DirectX10 / 11 techniques can help Different processes should be considered –To meet the needs of the software –To suit current hardware needs I.e. Which state changes are currently expensive?
12
Batches of Primitives At the core of each rendering process is the same “Render Primitives” step –Pass some model triangles to the GPU to render Usually a triangle list or triangle strip –In DirectX this is the innermost Draw call The triangles rendered in each call is a batch –We call also call this sending a batch or batching Sending each batch carries an overhead –Better to aim for a larger batch size –And send less batches
13
Batches / Batching A batch is typically all the triangles using the same material for a single model Increasing the batch size can be achieved by: –Using a minimum of materials per model: Combine textures into one rather than use several (atlasing) Using more complex shaders that can handle multiple effects –Instancing –multiple models in one batch (see later) In summary, to optimise graphics performance: –Maximise the batch size / reduce the number of batches per frame (similar goals) –Use a minimum of state changes
14
Batch Performance - Detail Measure performance of batch sizes: –Render and time a high polygon scene –Each time use a different batch size
15
Batch Performance - Detail Number of batches that can be rendered in a frame depends on CPU speed, not GPU speed Drawing 30,000 single triangle batches might be the best a system can achieve at 60fps –CPU fully loaded sending batches –GPU almost idle – it could render millions of triangles Little performance decrease in drawing 30,000 models of 1000 triangles each –CPU still fully loaded –Now GPU working at full load rendering the triangles
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.