High Performance in Broad Reach Games Chas. Boyd Principal Program Manager Microsoft Corporation
High Performance in Broad Reach Games The more polished the game, the larger the revenue opportunity. This talk covers techniques to give your app a professsional appeal using modern features for a fluid experience. Learn how to expand your audience by leveraging capabilities across the entire range of Windows8 PCs, including languages, tools, and APIs for high-performance graphics.
Agenda Windows 8 hardware diversity A unified 3D API to access the power of the GPU Designing for the Broadest Reach Optimizing Performance Tile-based rendering optimizations Recommendations
3 Presentations Today Step by Step through Game Development How to set up the game How to code it How to optimize it <- You are here
Windows 8 PC Hardware Diversity
GPU Hardware Evolution Year Version Defining Feature 1996 DirectX3 Hardware rasterization 1997 DirectX5 2 Shading options to choose from 1998 DirectX6 Multi-texture operations 1999 DirectX7 Vertex Processing in hardware 2000 DirectX8 Programmable Shaders: Vertex and Pixel 2001 DirectX8.1 Longer shaders 2002 DirectX9 High Level Shading Language, 32 instr 2003 DirectX9c 1000s of instructions per shader 2006 DirectX10 Geometry shader, Consistent shader models 2009 DirectX11 Compute Shader, Tessellation
After Before DirectX
DirectX9 Hardware Vertex shaders Pixel shaders 8 Textures 4 Render Targets Cube maps Volume textures Anisotropic filtering Antialiasing HDR rendering Texture compression
DirectX 10 Hardware Vertex shaders Geometry shaders Pixel shaders 8 Textures 4 Render Targets Cube maps Volume textures Anisotropic filtering Antialiasing HDR rendering Texture compression Geometry shaders Stream out 128 Textures per shader 8 Render Targets Integers in shaders Vertex textures Shader sampling Constant buffers Alpha-to-coverage Basic DirectCompute Async resource creation
DirectX 11 Hardware Vertex shaders Pixel shaders 8 Textures 4 Render Targets Cube maps Volume textures Anisotropic filtering Antialiasing HDR rendering Texture compression Geometry shaders Stream out 128 Textures per shader 8 Render Targets Integers in shaders Vertex textures Shader sampling Constant buffers Alpha-to-coverage Basic DirectCompute Async resource creation Full DirectCompute Random access writes Tessellation shaders New compression formats Shader linkage
Feature Levels Direct3D 11 provides a uniform interface to access hardware capabilities Feature Levels map to hardware capabilities Feature_Level_9 DirectX 9 Hardware Feature_Level_10 DirectX 10 Hardware Feature_Level_11 DirectX 11 Hardware
Select Feature Levels to Support D3D_FEATURE_LEVEL featureLevels[] = { D3D_FEATURE_LEVEL_11_1, D3D_FEATURE_LEVEL_11_0, D3D_FEATURE_LEVEL_10_1, D3D_FEATURE_LEVEL_10_0, D3D_FEATURE_LEVEL_9_3, D3D_FEATURE_LEVEL_9_1 }; UINT creationFlags = D3D11_CREATE_DEVICE_BGRA_SUPPORT; © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Create the Device and Context ComPtr<ID3D11Device> device; ComPtr<ID3D11DeviceContext> context; D3D11CreateDevice( nullptr, // use the default adapter D3D_DRIVER_TYPE_HARDWARE, 0, // use 0 unless a software device creationFlags, // defined above featureLevels, // what app will support ARRAYSIZE(featureLevels), D3D11_SDK_VERSION, // should always be D3D11_SDK_VERSION &device, // created device &m_featureLevel, // feature level of the device &context // corresponding immediate context ); © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Development Strategy Develop on DirectX 11 hardware Target Feature_Level_9 and scale up Include calibration code in game to dynamically configure for current hardware Be aware of Feature Level differences Test by restricting Feature Level Test on multiple PCs
DirectX Control Panel
Chapter 1/4 Balancing Act
Features vs Performance © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Opposing Directions Prune features and image quality to meet performance goals on low-end hardware Add features and image quality to differentiate your app on high-end hardware
Development Strategy Develop on DirectX 11 hardware Target Feature_Level_9 and scale up Include calibration code in game to dynamically configure for current hardware Adjust to maintain performance Be aware of Feature Level differences Test by restricting Feature Level Test on multiple PCs
Differentiation Increase Visual Quality Higher resolution textures Use bump maps
Texture Quality Control Balance visual quality with performance Scale back on size via mipmap levels Use block-compressed texture formats Loader code skips mip levels 1024 x 1024 512 x 512 256 x 256
Optimization Techniques Tune Anisotropic filter quality Simple scalar value MultiSampling AntiAliasing (MSAA) Reduce sample count to maintain frame rate Render to a lower resolution and scale up for final image For best image quality, do not scale 2D text Geometry Feature_Level_11 – use tessellation for more polygon count control Consider lower-resolution (lower vertex count) meshes
Basic Performance Checks Be sure you are not setting state and loading data that doesn’t change every frame Set breakpoint in the render code We’ve seen some apps drawing twice per frame Make sure that the game loop routines are still in the right order Check to see if you are CPU bound, or GPU bound. If GPU, check to see if you are fill-rate or vertex bound.
Use PIX PIX is a profiler mode for Visual Studio 11 You can use it to step-through DirectX API calls And display: event history pixel history DX rendering pipeline the DX object table, and call stacks
Chapter 3/4 More Optimizations
More Optimizations Pure performance helps low-end hardware But if image quality is not impacted, high-end hardware benefits too Minimum Precision is a new capability in Windows 8 HLSL
Minimum Precision Reduce the number of bits of precision in shader calculations Hints to the graphics driver where optimizations can be done Specifies minimum rather than actual precision min16float min12int min16int
Minimum Precision HLSL Code Sample static const float brightThreshold = 0.5f; Texture2D sourceTexture : register(t0); float4 DownScale3x3BrightPass(QuadVertexShaderOutput input) : SV_TARGET { float3 brightColor = 0; // Gather 16 adjacent pixels (each bilinear sample reads a 2x2 region) brightColor = sourceTexture.Sample(linearSampler, input.tex, int2(-1,-1)).rgb; brightColor += sourceTexture.Sample(linearSampler, input.tex, int2( 1,-1)).rgb; brightColor += sourceTexture.Sample(linearSampler, input.tex, int2(-1, 1)).rgb; brightColor += sourceTexture.Sample(linearSampler, input.tex, int2( 1, 1)).rgb; brightColor /= 4.0f; // Brightness thresholding brightColor = max(0, brightColor - brightThreshold); return float4(brightColor, 1.0f); } © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Minimum Precision HLSL Code Sample static const min16float brightThreshold = (min16float)0.5; Texture2D<min16float4> sourceTexture : register(t0); float4 DownScale3x3BrightPass(QuadVertexShaderOutput input) : SV_TARGET { min16float3 brightColor = 0; // Gather 16 adjacent pixels (each bilinear sample reads a 2x2 region) brightColor = sourceTexture.Sample(linearSampler, input.tex, int2(-1,-1)).rgb; brightColor += sourceTexture.Sample(linearSampler, input.tex, int2( 1,-1)).rgb; brightColor += sourceTexture.Sample(linearSampler, input.tex, int2(-1, 1)).rgb; brightColor += sourceTexture.Sample(linearSampler, input.tex, int2( 1, 1)).rgb; brightColor /= (min16float)4.0; // Brightness thresholding brightColor = max(0, brightColor - brightThreshold); return float4(brightColor, 1.0f); } © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Chapter 2/4 Tile –Based Rendering
Typical Rendering Command stream sent to GPU Command Stream CommandBuffer1 Command1 Command2 Command3 Command4 Command5 Command6 CommandBuffer2 CommandBuffer3 Command Stream
Tile Based Rendering Display the final image Send Command Stream to GPU Execute Command Stream for Tile 1 Execute Command Stream for Tile 2 Execute Command Stream for Tile 3 Execute Command Stream for Tile 4 CommandBuffer1 Command1 Command2 Command3 Command4 Command5 Command6 CommandBuffer2 CommandBuffer3 Buffered Commands Execute Command Stream for Tile 5 Execute Command Stream for Tile 6 Display the final image
Tile-based Rendering Strategies Avoid mid-scene flushes Avoid swapping back and forth between RenderTargets Use scissors when updating small portions of a RenderTarget Use DISCARD and NO_OVERWRITE when possible
Tile-based Rendering Optimizations GPUs with a tile-based rendering architecture can get a performance boost with a special flag: m_swapChain->Present(1, 0); // present the image on the display ComPtr<ID3D11View> view; m_renderTargetView.As(&view); // get the view on the RT m_d3dContext->DiscardView(view.Get()); // release the view
Chapter 4/4 Recommendations
Conclusion Windows 8 Runs on the broadest graphics hardware diversity ever Designed for graphics hardware acceleration Direct3D 11 is the 3D API to access the power of the GPU You can get Great Graphics Performance leveraging the GPU AND hit the broadest markets
Strategy Recap for Indie Devs Design for Feature_Level_9 Adjust at runtime for actual Feature Level Use advanced features to differentiate when available Dynamically calibrate for smooth performance Use new Direct3D 11 features to better utilize hardware Minimum precision Tile-based rendering optimizations
Time to Act Biggest opportunity. Ever. Windows 8 Consumer Preview is now available. Check out the store. Go build great games. http://dev.windows.com
Q & A
4/20/2017 8:42 AM © 2012 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. © 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.