DX11 TECHNIQUES IN HK2207 Takahiro Harada AMD
AMD‘s Favorite Effects HK2207 Demo for Radeon HD 6970 Based in Hong Kong 2207 Not just a single technique Cinematic with practical effects Physics effects Bullet CPU-physics CS rigid body Procedural adaptive tessellation Lighting effects Deferred rendering Post effects 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Live Connection 28th Feburary 2011 AMD‘s Favorite Effects
CS Rigid Body Simulation 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects CS Rigid Body For visual effect Simulation using CS CS5.0 has full functionality to realize simulation Key Features of CS Group shared memory Tree traversal Narrowphase(NP) Atomics Collision Random write 28th Feburary 2011 AMD‘s Favorite Effects
Particle Representation Approximate shapes with particles Arbitrary convex mesh input Scan conversion Integration A thread, rigid body Collision A thread, particle Collision with mesh Conversion to particles Collide against triangles 28th Feburary 2011 AMD‘s Favorite Effects GPU Gems3, Real-time Rigid Body Simulation on GPUs
AMD‘s Favorite Effects Mesh Collision (BVH) BVH used for broad phase collision detection Contains static scene triangles Node : 4 children, 4 volumes Pack a few triangles in a leaf Traversal efficiency Separate data to another buffer TriData v0 v1 v2 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Mesh Collision (BVH) Tree traversal Traversal stack located in Thread Group Shared Memory(TGSM) Traversal and Narrow phase(NP) are separated to keep high efficiency on the GPU Less divergence Reduce local resource usage 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Narrow Phase Output from tree collision HitData, List of triangle indices per body Sparse 1 body x 1 leaf collision == n particles x m tris Cache relevant triangles in TGSM Reduce memory traffic Use 1 thread group(TG) for a body 1 2 3 4 5 6 7 8 9 10 Body0 Body1 Body2 Body3 Body4 Body5 HitData 2 3 4 8 start n 28th Feburary 2011 AMD‘s Favorite Effects
Narrow Phase: 1 Thread Group Void NP() { Bring64ParticlesIntoGPRs(); if( LOCAL_IDX == 0 ) LoadAllCollisionInfo(); BARRIER; forAllLeaves(;;) forAllTriangles(;;j+=TG_SIZE) fillTriangle( ldsVtx, ldsAabb , LOCAL_IDX ); for(k<TG_SIZE;k++) if( ovelaps(ldsAabb[k]) ) collide( pData, ldsVtx[k] ); } 1 thread : 1 particle Use 1 thread as a controller of the SIMD Read HitData -> LeafData Share LeafData (TGSM) All the threads are used to read 64 tris in parallel 64 collisions in parallel AABB overlap test 1 Triangle vs 64 particles collision 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Inefficiencies Hit data buffer is sparse We launch too many TGs TG with 0 hit returns after mem access Controller sections Only controller is working 63 threads are idle Redundant overlap test(Particle-Tri) Body-Tri test is enough Leaf is not completely filled Several leaves are colliding Can issue more memory requests 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Introduce Prepass Hit data buffer is sparse We launch too many TGs TG with 0 hit returns after mem access Controller sections Only controller is working 63 threads are idle Redundant overlap test(Particle-Tri) Body-Tri test is enough Leaf is not completely filled Several leaves are colliding Can issue more memory requests Use Append Buffer A body/thread Use 64 threads to read Less single thread work Do Body-Tri test Pack triangle Data LeafA(4), LeafB(4) -> 8 Reduce local resource usage Better HW occupancy 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Pre Narrow Phase Use 1 thread for a body Read HitData -> LeafData -> Triangle Body-Triangle AABB test 64 Particle-Triangle collisions Store colliding triangle indices If any collide Write to append buffer Write triangle index to contiguous mem Sorting by n hits improves divergence Local sort Append Append Append Append Append Append 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Improved Narrow Phase Void NP() { Bring64ParticlesIntoGPRs(); if( LOCAL_IDX == 0 ) LoadAllCollisionInfo(); BARRIER; forAllLeaves(;;) forAllTriangles(;;j+=TG_SIZE) fillTriangle( ldsVtx, ldsAabb , LOCAL_IDX ); for(k<TG_SIZE;k++) if( ovelaps(ldsAabb[k]) ) collide( pData, ldsVtx[k] ); } Void NP() { Bring64ParticlesIntoGPRs(); if( LOCAL_IDX == 0 ) LoadNumHits(); BARRIER; for(i<ldsHitTriData.m_n;i+WG_SIZE) fillTriangle( ldsVtx[LOCAL_IDX] , i+LOCAL_IDX ); for(j<WG_SIZE;j++) collide( pData, ldsVtx[j] ); } 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Result 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects MAKING IT LOOK PRETTY … 28th Feburary 2011 AMD‘s Favorite Effects
Procedural Adaptive Tessellation Add surface detail using DX11 tessellation Hull shader Calc tessellation factor using depth Tessellator Domain shader Interpolate vertex position, normal Displacement factor using 3D Perlin noise Evaluate in local space Displacement vector Displace Pixel shader Normal is gradient 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Cracks Different tessellation factor on edge Objects are small enough Sample depth at the center Discontinuous displacement vector Normal is not continuous Use convexity of geometry Interpolate normal and vector from center 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Other Techniques Used Deferred shading Depth of field Emissive materials Lens ghosting and flare Aerial perspective Reflections Tone mapping LUT color correction 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Color 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Light 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects Emissive etc 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects DOF 28th Feburary 2011 AMD‘s Favorite Effects
AMD‘s Favorite Effects End Questions? Acknowledgement Jay McKee, Jason Yang, Justin Hensley, Lee Howes, Ali Saif, David Hoff, Abe Wiley, Dan Roeger 28th Feburary 2011 AMD‘s Favorite Effects