Download presentation
Presentation is loading. Please wait.
Published byAugustine Harrison Modified over 9 years ago
1
1 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Optimizing DirectX* Graphic Applications using Software Vertex Processing Ronen Zohar/Kim Pallister Intel Corporation
2
2 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Agenda Do I need SW vertex processing? The PSGP Using SW vertex processing for maximum performance: memory, batching and render-states SW vertex processing and DirectX*’s 8.0 new features
3
3 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Do I need SW vertex processing? Your publisher wants: –Eye-candy graphics, using all the latest 3D features –Lower the “minimum system requirements” –and many more Problem: older systems does not support all the eye- candy features Solution1: Disable features for low-end systems Solution2: Use SW vertex processing (at least for the features that you can) and keep some features
4
4 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Inside DirectX* Graphics Driver Application API Front-end Communication to the driver (DDI) SW Vertex processing (PSGP) DirectX run-time HW vertex processing path
5
5 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 PSGP – Processor Specific Geometry Pipeline Part of DirectX graphics responsible for the SW vertex processing algorithms, optimized for the client’s processor DirectX’s 8.0 PSGP is optimized for: –Intel® Pentium® III processor –Intel Pentium 4 processor
6
6 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 The PSGP VB Map stream to registers Execute vertex shader code Vertex shader path TransformationLightingTex Gen Fixed function path Format data to output- FVF Internal temporary VB’s Clipper IB To driver
7
7 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 PSGP Principles Use SIMD to process multiple vertices in each iteration –Vertical processing –Data is swizzled on the fly Prefetch input streams to hide memory latency Write output to temporary VB’s based on XYZRHW FVF code –In system memory if need to read back transformed vertices –In driver memory if no read-back is required –More on this later…
8
8 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Input Stream Memory Allocation Create SW processed primitives in system memory (using the D3DUSAGE_SOFTWAREPROCESSING usage create flags). If the same VB is processed both in SW and HW –Try to avoid it –If you must - create multiple copies, one in system and one in driver memory If the primitive is never clipped, use the D3DUSAGE_DONOTCLIP usage flag
9
9 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Primitive Batching Batch all the SW processed primitives together SW processes the entire VB range that you submit, if multiple primitives are using the same VB – squeeze the vertices range As with HW, bigger primitives are always better (the PSGP have long setup)
10
10 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Primitive Batching (Cont) The PSGP is batching the processed vertices before sending them to HW (to reduce HW’s VB changes) Primitives are batched as long as their output FVF is equal: –XYZ | NORMAL | TEX1 and XYZ | DIFFUSE | TEX1 have the same output FVF ( XYZRHW | DIFFUSE | TEX1 ) –In SW mode, changing the VB FVF does not mean a slowdown (unlike HW)
11
11 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Clipping Render-state When clipping is enabled, the PSGP –Stores its output to system memory buffer As it need to read vertices in order to clip –Driver need to copy it across the AGP When clipping disabled writes to driver allocated buffer –No Copy here! –Calculates clip flags (out-codes) for each vertex more execution cycles per vertex –Clips Minimize the amount of clipping Use bounding boxes/spheres on your objects Don’t forget to take the guard-band into account
12
12 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Clipping Render-state (Cont) Pseudo-code to minimize clipping –If (BB is outside screen) Don’t render primitive –Elseif (BB is inside guard-band) Render with clipping off –Else Render with clipping on Typical game scene should have <10% of primitives clipped –Biggest problem is front plane clipping
13
13 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Performance Render-states Specular – very expensive LocalViewer – smaller performance impact than HW, but still costs more NormalizeNormals – extra work for the PSGP, use only when needed Fog – written as “specular alpha”, can change PSGP’s output FVF
14
14 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 DirectX* 8.0 Graphics New Features Point sprites Tweening Indexed vertex blending/ Indexed palette skinning Vertex Shaders
15
15 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Point Sprites PSGP writes in native FVF format If HW does not support –Each point is expanded to quad, using the point size calculated –The quad list is submitted to the driver Very slow solution if no HW support for point sprites, try to avoid it
16
16 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Tweening Tween the position and normal before transformation (in SIMD) After tweening continuous the “standard” PSGP flow Costs very few cycles –But, for tweening and transformation only a vertex shader would run faster –Try to compare your exact scenario to a vertex shader
17
17 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Indexed Skinning Transforms all vertices to matrix0 space –Using scalar code, with lookup for the needed matrix Than continuous the normal PSGP flow DirectX* 7 style skinning is supported by some HW and may run faster, but requires multiple models and DrawPrimitive calls
18
18 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Vertex shaders At vertex shader creation –The shader code is compiled to equivalent IA32 code –Using all possible assembly optimizations and instructions available on client’s CPU to achieve fastest code At vertex shader execution –Calling the generated code SW vertex shaders have excellent performance
19
19 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 SW Vs. HW Vertex Shaders Calculates more than one vertex in a single iteration –Based on the processor SIMD width Not every shader instruction is 1 clock –But, the CPU runs with much higher frequency than today’s 3D graphics chips
20
20 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 SW Vs. HW Vertex Shaders (Cont) Simple compilation sample: –Mul r0.xyz,v0,c0 Movapsxmm0,[v0.x] Mulpsxmm0,[c0.x] Movapsxmm1,[v0.y] Mulpsxmm1,[c0.y] Movapsxmm2,[v0.z] Mulpsxmm2,[c0.z] Movaps[r0.x],xmm0 Movaps[r0.y],xmm1 Movaps[r0.z],xmm2
21
21 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 SW Vs. HW Vertex Shaders (cont) Data that you write, is data that the CPU have to calculate –Write only needed data (using the vertex shader write mask) –Use the swizzle modifiers, and don’t duplicate written data Vertex shader instructions are blended to achieve maximum performance –But, keeping dependency chains squeezed will help the compiler in physical register assignments
22
22 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Performance Tips for SW Vertex Shaders m?x? macros have better performance than the un-expanded macros Try to minimize the use of the address register –Due to the parallelism of the SW vertex shader –Sort the VB by values used in the address register
23
23 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Performance Tips for SW Vertex Shaders lit, expp and logp are big cycle consumers –Use the worse accuracy (i.e. expp.x) when possible –Use either.x or.z (but not both) –exp and log are worse than expp, logp Don’t implicitly saturate color values –it is done automatically
24
24 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Optimized Vertex Shader dp4 oPos.x, v0, c2 dp4 oPos.y, v0, c3 dp4 oPos.z, v0, c4 dp4 oPos.w, v0, c5 add r1, c6,-v0 dp3 r2, r1, r1 rsq r2, r2 mov oT0, v2 mul r1,r1,r2 dp3 r3, v1, r1 max r3,r3,c8 add r3, r3, c7 min oD0,r3,c9 m4x4 oPos, v0, c[2] add r1.xyz, c6,-v0 dp3 r2.w, r1, r1 rsq r2.w, r2.w mul r1.xyz,r1,r2.w dp3 r2.w, v1, r1 max r2.w,r2.w,c8 add oD0.xyz, r2.w, c7 mov oT0, v2
25
25 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Questions?? Ronen.Zohar@intel.com Kim.Pallister@intel.com Intel, Pentium and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Copyright © 2001 Intel Corp.
26
26 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Backup
27
27 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Tweening + transformation vertex shader Mul r0.xyz,v0,c0.x // c0.x – α Mad r0.xyz,v1,c0.y,r0 // c0.y – (1- α) M4x4oPos,r0,c1 MovoD[0].xyz,v2 MovoT[0].xy,v3
28
28 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Not Equal Address Value Const register file (x4) 1.0f 2.0f 3.0f 1212 Address register (x4) Need to re-arrange a combination register for the SIMD instruction to use (costs ~20 cycles) 1.0f2.0f1.0f2.0f Instruction argument
29
29 Copyright © 2001 Intel Corporation. * Other names and brands may be claimed as the property of others. Meltdown 2001 Equal Address Value Const register file (x4) 1.0f 2.0f 3.0f 2222 Address register (x4) Accessing directly the x4 constant register file. No penalty for “re-arranging” vertices Address accessing mode is selected when storing address value Instruction argument
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.