DirectX 8 and GeForce3 Christian Schär & Sacha Saxer
Overview DirectX Direct3D 8 Vertex Shader Pixel Shader Hardware Support Conclusions Future
DirectX Components DirectDraw Direct3D DirectSound DirectMusic DirectShow DirectInput DirectPlay 2D Graphics API 3D Graphics API Sound API Music API Multimedia Streams API Input API Network API
Direct3D Hardware Abstraction Layer Classic Rendering Pipeline Vertex Shader Pixel Shader D3DX
Hardware Abstraction Layer
Classic Rendering Pipeline CreateVertexBuffer(); SetVertexShader(); SetStreamSource(); DrawPrimitive();
Vertex Shader Small assembly language program Replaces the Transformation and Lighting Engine Responsible for World and View transformations Is executed once per vertex Has no neighborhood information Prepares data for the Pixel Shader
Modified Rendering Pipeline
Vertex Shader Architecture 16 Input registers (r/o) 96 Constant registers (r/o) 12 Temp registers (r/w) 1 Address register (w/o) Output registers Max. 128 instructions
Vertex Shader Assembly Input: vn Vertex c[n] Constants an Address rn Temp Output: oPos Position oTn Texture oDn Color oFog/oPts
Vertex Shader Instructions mov r, s0 add r, s0, s1 sub r, s0, s1 mul r, s0, s1 mad r, s0, s1, s2 rcp r, s0.w rsq r, s0.w ; copy ; sum ; difference ; multiply ; multiply-add ; reciprocal ; reciprocal sqrt
Vertex Shader Instructions dp3 r, s0, s1 dp4 r, s0, s1 min r, s0, s1 max r, s0, s1 slt r, s0, s1 sge r, s0, s1 ; 3D dot product ; 4D dot product ; per component ; 1.0 if less than ; 1.0 if greater/equal
Vertex Shader Instructions expp r, s0.w logp r, s0.w lit r, s0, s1 dst r, s0, s1 ; partial prec. ; lighting fn ; distance fn
Register modifier Component Modifier r.{x} {y} {z} {w} r.[xyzw][xyzw][xyzw][xyzw] -r Description Destination mask Source swizzle Source negation
Sample Vertex Shader vs.1.1 m4x4 r0, v0, c[CV_WORLD_0] m4x4 oPos, r0, c[CV_VIEWPROJ_0] m3x3 r1, v3, c[CV_WORLD_0] dp3 r1.x, r1, c[CV_LIGHT] max r1.1, r1.x, c[CV_ZERO].x mul r1, r1.x, c[CV_DIFUSE] add r1, r1, c[CV_AMBIENT] min r1, r1, c{CV_ONE].x mov oD0.x, r1.x mov oD0.y, r1.x mov oD0.z, r1.x
Pixel Shader Small assembly language program Replaces the Texturing and Lighting Engine Is executed once per pixel Has no neighborhood information Does further calculations on the Vertex Shader ’ s output The output is the color of the pixel
Modified Rendering Pipeline
Pixel Shader Architecture TexAddrOp 0 TexAddrOp 1 TexAddrOp 2 TexAddrOp 3 Triangle Rasterizer 8 Texture Blend Ops Specular / Fog Computed Alpha Blending Dx8 Pixel Shaders
Pixel Shader Assembly Input: vn Vertex color tn Texture cn Constants Output: rn Temp r0 Output color
Pixel Shader Instructions mov r, s0 add r, s0, s1 sub r, s0, s1 mul r, s0, s1 mad r, s0, s1, s2 ; copy ; sum ; difference ; multiply ; multiply-add
Pixel Shader Instructions dp3 r, s0, s1 lrp r, s0, s1, s2 cnd r, r0.a, s1, s2 sub r0, v0, v1_bias cnd r0, r0.a, c0, c1 ; 3D dot product ; lin. interp. blend ; r= r0.a>0.5 ? s1 : s2 ; r= v0 > v1 ? c0 : c1
Texture Instructions tex t texbem t texbeml t, s texcoord t texkill t texm3x2pad t, s texm3x2tex t, s texm3x3pad t, s texm3x3tex t, s texm3x3spec t, s0, s1 texm3x3vspec t, s texreg2ar t, s texreg2gb t, s ; normal sample ; bumped env. mapping ; … with Luminance ; sample tex. coords. ; black, if coords < 0 ; Matrix multiplications ; ; … +refl. +env. map. ; ; use s.ar as coords. ; use s.gb as coords.
Sample Texture Instructions tex t texm3x2pad t0, s texm3x2tex t1, s tex t0 texm3x3pad t1, t0 texm3x3pad t2, t0 texm3x3spec t3, t0, c0 mov r0, t3 ; put texture col. into t ; 3x2 matrix multiplication ; get normal vector from t0 ; eye-ray vector from c0 ; cube env. texture from t3 ; do env. mapping ; output color
Modifiers Component Modifier r.{a} {rgb} 1-r -r r_bias r_bx2 Description Source/Destination mask Invert Negate , *2
Modifiers Instruction Modifier _x2 _x4 _d2 _sat Description multiply result by 2 multiply result by 4 divide result by 2 cramp result
Sample Pixel Shader ps.1.1 tex t1 mov r0, t1 ; use texture color mov r0.a, v0 ; get diffuse lighting from ; vertex interpolation
D3DX Mesh Optimizing adjacency required sort by attribute compact progressive meshes
D3DX Progressive Meshes different levels of detail (LOD) half edge collapse cloning by sharing of vertex buffers streamable save method OptimizeBaseLOD TrimByVertices/TrimByFaces
D3DX Skinned Meshes vertex data, bone data with vertex indices support for.X files export filters for Maya, 3D Studio Max available up to 4 indices per vertex up to 12 indices per face up to 256 bone palettes ConvertToBlendedMesh() reduces to this constraints ConvertToIndexBlendedMesh() same, but less subsets Uses GeForce ’ s restricted skinning support by rendering a prefix in hardware.
Conclusions very flexible tool hardware accelerated low level interface (assembly) vertex shaders theoretically applicable to point-sampled geometry rendering of shadows needs lots of work
What ’ s next? subdivision surfaces in hardware (TruForm technology by ATI) displacement maps
References MSDN (Microsoft Developer Network) NVIDIA Whitepapers –Introduction to Vertex Shaders –Introdction to Pixel Shaders