Download presentation
Presentation is loading. Please wait.
1
Status – Week 272 Victor Moya
2
Vertex Shader VS 2.0+ (NV30) based Vertex Shader model. VS 2.0+ (NV30) based Vertex Shader model. Multithreaded?? Implemented with a FP array (3DLabs P10). Multithreaded?? Implemented with a FP array (3DLabs P10). Dynamic branching. Dynamic branching. No texture/vertx buffer load. No texture/vertx buffer load. No vertex kill. No vertex kill.
3
Vertex Shader
4
Shader Model Mono/Multithreaded Shader based in NV30 instruction set. Mono/Multithreaded Shader based in NV30 instruction set. A Shader is a stream processor: A Shader is a stream processor: Input Stream => Input Register Bank Input Stream => Input Register Bank 16 registers in a Vertex Shader 16 registers in a Vertex Shader 12 registers in Pixel Shader 12 registers in Pixel Shader Output Stream => Output Register Bank Output Stream => Output Register Bank ~16 registers in Vertex Shader ~16 registers in Vertex Shader ~4 registers in Pixel Shader ~4 registers in Pixel Shader Constant Memory/Register Bank Constant Memory/Register Bank Up to 256 in Vertex Shader Up to 256 in Vertex Shader
5
Shader Model Instruction Cache/Memory Instruction Cache/Memory Up to 256 in Vertex Shader Up to 256 in Vertex Shader 1024 in Pixel Shader 1024 in Pixel Shader Shared between different processors (?) Shared between different processors (?) Temporary and Auxiliary Registers Temporary and Auxiliary Registers 16 (Vertex Shader), 32/64 (Pixel Shader) 16 (Vertex Shader), 32/64 (Pixel Shader) Address Registers Address Registers Condition Code Register Condition Code Register Boolean Register Boolean Register Loop counters Loop counters etc. etc.
6
Shader Model Multithreaded: Multithreaded: numThreads: Number of streams that the shader can store. Includes idle and loading/unloading threads. Structures affected: Input and Output register banks. numThreads: Number of streams that the shader can store. Includes idle and loading/unloading threads. Structures affected: Input and Output register banks. numActiveThreads: Number of active (in execution) threads. Structures affected: temporary and auxiliary registers. PC table (in the Simulator Box). numActiveThreads: Number of active (in execution) threads. Structures affected: temporary and auxiliary registers. PC table (in the Simulator Box). Constant/Parameter Memory and Instruction Cache/Memory shared between all the threads. It is also shared between different Shaders (but this isn’t provided with the current model). Constant/Parameter Memory and Instruction Cache/Memory shared between all the threads. It is also shared between different Shaders (but this isn’t provided with the current model).
7
Test Model Three boxes: Three boxes: Loader: gets commands (input stream, new programs and parameters) from a file. Loader: gets commands (input stream, new programs and parameters) from a file. Fetch: fetch instructions from a Shader program memory. Fetch: fetch instructions from a Shader program memory. Decode/Execute: decodes and executes instructions, takes into account dependencies. Decode/Execute: decodes and executes instructions, takes into account dependencies. Writer: receives output stream and writes it in a file. Writer: receives output stream and writes it in a file.
8
Test Model Wires: Wires: Command: sends commands read from the input file to the fetch box. Latency varies for each kind of command and the data size. Command: sends commands read from the input file to the fetch box. Latency varies for each kind of command and the data size. New Shader Program: loads new instructions. New Shader Program: loads new instructions. New Shader Parameters: loads new parameters in constant memory. New Shader Parameters: loads new parameters in constant memory. New Input: sends a new input (Vertex Input 16 4D registers). New Input: sends a new input (Vertex Input 16 4D registers). Sync: for synchronization between Loader and Fetch (execution of a Shader Program depends from the Shader Output with the dynamic branch model). Latency 1. Sync: for synchronization between Loader and Fetch (execution of a Shader Program depends from the Shader Output with the dynamic branch model). Latency 1.
9
Test Model Wires: Wires: Instruction: Fetch send new instructions to Decode/Execute. Instruction EXIT marks end of Shader Program (Decode/Execute send Output to Writer). Latency 1. Instruction: Fetch send new instructions to Decode/Execute. Instruction EXIT marks end of Shader Program (Decode/Execute send Output to Writer). Latency 1. NewPC: Fetch recieves control flow changes from Decode/Execute. Latency 1. NewPC: Fetch recieves control flow changes from Decode/Execute. Latency 1. Execute: Drives execution latency for each instruction. Variable latency (1 – 5?). Execute: Drives execution latency for each instruction. Variable latency (1 – 5?). Output: Decode/Execute sends the Shader Program result for the current output to the logger box (Writer). Latency constant but greater than 1 (4 or 5?). Output: Decode/Execute sends the Shader Program result for the current output to the logger box (Writer). Latency constant but greater than 1 (4 or 5?).
10
Test Model Instruction Set: Instruction Set: Encoding in 128 bits. See file. Encoding in 128 bits. See file. Emulation: Emulation: Separate library: ShaderEmulator. Separate library: ShaderEmulator.
11
ShaderEmulator Performs the functional emulation of the shader: Performs the functional emulation of the shader: Instruction (static) management and execution. Instruction (static) management and execution. Keeps the shader state. Keeps the shader state. Implementation: Implementation: Support for differnt MODELS?: VS1, VS2, PS1, PS2. Support for differnt MODELS?: VS1, VS2, PS1, PS2. How to implement models? Different classess? Switch/case? How to implement models? Different classess? Switch/case? Where to keep structures related with control flow? Ex: stack, PC table. Where to keep structures related with control flow? Ex: stack, PC table.
12
ShaderEmulator Interface: Interface: ShaderEmulator(numThreads, numActiveThreads, shaderModel) ShaderEmulator(numThreads, numActiveThreads, shaderModel) LoadShaderProgram(code) LoadShaderProgram(code) ResetShaderState(numThread) ResetShaderState(numThread) ReadShaderState(numThread, data) ReadShaderState(numThread, data) LoadShaderState(numThread, data) LoadShaderState(numThread, data) ExecuteShaderInstruction(numThread, PC) ExecuteShaderInstruction(numThread, PC)
13
ShaderInstruction Decoded shader instruction. Decoded shader instruction. What to do with shader models? Invalid instructions in different models. What to do with shader models? Invalid instructions in different models. Interface: Interface: ShaderInstruction(code) ShaderInstruction(code) Different functions/attributes to get decoded information from the instruction (input registers, output registers, mask, swizzle, condition codes, etc.). Different functions/attributes to get decoded information from the instruction (input registers, output registers, mask, swizzle, condition codes, etc.).
14
ShaderExecInstruction Stores a instance of an instruction that is being executed. Stores a instance of an instruction that is being executed. Carries information about the execution: Carries information about the execution: ShaderInstruction: decoded instruction. ShaderInstruction: decoded instruction. PC: instruction memory address. PC: instruction memory address. state: decode/execution/writeback/locked/… state: decode/execution/writeback/locked/… result: result of the instruction. result: result of the instruction. startCycle: cycle in which the instruction was fetched. startCycle: cycle in which the instruction was fetched. Other statistics? Other statistics?
15
ShaderExecInstruction Implementation: Implementation: Avoid dynamic creation of objects. Avoid dynamic creation of objects. Static pool. Static pool. Created at fetch, destroyed at decode/execute (writeback). Created at fetch, destroyed at decode/execute (writeback). Can be managed by the own ShaderExecInstruction class? (static). Can be managed by the own ShaderExecInstruction class? (static).
16
Test Model
17
Code Management Directory structure: Directory structure: /emu (or /emulator): functional emulation classes and functions. /emu (or /emulator): functional emulation classes and functions. /sim (or /simulator): simulation classes and functions. /sim (or /simulator): simulation classes and functions. /support: support functions (IO, Types, etc.). /support: support functions (IO, Types, etc.).
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.