Status – Week 245 Victor Moya. Summary Streamer Streamer Creditos investigación. Creditos investigación.

Slides:



Advertisements
Similar presentations
1 Lecture 11: Modern Superscalar Processor Models Generic Superscalar Models, Issue Queue-based Pipeline, Multiple-Issue Design.
Advertisements

KeyStone Training More About Cache. XMC – External Memory Controller The XMC is responsible for the following: 1.Address extension/translation 2.Memory.
Software setup with PL7 and Sycon V2.8
Status – Week 257 Victor Moya. Summary GPU interface. GPU interface. GPU state. GPU state. API/Driver State. API/Driver State. Driver/CPU Proxy. Driver/CPU.
Chapter 3 Pipelining. 3.1 Pipeline Model n Terminology –task –subtask –stage –staging register n Total processing time for each task. –T pl =, where t.
CS4315A. Berrached:CMS:UHD1 Operating Systems and Computer Organization Chapter 4.
 Just as processes share the CPU, they also share physical memory. This section is about mechanisms for doing that sharing. EXAMPLE OF MEMORY USAGE Calculation.
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Spring 2003CSE P5481 Reorder Buffer Implementation (Pentium Pro) Hardware data structures retirement register file (RRF) (~ IBM 360/91 physical registers)
Status – Week 228 Victor Moya. Summary Hierarchical Z-Buffer. Hierarchical Z-Buffer.
Status – Week 250 Victor Moya. Summary Current State. Current State. Next Tasks. Next Tasks. Future Work. Future Work. Creditos investigación. Creditos.
Status – Week 274 Victor Moya. Simulator model Boxes. Boxes. Perform the actual work. Perform the actual work. A box can only access its own data, external.
Status – Week 249 Victor Moya. Summary MemoryController. MemoryController. Streamer. Streamer. TraceDriver. TraceDriver. Statistics. Statistics.
Status – Week 206 Victor Moya. Summary Fetch Cache. Fetch Cache. ColorCache. ColorCache. ColorWrite. ColorWrite. Next week. Next week.
Cache Table. ARP Modules Output Module Sleep until IP packet is received from IP Software Check cache table for entry corresponding to the destination.
Status – Week 247 Victor Moya. Summary Streamer. Streamer. TraceDriver. TraceDriver. bGPU bGPU Signal Traffic Analyzer. Signal Traffic Analyzer.
Status – Week 243 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry.
Status – Week 231 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization.
Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry.
Status – Week 277 Victor Moya.
LOOKUP MACHINE characterization Chanit Giat Rachel Stahl Instructor: Artyom Borzin הטכניון - מכון טכנולוגי לישראל המעבדה למערכות ספרתיות מהירות הפקולטה.
GPU Simulator Victor Moya. Summary Rendering pipeline for 3D graphics. Rendering pipeline for 3D graphics. Graphic Processors. Graphic Processors. GPU.
S A B D C T = 0 S gets message from above and sends messages to A, C and D S.
Status – Week 248 Victor Moya. Summary Streamer. Streamer. TraceDriver. TraceDriver. bGPU bGPU Signal Traffic Analyzer. Signal Traffic Analyzer. How to.
Status – Week 265 Victor Moya. Summary ShaderEmulator ShaderEmulator ShaderFetch ShaderFetch ShaderDecodeExecute ShaderDecodeExecute Communication storage.
Status – Week 226 Victor Moya. Summary Recursive descent. Recursive descent. Hierarchical Z Buffer. Hierarchical Z Buffer.
Midterm Tuesday October 23 Covers Chapters 3 through 6 - Buses, Clocks, Timing, Edge Triggering, Level Triggering - Cache Memory Systems - Internal Memory.
Status – Week 246 Victor Moya. Summary Signal Trace Format. Signal Trace Format. Creditos investigación. Creditos investigación.
Status – Week 272 Victor Moya. Vertex Shader VS 2.0+ (NV30) based Vertex Shader model. VS 2.0+ (NV30) based Vertex Shader model. Multithreaded?? Implemented.
Status – Week 254 Victor Moya. Summary Command Processor. Command Processor. Memory Controller. Memory Controller. Streamer. Streamer. Vertex buffers.
LUM final presentation Chanit Giat Rachel Stahl Instructor: Artyom Borzin Summer semester 2002.
Status – Week 240 Victor Moya. Summary Post Geometry Pipeline. Post Geometry Pipeline. Rasterization. Rasterization. Triangle Setup. Triangle Setup. Triangle.
EECS 470 Cache and Memory Systems Lecture 14 Coverage: Chapter 5.
Status – Week 239 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization.
Lecture 8 Shelving in Superscalar Processors (Part 1)
Status – Week 275 Victor Moya. Simulator model Boxes. Boxes. Perform the actual work. Perform the actual work. Parameters: wires in, wires out, child.
Status – Week 260 Victor Moya. Summary shSim. shSim. GPU design. GPU design. Future Work. Future Work. Rumors and News. Rumors and News. Imagine. Imagine.
Status – Week 266 Victor Moya. Summary ShaderEmulator ShaderEmulator ShaderFetch ShaderFetch ShaderDecodeExecute ShaderDecodeExecute Communication storage.
Memory Management n 1. Single contiguous allocation n 2. Partitioned organization: –Static, Dynamic n 3. (Pure) Paging.
1 ATTILA: A Cycle-Level Execution-Driven Simulator for Modern GPU Architectures Victor Moya, Carlos González, Jordi Roca, Agustín Fernández Jordi Roca,
14-15 May,2002 EVLA Correlator Backend Functional Design Tom Morgan 1 Backend Preliminary Functional Design.
2013/01/14 Yun-Chung Yang Energy-Efficient Trace Reuse Cache for Embedded Processors Yi-Ying Tsai and Chung-Ho Chen 2010 IEEE Transactions On Very Large.
Performed By: Yahel Ben-Avraham and Yaron Rimmer Instructor: Mony Orbach Bi-semesterial, /3/2013.
Instructor: Yuzhuang Hu Final August 7, :00pm - 10:pm HCC1700.
© 2004, D. J. Foreman 1 Computer Organization. © 2004, D. J. Foreman 2 Basic Architecture Review  Von Neumann ■ Distinct single-ALU & single-Control.
ECE 456 Computer Architecture Lecture #14 – CPU (III) Instruction Cycle & Pipelining Instructor: Dr. Honggang Wang Fall 2013.
B. Ramamurthy.  12 stage pipeline  At peak speed, the processor can request both an instruction and a data word on every clock.  We cannot afford pipeline.
Computer Science 210 Computer Organization Control Circuits Decoders and Multiplexers.
Chapter 91 Logical Address in Paging  Page size always chosen as a power of 2.  Example: if 16 bit addresses are used and page size = 1K, we need 10.
EFLAG Register of The The only new flag bit is the AC alignment check, used to indicate that the microprocessor has accessed a word at an odd.
HOW COMPUTERS WORK THE CPU & MEMORY. THE PARTS OF A COMPUTER.
Queue Manager and Scheduler on Intel IXP John DeHart Amy Freestone Fred Kuhns Sailesh Kumar.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
CS4315A. Berrached:CMS:UHD1 Operating Systems and Computer Organization Chapter 4.
Computer Organization CS224 Fall 2012 Lessons 39 & 40.
Dynamic Scheduling Why go out of style?
32- bit Microprocessor-Intel 80386
QuickPath interconnect GB/s GB/s total To I/O
Smruti R. Sarangi Computer Science and Engineering, IIT Delhi
Bruhadeshwar Meltdown Bruhadeshwar
Figure 13.1 MIPS Single Clock Cycle Implementation.
Instruction cycle Instruction: A command given to the microprocessor to perform an operation Program : A set of instructions given in a sequential.
شاخصهای عملکردی بیمارستان
Interconnect with Cache Coherency Manager
فرق بین خوب وعالی فقط اندکی تلاش بیشتر است
Direct Memory Access Disk and Network transfers: awkward timing:
Computer Organization
15-740/ Computer Architecture Lecture 10: Out-of-Order Execution
Rate of Change The rate of change is the change in y-values over the change in x-values.
Translation Lookaside Buffers
Presentation transcript:

Status – Week 245 Victor Moya

Summary Streamer Streamer Creditos investigación. Creditos investigación.

Streamer STREAMER FETCH OUTPUT CACHE STREAMER LOADER STREAMER COMMIT SHADER RASTERIZER COMMAND PROCESSOR STREAMER MEMORY CONTROLLER

Streamer

Streamer Reserve IRQ, OFIFO and OM at Streamer Fetch. Reserve IRQ, OFIFO and OM at Streamer Fetch. Allocate OM at Streamer Output Cache. Allocate OM at Streamer Output Cache. Allocate IRQ at Streamer Loader. Allocate IRQ at Streamer Loader. Allocate OFIFO at Streamer Commit. Allocate OFIFO at Streamer Commit. Deallocation messages to Streamer Fetch (from Streamer Loader and Streamer Commit) and Streamer Output Cache (from Streamer Commit). Deallocation messages to Streamer Fetch (from Streamer Loader and Streamer Commit) and Streamer Output Cache (from Streamer Commit).

Streamer Non indexed mode: Non indexed mode: Go through output cache: Go through output cache: Adds 1 cycle to the latency. Adds 1 cycle to the latency. Useless. Useless. Now required because OM is allocated at Streamer Output Cache. Now required because OM is allocated at Streamer Output Cache. Send new index from Streamer Fetch to Streamer Loader and Streamer Commit. Send new index from Streamer Fetch to Streamer Loader and Streamer Commit.

Streamer Streamer Loader pipeline: Streamer Loader pipeline: 1 – insert new index into IRQ. 1 – insert new index into IRQ. 2 – read index from the IRQ. 2 – read index from the IRQ. 3 – fetch first attribute for index input. 3 – fetch first attribute for index input. 4 – fetch second attribute for the index input. 4 – fetch second attribute for the index input. … n – send input to the shader. n – send input to the shader. n + 1 – free IRQ entry (message to Streamer Fetch). n + 1 – free IRQ entry (message to Streamer Fetch).

Streamer Streamer Commit pipeline: Streamer Commit pipeline: 1 – insert index in the OFIFO. 1 – insert index in the OFIFO. 2 – read index in the OFIFO if calculate bit enabled. Access the last use table. 2 – read index in the OFIFO if calculate bit enabled. Access the last use table. 3 – send output to the rasterizer. 3 – send output to the rasterizer. 4 – free OFIFO entry (message to Streamer Fetch). Free output memory entry if last use table says so (message to Streamer Output Cache). 4 – free OFIFO entry (message to Streamer Fetch). Free output memory entry if last use table says so (message to Streamer Output Cache).