Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computer Architecture Lecture 3 – Part 2 15 th May, 2006 Abhinav Agarwal Veeramani V.

Similar presentations


Presentation on theme: "Computer Architecture Lecture 3 – Part 2 15 th May, 2006 Abhinav Agarwal Veeramani V."— Presentation transcript:

1 Computer Architecture Lecture 3 – Part 2 15 th May, 2006 Abhinav Agarwal Veeramani V.

2 Preamble Last Quiz Webpage – Lec Slides, Quiz Sol, Verilog Lab Optional Lectures Limited Projects

3 Outline Simple Pipeline – hazards and solution Out of order exceution Reg Renaming In order Commit

4 Quick recap – Pipelining source: http://cse.stanford.edu/class/sophomore-college/projects-00/risc/pipelining/http://cse.stanford.edu/class/sophomore-college/projects-00/risc/pipelining/

5 Control Hazard Branch delay slot bnz r1, label add r1, r2, r3 label: sub r1, r2, r3 Save one cycle stall. Fetch in the negative edge to save another. bez r1, label IFID EX MEM WB Bubble IF TargetIF IFID/RFEXMEMWB IFID/RFEXMEMWB IFID/RFEXMEMWB

6 Branch Prediction Deeper pipelines. Such static compiler techniques would not work. Dynamically remember last targets of this branch and take decision on basis of history

7 Data Hazards RAW hazard – Read after Write add r1, r2, r3 store r1, 0(r4) WAW hazard – Write after Write div r1, r3, r4 … add r1, r10, r5 WAR hazard – Write after Read Generally not relevant in simple pipelines IFID/RFEXMEMWB IFID/RFEXMEMWB

8 Remedies Bypass values (Data forwarding) RAW hazards are tackled this way Not all RAW hazards can be solved by forwarding. E.g.: Load delay  load r1, 0(r2)  add r3, r1, r4 Solutions:  Software – Compiler Techniques  Hardware – Out of order Execution IFID/RFEXMEMWB IFID/RFEXMEMWB

9 Out of Order Execution source: EV8 DEC Aplha Procesor, (c) Intel

10 Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20  and r3, r4, r1  xor r4, r2, r4  sub r2, r4, r3 Logical Register Physical Register R1P7 R2 R3 R4P2 Register Map

11 Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20addi p1, p2, 0x20  and r3, r4, r1  xor r4, r2, r4  sub p6, p5, p3 Logical Register Physical Register R1P7 R2P1 R3 R4P2 Register Map

12 Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20addi p1, p2, 0x20  and r3, r4, r1and p3, p2, p7  xor r4, r2, r4  sub r2, r4, r3 Logical Register Physical Register R1P7 R2P1 R3P3 R4P2 Register Map

13 Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20addi p1, p2, 0x20  and r3, r4, r1and p3, p2, p7  xor r4, r2, r4xor p5, p1, p2  sub r2, r4, r3sub p6, p5, p3 WAW hazards eliminated Useful for new processors which have larger no. of Physical Reg Logical Register Physical Register R1P7 R2P6 R3P3 R4P5 Register Map

14 In order Retirement After Execution, each inst gets queued up in a table This table ensures that the initial program order is maintained Inst are allowed to become permanent only when they reach top of Re-order table

15 Remedies to Structural hazards Simplest solution: Increase resources, functional units (Silicon allows us to do this) Another solution: Pipeline the functional units Pipelining is not always possible/feasible.

16 Superscalar execution! Execute more than one instruction every cycle. Make better use of the functional units Fetch, commit more instructions every cycle.

17 Memory Organization in processors Caches inside the chip Faster – ‘Closer’ SRAM cells They contain recently-used data They contain data in ‘blocks’

18 Rational behind caches Principle of spatial locality Principle of temporal locality Replacement policy (LRU, LFU, etc.) Principle of inclusivity

19 References http://en.wikipedia.org/wiki/Hazard_(computer _architecture) http://en.wikipedia.org/wiki/Hazard_(computer _architecture) http://www.csee.umbc.edu/~plusquel/611/slid es/chap3_3.html http://www.csee.umbc.edu/~plusquel/611/slid es/chap3_3.html


Download ppt "Computer Architecture Lecture 3 – Part 2 15 th May, 2006 Abhinav Agarwal Veeramani V."

Similar presentations


Ads by Google