Computer Architecture Lecture 3 – Part 2 15 th May, 2006 Abhinav Agarwal Veeramani V.

Slides:

Advertisements

Similar presentations

SE-292 High Performance Computing

Advertisements

COMP381 by M. Hamdi 1 (Recap) Pipeline Hazards. COMP381 by M. Hamdi 2 I n s t r. O r d e r add r1,r2,r3 sub r4,r1,r3 and r6,r1,r7 or r8,r1,r9 xor r10,r1,r11.

© 2006 Edward F. Gehringer ECE 463/521 Lecture Notes, Spring 2006 Lecture 1 An Overview of High-Performance Computer Architecture ECE 463/521 Spring 2006.

1 Pipelining Part 2 CS Data Hazards Data hazards occur when the pipeline changes the order of read/write accesses to operands that differs from.

CMSC 611: Advanced Computer Architecture Pipelining Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.

Superscalars Lalitha Ramadoss Elec 6200 Computer Architetcure& Design Lectured by Dr.Vishwani Agrawal Electrical&computer Engineering Auburn.

Computer Architecture Lecture 2 Abhinav Agarwal Veeramani V.

Superscalar processors Review. Dependence graph S1S2 Nodes: instructions Edges: ordered relations among the instructions Any ordering-based transformation.

Dynamic Branch PredictionCS510 Computer ArchitecturesLecture Lecture 10 Dynamic Branch Prediction, Superscalar, VLIW, and Software Pipelining.

1 Advanced Computer Architecture Limits to ILP Lecture 3.

Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.

Lecture Objectives: 1)Define pipelining 2)Calculate the speedup achieved by pipelining for a given number of instructions. 3)Define how pipelining improves.

Pipeline Hazards Pipeline hazards These are situations that inhibit that the next instruction can be processed in the next stage of the pipeline. This.

Instruction-Level Parallelism (ILP)

Pipelining III Andreas Klappenecker CPSC321 Computer Architecture.

Computer ArchitectureFall 2007 © October 24nd, 2007 Majd F. Sakr CS-447– Computer Architecture.

King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.

RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.

Pipelined Datapath and Control (Lecture #15) ECE 445 – Computer Organization The slides included herein were taken from the materials accompanying Computer.

Data Dependencies A dependency type that can cause a stall.

ENGS 116 Lecture 51 Pipelining and Hazards Vincent H. Berk September 30, 2005 Reading for today: Chapter A.1 – A.3, article: Patterson&Ditzel Reading for.

1 Sixth Lecture: Chapter 3: CISC Processors (Tomasulo Scheduling and IBM System 360/91) Please recall:  Multicycle instructions lead to the requirement.

Pipelining. 10/19/ Outline 5 stage pipelining Structural and Data Hazards Forwarding Branch Schemes Exceptions and Interrupts Conclusion.

Memory/Storage Architecture Lab Computer Architecture Pipelining Basics.

Chapter 2 Summary Classification of architectures Features that are relatively independent of instruction sets “Different” Processors –DSP and media processors.

1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle operations MIPS R4000 CDA5155 Spring, 2007, Peir / University.

Pipelining Enhancing Performance. Datapath as Designed in Ch. 5 Consider execution of: lw $t1,100($t0) lw $t2,200($t0) lw $t3,300($t0) Datapath segments.

Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.

CSE 340 Computer Architecture Summer 2014 Basic MIPS Pipelining Review.

CMPE 421 Parallel Computer Architecture

Chapter 6 Pipelined CPU Design. Spring 2005 ELEC 5200/6200 From Patterson/Hennessey Slides Pipelined operation – laundry analogy Text Fig. 6.1.

Winter 2002CSE Topic Branch Hazards in the Pipelined Processor.

1 (Based on text: David A. Patterson & John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, 3 rd Ed., Morgan Kaufmann,

1  1998 Morgan Kaufmann Publishers Chapter Six. 2  1998 Morgan Kaufmann Publishers Pipelining Improve perfomance by increasing instruction throughput.

11 Pipelining Kosarev Nikolay MIPT Oct, Pipelining Implementation technique whereby multiple instructions are overlapped in execution Each pipeline.

Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 13: Branch prediction (Chapter 4/6)

10/11: Lecture Topics Execution cycle Introduction to pipelining

Introduction to Computer Organization Pipelining.

Lecture 9. MIPS Processor Design – Pipelined Processor Design #1 Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System.

PROCESSOR PIPELINING YASSER MOHAMMAD. SINGLE DATAPATH DESIGN.

Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27,

Lecture 1: Introduction Instruction Level Parallelism & Processor Architectures.

L17 – Pipeline Issues 1 Comp 411 – Fall /23/09 CPU Pipelining Issues Read Chapter This pipe stuff makes my head hurt! What have you been.

High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.

CS203 – Advanced Computer Architecture Pipelining Review.

CSL718 : Superscalar Processors

CS2100 Computer Organization

Instructor: Justin Hsia

Single Clock Datapath With Control

Pipeline Implementation (4.6)

Appendix C Pipeline implementation

Computer Architecture Lecture 3 – Part 1 11th May, 2006

Vishwani D. Agrawal James J. Danaher Professor

Computer Architecture Lecture 3

Pipelining review.

Computer Architecture Lecture 4 17th May, 2006

Pipelining in more detail

Pipeline control unit (highly abstracted)

The Processor Lecture 3.6: Control Hazards

Control unit extension for data hazards

Instruction Level Parallelism (ILP)

November 5 No exam results today. 9 Classes to go!

Instruction Execution Cycle

Pipeline control unit (highly abstracted)

Vishwani D. Agrawal James J. Danaher Professor

CS203 – Advanced Computer Architecture

Pipeline Control unit (highly abstracted)

Conceptual execution on a processor which exploits ILP

Presentation transcript:

Computer Architecture Lecture 3 – Part 2 15 th May, 2006 Abhinav Agarwal Veeramani V.

Preamble Last Quiz Webpage – Lec Slides, Quiz Sol, Verilog Lab Optional Lectures Limited Projects

Outline Simple Pipeline – hazards and solution Out of order exceution Reg Renaming In order Commit

Quick recap – Pipelining source:

Control Hazard Branch delay slot bnz r1, label add r1, r2, r3 label: sub r1, r2, r3 Save one cycle stall. Fetch in the negative edge to save another. bez r1, label IFID EX MEM WB Bubble IF TargetIF IFID/RFEXMEMWB IFID/RFEXMEMWB IFID/RFEXMEMWB

Branch Prediction Deeper pipelines. Such static compiler techniques would not work. Dynamically remember last targets of this branch and take decision on basis of history

Data Hazards RAW hazard – Read after Write add r1, r2, r3 store r1, 0(r4) WAW hazard – Write after Write div r1, r3, r4 … add r1, r10, r5 WAR hazard – Write after Read Generally not relevant in simple pipelines IFID/RFEXMEMWB IFID/RFEXMEMWB

Remedies Bypass values (Data forwarding) RAW hazards are tackled this way Not all RAW hazards can be solved by forwarding. E.g.: Load delay  load r1, 0(r2)  add r3, r1, r4 Solutions:  Software – Compiler Techniques  Hardware – Out of order Execution IFID/RFEXMEMWB IFID/RFEXMEMWB

Out of Order Execution source: EV8 DEC Aplha Procesor, (c) Intel

Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20  and r3, r4, r1  xor r4, r2, r4  sub r2, r4, r3 Logical Register Physical Register R1P7 R2 R3 R4P2 Register Map

Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20addi p1, p2, 0x20  and r3, r4, r1  xor r4, r2, r4  sub p6, p5, p3 Logical Register Physical Register R1P7 R2P1 R3 R4P2 Register Map

Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20addi p1, p2, 0x20  and r3, r4, r1and p3, p2, p7  xor r4, r2, r4  sub r2, r4, r3 Logical Register Physical Register R1P7 R2P1 R3P3 R4P2 Register Map

Register Renaming  lw r4, 0(r1)lw p2, 0(p7)  addi r2, r4, 0x20addi p1, p2, 0x20  and r3, r4, r1and p3, p2, p7  xor r4, r2, r4xor p5, p1, p2  sub r2, r4, r3sub p6, p5, p3 WAW hazards eliminated Useful for new processors which have larger no. of Physical Reg Logical Register Physical Register R1P7 R2P6 R3P3 R4P5 Register Map

In order Retirement After Execution, each inst gets queued up in a table This table ensures that the initial program order is maintained Inst are allowed to become permanent only when they reach top of Re-order table

Remedies to Structural hazards Simplest solution: Increase resources, functional units (Silicon allows us to do this) Another solution: Pipeline the functional units Pipelining is not always possible/feasible.

Superscalar execution! Execute more than one instruction every cycle. Make better use of the functional units Fetch, commit more instructions every cycle.

Memory Organization in processors Caches inside the chip Faster – ‘Closer’ SRAM cells They contain recently-used data They contain data in ‘blocks’

Rational behind caches Principle of spatial locality Principle of temporal locality Replacement policy (LRU, LFU, etc.) Principle of inclusivity

References _architecture) _architecture) es/chap3_3.html es/chap3_3.html