234262 Tutorial #13 Solving MIPS Exam Problems 20:00 234262 © Dima Elenbogen 2010, Technion 1.

Slides:

Advertisements

Similar presentations

Morgan Kaufmann Publishers The Processor

Advertisements

– © Yohai Devir 2007 Technion - IIT Tutorial #10 MIPS commands.

1  1998 Morgan Kaufmann Publishers We will be reusing functional units –ALU used to compute address and to increment PC –Memory used for instruction and.

Pipelining 6.1, 6.2. Performance Measurements Cycle Time: Time __________________ Latency: Time to finish a _____________, start to finish Throughput:

Princess Sumaya Univ. Computer Engineering Dept. Chapter 4:

Pipeline MIPS תרגול כיתה מס' 12. דוגמה 1 הסבירו איזה מעקף (bypass/forwarding) דרוש ב- pipeline בכדי לבצע את התכנית הבאה: add$2,$3,$4 add$4,$5,$6 add$5,$3,$4.

1 RISC Pipeline Han Wang CS3410, Spring 2010 Computer Science Cornell University See: P&H Chapter 4.6.

Tutorial #10 MIPS commands – © Yohai Devir 2007 Technion - IIT.

CSE 490/590, Spring 2011 CSE 490/590 Computer Architecture Pipelining III Steve Ko Computer Sciences and Engineering University at Buffalo.

CS-447– Computer Architecture Lecture 12 Multiple Cycle Datapath

PCPC addr instr INSTR MEM R1 R2 WR W Data R Data 1 R Data 2 ALU DATA MEM ALU CTRL rs rt op +4 shift 2 zero BRANCH CTRL muxmux sign extend immed 1632 ADDADD.

ENEE350 Ankur Srivastava University of Maryland, College Park Based on Slides from Mary Jane Irwin ( )

VHDL Development for ELEC7770 VLSI Project Chris Erickson Graduate Student Department of Electrical and Computer Engineering Auburn University, Auburn,

©UCB CS 162 Computer Architecture Lecture 3: Pipelining Contd. Instructor: L.N. Bhuyan

תירגול השלמה : Pipelined MIPS Single-cycle MIPS Retiming Mealy Criterion 09: © Dima Elenbogen 2010, Technion 1.

1 Stalling  The easiest solution is to stall the pipeline  We could delay the AND instruction by introducing a one-cycle delay into the pipeline, sometimes.

Solving MIPS Exam Problems 21: © Dima Elenbogen 2010, Technion 1.

Tutorial #6 Controller + DataPath part II – © Yohai Devir 2007 © Dima Elenbogen 2009 Technion - IIT.

ECE 232 L19.Pipeline2.1 Adapted from Patterson 97 ©UCBCopyright 1998 Morgan Kaufmann Publishers ECE 232 Hardware Organization and Design Lecture 19 Pipelining,

Microprocessor Design

1 The single cycle CPU. 2 Performance of Single-Cycle Machines Memory Unit 2 ns ALU and Adders 2 ns Register file (Read or Write) 1 ns Class Fetch Decode.

Tutorial #13 Solving MIPS Exam Problems 01: © Dima Elenbogen 2010, Technion 1.

1 השעון Hertz=1/sec מחשב פנטיום במהירות של פירושו שהוא מבצע 8^10 *2 מחזורי שעון בשניה. כל מחזור שעון לוקח 200MHZ 5*10^-9=5nanosecond כמה לוקחת פקודה בימינו?

Lec 9: Pipelining Kavita Bala CS 3410, Fall 2008 Computer Science Cornell University.

EECC550 - Shaaban #1 Selected Chapter 5 For More Practice Exercises Winter The MIPS jump and link instruction, jal is used to support procedure.

Computer ArchitectureFall 2008 © October 6th, 2008 Majd F. Sakr CS-447– Computer Architecture.

שלבי ביצוע הוראת מכונה (1) FETCH = קרא הוראה מהזיכרון ע " פ הכתובת שמכיל ה -PC. (2) DECODE = פענח את הפקודה וקרא את האוגרים הנחוצים ( אחד או שניים ). (3)

Lecture 12: Pipeline Datapath Design Professor Mike Schulte Computer Architecture ECE 201.

1 Pipelining Reconsider the data path we just did Each instruction takes from 3 to 5 clock cycles However, there are parts of hardware that are idle many.

Lecture 5: Pipelining Implementation Kai Bu

Automobile Manufacturing 1. Build frame. 60 min. 2. Add engine. 50 min. 3. Build body. 80 min. 4. Paint. 40 min. 5. Finish.45 min. 275 min. Latency: Time.

CASE STUDY OF A MULTYCYCLE DATAPATH. Alternative Multiple Cycle Datapath (In Textbook) Minimizes Hardware: 1 memory, 1 ALU Ideal Memory Din Address 32.

Pipelined Datapath and Control

Fall EE 333 Lillevik 333f06-l8 University of Portland School of Engineering Computer Organization Lecture 8 Detailed MIPS datapath Timing overview.

Comp Sci pipelining 1 Ch. 13 Pipelining. Comp Sci pipelining 2 Pipelining.

Sample Code (Simple) Run the following code on a pipelined datapath: add1 2 3 ; reg 3 = reg 1 + reg 2 nand ; reg 6 = reg 4 & reg 5 lw ; reg.

Electrical and Computer Engineering University of Cyprus LAB3: IMPROVING MIPS PERFORMANCE WITH PIPELINING.

POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? Pipelining Ver. Jan 14, 2014 Marco D. Santambrogio:

TEAM FRONT END ECEN 4243 Digital Computer Design.

ECE Computer Architecture Lecture Notes # 6 Shantanu Dutt How to Add To & Use the Basic Processor Organization To Execute Different Instructions.

PC Memory 2 16 by 16 bit Reg. File ALUALU SEXT 16 I[5:0] I[7:0] 8 6 Controller +1 Rd1 Rd2 Wr WE Out1 In Out2 I Memory 2 16 by 16 bit 16 WE ZEXT 16 I[11:0]

2/15/02CSE Data Hazzards Data Hazards in the Pipelined Implementation.

EE524/CptS561 Jose G. Delgado-Frias 1 Processor Basic steps to process an instruction IFID/OFEXMEMWB Instruction Fetch Instruction Decode / Operand Fetch.

Introduction to Computer Organization Pipelining.

CMPE 421 REVIEW: MIDTERM 1. A MODIFIED FIVE-Stage Pipeline PC A Y R MD1 addr inst Inst Memory Imm Ext add rd1 GPRs rs1 rs2 ws wd rd2 we wdata addr wdata.

Design a MIPS Processor (II)

Access the Instruction from Memory

Problem with Single Cycle Processor Design

Stalling delays the entire pipeline

5 Steps of MIPS Datapath Figure A.2, Page A-8

MIPS Instructions.

Single Cycle Processor

Multi-Cycle CPU.

ECS 154B Computer Architecture II Spring 2009

CDA 3101 Spring 2016 Introduction to Computer Organization

School of Computing and Informatics Arizona State University

Solving MIPS Exam Problems

CSCE 212 Chapter 5 The Processor: Datapath and Control

Tutorial #10 MIPS commands

תרגול 6 בקר ומסלול נתונים חלק שני

Multicycle Approach Break up the instructions into steps

Pipelined Processor Design

Data Hazards Data Hazard

Access the Instruction from Memory

Review Fig 4.15 page 320 / Fig page 322

Pipelining Appendix A and Chapter 3.

Alternative datapath (book): Multiple Cycle Datapath

COMS 361 Computer Organization

Need to stall for one cycle.

ELEC / Computer Architecture and Design Spring 2015 Pipeline Control and Performance (Chapter 6) Vishwani D. Agrawal James J. Danaher.

Presentation transcript:

Tutorial #13 Solving MIPS Exam Problems 20: © Dima Elenbogen 2010, Technion 1

LWA Ri, Rj 20: © Dima Elenbogen 2010, Technion2

קידוד של LWA Ri, Rj 20: © Dima Elenbogen 2010, Technion 3

קידוד של LWA Ri, Rj 20: © Dima Elenbogen 2010, Technion4

מהלך הביצוע של LWA Ri, Rj 20: © Dima Elenbogen 2010, Technion5

מימוש LWA Ri, Rj – © Dima Elenbogen 2010 Technion - IIT

למה נחוץ מחזור ההמתנה ? 20: © Dima Elenbogen 2010, Technion7 Tpd Tpd ≈ Tcycle

LWC Rn, const 20: © Dima Elenbogen 2010, Technion8 Pay attention that this command occupies 2 words!

קידוד של LWC Rn, const 20: © Dima Elenbogen 2010, Technion9 OPRsRtIM OP...Rn1 6 bit5 bit 16 bit const 32 bits α:α: α+4:

רעיון למימוש של LWC Rn, const 20: © Dima Elenbogen 2010, Technion10 IRIR M3 M3 5 IR[d] 5 IR[t] 5 IR[s] M4 M4 N REG1 W DATA M5 M5 R s R t M6 M6 12 IR[OP+FUN] N REG2 16 IR[IM] WN REG ZERO REG FILE 4 44 SIGN EXT C1 C3 WRITEWRITE 6 IR[OP] ALUout BPCBPC PCPC M1 M1 M2 M2 ZERO IR[XI] 26 PC[31:28] C WRITEWRITE READREAD ALUout ADDR MEMout MEM WDATA ALUALU OPRsRtIM OP...Rn1 Const למה קידדנו את Rn דווקה ב -Rt?

שלב decode מקורי – © Yohai Devir 2007 Dima Elenbogen 2010 Technion - IIT IRIR 5 IR[t] 5 IR[s] N REG1 RsRs RtRt M5 M5 M6 M6 ALUALU N REG2 REG FILE 16 IR[IM] 44 SIGN EXT C1 6 IR[OP] ALUout BPCBPC PCPC C1 is decoding IR[OP] Rs is being read Rt is being read BPC ← PC + SX(Imm)*4

תוספת בשלב decode – © Yohai Devir 2007 Dima Elenbogen 2010 Technion - IIT IRIR M5 M5 M6 M6 ALUALU 16 IR[IM] 44 SIGN EXT C1 6 IR[OP] ALUout BPCBPC PCPC IRIR M5 M5 M6 M6 ALUALU PCPC M1 M1 M2 M2 ADDR MEMout MEM WDATA C1 is decoding IR[OP]... BPC ← PC + SX(Imm)*4 MEMout <= MEM(PC) // BPC ← α + 8 // Read the const value from the memory השינוי לא פוגע בפקודות אחרות ! OPRsRtIM OP...Rn1

שלב ייחודי ל - LWC Rn, const M2 M2 5 IR[d] REG FILE ADDR MEMout MEM WDATA PCPC M1 M1 20: © Dima Elenbogen 2010, Technion13 Rt ← Mem(PC) PC ← BPC WB: // Rt ← MEM[α + 4] // PC ← α + 8 IRIR M3 M3 5 IR[t] 5 IR[s] M4 M4 N REG1 W DATA M5 M5 R s R t M6 M6 12 IR[OP+FUN] N REG2 16 IR[IM] WN REG ZERO 4 44 SIGN EXT C1 C3 WRITEWRITE 6 IR[OP] ALUout BPCBPC ZERO IR[XI] 26 PC[31:28] C WRITEWRITE READREAD ALUout ALUALU OPRsRtIM OP...Rn1

מימוש של LWC Rn, const 20: © Dima Elenbogen 2010, Technion14 OPRsRtIM OP...Rn1 6 bit5 bit 16 bit const 32 bits IR ← Mem(PC) PC ← PC + 4 Fetch: // PC ← α + 4 C1 is decoding IR[OP]... BPC ← PC + SX(Imm)*4 MEMout <= MEM(PC) // BPC ← α + 8 // Read the const value Decode: Rt ← Mem(PC) PC ← BPC WB: // Rt ← MEM[α + 4] // PC ← α + 8 פעולה נוספת ב - decode

שלב ייחודי אלטרנטיבי ל - LWC Rn, const 20: © Dima Elenbogen 2010, Technion15 Rt ← Mem(PC) PC ← PC + 4 WB: // Rt ← MEM[α + 4] // PC ← α + 8 OPRsRtIM OP...Rn... IRIR M3 M3 5 IR[d] 5 IR[t] 5 IR[s] M4 M4 N REG1 W DATA M5 M5 R s R t M6 M6 12 IR[OP+FUN] N REG2 16 IR[IM] WN REG ZERO REG FILE 4 44 SIGN EXT C1 C3 WRITEWRITE 6 IR[OP] ALUout BPCBPC PCPC M1 M1 M2 M2 ZERO IR[XI] 26 PC[31:28] C WRITEWRITE READREAD ALUout ADDR MEMout MEM WDATA ALUALU

ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion16

קידוד ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion17 OPRsRtIM OPRiRjImm 6 bit5 bit 16 bit IRIR M3 M3 5 IR[d] 5 IR[t] 5 IR[s] M4 M4 N REG1 W DATA M5 M5 R s R t M6 M6 12 IR[OP+FUN] N REG2 16 IR[IM] WN REG ZERO REG FILE 4 44 SIGN EXT C1 C3 WRITEWRITE 6 IR[OP] ALUout BPCBPC PCPC M1 M1 M2 M2 ZERO IR[XI] 26 PC[31:28] C WRITEWRITE READREAD ALUout ADDR MEMout MEM WDATA ALUALU

IRIR M3 M3 5 IR[d] 5 IR[t] 5 IR[s] M4 M4 N REG1 W DATA M5 M5 R s R t M6 M6 12 IR[OP+FUN] N REG2 16 IR[IM] WN REG ZERO REG FILE 4 44 SIGN EXT C1 C3 WRITEWRITE 6 IR[OP] ALUout BPCBPC PCPC M1 M1 M2 M2 ZERO IR[XI] 26 PC[31:28] C WRITEWRITE READREAD ALUout ADDR MEMout MEM WDATA ALUALU פיתרון קל ל - ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion18

IRIR M3 M3 5 IR[d] 5 IR[t] 5 IR[s] M4 M4 N REG1 W DATA M5 M5 R s R t M6 M6 12 IR[OP+FUN] N REG2 16 IR[IM] WN REG ZERO REG FILE 4 44 SIGN EXT C1 C3 WRITEWRITE 6 IR[OP] ALUout BPCBPC PCPC M1 M1 M2 M2 ZERO IR[XI] 26 PC[31:28] C WRITEWRITE READREAD ALUout ADDR MEMout MEM WDATA ALUALU עלות הפיתרון הקל ל - ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion19 קל, אך יקר. עלותו היא 2*32+32 = 96 ש '' ח. דורש 4 מחזורים.

N REG1 0 IRIR M3 M3 5 IR[d] 5 IR[t] 5 IR[s] M4 M4 W DATA M5 M5 R s R t M6 M6 12 IR[OP+FUN] N REG2 16 IR[IM] WN REG ZERO REG FILE 4 44 SIGN EXT C1 C3 WRITEWRITE 6 IR[OP] ALUout BPCBPC PCPC M1 M1 M2 M2 ZERO IR[XI] 26 PC[31:28] C WRITEWRITE READREAD ALUout ADDR MEMout MEM WDATA ALUALU פיתרון בינוני ל - ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion20 עלות בינונית : היא 2*32+5 = 69 ש '' ח. אין לנו קבוע 0 בקידוד הפקודה, לכן נדרש בורר לפני REG1. הפיתרון דורש יותר מ -4 מחזורים. כתובת שאליה נכתוב נשמרת ב -PC. כמובן, חובה לשחזר את ערך ה -PC הקודם לאחר השלמת עידכון הזיכרון. BPC ישמש לכך.

הפיתרון הזול ביותר ל - ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion21 OPRsRtIM OPRiRjImm

הפיתרון הזול ביותר ל - ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion22

הפיתרון הזול ביותר ל - ADDMEM Ri, Rj, Imm 20: © Dima Elenbogen 2010, Technion23 מחזורי קריאת רגיסטרים חדשים מחזור קריאת רגיסטרים אחרי עידכון מחזורי קריאת רגיסטרים חדשים שימו לב לכתיבות שנעשות במקביל במחזור הלפני אחרון מחזורים 3, 13 ו -14: גיבוי ערך PC ב -BPC ושיחזור הערך משם

Pipelined MIPS 20: © Dima Elenbogen 2010, Technion24 IFIDEXMEMWB The main problem of the pipelined MIPS are data hazards. If a command updates Rk, its new value will become available only 3 commands later. A compiler/programmer should optimize code in order to minimize the data hazards. When they are inevitable, 2 solutions are possible: The compiler/programmer can intentionally insert NOP commands. (In reality) If a processor has a data hazard detection unit, the unit should detect the hazards and delay processing of the reading commands.

SWI Rj Rk 20: © Dima Elenbogen 2010, Technion25

קידוד של SWI Rj Rk 20: © Dima Elenbogen 2010, Technion26 OPRsRtIM OPRjRk4 6 bit5 bit 16 bit

SWI Rj Rk 20: © Dima Elenbogen 2010, Technion27 OPRsRtIM OPRjRk4 6 bit5 bit 16 bit

מימוש של SWI Rj Rk 20: © Dima Elenbogen 2010, Technion28

תשובה ל - SWI Rj Rk 20: © Dima Elenbogen 2010, Technion29