Download presentation
Presentation is loading. Please wait.
Published byDamian Mathews Modified over 9 years ago
1
EEL5708/Bölöni Lec 3.1 Fall 2004 Sept 1, 2004 Lotzi Bölöni Fall 2004 EEL 5708 High Performance Computer Architecture Lecture 3 Review: Instruction Sets
2
EEL5708/Bölöni Lec 3.2 Fall 2004 Acknowledgements All the lecture slides were adopted from the slides of David Patterson (1998, 2001) and David E. Culler (2001), Copyright 1998-2002, University of California Berkeley
3
EEL5708/Bölöni Lec 3.3 Fall 2004 Review: Instruction sets
4
EEL5708/Bölöni Lec 3.4 Fall 2004 The Instruction Set: a Critical Interface instruction set software hardware
5
EEL5708/Bölöni Lec 3.5 Fall 2004 Levels of Representation High Level Language Program Assembly Language Program Machine Language Program Control Signal Specification Compiler Assembler Machine Interpretation temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; lw$15,0($2) lw$16,4($2) sw$16,0($2) sw$15,4($2) 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 °°°° ALUOP[0:3] <= InstReg[9:11] & MASK
6
EEL5708/Bölöni Lec 3.6 Fall 2004 Instruction Set Architecture... the attributes of a [computing] system as seen by the programmer, i.e. the conceptual structure and functional behavior, as distinct from the organization of the data flows and controls the logic design, and the physical implementation. – Amdahl, Blaaw, and Brooks, 1964SOFTWARE -- Organization of Programmable Storage -- Data Types & Data Structures: Encodings & Representations -- Instruction Formats -- Instruction (or Operation Code) Set -- Modes of Addressing and Accessing Data Items and Instructions -- Exceptional Conditions
7
EEL5708/Bölöni Lec 3.7 Fall 2004 Review: MIPS R3000 (core) 0 r0 r1 ° r31 PC lo hi Programmable storage 2^32 x bytes 31 x 32-bit GPRs (R0=0) 32 x 32-bit FP regs (paired DP) HI, LO, PC Data types ? Format ? Addressing Modes? Arithmetic logical Add, AddU, Sub, SubU, And, Or, Xor, Nor, SLT, SLTU, AddI, AddIU, SLTI, SLTIU, AndI, OrI, XorI, LUI SLL, SRL, SRA, SLLV, SRLV, SRAV Memory Access LB, LBU, LH, LHU, LW, LWL,LWR SB, SH, SW, SWL, SWR Control J, JAL, JR, JALR BEq, BNE, BLEZ,BGTZ,BLTZ,BGEZ,BLTZAL,BGEZAL 32-bit instructions on word boundary
8
EEL5708/Bölöni Lec 3.8 Fall 2004 Review: Basic ISA Classes Accumulator: 1 addressadd Aacc acc + mem[A] 1+x addressaddx Aacc acc + mem[A + x] Stack: 0 addressaddtos tos + next General Purpose Register: 2 addressadd A BEA(A) EA(A) + EA(B) 3 addressadd A B CEA(A) EA(B) + EA(C) Load/Store: 3 addressadd Ra Rb RcRa Rb + Rc load Ra RbRa mem[Rb] store Ra Rbmem[Rb] Ra
9
EEL5708/Bölöni Lec 3.9 Fall 2004 Instruction Formats Variable: Fixed: Hybrid: … Addressing modes –each operand requires address specifier => variable format code size => variable length instructions performance => fixed length instructions –simple decoding, predictable operations With load/store instruction arch, only one memory address and few addressing modes => simple format, address mode given by opcode
10
EEL5708/Bölöni Lec 3.10 Fall 2004 MIPS Addressing Modes & Formats Simple addressing modes All instructions 32 bits wide oprsrtrd immed register Register (direct) oprsrt register Base+index + Memory immedoprsrt Immediate immedoprsrt PC PC-relative + Memory Register Indirect?
11
EEL5708/Bölöni Lec 3.11 Fall 2004 Execution Cycle Instruction Fetch Instruction Decode Operand Fetch Execute Result Store Next Instruction Obtain instruction from program storage Determine required actions and instruction size Locate and obtain operand data Compute result value or status Deposit results in storage for later use Determine successor instruction
12
EEL5708/Bölöni Lec 3.12 Fall 2004 Review: Measuring performance
13
EEL5708/Bölöni Lec 3.13 Fall 2004 Which is faster? Time to run the task (ExTime) –Execution time, response time, latency Tasks per day, hour, week, sec, ns … (Performance) –Throughput, bandwidth Plane Boeing 747 BAD/Sud Concorde Speed 610 mph 1350 mph DC to Paris 6.5 hours 3 hours Passengers 470 132 Throughput (pmph) 286,700 178,200
14
EEL5708/Bölöni Lec 3.14 Fall 2004 Performance(X) Execution_time(Y) n == Performance(Y) Execution_time(X) Definitions Performance is in units of things per sec –bigger is better If we are primarily concerned with response time –performance(x) = 1 execution_time(x) " X is n times faster than Y" means
15
EEL5708/Bölöni Lec 3.15 Fall 2004 Computer Performance CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle CPU time= Seconds = Instructions x Cycles x Seconds Program Program Instruction Cycle Inst Count CPIClock Rate Program X Compiler X (X) Inst. Set. X X Organization X X Technology X inst count CPI Cycle time
16
EEL5708/Bölöni Lec 3.16 Fall 2004 Cycles Per Instruction (Throughput) “Instruction Frequency” CPI = (CPU Time * Clock Rate) / Instruction Count = Cycles / Instruction Count “Average Cycles per Instruction”
17
EEL5708/Bölöni Lec 3.17 Fall 2004 Example: Calculating CPI bottom up Typical Mix of instruction types in program Base Machine (Reg / Reg) OpFreqCyclesCPI(i)(% Time) ALU50%1.5(33%) Load20%2.4(27%) Store10%2.2(13%) Branch20%2.4(27%) 1.5
18
EEL5708/Bölöni Lec 3.18 Fall 2004 Example: Branch Stall Impact Assume CPI = 1.0 ignoring branches (ideal) Assume solution was stalling for 3 cycles If 30% branch, Stall 3 cycles on 30% OpFreqCyclesCPI(i)(% Time) Other 70%1.7(37%) Branch30%4 1.2(63%) => new CPI = 1.9 New machine is 1/1.9 = 0.52 times faster (i.e. slow!)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.