Lecture 4 Sept 9 Goals: Amdahl’s law Chapter 2 MIPS assembly language instruction formats translating c into MIPS - examples.

Slides:



Advertisements
Similar presentations
1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
Advertisements

Goal: Write Programs in Assembly
Review of the MIPS Instruction Set Architecture. RISC Instruction Set Basics All operations on data apply to data in registers and typically change the.
Instruction Set-Intro
Lecture 5: MIPS Instruction Set
1 ECE462/562 ISA and Datapath Review Ali Akoglu. 2 Instruction Set Architecture A very important abstraction –interface between hardware and low-level.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /15/2013 Lecture 11: MIPS-Conditional Instructions Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER.
1 ECE369 ECE369 Chapter 2. 2 ECE369 Instruction Set Architecture A very important abstraction –interface between hardware and low-level software –standardizes.
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 5: Data Transfer Instructions / Control Flow Instructions Partially adapted from Computer.
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 5 MIPS ISA & Assembly Language Programming.
CS3350B Computer Architecture Winter 2015 Lecture 4
Systems Architecture Lecture 5: MIPS Instruction Set
Chapter 2 Instructions: Language of the Computer
Chapter 2 Instructions: Language of the Computer Part III.
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 4: Arithmetic / Data Transfer Instructions Partially adapted from Computer Organization.
S. Barua – CPSC 440 CHAPTER 2 INSTRUCTIONS: LANGUAGE OF THE COMPUTER Goals – To get familiar with.
1 Lecture 2: MIPS Instruction Set Today’s topic:  MIPS instructions Reminder: sign up for the mailing list cs3810 Reminder: set up your CADE accounts.
Lecture 5 Sept 14 Goals: Chapter 2 continued MIPS assembly language instruction formats translating c into MIPS - examples.
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 6: Logic/Shift Instructions Partially adapted from Computer Organization and Design, 4.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji.
ISA-2 CSCE430/830 MIPS: Case Study of Instruction Set Architecture CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng Zhu.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 2 Instructions: Language of the Computer.
CMPT 334 Computer Organization
Lecture Objectives: 1)Define the terms least significant bit and most significant bit. 2)Explain how unsigned integer numbers are represented in memory.
CS224 Fall 2011 Chap 2a Computer Organization CS224 Fall 2011 Chapter 2 a With thanks to M.J. Irwin, D. Patterson, and J. Hennessy for some lecture slide.
April 23, 2001Systems Architecture I1 Systems Architecture I (CS ) Lecture 9: Assemblers, Linkers, and Loaders * Jeremy R. Johnson Mon. April 23,
Lecture 4: MIPS Instruction Set
IFT 201: Unit 1 Lecture 1.3: Processor Architecture-3
Chapter 2 CSF 2009 The MIPS Assembly Language: Introduction to Binary System.
Computer Architecture (CS 207 D) Instruction Set Architecture ISA.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /08/2013 Lecture 10: MIPS Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL STATE.
CHAPTER 6 Instruction Set Architecture 12/7/
Chapter 2 Decision-Making Instructions (Instructions: Language of the Computer Part V)
Chapter 2 CSF 2009 The MIPS Assembly Language. Stored Program Computers Instructions represented in binary, just like data Instructions and data stored.
Computer Organization CS224 Fall 2012 Lessons 7 and 8.
EET 4250 Instruction Representation & Formats Acknowledgements: Some slides and lecture notes for this course adapted from Prof. Mary Jane Penn.
Chapter 2 — Instructions: Language of the Computer — 1 Memory Operands Main memory used for composite data – Arrays, structures, dynamic data To apply.
Chapter 2 — Instructions: Language of the Computer — 1 Conditional Operations Branch to a labeled instruction if a condition is true – Otherwise, continue.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO Session 9 Binary Representation and Logical Operations.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO Session 7, 8 Instruction Set Architecture.
CDA 3101 Spring 2016 Introduction to Computer Organization
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO Session 11 Conditional Operations.
CS Computer Organization Numbers and Instructions Dr. Stephen P. Carl.
Chapter 2 Instructions: Language of the Computer.
COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Yaohang Li.
CSCI-365 Computer Organization Lecture Note: Some slides and/or pictures in the following are adapted from: Computer Organization and Design, Patterson.
Computer Architecture & Operations I
Computer Architecture & Operations I
Computer Architecture & Operations I
MIPS Assembly.
COMPUTER ARCHITECTURE & OPERATIONS I
Morgan Kaufmann Publishers
Lecture 4: MIPS Instruction Set
The University of Adelaide, School of Computer Science
Computer Architecture (CS 207 D) Instruction Set Architecture ISA
The University of Adelaide, School of Computer Science
Instructions - Type and Format
Lecture 4: MIPS Instruction Set
Systems Architecture I (CS ) Lecture 5: MIPS Instruction Set*
CSCI206 - Computer Organization & Programming
Systems Architecture Lecture 5: MIPS Instruction Set
Computer Architecture & Operations I
The University of Adelaide, School of Computer Science
ECE232: Hardware Organization and Design
Chapter 2 Instructions: Language of the Computer
The University of Adelaide, School of Computer Science
Computer Instructions
Computer Architecture
UCSD ECE 111 Prof. Farinaz Koushanfar Fall 2018
Systems Architecture I (CS ) Lecture 5: MIPS Instruction Set*
Presentation transcript:

Lecture 4 Sept 9 Goals: Amdahl’s law Chapter 2 MIPS assembly language instruction formats translating c into MIPS - examples

Amdahl’s Law Amdahl’s law: speedup achieved if a fraction f of a task is unaffected and the remaining 1 – f part runs p times as fast. s =  min(p, 1/f) 1 f + (1 – f)/p f = fraction unaffected p = speedup of the rest

Example Amdahl’s Law in design A processor spends 30% of its time on flp addition, 25% on flp mult, and 10% on flp division. Evaluate the following enhancements, each costing the same to implement: a.Redesign of the flp adder to make it twice as fast. b.Redesign of the flp multiplier to make it three times as fast. c.Redesign the flp divider to make it 10 times as fast.

Example Amdahl’s Law in design A processor spends 30% of its time on flp addition, 25% on flp mult, and 10% on flp division. Evaluate the following enhancements, each costing the same to implement: a.Redesign of the flp adder to make it twice as fast. b.Redesign of the flp multiplier to make it three times as fast. c.Redesign the flp divider to make it 10 times as fast. Solution a.Adder redesign speedup = 1 / [ / 2] = 1.18 b.Multiplier redesign speedup = 1 / [ / 3] = 1.20 c.Divider redesign speedup = 1 / [ / 10] = 1.10 What if both the adder and the multiplier are redesigned?

Generalized Amdahl’s Law Original running time of a program = 1 = f 1 + f f k New running time after the fraction f i is speeded up by a factor p i f 1 f 2 f k p 1 p 2 p k Speedup formula 1 S = f 1 f 2 f k p 1 p 2 p k If a particular fraction is slowed down rather than speeded up, use s j f j instead of f j / p j, where s j > 1 is the slowdown factor

Amdahl’s Law – limit to improvement Improving an aspect of a computer and expecting a proportional improvement in overall performance §1.8 Fallacies and Pitfalls Can’t be done! Example: multiply accounts for 80s/100s How much improvement in multiply performance to get 5× overall? Corollary: make the common case fast

Pitfall: MIPS as a Performance Metric MIPS: Millions of Instructions Per Second Doesn’t account for Differences in ISAs between computers Differences in complexity between instructions CPI varies between programs on a given CPU

Reporting Computer Performance Measured or estimated execution times for three programs. Time on machine X Time on machine Y Speedup of Y over X Program A Program B Program C All 3 programs Analogy: If a car is driven to a city 100 km away at 100 km/hr and returns at 50 km/hr, the average speed is not ( ) / 2 but is obtained from the fact that it travels 200 km in 3 hours.

Measured or estimated execution times for three programs. Time on machine X Time on machine Y Speedup of Y over X Program A Program B Program C Geometric mean does not yield a measure of overall speedup, but provides an indicator that at least moves in the right direction Comparing the Overall Performance Speedup of X over Y Arithmetic mean Geometric mean

Effect of Instruction Mix on Performance Consider two applications DC and RS and two machines M 1 and M 2 : ClassData Comp. Reactor Sim. M 1 ’s CPIM 2 ’s CPI A: Ld/Str 25% 32% B: Integer 32% 17% C: Sh/Logic 16% 2% D: Float 0% 34% E: Branch 19% 9% F: Other 8% 6% Find the effective CPI for the two applications on both machines.

Effect of Instruction Mix on Performance Consider two applications DC and RS and two machines M 1 and M 2 : ClassData Comp. Reactor Sim. M 1 ’s CPIM 2 ’s CPI A: Ld/Str 25% 32% B: Integer 32% 17% C: Sh/Logic 16% 2% D: Float 0% 34% E: Branch 19% 9% F: Other 8% 6% Find the effective CPI for the two applications on both machines. Solution CPI of DC on M 1 : 0.25       2.0 = 2.31 DC on M 2 : 2.54 RS on M 1 : 3.94 RS on M 2 : 2.89

Figure 3.10 Trends in processor performance and DRAM memory chip capacity (Moore’s law). Performance Trends and Obsolescence “Can I call you back? We just bought a new computer and we’re trying to set it up before it’s obsolete.”

Trend in computational performance per watt of power used in general- purpose processors and DSPs. Performance is Important, But It Isn’t Everything

Concluding Remarks Cost/performance is improving Due to underlying technology development Hierarchical layers of abstraction In both hardware and software Instruction set architecture The hardware/software interface Execution time: the best performance measure Power is a limiting factor Use parallelism to improve performance §1.9 Concluding Remarks

Chapter 2 Instructions: Language of the Computer MIPS instruction set instruction encoding converting c into MIPS programs recursive programs MIPS implementation and testing SPIM simulator

Instruction Set Collection of instructions of a computer Different computers have different instruction sets But with many aspects in common Early computers had very simple instruction sets Simplified implementation Many modern computers also have simple instruction sets §2.1 Introduction

The MIPS Instruction Set Used as the example throughout the book Stanford MIPS commercialized by MIPS Technologies ( Large share of embedded core market Applications in consumer electronics, network/storage equipment, cameras, printers, …

Just as first RISC processors were coming to market (around1986), Computer chronicles dedicated one of its shows to RISC. A link to this clip is: # David Patterson (one of the authors of the text) is among the people interviewed.

Arithmetic Operations Add and subtract, three operands Two sources and one destination add a, b, c # a gets b + c All arithmetic operations have this form Design Principle 1: Simplicity favors regularity Regularity makes implementation simpler Simplicity enables higher performance at lower cost §2.2 Operations of the Computer Hardware

Arithmetic Example C code: f = (g + h) - (i + j); Compiled MIPS code: add t0, g, h # temp t0 = g + h add t1, i, j # temp t1 = i + j sub f, t0, t1 # f = t0 - t1

Register Operands Arithmetic instructions use register operands MIPS has a 32 × 32-bit register file Use for frequently accessed data Numbered 0 to bit data called a “word” Assembler names $t0, $t1, …, $t9 for temporary values $s0, $s1, …, $s7 for saved variables Design Principle 2: Smaller is faster §2.3 Operands of the Computer Hardware

Register Operand Example C code: f = (g + h) - (i + j); f, …, j in $s0, …, $s4 Compiled MIPS code: add $t0, $s1, $s2 add $t1, $s3, $s4 sub $s0, $t0, $t1

Memory Operands Main memory used for composite data Arrays, structures, dynamic data To apply arithmetic operations Load values from memory into registers Store result from register to memory Memory is byte addressed Each address identifies an 8-bit byte Words are aligned in memory Address must be a multiple of 4 MIPS is Big Endian Most-significant byte at least address of a word c.f. Little Endian: least-significant byte at least address

Memory Operand Example 1 C code: g = h + A[8]; g in $s1, h in $s2, base address of A in $s3 Compiled MIPS code: Index 8 requires offset of 32 4 bytes per word lw $t0, 32($s3) # load word add $s1, $s2, $t0 offset base register

Memory Operand Example 2 C code: A[12] = h + A[8]; h in $s2, base address of A in $s3 Compiled MIPS code: Index 8 requires offset of 32 lw $t0, 32($s3) # load word add $t0, $s2, $t0 sw $t0, 48($s3) # store word

Registers vs. Memory Registers are faster to access than memory Operating on memory data requires loads and stores More instructions to be executed Compiler must use registers for variables as much as possible Only spill to memory for less frequently used variables Register optimization is important!

Immediate Operands Constant data specified in an instruction addi $s3, $s3, 4 No subtract immediate instruction Just use a negative constant addi $s2, $s1, -1 Design Principle 3: Make the common case fast Small constants are common Immediate operand avoids a load instruction

The Constant Zero MIPS register 0 ($zero) is the constant 0 Cannot be overwritten Useful for common operations E.g., move between registers add $t2, $s1, $zero

Unsigned Binary Integers Given an n-bit number Range: 0 to 2 n – 1 Example = 0 + … + 1× ×2 2 +1×2 1 +1×2 0 = 0 + … = Using 32 bits 0 to 4,294,967,295 §2.4 Signed and Unsigned Numbers

Twos-Complement Signed Integers Given an n-bit number Range: –2 n – 1 to +2 n – 1 – 1 Example = –1× × … + 1×2 2 +0×2 1 +0×2 0 = –2,147,483, ,147,483,644 = –4 10 Using 32 bits –2,147,483,648 to +2,147,483,647

Twos-Complement Signed Integers Bit 31 is sign bit 1 for negative numbers 0 for non-negative numbers –(–2 n – 1 ) can’t be represented Non-negative numbers have the same unsigned and 2s-complement representation Some specific numbers 0: … 0000 –1: … 1111 Most-negative: … 0000 Most-positive: … 1111

Signed Negation Complement and add 1 Complement means 1 → 0, 0 → 1 Example: negate = … –2 = … = …

Sign Extension Representing a number using more bits Preserve the numeric value In MIPS instruction set addi : extend immediate value lb, lh : extend loaded byte/halfword beq, bne : extend the displacement Replicate the sign bit to the left c.f. unsigned values: extend with 0s Examples: 8-bit to 16-bit +2: => –2: =>

Representing Instructions Instructions are encoded in binary Called machine code MIPS instructions Encoded as 32-bit instruction words Small number of formats encoding operation code (opcode), register numbers, … Regularity! Register numbers $t0 – $t7 are reg’s 8 – 15 $t8 – $t9 are reg’s 24 – 25 $s0 – $s7 are reg’s 16 – 23 §2.5 Representing Instructions in the Computer

MIPS R-format Instructions Instruction fields op: operation code (opcode) rs: first source register number rt: second source register number rd: destination register number shamt: shift amount (00000 for now) funct: function code (extends opcode) oprsrtrdshamtfunct 6 bits 5 bits

R-format Example add $t0, $s1, $s2 special$s1$s2$t00add = oprsrtrdshamtfunct 6 bits 5 bits

Hexadecimal Base 16 Compact representation of bit strings 4 bits per hex digit c d a1010e b1011f1111 Example: eca

MIPS I-format Instructions Immediate arithmetic and load/store instructions rt: destination or source register number Constant: –2 15 to – 1 Address: offset added to base address in rs Design Principle 4: Good design demands good compromises Different formats complicate decoding, but allow 32-bit instructions uniformly Keep formats as similar as possible oprsrtconstant or address 6 bits5 bits 16 bits

Logical Operations Instructions for bitwise manipulation OperationCJavaMIPS Shift left<< sll Shift right>>>>> srl Bitwise AND&& and, andi Bitwise OR|| or, ori Bitwise NOT~~ nor Useful for extracting and inserting groups of bits in a word §2.6 Logical Operations

Shift Operations shamt: how many positions to shift Shift left logical Shift left and fill with 0 bits sll by i bits multiplies by 2 i Shift right logical Shift right and fill with 0 bits srl by i bits divides by 2 i (unsigned only) oprsrtrdshamtfunct 6 bits 5 bits

AND Operations Useful to mask bits in a word Select some bits, clear others to 0 and $t0, $t1, $t $t2 $t $t0

OR Operations Useful to include bits in a word Set some bits to 1, leave others unchanged or $t0, $t1, $t $t2 $t $t0

NOT Operations Useful to invert bits in a word Change 0 to 1, and 1 to 0 MIPS has 3-operand NOR instruction a NOR b == NOT ( a OR b ) nor $t0, $t1, $zero $t $t0 Register 0: always read as zero

Conditional Operations Branch to a labeled instruction if a condition is true Otherwise, continue sequentially beq rs, rt, L1 if (rs == rt) branch to instruction labeled L1; bne rs, rt, L1 if (rs != rt) branch to instruction labeled L1; j L1 unconditional jump to instruction labeled L1 §2.7 Instructions for Making Decisions

Compiling If Statements C code: if (i==j) f = g+h; else f = g-h; f, g, … in $s0, $s1, … Compiled MIPS code: bne $s3, $s4, Else add $s0, $s1, $s2 j Exit Else: sub $s0, $s1, $s2 Exit: … Assembler calculates addresses

Compiling Loop Statements C code: while (save[i] == k) i += 1; i in $s3, k in $s5, address of save in $s6 Compiled MIPS code: Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j Loop Exit: …

More Conditional Operations Set result to 1 if a condition is true Otherwise, set to 0 slt rd, rs, rt if (rs < rt) rd = 1; else rd = 0; slti rt, rs, constant if (rs < constant) rt = 1; else rt = 0; Use in combination with beq, bne slt $t0, $s1, $s2 # if ($s1 < $s2) bne $t0, $zero, L # branch to L

Branch Instruction Design Why not blt, bge, etc? Hardware for <, ≥, … slower than =, ≠ Combining with branch involves more work per instruction, requiring a slower clock All instructions penalized! beq and bne are the common case This is a good design compromise

Signed vs. Unsigned Signed comparison: slt, slti Unsigned comparison: sltu, sltui Example $s0 = $s1 = slt $t0, $s0, $s1 # signed –1 < +1  $t0 = 1 sltu $t0, $s0, $s1 # unsigned +4,294,967,295 > +1  $t0 = 0