Download presentation
Presentation is loading. Please wait.
1
ECE 15B Computer Organization Spring 2010 Dmitri Strukov Lecture 3: Arithmetic Instructions Partially adapted from Computer Organization and Design, 4 th edition, Patterson and Hennessy, and classes taught by Patterson at Berkeley, Ryan Kastner at UCSB and Mary Jane Irwin at Penn State
2
Announcement TA office hours for Vivek were moved to Tuesday 11:00 am – 12:00 am Basics of logic design is in Appendix C (P+H) SPIM and reading status ECE 15B Spring 2010
3
Agenda Key concepts from last lecture & several new ones C operators and operands Variables in Assembly: Registers Addition and Subtraction in Assembly ECE 15B Spring 2010
4
Key Concepts from Last Lecture Synchronous circuits Clocking & Pipelining & Timing Diagram CPU simplified diagram ECE 15B Spring 2010
5
CPU Clocking Operation of digital hardware governed by a constant-rate clock Clock (cycles) Data transfer and computation Update state Clock period Clock period: duration of a clock cycle e.g., 250ps = 0.25ns = 250×10 –12 s Clock frequency (rate): cycles per second e.g., 4.0GHz = 4000MHz = 4.0×10 9 Hz ECE 15B Spring 2010
6
CPU Overview ECE 15B Spring 2010
7
… with muxes Can’t just join wires together Use multiplexers ECE 15B Spring 2010
8
… with muxes ECE 15B Spring 2010
9
Below the Program lw $t0, 0($2) lw $t1, 4($2) sw $t1, 0($2) sw $t0, 4($2) High Level Language Program (e.g., C) Assembly Language Program (e.g.,MIPS) Machine Language Program (MIPS) Hardware Architecture Description (e.g., block diagrams) Compiler Assembler Machine Interpretation temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 Logic Circuit Description (Circuit Schematic Diagrams) Architecture Implementation ECE 15B Spring 2010
10
Assembly Language ECE 15B Spring 2010 Basic job of a CPU: execute lots of instructions Instructions are the primitive operations that the CPU may execute Different CPUs implement different sets of instructions Instruction Set Architecture (ISA) is a set of instructions a particular CPU implements Examples: Intel 80x86 (Pentium 4), IBM/Motorola Power PC (Macintosh), MIPS, Intel IA64, ARM
11
Instruction Set Architectures ECE 15B Spring 2010 Early trend was to add more and more instructions to new CPU to do elaborate operations VAX architecture had an instruction to multiply polynomials RISC philosophy (Cocke IBM, Patterson, Hennessy, 1980s) RISC = Reduced Instruction Set Computing Keep the instruction set small and simple which makes it easier to build fast hardware Let software (compiler) do complicated operations by composing simpler ones
12
How to Access Performance of Instruction Set Architecture? ECE 15B Spring 2010
13
SPEC CPU Benchmark Programs used to measure performance – Supposedly typical of actual workload Standard Performance Evaluation Corp (SPEC) – Develops benchmarks for CPU, I/O, Web, … SPEC CPU2006 – Elapsed time to execute a selection of programs Negligible I/O, so focuses on CPU performance – Normalize relative to reference machine – Summarize as geometric mean of performance ratios CINT2006 (integer) and CFP2006 (floating-point)
14
CINT2006 for Opteron X4 2356 ECE 15B Spring 2010 NameDescriptionIC×10 9 CPITc (ns)Exec timeRef timeSPECratio perlInterpreted string processing2,1180.750.406379,77715.3 bzip2Block-sorting compression2,3890.850.408179,65011.8 gccGNU C Compiler1,0501.720.47248,05011.1 mcfCombinatorial optimization33610.000.401,3459,1206.8 goGo game (AI)1,6581.090.4072110,49014.6 hmmerSearch gene sequence2,7830.800.408909,33010.5 sjengChess game (AI)2,1760.960.483712,10014.5 libquantumQuantum computer simulation1,6231.610.401,04720,72019.8 h264avcVideo compression3,1020.800.4099322,13022.3 omnetppDiscrete event simulation5872.940.406906,2509.1 astarGames/path finding1,0821.790.407737,0209.1 xalancbmkXML parsing1,0582.700.401,1436,9006.0 Geometric mean11.7
15
Review: Technology Trends Uniprocessor Performance (SPECint) VAX : 1.25x/year 1978 to 1986 RISC + x86: 1.52x/year 1986 to 2002 RISC + x86: 1.20x/year 2002 to present 1.25x/year 1.52x/year 1.20x/year Performance (vs. VAX-11/780) 3X “Sea change” in chip design: multiple “cores” or processors per chip ECE 15B Spring 2010
16
MIPS Architecture MIPS – Semiconductor company that built one of the first commercial RISC architectures We will study the MIPS architecture in detail in this class Why MIPS instead of Intel 80x86 – MIPS is simple and elegant. Don’t want to get bogged down in gritty details – MIPS is widely used in embedded apps – There are more embedded computers than PCs ECE 15B Spring 2010
17
Assembly Variables: Registers Unlike HLL like C or Java, assembly cannot use variables – Why not? Keep hardware simple Assembly Operands are registers – Limited number of special locations built directly into the hardware – Operations can only be performed on these – Benefit: Since registers file is small, it is very fast ECE 15B Spring 2010
18
Assembly Variables: Registers Drawback: – Registers are in a hardware, there are a predetermined number of them Solution: – MIPS code must be very carefully put together to efficiently use registers 32 registers in MIPS – Smaller is faster Each MIPS register is 32 bits wide – Groups of 32 bits called a word in MIPS ECE 15B Spring 2010
19
Assembly Variables Registers are numbered from 0 to 31 Each Register can be referred to by number or name Number references – $0, $1, $2, …., $30, $31 ECE 15B Spring 2010
20
Assembly Variables: Registers By convention, each register also has a name to make it easier to code For now: $16 - $23 $s0 - $s7 (correspond to C variables) $8 - $15 $t0 - $t7 (correspond to temporary variables) Will explain other 16 register names later In general, use names to make your code more readable ECE 15B Spring 2010
21
C, Java Variables vs. Registers In C (and most High Level Languages) variables declared first and given a type – Example: int fahr, celcius; char a, b, c, d, e; – Each variable can only represent a value of the type it was declared as (cannot mic and match int and char variables) In assembly Language the registers have no type – Operation determines how register contents are treated ECE 15B Spring 2010
22
Comments in Assembly Another way to make your code more readable: comments Hash (#) is used for MIPS comments – Anything from hash mark to end of line is a comment and will be ignored – Note: different from C, comments have format /* comment */, so they can span many lines ECE 15B Spring 2010
23
Assembly Instructions In assembly language, each statement (called an instruction), executes exactly one of a short list of simple commands Unlike in C (and most other high level languages), each line of assembly code contains at most one instruction Instructions are related to operations (=,+,-, *,/) in C or Java ECE 15B Spring 2010
24
MIPS Syntax Instruction Syntax: [Label:] Op-code [oper. 1], [oper. 2], [oper.3], [#comment] (0) (1) (2) (3) (4) (5) – Where 1) operation name 2,3,4) operands 5) comments 0) label field is optional, will discuss later – For arithmetic and logic instruction 2) operand getting result (“destination”) 3) 1 st operand for operation (“source 1”) 4) 2 nd operand for operation (source 2” Syntax is rigid – 1 operator, 3 operands – Why? Keep hardware simple via regularity ECE 15B Spring 2010
25
Addition and Subtraction of Integers Addition in assembly – Example: add $s0, $s1, $s2 (in MIPS) Equivalent to: a = b + c (in C) Where MIPS registers $s0, $s1, $s2 are associated with C variables a, b, c Subtraction in Assembly – Example Sub $s3, $s4, S5(in MIPS) Equivalent to: d = e - f (in C) Where MIPS registers $s3, $s4, $s5 are associated with C variables d, e, f ECE 15B Spring 2010
26
Addition and Subtraction of Integers The following C statement in MIPS? a= b + c+ d - a Break into multiple instructions add $t0, $s1, $s2#temp = b + c add $t0, $t0, $s3# temp = temp + d sub $s0, $t0, $s4# a = temp – e – Notes: A single line of C may break up into several lines of MIPS Everything after the hash mark on each line is ignored (i.e. comments) ECE 15B Spring 2010
27
Addition and Subtraction of Integers How do we do this? f = (g + h) – (i + j) Use intermediate temporary registers add $t0, $s1, $s2#temp = g + h add $t1, $s3, $s4#temp = I + j sub $s0, $t0, $t1#f = (g+h)-(i+j) ECE 15B Spring 2010
28
Immediates Immediates are numerical constants They appear often in code, so there are special instructions for them Add immediate: addi $s0, $s1, 10# f= g + 10 (in C) – Where MIPS registers $s0 and $s1 are associated with C variables f and g – Syntax similar to add instruction, except that last argument is a number instead of register ECE 15B Spring 2010
29
Immediates There is no Subtract Immediate in MIPS: Why? – Remove redundant operations, i.e. if operation can be decomposed to into simpler ones exclude it from the set of instructions addi …, -X is equivalent to subi …, X so no subi Example addi $so, $s1, -10# f = g – 10 – where MIPS registers $s0 and $s1 are associated with C variables f and g ECE 15B Spring 2010
30
Register Zero One particular immediate, the number zero (o) appears very often in code – So define register zero ($0 or $zero) to always have the value 0 Example add $s0, S1, $zero # f = g Where MIPS registers $s0 and $s1 are associated with C variables f, g, Defined in hardware, so an instruction addi $zero, $zero, 5 will not do anything! ECE 15B Spring 2010
31
Additional Notes: CPU Time Performance improved by – Reducing number of clock cycles – Increasing clock rate – Hardware designer must often trade off clock rate against cycle count ECE 15B Spring 2010
32
Additional Notes: CPU Time Example Computer A: 2GHz clock, 10s CPU time Designing Computer B – Aim for 6s CPU time – Can do faster clock, but causes 1.2 × clock cycles How fast must Computer B clock be? ECE 15B Spring 2010
33
Additional Notes: Instruction Count and CPI Instruction Count for a program – Determined by program, ISA and compiler Average cycles per instruction – Determined by CPU hardware – If different instructions have different CPI Average CPI affected by instruction mix ECE 15B Spring 2010
34
Additional Notes: CPI Example Computer A: Cycle Time = 250ps, CPI = 2.0 Computer B: Cycle Time = 500ps, CPI = 1.2 Same ISA Which is faster, and by how much? A is faster… …by this much ECE 15B Spring 2010
35
Additional Notes: CPI in More Detail If different instruction classes take different numbers of cycles Weighted average CPI Relative frequency ECE 15B Spring 2010
36
Additional Notes: Pipelining Analogy Pipelined laundry: overlapping execution – Parallelism improves performance Four loads: Speedup = 8/3.5 = 2.3 Non-stop: Speedup = 2n/0.5n + 1.5 ≈ 4 = number of stages ECE 15B Spring 2010
37
Additional Notes: Pipeline Performance Single-cycle (T c = 800ps) Pipelined (T c = 200ps) ECE 15B Spring 2010
38
Conclusions In MIPS assembly language – Register replace C variables – One instruction (simple operation) per line – Simpler is faster ECE 15B Spring 2010
39
Review Instructions so far: add, addi, sub Registers so far C variables: $s0 - $s7 Temporary variables: $t0 - $t9 Zero: $zero ECE 15B Spring 2010
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.