Computer Architecture CSCE 350 Rabi Mahapatra Spring 2017
Course Contents
Course Contents Organization of a computer Assembly language Design of a computer Verilog Future architectures
How does the course fit into the curriculum? CPSC 4xx compiler OS CPSC 483 Computer Sys Design CPSC 350 Computer Architecture ELEN 220 Intro to Digital Design ELEN 248 Intro to DGTL Sym Design
Grading Two exams 55% Assignments and quizzes 10% Lab & Projects 35%
Course Information Labs MIPS (Assembly Programming), Verilog (HDL) Projects Project 1: MIPS Projects 2 & 3: Verilog (Datapath Design) TA & Peer Teacher: TBA
Course Information [contd…] Course Webpage http://courses.cs.tamu.edu/rabi/csce350/index.htmlhttp://people.tamu.edu/~d.dharanidhar/CSCE350_Spring2017_Fellow/SyllabusSpring2017.htm
Computer Architecture Some people define computer architecture as the combination of the instruction set architecture (what the executable can see of the hardware, the functional interface) and the machine organization (how the hardware implements the instruction set architecture)
What is Computer Architecture ? Applications Instruction set architecture Machine organization Operating System Compiler Firmware Instr. Set Proc. I/O system Datapath & Control Digital Design Circuit Design Layout Many levels of abstraction
Factors affecting ISA ??? Technology Cleverness Applications History Programming Languages Applications Computer Architecture Cleverness Operating Systems History
ISA: Critical Interface software instruction set hardware Examples: 80x86 50,000,000 vs. MIPS 5500,000 ???
The Big Picture Since 1946 all computers have had 5 components!!! Processor Input Control Memory Datapath Output Since 1946 all computers have had 5 components!!!
Technology Trends Processor logic capacity: about 30% per year clock rate: about 20% per year Memory DRAM capacity: about 60% per year (4x every 3 years) Memory speed: about 10% per year Cost per bit: improves about 25% per year Disk capacity: about 60% per year Total use of data: 100% per 9 months! Network Bandwidth Bandwidth increasing more than 100% per year!
Technology Trends Microprocessor Logic Density DRAM chip capacity DRAM Year Size 1980 64 Kb 1983 256 Kb 1986 1 Mb 1989 4 Mb 1992 16 Mb 1996 64 Mb 1999 256 Mb 2002 1 Gb In ~1985 the single-chip processor (32-bit) and the single-board computer emerged In the 2002+ timeframe, these may well look like mainframes compared single-chip computer (maybe 2 chips)
Technology Trends Smaller feature sizes – higher speed, density ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)
Technology Trends Number of transistors doubles every 18 months (amended to 24 months) ECE/CS 752; copyright J. E. Smith, 2002 (Univ. of Wisconsin)
Levels of Representation High Level Language Program temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; Compiler Assembly Language Program lw $15, 0($2) lw $16, 4($2) sw $16, 0($2) sw $15, 4($2) Assembler Machine Language Program 0000 1001 1100 0110 1010 1111 0101 1000 1010 1111 0101 1000 0000 1001 1100 0110 1100 0110 1010 1111 0101 1000 0000 1001 0101 1000 0000 1001 1100 0110 1010 1111 Machine Interpretation Control Signal Specification ALUOP[0:3] <= InstReg[9:11] & MASK
Execution Cycle Instruction Fetch Decode Operand Execute Result Store Next Obtain instruction from program storage Determine required actions and instruction size Locate and obtain operand data Compute result value or status Deposit results in storage for later use Determine successor instruction
What next? How does better technology help to improve performance? How can we quantify the gain? The MIPS architecture MIPS instructions First contact with assembly language programming
The Role of Performance
Performance Response time: time between start and finish of the task (aka execution time) Throughput: total amount of work done in a given time
Question Suppose that we replace the processor in a computer by a faster model Does this improve the response time? How about the throughput?
Question Suppose we add an additional processor to a system that uses separate processors for separate tasks. Does this improve the response time? Does this improve the throughput?
CPU Performance CPU Execution time for a program = CPU Clock Cycles for a program X Clock Cycle time = CPU Clock Cycles for a program / Clock rate
CPU Performance Example A program runs in 10 secs on computer A with a 4 GHz clock. We try to build computer B, that runs this program in 6 secs where computer B requires 1.2 times as many clock cycles as computer A for this program. What clock rate should we target?
CPU Performance CPU clock cycles = instruction count X CPI CPU time = instruction count X CPI X Clock cycle time inst count Cycle time CPI
Cycles Per Instruction (Throughput) “Average Cycles per Instruction” CPI = (CPU Time * Clock Rate) / Instruction Count = Cycles / Instruction Count “Instruction Frequency”
Example: Calculating CPI Base Machine (Reg / Reg) Op Freq Cycles CPI(i) (% Time) ALU 50% 1 .5 (33%) Load 20% 2 .4 (27%) Store 10% 2 .2 (13%) Branch 20% 2 .4 (27%) 1.5 Typical Mix of instruction types in program
" X is n times faster than Y" means Performance (Absolute) Performance Relative Performance " X is n times faster than Y" means n =
MIPS as a performance measure MIPS = Instruction count / (Execution time X 10^6)
Amdahl’s Law The execution time after making an improvement to the system is given by Exec time after improvement = I/A + E I = execution time affected by improvement A = amount of improvement E = execution time unaffected
Amdahl’s Law Suppose that program runs 100 seconds on a machine and multiplication instructions take 80% of the total time. How much do I have to improve the speed of multiplication if I want my program to run 5 times faster? 20 seconds = 80 seconds/n + 20 seconds => it is impossible!