Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 1 An Introduction to Processor Design 부산대학교 컴퓨터공학과.

Similar presentations


Presentation on theme: "Chapter 1 An Introduction to Processor Design 부산대학교 컴퓨터공학과."— Presentation transcript:

1 Chapter 1 An Introduction to Processor Design 부산대학교 컴퓨터공학과

2 2015-10-11PNU Computer Eng.2 1.1 Processor Architecture & Organization  All modern general-purpose computers employ “stored program concept”  IAS computer by von Neumann at Princeton Institute for Advanced Studies (in 1946)  First implemented in ‘Baby Machine’ at Univ. of Manchester, England (in 1948)  [Figure 1.1] The state in a stored-program digital computer

3 2015-10-11PNU Computer Eng.3 1.1 Processor Architecture & Organization  50 years of development:  performance of processors   cost    cost-effective computers (principles of operation not changed much)  Most of improvements:  Advances in technology of electronics  Vacuum tubes -> transistors -> ICs -> VLSI  New insights:  Virtual memory (early 1960s)  Cache memory  Pipelining  RISC

4 2015-10-11PNU Computer Eng.4 1.2 Abstraction in Hardware Design  Transistors (elementary component)  Logically act as inverters  Logic gates  CMOS NAND gate (using 4 trs)  If A = B = Vdd, output = Vss  If either A or B (or both) = Vss, output =Vdd  => output = not(A.B)  Transistor circuit, logic symbol, truth table

5 2015-10-11PNU Computer Eng.5 1.2 Abstraction in Hardware Design  The gate abstraction  Simplify the process of designing circuits with great number of trs  Removes the need to know that the gate is built from trs  Free from implementation technology in function level  Eg. Field effect tr, bipolar tr, etc.  However, performance difference exists  Levels of abstraction  Trs  Gates, memory cells  Adder, MUX, decoder, registers  ALUs, shifters, memory blocks  Processors, peripherals, memories  ICs  PCBs  PCs, controllers, mobile phones

6 2015-10-11PNU Computer Eng.6 1.3 MU0 – a simple processor  A simple form of processor can be built from a few basic components  PC (program counter)  ACC (accumulator)  ALU (arithmetic-logic unit)  IR (instruction register)  Instruction decoder, control logic  The MU0 instruction set  A 16-bit machine with a 12-bit address space (4K x 2 bytes: 8K bytes memory)  Instructions: 16 bits long (op: 4 bits, address field: 12 bits)

7 2015-10-11PNU Computer Eng.7 1.3 MU0 – a simple processor  [Table 1.1] The MU0 instruction set

8 2015-10-11PNU Computer Eng.8 1.3 MU0 – a simple processor  Datapath  A register transfer level (RTL) design style based on registers, MUXs, and so on  [Figure 1.5] MU0 datapath example

9 2015-10-11PNU Computer Eng.9 RTL level design  [Figure 1.6] MU0 register transfer level organization  Control signals:  enables on all of regs  function select lines to ALU  select control lines for two MUXs  control for a tri-state driver to send ACC value to memory  MEMrq (memory request)  RnW (read/write control lines)

10 2015-10-11PNU Computer Eng.10 1.4 Instruction set design  To build a high-performance processor (beyond MU0 inst. set), inst. set design is important.  4 address insts (the most general form)  Ex) add d, s1, s2, next_i; d := s1 + s2  3 address insts  Make address of the next inst. implicit using PC (except for branch)  Ex) add d, s1, s2; d := s1 + s2

11 2015-10-11PNU Computer Eng.11 1.4 Instruction set design  2 address insts  Make destination reg. the same as one of source reg.  Ex) add d, s1; d := d + s1  1 address insts  AC is used as destination  Ex) add s1; AC := AC + s1  0 address insts (using a stack)  Ex) add; tos := tos + next on stack

12 2015-10-11PNU Computer Eng.12 1.4 Instruction set design  Addressing modes  Immediate addressing: immediate data  Absolute addressing: inst. contains full address for data  Indirect addressing: inst. contains address of location that contains address of data  Register addressing: data is in a reg.  Register indirect addressing  Index addressing  Stack addressing

13 2015-10-11PNU Computer Eng.13 1.4 Instruction set design  Control flow instructions  Branch, jump  Conditional branch  Subroutine calls & returns  System calls  Branch to an operating system routine  Exceptions  Error handling

14 2015-10-11PNU Computer Eng.14 1.5 Processor design trade-offs  CISC vs RISC  CISC  To reduce semantic gap b/w high level language & machine instruction  Complex sequence of operations  Make compiler’s job easy  RISC  ARM’s middle name: from RISC  Reducing semantic gap is not the right way to make an efficient computer  [Table 1.3] Typical dynamic instruction usage

15 2015-10-11PNU Computer Eng.15 1.5 Processor design trade-offs  Data movement b/w regs and memory: almost half  Control flow such as branches & procedure calls: almost quarter  Arithmetic operations: only 15%  Complex arithmetic insts do not help much  The most important tech: pipelining, cache memory  To make processors go faster

16 2015-10-11PNU Computer Eng.16 1.5 Processor design trade-offs  Pipelines  Fetch  Decode  REG: get operands from register bank  ALU  MEM: access memory for an operand, if necessary  RES: write result back to register bank  [Figure 1.13] Pipelined instruction execution

17 2015-10-11PNU Computer Eng.17 1.5 Processor design trade-offs  Pipeline hazards  Read after write hazard (data hazard)  Result from one inst is used as an operand by the next inst => inst2 must stall until the result is available  [Figure 1.14] Read-after-write pipeline hazard

18 2015-10-11PNU Computer Eng.18 1.5 Processor design trade-offs  Branch hazard  Solution:  Compute branch target earlier (if possible)  The target may be computed speculatively  Delayed branch  [Figure 1.15] Pipelined branch behavior  Pipeline efficiency  The deeper the pipeline, the worse the problems get: RISC approach is better

19 2015-10-11PNU Computer Eng.19 1.6 RISC  In 1980, Patterson: RISCI project  RISCI arch  Fixed (32-bit) inst size with few formats  Load-store arch:  Insts that process data operate only on regs  Separate insts to access memory  A large register bank (32 32-bit regs) to allow load-store arch to operate efficiently  RISCI organization  Hard-wired inst decode logic  Pipelined execution  Single cycle execution  RISCI advantages  A smaller die size  A shorter development time  A higher performance (controversial)


Download ppt "Chapter 1 An Introduction to Processor Design 부산대학교 컴퓨터공학과."

Similar presentations


Ads by Google