Chapter 3 Instruction Set Architecture Advanced Computer Architecture COE 501.

Slides:



Advertisements
Similar presentations
Chapter 2 Instruction Set Principles. Computer Architecture’s Changing Definition 1950s to 1960s: Computer Architecture Course = Computer Arithmetic 1970s.
Advertisements

Instruction Set Design
CEG3420 Lec2.1 ©UCB Fall 1997 ISA Review CEG3420 Computer Design Lecture 2.
ISA Issues; Performance Considerations. Testing / System Verilog: ECE385.
1 Lecture 3: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation.
EECC551 - Shaaban #1 Lec # 2 Fall Instruction Set Architecture (ISA) “... the attributes of a [computing] system as seen by the programmer,
RISC / CISC Architecture By: Ramtin Raji Kermani Ramtin Raji Kermani Rayan Arasteh Rayan Arasteh An Introduction to Professor: Mr. Khayami Mr. Khayami.
INSTRUCTION SET ARCHITECTURES
1 ECE462/562 ISA and Datapath Review Ali Akoglu. 2 Instruction Set Architecture A very important abstraction –interface between hardware and low-level.
COMP381 by M. Hamdi 1 Instruction Set Architectures.
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
Unit -II CPU Organization By- Mr. S. S. Hire. CPU organization.
Implementation of a Stored Program Computer
Operand Addressing and Instruction Representation
Linked Lists in MIPS Let’s see how singly linked lists are implemented in MIPS on MP2, we have a special type of doubly linked list Each node consists.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Instruction Set Architecture
CMP 301A Computer Architecture 1 Lecture 4. 2 Outline zISA Introduction zISA Classes yStack yAccumulator yRegister memory yRegister register/load store.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CS-334: Computer.
Digital System Design 1 Lecture 3 Instruction Set Architecture Pradondet Nilagupta Fall 2000 (original notes from Prof. Mike Schulte)
Implementation of a Stored Program Computer ITCS 3181 Logic and Computer Systems 2014 B. Wilkinson Slides2.ppt Modification date: Oct 16,
Instruction Set Architecture Basics. Our Progress Done with levels 0 and 1 Seen multiple examples of level 2 Ready for ISA general principles.
Classifying GPR Machines TypeNumber of Operands Memory Operands Examples Register- Register 30 SPARC, MIPS, etc. Register- Memory 21 Intel 80x86, Motorola.
Chapter 5 A Closer Look at Instruction Set Architectures.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Chapter Six Sun SPARC Architecture. SPARC Processor The name SPARC stands for Scalable Processor Architecture SPARC architecture follows the RISC design.
Instruction Set Architecture The portion of the machine visible to the programmer Issues: Internal storage model Addressing modes Operations Operands Encoding.
Computer Architecture EKT 422
COMPUTER ORGANIZATION AND ASSEMBLY LANGUAGE Lecture 19 & 20 Instruction Formats PDP-8,PDP-10,PDP-11 & VAX Course Instructor: Engr. Aisha Danish.
Oct. 25, 2000Systems Architecture I1 Systems Architecture I (CS ) Lecture 9: Alternative Instruction Sets * Jeremy R. Johnson Wed. Oct. 25, 2000.
M. Mateen Yaqoob The University of Lahore Spring 2014.
Operand Addressing And Instruction Representation Cs355-Chapter 6.
CS 211: Computer Architecture Lecture 2 Instructor: Morris Lancaster.
Chapter 10 Instruction Sets: Characteristics and Functions Felipe Navarro Luis Gomez Collin Brown.
Lecture 5 A Closer Look at Instruction Set Architectures Lecture Duration: 2 Hours.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO Session 7, 8 Instruction Set Architecture.
Instruction Sets: Characteristics and Functions  Software and Hardware interface Machine Instruction Characteristics Types of Operands Types of Operations.
INSTRUCTION SET PRINCIPLES. Computer Architecture’s Changing Definition  1950s to 1960s: Computer Architecture Course = Computer Arithmetic  1970s to.
Instruction Set Architectures. Our Progress Done with levels 0 and 1 Seen multiple examples of level 2 Ready for ISA general principles.
Computer Architecture & Operations I
Computer Architecture & Operations I
A Closer Look at Instruction Set Architectures
A Closer Look at Instruction Set Architectures
Instruction Set Architectures: History and Issues
Advanced Computer Architecture 5MD00 / 5Z032 Instruction Set Design
Instruction Set Architectures
MIPS Assembly.
CS170 Computer Organization and Architecture I
Computer Organization and ASSEMBLY LANGUAGE
Instruction Set Principles
Chapter 2 Instruction Set Principles
ECEG-3202 Computer Architecture and Organization
Chapter 9 Instruction Sets: Characteristics and Functions
A Closer Look at Instruction Set Architectures Chapter 5
Computer Architecture
ECEG-3202 Computer Architecture and Organization
What is Computer Architecture?
Dr Hao Zheng Computer Sci. & Eng. U of South Florida
COMS 361 Computer Organization
What is Computer Architecture?
What is Computer Architecture?
CPU Structure CPU must:
Lecture 4: Instruction Set Design/Pipelining
CSE378 Introduction to Machine Organization
Chapter 10 Instruction Sets: Characteristics and Functions
Presentation transcript:

Chapter 3 Instruction Set Architecture Advanced Computer Architecture COE 501

Hot Topics in Computer Architecture 1950s and 1960s: –Computer Arithmetic 1970 and 1980s: –Instruction Set Design –ISA Appropriate for Compilers 1990s: –Design of CPU –Design of memory system –Instruction Set Extensions 2000s: –Computer Arithmetic –Design of I/O system –Parallelism

Instruction Set Architecture “Instruction set architecture is the structure of a computer that a machine language programmer must understand to write a correct (timing independent) program for that machine.” –Source: IBM in 1964 when introducing the IBM 360 architecture, which eliminated 7 different IBM instruction sets. The instruction set architecture is also the machine description that a hardware designer must understand to design a correct implementation of the computer.

Instruction Set Architecture instruction set High level language code : C, C++, Java, Fortan, hardware The instruction set architecture serves as the interface between software and hardware. It provides the mechanism by which the software tells the hardware what should be done. Assembly language code: architecture specific statements Machine language code: architecture specific bit patterns software compiler assembler

ISA Metrics Orthogonality –No special registers, few special cases, all operand modes available with any data type or instruction type Completeness –Support for a wide range of operations and target applications Regularity –No overloading for the meanings of instruction fields Streamlined –Resource needs easily determined Ease of compilation (or assembly language programming) Ease of implementation

Instruction Set Design Issues Instruction set design issues include: –Where are operands stored? »registers, memory, stack, accumulator –How many explicit operands are there? »0, 1, 2, or 3 –How is the operand location specified? »register, immediate, indirect,... –What type & size of operands are supported? »byte, int, float, double, string, vector... –What operations are supported? »add, sub, mul, move, compare...

Evolution of Instruction Sets Single Accumulator (EDSAC 1950) Accumulator + Index Registers (Manchester Mark I, IBM 700 series 1953) Separation of Programming Model from Implementation High-level Language BasedConcept of a Family (B )(IBM ) General Purpose Register Machines Complex Instruction SetsLoad/Store Architecture RISC (Vax, Intel ) (CDC 6600, Cray ) (Mips,Sparc,88000,IBM RS6000, )

Classifying ISAs Accumulator (before 1960): 1 addressadd Aacc  acc + mem[A] Stack (1960s to 1970s): 0 addressaddtos  tos + next Memory-Memory (1970s to 1980s): 2 addressadd A, Bmem[A]  mem[A] + mem[B] 3 addressadd A, B, C mem[A]  mem[B] + mem[C] Register-Memory (1970s to present): 2 addressadd R1, AR1  R1 + mem[A] load R1, AR1  mem[A] Register-Register (Load/Store) (1960s to present): 3 addressadd R1, R2, R3R1  R2 + R3 load R1, R2R1  mem[R2] store R1, R2mem[R1]  R2

Accumulator Architectures Instruction set: add A, sub A, mult A, div A,... load A, store A Example: A*B - (A+C*B) load B mul C add A store D load A mul B sub D BB*CA+B*CA A*Bresult

Accumulators: Pros and Cons Pros –Very low hardware requirements –Easy to design and understand Cons –Accumulator becomes the bottleneck –Little ability for parallelism or pipelining –High memory traffic

Stack Architectures Instruction set: add, sub, mult, div,... push A, pop A Example: A*B - (A+C*B) push A push B mul push A push C push B mul add sub AB A A*B A A C A ACBB*CA+B*Cresult

Stacks: Pros and Cons Pros –Good code density (implicite top of stack) –Low hardware requirements –Easy to write a simpler compiler for stack architectures Cons –Stack becomes the bottleneck –Little ability for parallelism or pipelining –Data is not always at the top of stack when need, so additional instructions like TOP and SWAP are needed –Difficult to write an optimizing compiler for stack architectures

Memory-Memory Architectures Instruction set: (3 operands)add A, B, Csub A, B, C mul A, B, C (2 operands)add A, Bsub A, B mul A, B Example: A*B - (A+C*B) –3 operands 2 operands mul D, A, Bmov D, A mul E, C, Bmul D, B add E, A, Emov E, C sub E, D, Emul E, B add E, A sub E, D

Memory-Memory: Pros and Cons Pros –Requires fewer instructions (especially if 3 operands) –Easy to write compilers for (especially if 3 operands) Cons –Very high memory traffic (especially if 3 operands) –Variable number of clocks per instruction –With two operands, more data movements are required

Register-Memory Architectures Instruction set: add R1, A sub R1, A mul R1, B load R1, Astore R1, A Example: A*B - (A+C*B) load R1, A mul R1, B/*A*B*/ store R1, D load R2, C mul R2, B/*C*B*/ add R2, A/*A + CB*/ sub R2, D/*AB - (A + C*B)*/

Memory-Register: Pros and Cons Pros –Some data can be accessed without loading first –Instruction format easy to encode –Good code density Cons –Operands are not equivalent (poor orthogonal) –Variable number of clocks per instruction –May limit number of registers

Load-Store Architectures Instruction set: add R1, R2, R3 sub R1, R2, R3 mul R1, R2, R3 load R1, &Astore R1, &Amove R1, R2 Example: A*B - (A+C*B) load R1, &A load R2, &B load R3, &C mul R7, R3, R2/*C*B */ add R8, R7, R1 /* A + C*B */ mul R9, R1, R2/* A*B */ sub R10, R9, R8/*A*B - (A+C*B) */

Load-Store: Pros and Cons Pros –Simple, fixed length instruction encodings –Instructions take similar number of cycles –Relatively easy to pipeline and make superscalar Cons –Higher instruction count –Not all instructions need three operands –Dependent on good compiler

Registers: Advantages and Disadvantages Advantages –Faster than cache or main memory (no addressing mode or tags) –Deterministic (no misses) –Can replicate (multiple read ports) –Short identifier (typically 3 to 8 bits) –Reduce memory traffic Disadvantages –Need to save and restore on procedure calls and context switch –Can’t take the address of a register (for pointers) –Fixed size (can’t store strings or structures efficiently) –Compiler must manage –Limited number

Big Endian Addressing With Big Endian addressing, the byte binary address x... x00 is in the most significant position (big end) of a 32 bit word (IBM, Motorolla, Sun, HP).

Little Endian Addressing With Little Endian addressing, the byte binary address x... x00 is in the least significant position (little end) of a 32 bit word (DEC, Intel). Programmers/protocols should be careful when transferring binary data between Big Endian and Little Endian machines

Operand Alignment An access to an operand of size s bytes at byte address A is said to be aligned if A mod s = 0

Unrestricted Alignment If the architecture does not restrict memory accesses to be aligned then –Software is simple –Hardware must detect misalignment and make two memory accesses –Expensive logic to perform detection –Can slow down all references –Sometimes required for backwards compatibility

Restricted Alignment If the architecture restricts memory accesses to be aligned then –Software must guarantee alignment –Hardware detects misalignment access and traps –No extra time is spent when data is aligned Since we want to make the common case fast, having restricted alignment is often a better choice, unless compatibility is an issue.

Types of Addressing Modes (VAX) Addressing ModeExampleAction 1.Register directAdd R4, R3R4 <- R4 + R3 2.Immediate Add R4, #3R4 <- R DisplacementAdd R4, 100(R1)R4 <- R4 + M[100 + R1] 4.Register indirect Add R4, (R1)R4 <- R4 + M[R1] 5.IndexedAdd R4, (R1 + R2)R4 <- R4 + M[R1 + R2] 6.Direct Add R4, (1000)R4 <- R4 + M[1000] 7.Memory IndirectAdd <- R4 + M[M[R3]] 8.AutoincrementAdd R4, (R2)+R4 <- R4 + M[R2] R2 <- R2 + d 9.AutodecrementAdd R4, (R2)-R4 <- R4 + M[R2] R2 <- R2 - d 10. ScaledAdd R4, 100(R2)[R3]R4 <- R4 + M[100 + R2 + R3*d] Studies by [Clark and Emer] indicate that modes 1-4 account for 93% of all operands on the VAX.

Types of Operations Arithmetic and Logic:AND, ADD Data Transfer:MOVE, LOAD, STORE ControlBRANCH, JUMP, CALL SystemOS CALL, VM Floating PointADDF, MULF, DIVF DecimalADDD, CONVERT StringMOVE, COMPARE Graphics(DE)COMPRESS

80x86 Instruction Frequency

Relative Frequency of Control Instructions Design hardware to handle branches quickly, since these occur most frequently

Frequency of Operand Sizes on 32-bit Load-Store Machines For floating-point want good performance for 64 bit operands. For integer operations want good performance for 32 bit operands. Recent architectures also support 64-bit integers

Instruction Encoding Variable –Instruction length varies based on opcode and address specifiers –For example, VAX instructions vary between 1 and 53 bytes, while x86 instruction vary between 1 and 17 bytes. –Good code density, but difficult to decode and pipeline Fixed –Only a single size for all instructions –For example, DLX, MIPS, Power PC, Sparc all have 32 bit instructions –Not as good code density, but easier to decode and pipeline Hybrid –Have multiple format lengths specified by the opcode –For example, IBM 360/370 –Compromise between code density and ease of decode

Compilers and ISA Compiler Goals –All correct programs compile correctly –Most compiled programs execute quickly –Most programs compile quickly –Achieve small code size –Provide debugging support Multiple Source Compilers –Same compiler can compiler different languages Multiple Target Compilers –Same compiler can generate code for different machines

Compilers Phases Compilers use phases to manage complexity –Front end »Convert language to intermediate form –High level optimizer »Procedure inlining and loop transformations –Global optimizer »Global and local optimization, plus register allocation –Code generator (and assembler) »Dependency elimination, instruction selection, pipeline scheduling

Designing ISA to Improve Compilation Provide enough general purpose registers to ease register allocation ( more than 16). Provide regular instruction sets by keeping the operations, data types, and addressing modes orthogonal. Provide primitive constructs rather than trying to map to a high-level language. Simplify trade-offs among alternatives. Allow compilers to help make the common case fast.