CS 136 1 CS136, Advanced Architecture Instruction Set Architecture.

Slides:



Advertisements
Similar presentations
Instruction Set Design
Advertisements

1 Lecture 3: MIPS Instruction Set Today’s topic:  More MIPS instructions  Procedure call/return Reminder: Assignment 1 is on the class web-page (due.
Review of the MIPS Instruction Set Architecture. RISC Instruction Set Basics All operations on data apply to data in registers and typically change the.
Chapter 3 Instruction Set Architecture Advanced Computer Architecture COE 501.
CEG3420 Lec2.1 ©UCB Fall 1997 ISA Review CEG3420 Computer Design Lecture 2.
ISA Issues; Performance Considerations. Testing / System Verilog: ECE385.
1 Lecture 3: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation.
Lecture 3: Instruction Set Principles Kai Bu
EECC551 - Shaaban #1 Lec # 2 Fall Instruction Set Architecture (ISA) “... the attributes of a [computing] system as seen by the programmer,
RISC / CISC Architecture By: Ramtin Raji Kermani Ramtin Raji Kermani Rayan Arasteh Rayan Arasteh An Introduction to Professor: Mr. Khayami Mr. Khayami.
INSTRUCTION SET ARCHITECTURES
1 ECE462/562 ISA and Datapath Review Ali Akoglu. 2 Instruction Set Architecture A very important abstraction –interface between hardware and low-level.
10/9: Lecture Topics Starting a Program Exercise 3.2 from H+P Review of Assembly Language RISC vs. CISC.
ELEN 468 Advanced Logic Design
Computer Organization and Architecture
Microprocessors General Features To be Examined For Each Chip Jan 24 th, 2002.
PART 4: (2/2) Central Processing Unit (CPU) Basics CHAPTER 13: REDUCED INSTRUCTION SET COMPUTERS (RISC) 1.
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
S. Barua – CPSC 440 CHAPTER 2 INSTRUCTIONS: LANGUAGE OF THE COMPUTER Goals – To get familiar with.
Lecture 5 Sept 14 Goals: Chapter 2 continued MIPS assembly language instruction formats translating c into MIPS - examples.
Microprocessors Introduction to RISC Mar 19th, 2002.
What is an instruction set?
(6.1) Central Processing Unit Architecture  Architecture overview  Machine organization – von Neumann  Speeding up CPU operations – multiple registers.
Reduced Instruction Set Computers (RISC) Computer Organization and Architecture.
Chapter 5: ISAs In MARIE, we had simple instructions –4 bit op code followed by either 12 bit address for load, store, add, subt, jump 2 bit condition.
Operand Addressing and Instruction Representation
Instruction Set Architecture
Machine Instruction Characteristics
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CS-334: Computer.
Classifying GPR Machines TypeNumber of Operands Memory Operands Examples Register- Register 30 SPARC, MIPS, etc. Register- Memory 21 Intel 80x86, Motorola.
Computer architecture Lecture 11: Reduced Instruction Set Computers Piotr Bilski.
Chapter 5 A Closer Look at Instruction Set Architectures.
1 Instruction Set Architecture (ISA) Alexander Titov 10/20/2012.
Instruction Set Architecture The portion of the machine visible to the programmer Issues: Internal storage model Addressing modes Operations Operands Encoding.
Computer Architecture and Organization
Computer Architecture EKT 422
COMPUTER ORGANIZATION AND ASSEMBLY LANGUAGE Lecture 19 & 20 Instruction Formats PDP-8,PDP-10,PDP-11 & VAX Course Instructor: Engr. Aisha Danish.
Crosscutting Issues: The Rôle of Compilers Architects must be aware of current compiler technology Compiler Architecture.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Reduced Instruction Set Computers. Major Advances in Computers(1) The family concept —IBM System/ —DEC PDP-8 —Separates architecture from implementation.
Lecture 04: Instruction Set Principles Kai Bu
Chapter 10 Instruction Sets: Characteristics and Functions Felipe Navarro Luis Gomez Collin Brown.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Lecture 5 A Closer Look at Instruction Set Architectures Lecture Duration: 2 Hours.
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO Session 7, 8 Instruction Set Architecture.
MIPS Assembly.
Central Processing Unit Architecture
A Closer Look at Instruction Set Architectures
ELEN 468 Advanced Logic Design
A Closer Look at Instruction Set Architectures
A Closer Look at Instruction Set Architectures: Expanding Opcodes
The University of Adelaide, School of Computer Science
Instructions - Type and Format
Lecture 4: MIPS Instruction Set
The University of Adelaide, School of Computer Science
ECEG-3202 Computer Architecture and Organization
Chapter 9 Instruction Sets: Characteristics and Functions
A Closer Look at Instruction Set Architectures Chapter 5
Computer Instructions
Computer Architecture
ECEG-3202 Computer Architecture and Organization
Evolution of ISA’s ISA’s have changed over computer “generations”.
UCSD ECE 111 Prof. Farinaz Koushanfar Fall 2018
Introduction to Microprocessor Programming
Evolution of ISA’s ISA’s have changed over computer “generations”.
CPU Structure CPU must:
Lecture 4: Instruction Set Design/Pipelining
Presentation transcript:

CS CS136, Advanced Architecture Instruction Set Architecture

CS Types of ISAs Stack –Implicit operands (top of stack) –Heavy memory traffic –Limited ability to access operands at will –Obsolete Accumulator –Implicit register operand (“accumulator”) –One memory operand –Insufficient temporaries –Obsolete General-purpose register –Multiple registers –Several variations

CS GPR Architectures Memory-memory –CISC idea –Usually allows any operand to be in register as well Register-memory –Example: x86 –Can do one operand in register, one in memory, or 2 in regs Register-register –Only design used in modern machines –Lots of registers ⇒ fast flexible operand access –Simplicity of hardware –Compiler has full flexibility in register usage

CS Five Ways to Do C = A + B STACK PUSH A PUSH B ADD POP C ACCUM LOAD A ADD B STORE C MEM-MEM ADD C,A,B REG-MEM LOAD R1,A ADD R1,B STORE R1,C REG-REG LOAD R1,A LOAD R2,B ADD R3,R1,R2 STORE R3,C

CS Memory Addressing Originally just word addressing 8-bit bytes and byte addressing introduced on IBM 360 series Brief experiments with bit addressing (bad idea) Unaligned accesses not worth supporting Some machines byte-address but only load/store a word at a time –Turned out to be bad design decision –Too many programs do string processing 1 character at a time –May need to revisit in future (32-bit characters?) Modern RISC designs allow short load/store, but not short arithmetic

CS Endian-ness The word is “Endian”, not “Indian” Reference to Gulliver’s Travels Little-Endian invented by Digital Equipment on the PDP-11 –Mathematically more elegant –Horrible for humans –“It seemed like a good idea at the time” –Should be banished from the face of the Earth Some machines can switch endianness with a control bit –This idea is even stupider than the original

CS Addressing Modes How can an instruction reference memory? Early days: absolute address in instruction –Led to instruction modification –Improvement: “Indirection” picked up absolute location, used it as final address Minimum necessary today: follow pointer in register –Clumsy if only option Fanciest conceivable: *(R1+S*R2+constant), with either or both of R1 and R2 autoincremented or autodecremented as side effect, either before or after instruction –No machine went quite this far –But VAX came close

CS Addressing Modes (cont’d) What’s actually useful? Need to follow pointers: can restrict to registers –ADD R1,(R2) –Better: LOAD R1,R2 (like MIPS) Frequent stack access ⇒ register + constant useful Immediates needed for built-in constants Access to globals ⇒ absolute memory addresses –(We’ll see that that’s painful) PC-relative modes –Used to be needed for data; not in modern systems –Still needed for calls and branches Absolute addresses no longer needed for branches –Can always emulate with PC-relative, since PC known –Still available on some architectures

CS Operand Types and Sizes Type usually implies size Integers can safely be widened to word size –Shrink again when stored –Takes advantage of two’s-complement representation Single-precision FP gives different results than double-precisions ⇒ Necessary to support both widths –Some FPUs can do two SP operations in parallel Older machines allowed “packed” decimal (2 digits per byte) –x86 supports with DAA (Decimal Add Adjust) instruction –Still useful in business world, though dying 32 bits standard these days, 64 bits coming –128 some day?

CS Operations Provided Only one instruction truly needed: SJ –Subtract A from B, giving C; if result is < 0, jump to D –It’s Turing-complete! Practical machines need a bit more at minimum: –Arithmetic and logical (add, multiply, divide?, and, or, …) –Data movement (load/store, move between registers) –Control (conditional/unconditional branch, procedure call and return, trap to OS) –System control (return from interrupt, manage VM, set unprivileged mode, access I/O devices) Other builtins can be useful: –Basic floating point »Bad x86 design idea: sin, sqrt, etc.! –Decimal –String –Vector, graphics

CS Control Flow Addressing modes are important –PC-relative means code can run at any virtual address –Useful for dynamically linked (shared) libraries Pointer-following jump needed for returns –Also useful for switch statements, function pointers, virtual functions, and shared libraries How to specify condition for conditional branches? –Condition code as side effect of every instruction »Boils down to extra register »Spurious dependencies in pipeline –Condition register explicitly set by comparison –Compare as part of branch »Adds delay slots in pipeline

CS Encodings Variable-length instructions –Highly efficient (few wasted bits) –Allows complex specifications (e.g., x86 addressing modes) –Usually means misaligned instruction fetch –Greatly complicates fetch/decode units Fixed-length instructions –May limit number of registers –Usually very few instruction formats –Wastes space but gains speed (e.g., only aligned fetches) –Limits width of immediate operands

CS The Fight for Bits How wide should instruction be? –Wider ⇒ can encode more registers, more options –Wider ⇒ bigger programs, more memory bandwidth –Bigger programs ⇒ fewer cache hits Things you need to encode: –Operation code (16 to 1000 instructions) –Operands (at least one, normally two or three) –Immediate operands –Memory offsets –Branch targets –Branch conditions –Conditional operations (e.g., conditional load, add)

CS Two or Three Operands? In favor of three: –Smaller code size –No clobbered operands ⇒ fewer copies or reloads –Setting R0 to zero allows fewer operations supported in ALU In favor of two: –Can address more registers

CS How to Decide All These Questions? Slide rules at 50 paces? Analysis wars –Look at existing designs, existing programs –“Recompile” programs for hypothetical architecture »Analyze size of resulting program »Run through simulator to see how it performs –Impractical approach »Writing compiler back ends is expensive »Simulators are slow –instead, make projections based on existing object code

CS Example of Bad DEC VAX had three “auto” addressing modes: autopostincrement, autopredecrement, and indirect autopostincrement What happened to indirect autopredecrement? –Analyzed output of BLISS compiler on many programs –Language didn’t provide way to express autopredecrement –Concluded it wasn’t necessary –Very different result if had analyzed C! *--p1 = a[--i];

CS Example of Difficult Analysis: imm16 How big should an immediate be? Easy analysis: examine existing code –Calculate frequency of various widths –Analyze tradeoff of using those bits for other purposes Problem: analyzed architecture affects frequency of different widths –E.g., Alpha has only 16 bits, so you’ll never see over 16! –Alternative: look for multi-instruction sequences that effectively use more than 16 bits »Hard to find (compiler pipeline scheduling) »Compiler will stand on head, use sneaky tricks to avoid generating extra instructions –Need for wider constants depends on architecture »E.g., MIPS needs them when jumping to shared libraries

CS

CS Interaction with Compilers Nearly all modern code generated by compilers Architect must make compiler’s job easier –Lots of registers –Orthogonal instruction set –Few side effects –Instructions and addressing modes matched to language constructs »But NOT attempt to implement them in detail! »Primitives are better than “solutions” even when solutions are correct –Good support for stack, globals, and pointers –Support for both compile-time and run-time binding –Don’t ask compiler to predict dynamic information (e.g., branch targets) –Don’t provide features language can’t express »Example pro and con: vector architectures

CS The MIPS64 Architecture Extension of MIPS32 Data path widened to 64 bits –Still 32-bit instructions –Still only 32 registers Most instructions have “D” as prefix to indicate 64-bit version

CS MIPS Instruction Formats Opcode 6 rs 5 rt 5 rd 5 shamt 5 funct 6 R-Type Instruction I-Type Instruction Opcode 6 rs 5 rt 516 Immediate J-Type Instruction Opcode 626 Offset inserted into PC

CS I-Type Instructions Encodes loads, stores (all widths), immediate ALU ops Also conditional branches (rt unused) Opcode 6 rs 5 rt 516 Immediate

CS R-Type Instructions Opcode 6 rs 5 rt 5 rd 5 shamt 5 funct 6 Register-register ALU operations –“funct” encodes the ALU operation: add, sub, etc. –Opcode chooses operands, special registers, sizes, etc. –Conditional moves Handles special registers, floating point, …

CS J-Type Instructions Opcode 626 Offset inserted into PC Jump, jump and link Trap, return from exception

CS MIPS Control Flow Unconditional jump substitutes low bits of PC –NOT addition! –Exceptionally bad on 64-bit architecture, where 36 bits unchanged No built-in stack –Subroutine call stores return in register –Callee must save on stack if necessary –Reduces overall cycle time –Ultra-efficient for leaf functions Conditional branches only test against zero –Complex tests (e.g., <) store Z/NZ result in a register –We’ve seen how this improves the pipeline Conditional moves can eliminate many branches –Feature of many modern architectures

CS MIPS Floating Point Floating point was originally coprocessor ⇒ Separate FP registers –Special instructions to move to/from integer registers MIPS64 (but not 32) has paired single operations –Two SP numbers pass through DP ALU simultaneously MIPS64 also has multiply-add in one instruction –Useful in signal processing (multimedia)

CS Fallacies and Pitfalls PITFALL: Instruction designed to support feature in some language –Examples: PDP-11/45 MARK, VAX CALLS, IBM 360 ED/EDMK –Why is this bad? »Easy to get wrong (PDP-11 MARK instruction) »Easy to make inefficient (VAX CALLS) »Languages evolve, hardware doesn‘t

CS Fallacies and Pitfalls (2) FALLACY: Typical programs exist –We wish! PITFALL: Ignoring the compiler –Design better code size, based on bad compiler –Good compiler can blow your idea out of the water FALLACY: Flawed architectures can’t succeed –Ummm, x86? –Every architecture has drawbacks FALLACY: You (YOU!) can design a flawless architecture –Always tradeoffs –Always something new to learn

CS Summary Instruction encoding is important Don’t forget to provide what the compiler needs –This is NOT what you think the compiler needs! Addresses will only get wider Data will only get wider –Including characters Cleverness to improve bandwidth (e.g., MADD) RISC is here to stay