“Today, less than $500 will purchase a personal computer that has more performance, more main memory, and more disk storage than a computer bought in.

Slides:



Advertisements
Similar presentations
CH10 Instruction Sets: Characteristics and Functions
Advertisements

Instruction Set Design
Chapter 3 Instruction Set Architecture Advanced Computer Architecture COE 501.
ISA Issues; Performance Considerations. Testing / System Verilog: ECE385.
1 Lecture 3: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation.
Lecture 3: Instruction Set Principles Kai Bu
INSTRUCTION SET ARCHITECTURES
Microprocessors General Features To be Examined For Each Chip Jan 24 th, 2002.
CIS429/529 ISA - 1 Instruction Set Architectures Classification Addressing Modes Types of Instructions Encoding Instructions MIPS64 Instruction Set.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
What is an instruction set?
1 RISC Machines l RISC system »instruction –standard, fixed instruction format –single-cycle execution of most instructions –memory access is available.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
CSE378 MIPS ISA1 MIPS History MIPS is a computer family –R2000/R3000 (32-bit); R4000/4400 (64-bit); R8000; R10000 (64-bit) etc. MIPS originated as a Stanford.
Lecture 18 Last Lecture Today’s Topic Instruction formats
Machine Instruction Characteristics
L/O/G/O The Instruction Set Chapter 9 CS.216 Computer Architecture and Organization.
1 Copyright © 2011, Elsevier Inc. All rights Reserved. Appendix A Authors: John Hennessy & David Patterson.
Dr Mohamed Menacer College of Computer Science and Engineering Taibah University CS-334: Computer.
Computer Systems Organization CS 1428 Foundations of Computer Science.
Classifying GPR Machines TypeNumber of Operands Memory Operands Examples Register- Register 30 SPARC, MIPS, etc. Register- Memory 21 Intel 80x86, Motorola.
Advanced Computer Architecture 0 Lecture # 1 Introduction by Husnain Sherazi.
Instruction Set Architecture The portion of the machine visible to the programmer Issues: Internal storage model Addressing modes Operations Operands Encoding.
Computer Architecture and Organization
Microprocessors The ia32 User Instruction Set Jan 31st, 2002.
Computer Architecture EKT 422
Lecture 04: Instruction Set Principles Kai Bu
Chapter 10 Instruction Sets: Characteristics and Functions Felipe Navarro Luis Gomez Collin Brown.
Instruction Sets: Addressing modes and Formats Group #4  Eloy Reyes  Rafael Arevalo  Julio Hernandez  Humood Aljassar Computer Design EEL 4709c Prof:
Group # 3 Jorge Chavez Henry Diaz Janty Ghazi German Montenegro.
Instruction Sets. Instruction set It is a list of all instructions that a processor can execute. It is a list of all instructions that a processor can.
Computer Architecture. Instruction Set “The collection of different instructions that the processor can execute it”. Usually represented by assembly codes,
Instruction sets : Addressing modes and Formats
Overview of Instruction Set Architectures
Part of the Assembler Language Programmers Toolbox
A Closer Look at Instruction Set Architectures
Computer Organization and ASSEMBLY LANGUAGE
A Closer Look at Instruction Set Architectures
Computer Architecture and Organization Miles Murdocca and Vincent Heuring Chapter 4 – The Instruction Set Architecture.
William Stallings Computer Organization and Architecture 8th Edition
The University of Adelaide, School of Computer Science
Computer Organization and Assembly Language (COAL)
Processor Organization and Architecture
Lecture 04: Instruction Set Principles
Instruction Set Architecture
Instructions - Type and Format
CS170 Computer Organization and Architecture I
Lecture 04: Instruction Set Principles
CSCE Fall 2013 Prof. Jennifer L. Welch.
Processor Organization and Architecture
Topics Introduction Hardware and Software How Computers Store Data
ECEG-3202 Computer Architecture and Organization
Introduction to Micro Controllers & Embedded System Design
Chapter 9 Instruction Sets: Characteristics and Functions
A Closer Look at Instruction Set Architectures Chapter 5
Computer Instructions
ECEG-3202 Computer Architecture and Organization
CSCE Fall 2012 Prof. Jennifer L. Welch.
Introduction to Microprocessor Programming
Instruction Set Principles
Review In last lecture, done with unsigned and signed number representation. Introduced how to represent real numbers in float format.
Lecture 4: Instruction Set Design/Pipelining
Computer Organization
Chapter 10 Instruction Sets: Characteristics and Functions
Chapter 4 The Von Neumann Model
Presentation transcript:

“Today, less than $500 will purchase a personal computer that has more performance, more main memory, and more disk storage than a computer bought in 1985 for 1 million dollars.” section 1.1

Figure 1.1 Growth in processor performance since mid-1980s section 1.1

Why such a rapid growth? Innovation in computer design Reduced Instruction Set Computers Exploitation of instruction level parallelism Use of caches Advances in technology used to build computers Integrated circuit technology which supported mass production of microprocessors (allowed CPU on one chip) Elimination of assembly language programming and vendor- independent operating systems (Unix, Linux) Made it less risky for vendors to introduce a new architecture section 1.1

But, huge performance gains are over Processor performance improvement has dropped from 50% (since mid 80s) to currently 20% No instruction level parallelism left to exploit Almost unchanged memory latency section 1.1

Where is industry headed? Thread-level parallelism (TLP) Thread is a separate process with its own instructions and data that is typically part of a parallel program Data-level parallelism (DLP) Same operation applied to multiple pieces of data simultaneously Note: unlike instruction-level parallelism that occurs without programmer intervention, TLP and DLP requires the programmer to write the parallel code section 1.1

Changing face of computing 1960s – Large mainframe Millions of dollars Multiple operators Data processing and scientific computing 1970s Minicomputer Time-sharing Terminals section 1.2

Changing face of computing 1980s Desktop computing via microprocessors Servers that provide file storage and access, larger memory, more computing power 1990s (and beyond!) Internet and world wide web (need for powerful servers) Hand held computing devices (PDAs) Digital consumer electronics (video games, dvd players, TiVo, satellite receivers) section 1.2

Three computing markets Desktop computing $500 PCs to $5,000 workstations Market driven to optimize price- performance (how much bang for the buck) Consumers are interested in high performance and low price sections 1.2

Three computing markets Servers Provide file storage and access,and enough memory and computing power to support many simultaneous users Emergence of world wide web increased popularity Key requirements Availability Scalability Efficient throughput sections 1.2

Three computing markets Embedded computers Computers lodged in other devices – microwaves, washing machines, printers, networking switches, PDAs, ATMs, etc. Very wide range of processing power and cost: 8 bit processors that cost less than a dime to high-end processors for game systems that cost hundreds of dollars Real-time performance requirement – absolute maximum execution time Need to minimize memory (contain costs) and power sections 1.2

What about supercomputers? Supercomputer – machine with significant floating point performance, designed for specific scientific applications; costs millions of dollars Clusters of desktop computers have largely overtaken conventional supercomputers because they are cheaper and scalable sections 1.2

Instruction Set Architecture (ISA) Portion of the computer visible to the programmer or compiler writer What is visible: Instruction types and formats General purpose registers Addressing modes Memory addressing section 1.3

Classifying ISAs Classified by the location of operands Five possibilities: Stack architecture Accumulator architecture Memory-memory architecture Register-memory architecture Register-register architecture section 1.3

Stack architecture Operands are implicitly on top of the stack Code sequence for C = A + B Push A Push B Add Pop C Each of these instructions causes a TOS register to be modified (TOS = address of operand on top of stack) in addition to the obvious calculation Example: JVM section 1.3

Accumulator architecture One operand is implicitly the accumulator Code sequence for C = A + B Load A Add B Store C Note the accumulator can be used to hold the input operand or the result section 1.3

Memory-memory architecture All operands are explicit and are in memory Code sequence for C = A + B ADD C, A, B Not found in any modern machines section 1.3

Register-memory architecture Operands are explicit, either register or memory location Any instruction can access memory Code sequence for C = A + B Load R1, A Add R3, R1, B Store R3, C 80x86 (X86) is an example section 1.3

Register-Register architecture Only load and store instructions can access memory Also called a load-store architecture Code sequence for C = A + B Load R1, A Load R2, B Add R3, R1, R2 Store R3, C Example: MIPS section 1.3

Machines designed since 1980 Load-store architectures Why? How many registers? Newer machines tend to have more registers than older machines As instruction level parallelism increased, the number of registers needed increased section 1.3

Memory Addressing How is memory address interpreted? Almost always the address is a byte address Bytes, halfwords, words, and often doublewords can be accessed How are bytes ordered within a word? Little endian – low order byte has the lowest address Big endian – high order byte has the lowest address section 1.3

Little endian machine Makes sense that low order byte has the lowest address But, strings appear in reverse order in memory dumps and registers 3 2 1 section 1.3

Big endian machine Low order byte is in the right most byte above, but has the highest address 1 2 3 section 1.3

Alignment In most computer accesses to objects larger than a byte must be aligned Access to an object of size s at address A is aligned if: A mod s = 0 Why this restriction? Memory typically aligned on word or double word boundary Unaligned accesses require multiple accesses to memory and are thus slower section 1.3

Addressing Modes Mode indicates where the operand can be found (register, part of the instruction, memory location) Effective address calculation – uses the address field of an instruction to determined a memory location section 1.3

80x86 addressing modes Register Immediate Absolute Indirect Base plus displacement Indexed (two forms) Scaled indexed (four forms) section 1.3

MIPS addressing modes Register Immediate Displacement section 1.3

Type and Size of Operands How is type of operand designated? Part of the operand (old technique) Part of the opcode (add versus addf) section 1.3

Typical types for desktop/servers Character (8 bit ASCII or 16 bit Unicode) Halfword (16 bits) Integer (32 bit word almost always two’s complement) Single precision floating point (1 word, almost always IEEE 754 standard) Double precision floating point (2 words, IEEE 754) 80x86 also supports extended double precision (80 bits) section 1.3

More types for desktops/servers Packed decimal (binary coded decimal) four bits used to encode single decimal digits two digits per byte Used in business applications to get exact decimal numbers (.1 in decimal is a repeating binary number) section 1.3

Operations in the Instruction Set Categories Arithmetic and logical Data transfer Control System Floating point Decimal (Packed Decimal) String Graphics (Pixel and Vertex Operations) section 1.3

Instructions for Control Flow No consistent terminology: branch, jump, transfer This book: jump = unconditional; branch = conditional 4 types: Conditional branches Jumps Procedure calls Procedure returns section 1.3

Addressing Modes for Control Flow Instructions Target usually explicitly specified (major exception: procedure returns) Most common way to specify target is as a displacement from the PC (PC-relative) PC-relative mode only works if the target can be calculated at compile-time Register-indirect jump – target address is in a register (calculated at runtime) section 1.3

Why indirect jumps? Case/switch statements Virtual functions (dynamic binding of call to function binding) Function pointers – functions passed as arguments Dynamically shared libraries – library loaded and linked at runtime section 1.3

80x86 branches Conditional branches test condition code bits set as a result of a previous arithmetic instruction Direct jump jmp .L1 (unconditional) je .L1 (conditional based upon value of ZF) Indirect jump jmp *%eax (target is in %eax) jmp *(%eax) (target is in memory) call sub – pushes return address onto the stack section 1.3

MIPS branches Conditional branches test contents of a register beqz r3 target (branch if r3 is equal to zero) Return address is saved in a register jal sub (return address automatically saved in r31) jalr r3 (target is in r3; return address saved in r31) Unconditional jumps j target (target is encoded as PC relative) j r3 (target is in r3) section 1.3

Encoding an instruction set: competing forces Desire to have as many registers and modes as possible Want to keep instruction size and thus program size as small as possible Want instructions to be encoded in such a way that they are easy to pipeline section 1.3

Encoding an instruction set Fixed encoding (MIPS) Addressing mode implied by the opcode All instructions are the same size Easier to decode Easier to pipeline instruction execution Variable encoding (80x86) Addressing mode is explicit in the operand Different length instructions All addressing modes work with all operands Program is smaller (important when memory was at a premium) Individual instructions vary in size and amount of work section 1.3

Implementation technologies Integrated circuit logic technology – number of transistors on a chip increases about 40-55% per year Semiconductor DRAM (primary memory) – density increases 40% per year Magnetic disks – disk density improving by more than 30% per year recently Network technology – recently improving quickly because of interest in world wide web section 1.4

Performance trends Bandwidth (throughput) – total amount of work done in a given time Megabytes/second for a disk transfer Latency (response time) – time between start and completion of an event Milliseconds for a disk access Figure 1.8 shows that bandwidth improves much more rapidly than latency section 1.4

Figure 1.8 sections 1.1, 1.2, 1.3, 1.5, 1.6, 1.7, 1.9

Impact of transistors and wires Feature size – size of transistor or wire in either the x or y dimension As transistors decrease in size Amount of power needed for correct operation of transistor decreases Chip density increases Wire delay plays a greater role in chip performance High rate in improvement in transistor density resulted in the rapid advance from 4-bit to 64-bit processors section 1.4