Slide 1 Scalar Processor Design Phenomenal advances in its brief lifetime of 30+ years : X2/18mo in 30yr.  multi-GFLOPs processors, inspiring and facilitating.

Slides:



Advertisements
Similar presentations
1 Lecture 3: Instruction Set Architecture ISA types, register usage, memory addressing, endian and alignment, quantitative evaluation.
Advertisements

RISC / CISC Architecture By: Ramtin Raji Kermani Ramtin Raji Kermani Rayan Arasteh Rayan Arasteh An Introduction to Professor: Mr. Khayami Mr. Khayami.
Computer Abstractions and Technology
TU/e Processor Design 5Z0321 Processor Design 5Z032 Computer Systems Overview Chapter 1 Henk Corporaal Eindhoven University of Technology 2011.
Slide 1 Fundamentals of Computer Design CSCE430/830 Computer Architecture Instructor: Hong Jiang Courtesy of Prof. Yifeng U. of Maine Fall, 2007.
Chapter 1: Introduction We begin with a brief, introductory look at the components in a computer system We will then consider the evolution of computer.
Recap Measuring and reporting performance Quantitative principles Performance vs Cost/Performance.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
RISC vs CISC CS 3339 Lecture 3.2 Apan Qasem Texas State University Spring 2015 Some slides adopted from Milo Martin at UPenn.
COMP3221: Microprocessors and Embedded Systems Lecture 2: Instruction Set Architecture (ISA) Lecturer: Hui Wu Session.
Chapter 1. Introduction This course is all about how computers work But what do we mean by a computer? –Different types: desktop, servers, embedded devices.
Introduction What is Parallel Algorithms? Why Parallel Algorithms? Evolution and Convergence of Parallel Algorithms Fundamental Design Issues.
EET 4250: Chapter 1 Performance Measurement, Instruction Count & CPI Acknowledgements: Some slides and lecture notes for this course adapted from Prof.
Slide 1 Motivations and Introduction Phenomenal growth in computer industry/technology: X2/18mo in 20yr.  multi-GFLOPs processors, largely due to –Micro-electronics.
1 CSE SUNY New Paltz Chapter 1 Introduction CSE-45432Introduction to Computer Architecture Dr. Izadi.
CIS 314 : Computer Organization Lecture 1 – Introduction.
PSU CS 106 Computing Fundamentals II Introduction HM 1/3/2009.
1 Lecture 3: Instruction Sets Section 1.3, Sections Technology trends Design issues in defining an instruction set  Register and memory access.
Chapter 1 Sections 1.1 – 1.3 Dr. Iyad F. Jafar Introduction.
Computer Organization and Assembly language
CMSC 611: Advanced Computer Architecture Performance Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
RISC and CISC. Dec. 2008/Dec. and RISC versus CISC The world of microprocessors and CPUs can be divided into two parts:
Computer performance.
Lecture 2: Technology Trends and Performance Evaluation Performance definition, benchmark, summarizing performance, Amdahl’s law, and CPI.
Part 1.  Intel x86/Pentium family  32-bit CISC processor  SUN SPARC and UltraSPARC  32- and 64-bit RISC processors  Java  C  C++  Java  Why Java?
Lecture 03: Fundamentals of Computer Design - Trends and Performance Kai Bu
CS 6461: Computer Architecture Fall 2013 History and Trends Instructor: Morris Lancaster.
1 Recap (from Previous Lecture). 2 Computer Architecture Computer Architecture involves 3 inter- related components – Instruction set architecture (ISA):
The University of Adelaide, School of Computer Science
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Lecture 1 1 Computer Systems Architecture Lecture 1: What is Computer Architecture?
Advanced Computer Architecture Fundamental of Computer Design Instruction Set Principles and Examples Pipelining:Basic and Intermediate Concepts Memory.
Computer Organization and Design Computer Abstractions and Technology
Part 1.  Intel x86/Pentium family  32-bit CISC processor  SUN SPARC and UltraSPARC  32- and 64-bit RISC processors  Java  C  C++  Java  Why Java?
Chapter 1 Computer Abstractions and Technology. Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution Progress in computer technology.
Computer Architecture CPSC 350
AEEC405 – Microprocessor Architecture. Some Information Instructor Details Main Book.
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
RISC and CISC. What is CISC? CISC is an acronym for Complex Instruction Set Computer and are chips that are easy to program and which make efficient use.
Introduction.  This course is all about how computers work  But what do we mean by a computer?  Different types: desktop, servers, embedded devices.
MIPS Processor Chapter 12 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Adapted from Computer Organization and Design, Patterson & Hennessy, UCB ECE232: Hardware Organization and Design Part 5: MIPS Instructions I
Lecture 1: Introduction CprE 585 Advanced Computer Architecture, Fall 2004 Zhao Zhang.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 3.
SPRING 2012 Assembly Language. Definition 2 A microprocessor is a silicon chip which forms the core of a microcomputer the concept of what goes into a.
Computer Organization
Morgan Kaufmann Publishers
Architecture & Organization 1
Computer Architecture CSCE 350
Chapter 1 Fundamentals of Computer Design
Architecture & Organization 1
CMSC 611: Advanced Computer Architecture
Performance of computer systems
Computer Evolution and Performance
Evolution of ISA’s ISA’s have changed over computer “generations”.
What is Computer Architecture?
COMS 361 Computer Organization
Evolution of ISA’s ISA’s have changed over computer “generations”.
Performance of computer systems
CMSC 611: Advanced Computer Architecture
Evolution of ISA’s ISA’s have changed over computer “generations”.
The University of Adelaide, School of Computer Science
Utsunomiya University
Evolution of ISA’s ISA’s have changed over computer “generations”.
CSE378 Introduction to Machine Organization
Presentation transcript:

Slide 1 Scalar Processor Design Phenomenal advances in its brief lifetime of 30+ years : X2/18mo in 30yr.  multi-GFLOPs processors, inspiring and facilitating major innovations in: –Embedded microcontrollers (300 millions sold in 2000) –Personal computers(150 millions sold in 2000) –Advanced workstations –Handheld and mobile devices –Application and file servers (4 millions sold in 2000) –Web servers for the Internet –Low-cost supercomputers –Large-scale computing clusters –Well over one billion microprocessors shipped per year The amazing decades of the evolution of microprocessors: Transistor count2K-100K100K-1M1M-100M100M-2B Clock frequency0.1-3 MHz3-30 MHz30MHz-1GHz1GHz-15GHz Instruction/cycle0.1 IPC IPC IPC IPC

Slide 2 Scalar Processor Design Past (Milestones): –First electronic computer ENIAC in 1946: 18,000 vacuum tubes, 3,000 cubic feet, 20 2-foot 10-digit registers, 5 KIPs (thousand additions per second); –First microprocessor (a CPU on a single IC chip) Intel 4004 in 1971: 2,300 transistors, 60 KIPs, $200; –Virtual elimination of assembly language programming reduced the need for object-code compatibility; –The creation of standardized, vendor-independent operating systems, such as UNIX and its clone, Linux, lowered the cost and risk of bringing out a new architecture –RISC instruction set architecture paved ways for drastic design innovations that focused on two critical performance techniques: instruction-level parallelism and use of caches

Slide 3 Scalar Processor Design Present (State of the art): –Microprocessors approaching/surpassing 10 GFLOPS; –A high-end microprocessor ( $10million) ten years ago; –While technology advancement contributes a sustained annual growth of 35%, innovative computer design accounts for another 25% annual growth rate  a factor of 15 in performance gains!(fig1.1) –Three different computing markets (fig. 1.3): »Desktop Computing –- driven by price-performance (a few hundreds through over 10K); »Servers – availability driven (distinguished from reliability), providing sustained high performance »Embedded Computers – fastest growing portion of the computer market, real-time performance driven, and need to minimize memory and power, as well as ASIC

Slide 4 Scalar Processor Design Present (State of the art): –The Task of the Computer Designer: »Instruction Set Architecture (Traditional view of what Computer Architecture is), the boundary between software and hardware; »Organization, high-level aspects of a computer’s design, such as the memory system, the bus structure, the internal design of CPU, based on a given instruction set architectrue; »Hardware, the specifics of a machine, including the detailed logic design and the packaging technology of the machine. Future (Technology Trends): –A truly successful instruction set architecture (ISA) should last for decades, however it takes an computer architect’s acute observation and knowledge of the rapidly changing technology, in order for the ISA to survive and cope with such changes:

Slide 5 Scalar Processor Design »IC logic technology: transistor count on a chip grows at 55% annual rate (35% density growth rate % die size growth) while device speed scales more slowly; »Semiconductor DRAM: density grows at 60% annually while cycle time improves very slowly (decreasing one-third in ten years). Bandwidth per chip increases twice as fast as latency decreases; »Magnetic dish technology: density increases at 100% annual rate since 1990 while access time improves at about a third every ten years; and »Network technology: both latency and bandwidth have been improving, with more focus on bandwidth of late; the increasing importance of networking has led to faster improvement in performance than before—Internet bandwidth doubles every year in the U.S. challenge & opportunity for computer designer »Scaling of transistor performance: while transistor density increases quadratically with linear decrease in feature size, transistor performance increases roughly linearly with decrease in feature size  challenge & opportunity for computer designer! »Wires and power in IC: propagation delay and power needs?

Slide 6 Instruction Set Architecture ISA should reflect application characteristics: –Desktop computing is compute-intensive, thus focusing on features favoring Integer and FP ops; –Server computing is data-intensive, focusing on integers and char-strings (yet FP ops are still standard in them) –Embedded computing is time-sensitive, memory and power conciouse, thus focusing on code-density, real-time and media data streams. ISA has been defined as a contract between the software and hardware, or between the program and the machine, thus facilitating independent development of programs and machines.

Slide 7 Instruction Set Architecture Taxonomy of ISA: –Stack –Stack: both operands are implicit on the top of the stack, a data structure in which items are accessed an a last in, first out fashion. –Accumulator: –Accumulator: one operand is implicit in the accumulator, a special-purpose register. –General Purpose Register –General Purpose Register: all operands are explicit in specified registers or memory locations. Depending on where operands are specified and stored, there are three different ISA groups: »Register-Memory: one operand in register and one in memory.Examples: IBM 360/370, Intel 80x86 family, Mototola 68000; »Memory-Memory: both operands are in memory. Example: VAX. »Register=Register (load & store): all operands, except for those in load and store instructions, are in registers. Examples: SPARC (Sun Microsystems), MIPS, Precision Architecture (HP), PowerPC (IBM), Alpha (DEC).

Slide 8 Taxonomy of ISA: Examples Instruction Set Architecture TOS ALU Accumulator Stack Reg. Set Memory (a) Stack (b) Accumulator(c) Register-Memory (d) Reg-Reg/Load-Store Push A Push B Add Pop C Load A Add B Store C Load R1,A Add R3,R1,B Store R3,C Load R1,A Load R2,B Add R3,R1,R2 Store R3,C C  A+B ALU Memory Add C,A,B (e) Memory-Memory

Slide 9 Dynamic and Static Interface dynamic and static interfaceInherent in each ISA’s definition is an associated definition of an interface, called the dynamic and static interface (DSI), that separates what is done statically at compile time versus what is done dynamically at run time. –A key issue in the design of an ISA is the placement of the DSI: Program (Software) Machine (Hardware) Architecture (DSI) Compiler complexity Hardware complexity Exposed to software Hidden in hardware “Static” “Dynamic” DEL ~CISC~VLIW~RISC HLL Program DSI-1 DSI-2 DSI-3 Hardware

Slide 10 Processor Performance Performance Equation: –1/Performance = Time/Program = (Instuctions/Program)(Cycles/Instructions)(Time/Cycle) = IC * CPI * CCT Where: IC=Instruction Count; CPI=Cycles per Instruction; CCT=Clock cycle time Performance Optimization: reducing one or more of the three factors (IC, CPI, CCT) –IC: ISA dependent (CISC vs RISC); compiler dependent; dynamic elimination of redundant computation (computation reuse); –CPI: ISA and instruction complexity dependent; microarchitecture dependent (pipelining and speculative execution of instructions); –CCT: microarchitecture; pipelining; clock frequency; pipeline depth; can affect CPI;

Slide 11 Processor Performance functional (ISA) and performance (CPI)Performance Evaluation: functional (ISA) and performance (CPI) –Trace-driven simulation –Execution-driven simulation Physical execution with software instrumentation Traces Cycle-based performance simulator Physical execution with hardware instrumentation Functional simulator Trace generation Trace-driven Timing simulation Functional simulator (instruction interpretation) Functional simulation Trace storage Cycle-based performance simulator Execution-driven Timing simulation Execution trace Checkpoint and control Traces

Slide 12 Processor Performance Amdahl’s LawPerformance Evaluation: Amdahl’s Law –Idealized pipeline execution profile –Realistic pipeline execution profile –Refined speedup equation : 1-g g N 1 Pipeline Depth N 1 Pipeline stall

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18