© 2000 Morgan Kaufman Overheads for Computers as Components Architectures and instruction sets zComputer architecture taxonomy. zAssembly language.

Slides:



Advertisements
Similar presentations
Basic Computer Architecture
Advertisements

DSPs Vs General Purpose Microprocessors
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
Instruction Set Design
Superscalar and VLIW Architectures Miodrag Bolic CEG3151.
Chapter 2 Instruction Sets 金仲達教授 清華大學資訊工程學系 (Slides are taken from the textbook slides)
RISC / CISC Architecture By: Ramtin Raji Kermani Ramtin Raji Kermani Rayan Arasteh Rayan Arasteh An Introduction to Professor: Mr. Khayami Mr. Khayami.
1 ECE462/562 ISA and Datapath Review Ali Akoglu. 2 Instruction Set Architecture A very important abstraction –interface between hardware and low-level.
POLITECNICO DI MILANO Parallelism in wonderland: are you ready to see how deep the rabbit hole goes? ILP: VLIW Architectures Marco D. Santambrogio:
1 Advanced Computer Architecture Limits to ILP Lecture 3.
© 2000 Morgan Kaufman Overheads for Computers as Components CPUs zCPU performance zCPU power consumption.
Computer Organization and Architecture
© 2008 Wayne Wolf Overheads for Computers as Components 2nd ed. Instruction sets Computer architecture taxonomy. Assembly language. 1.
Computer Architecture and Data Manipulation Chapter 3.
Computer Systems. Computer System Components Computer Networks.
PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13A - Processor design Programming Language Design.
Processor Technology and Architecture
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
Pipelined Processor II CPSC 321 Andreas Klappenecker.
Unit -II CPU Organization By- Mr. S. S. Hire. CPU organization.
(6.1) Central Processing Unit Architecture  Architecture overview  Machine organization – von Neumann  Speeding up CPU operations – multiple registers.
Inside The CPU. Buses There are 3 Types of Buses There are 3 Types of Buses Address bus Address bus –between CPU and Main Memory –Carries address of where.
Computer Organization Computer Organization & Assembly Language: Module 2.
© 2000 Morgan Kaufman Overheads for Computers as Components Instruction sets zComputer architecture taxonomy. zAssembly language. ARM processor  PDA’s,
Chapter 5 Basic Processing Unit
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Spring 2003CSE P5481 VLIW Processors VLIW (“very long instruction word”) processors instructions are scheduled by the compiler a fixed number of operations.
RISC Architecture RISC vs CISC Sherwin Chan.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Super computers Parallel Processing By Lecturer: Aisha Dawood.
Computer Architecture Pipelines & Superscalars Sunset over the Pacific Ocean Taken from Iolanthe II about 100nm north of Cape Reanga.
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
1 Pipelining Part I CS What is Pipelining? Like an Automobile Assembly Line for Instructions –Each step does a little job of processing the instruction.
Pipelining and Parallelism Mark Staveley
Processor Level Parallelism. Improving the Pipeline Pipelined processor – Ideal speedup = num stages – Branches / conflicts mean limited returns after.
Unit-2 Instruction Sets, CPUs
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
© 2000 Morgan Kaufman Overheads for Computers as Components CPUs zCPU performance: How fast it can execute instructions  increasing throughput by pipelining.
COMPUTER ORGANIZATIONS CSNB123 NSMS2013 Ver.1Systems and Networking1.
Simple ALU How to perform this C language integer operation in the computer C=A+B; ? The arithmetic/logic unit (ALU) of a processor performs integer arithmetic.
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
Jan. 5, 2000Systems Architecture II1 Machine Organization (CS 570) Lecture 1: Overview of High Performance Processors * Jeremy R. Johnson Wed. Sept. 27,
High Performance Computing1 High Performance Computing (CS 680) Lecture 2a: Overview of High Performance Processors * Jeremy R. Johnson *This lecture was.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
BASIC COMPUTER ARCHITECTURE HOW COMPUTER SYSTEMS WORK.
PipeliningPipelining Computer Architecture (Fall 2006)
Chapter Overview General Concepts IA-32 Processor Architecture
Topics to be covered Instruction Execution Characteristics
Advanced Architectures
ECE354 Embedded Systems Introduction C Andras Moritz.
Instruction Level Parallelism
Central Processing Unit Architecture
William Stallings Computer Organization and Architecture 8th Edition
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Instructional Parallelism
Superscalar Processors & VLIW Processors
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
Overheads for Computers as Components 2nd ed.
Introduction SYSC5603 (ELG6163) Digital Signal Processing Microprocessors, Software and Applications Miodrag Bolic.
Overheads for Computers as Components 2nd ed.
Introduction to Microprocessor Programming
Superscalar and VLIW Architectures
CPU Structure CPU must:
Chapter 4 The Von Neumann Model
Presentation transcript:

© 2000 Morgan Kaufman Overheads for Computers as Components Architectures and instruction sets zComputer architecture taxonomy. zAssembly language.

© 2000 Morgan Kaufman Overheads for Computers as Components von Neumann architecture zMemory holds data, instructions. zCentral processing unit (CPU) fetches instructions from memory. ySeparate CPU and memory distinguishes programmable computer. zCPU registers help out: program counter (PC), instruction register (IR), general- purpose registers, etc.

© 2000 Morgan Kaufman Overheads for Computers as Components CPU + memory memory CPU PC address data IRADD r5,r1,r3 200 ADD r5,r1,r3

© 2000 Morgan Kaufman Overheads for Computers as Components Harvard architecture CPU PC data memory program memory address data address data

© 2000 Morgan Kaufman Overheads for Computers as Components von Neumann vs. Harvard zHarvard can’t use self-modifying code. zHarvard allows two simultaneous memory fetches. zMost DSPs use Harvard architecture for streaming data: ygreater memory bandwidth; ymore predictable bandwidth.

© 2000 Morgan Kaufman Overheads for Computers as Components RISC vs. CISC zComplex instruction set computer (CISC): ymany addressing modes; ymany operations. zReduced instruction set computer (RISC): yload/store; ypipelinable instructions.

© 2000 Morgan Kaufman Overheads for Computers as Components Instruction set characteristics zFixed vs. variable length. zAddressing modes. zNumber of operands. zTypes of operands.

© 2000 Morgan Kaufman Overheads for Computers as Components Programming model zProgramming model: registers visible to the programmer. zSome registers are not visible (IR).

© 2000 Morgan Kaufman Overheads for Computers as Components Multiple implementations zSuccessful architectures have several implementations: yvarying clock speeds; ydifferent bus widths; ydifferent cache sizes; yetc.

© 2000 Morgan Kaufman Overheads for Computers as Components Assembly language zOne-to-one with instructions (more or less). zBasic features: yOne instruction per line. yLabels provide names for addresses (usually in first column). yInstructions often start in later columns. yColumns run to end of line.

© 2000 Morgan Kaufman Overheads for Computers as Components ARM assembly language example label1ADR r4,c LDR r0,[r4] ; a comment ADR r4,d LDR r1,[r4] SUB r0,r0,r1 ; comment destination

© 2000 Morgan Kaufman Overheads for Computers as Components Pseudo-ops zSome assembler directives don’t correspond directly to instructions: yDefine current address. yReserve storage. yConstants.

© 2000 Morgan Kaufman Overheads for Computers as Components Pipelining zExecute several instructions simultaneously but at different stages. zSimple three-stage pipe: fetchdecodeexecutememory

© 2000 Morgan Kaufman Overheads for Computers as Components Pipeline complications zMay not always be able to predict the next instruction: yConditional branch. zCauses bubble in the pipeline: fetchdecode Execute JNZ fetchdecodeexecute fetchdecodeexecute

© 2000 Morgan Kaufman Overheads for Computers as Components Superscalar zRISC pipeline executes one instruction per clock cycle (usually). zSuperscalar machines execute multiple instructions per clock cycle. yFaster execution. yMore variability in execution times. yMore expensive CPU.

© 2000 Morgan Kaufman Overheads for Computers as Components Simple superscalar zExecute floating point and integer instruction at the same time. yUse different registers. yFloating point operations use their own hardware unit. zMust wait for completion when floating point, integer units communicate.

© 2000 Morgan Kaufman Overheads for Computers as Components Costs zGood news---can find parallelism at run time. yBad news---causes variations in execution time. zRequires a lot of hardware. yn 2 instruction unit hardware for n-instruction parallelism.

© 2000 Morgan Kaufman Overheads for Computers as Components Finding parallelism zIndependent operations can be performed in parallel: ADD r0, r0, r1 ADD r2, r2, r3 ADD r6, r4, r r0 r1r2r3 r0 r4 r6 r3

© 2000 Morgan Kaufman Overheads for Computers as Components Pipeline hazards Two operations that require the same resource cannot be executed in parallel: x = a + b; a = d + e; y = a - f; x a b d e a y f

© 2000 Morgan Kaufman Overheads for Computers as Components Scoreboarding zScoreboard keeps track of what instructions use what resources: Reg fileALUFP instr1XX instr2X

© 2000 Morgan Kaufman Overheads for Computers as Components Order of execution zIn-order: yMachine stops issuing instructions when the next instruction can’t be dispatched. zOut-of-order: yMachine will change order of instructions to keep dispatching. ySubstantially faster but also more complex.

© 2000 Morgan Kaufman Overheads for Computers as Components VLIW architectures zVery long instruction word (VLIW) processing provides significant parallelism. zRely on compilers to identify parallelism.

© 2000 Morgan Kaufman Overheads for Computers as Components What is VLIW? zParallel function units with shared register file: register file function unit function unit function unit function unit... instruction decode and memory

© 2000 Morgan Kaufman Overheads for Computers as Components VLIW cluster zOrganized into clusters to accommodate available register bandwidth: cluster...

© 2000 Morgan Kaufman Overheads for Computers as Components VLIW and compilers zVLIW requires considerably more sophisticated compiler technology than traditional architectures---must be able to extract parallelism to keep the instructions full. zMany VLIWs have good compiler support.

© 2000 Morgan Kaufman Overheads for Computers as Components Static scheduling ab c d ef g abe fc dg nop expressionsinstructions

© 2000 Morgan Kaufman Overheads for Computers as Components Trace scheduling conditional 1 block 2block 3 loop head 4 loop body 5 Rank paths in order of frequency. Schedule paths in order of frequency.

© 2000 Morgan Kaufman Overheads for Computers as Components EPIC zEPIC = Explicitly parallel instruction computing. zUsed in Intel/HP Merced (IA-64) machine. zIncorporates several features to allow machine to find, exploit increased parallelism.

© 2000 Morgan Kaufman Overheads for Computers as Components IA-64 instruction format zInstructions are bundled with tag to indicate which instructions can be executed in parallel: taginstruction 1instruction 2instruction bits

© 2000 Morgan Kaufman Overheads for Computers as Components Memory system zCPU fetches data, instructions from a memory hierarchy: Main memory L2 cache L1 cache CPU

© 2000 Morgan Kaufman Overheads for Computers as Components Memory hierarchy complications zProgram behavior is much more state- dependent. yDepends on how earlier execution left the cache. zExecution time is less predictable. yMemory access times can vary by 100X.