Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.

Slides:



Advertisements
Similar presentations
Instruction Level Parallelism and Superscalar Processors
Advertisements

CH14 Instruction Level Parallelism and Superscalar Processors
Computer Organization and Architecture
CPU Review and Programming Models CT101 – Computing Systems.
10/11: Lecture Topics Slides on starting a program from last time Where we are, where we’re going RISC vs. CISC reprise Execution cycle Pipelining Hazards.
Vector Processing. Vector Processors Combine vector operands (inputs) element by element to produce an output vector. Typical array-oriented operations.
10-1 Chapter 10 - Trends in Computer Architecture Principles of Computer Architecture by M. Murdocca and V. Heuring © 1999 M. Murdocca and V. Heuring Parallel.
PZ13B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13B - Client server computing Programming Language.
PZ11B Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ11B - Parallel execution Programming Language Design.
PZ13A Programming Language design and Implementation -4th Edition Copyright©Prentice Hall, PZ13A - Processor design Programming Language Design.
Chapter 14 Superscalar Processors. What is Superscalar? “Common” instructions (arithmetic, load/store, conditional branch) can be executed independently.
Chapter 12 Pipelining Strategies Performance Hazards.
Midterm Wednesday Chapter 1-3: Number /character representation and conversion Number arithmetic Combinational logic elements and design (DeMorgan’s Law)
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
Chapter 12 CPU Structure and Function. Example Register Organizations.
Group 5 Alain J. Percial Paula A. Ortiz Francis X. Ruiz.
RISC CSS 548 Joshua Lo.
COMPUTER ORGANIZATIONS CSNB123 May 2014Systems and Networking1.
Instruction Sets and Pipelining Cover basics of instruction set types and fundamental ideas of pipelining Later in the course we will go into more depth.
Chapter 5 Basic Processing Unit
Previously Fetch execute cycle Pipelining and others forms of parallelism Basic architecture This week we going to consider further some of the principles.
What have mr aldred’s dirty clothes got to do with the cpu
1 Instruction Sets and Beyond Computers, Complexity, and Controversy Brian Blum, Darren Drewry Ben Hocking, Gus Scheidt.
RISC Architecture RISC vs CISC Sherwin Chan.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
M. Mateen Yaqoob The University of Lahore Spring 2014.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Pipelining and Parallelism Mark Staveley
CISC and RISC 12/25/ What is CISC? acronym for Complex Instruction Set Computer Chips that are easy to program and which make efficient use of memory.
COMPUTER ORGANIZATIONS CSNB123 NSMS2013 Ver.1Systems and Networking1.
1 Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
3/12/2013Computer Engg, IIT(BHU)1 CONCEPTS-1. Pipelining Pipelining is used to increase the speed of processing It uses temporal parallelism In pipelining,
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
CPIT Program Execution. Today, general-purpose computers use a set of instructions called a program to process data. A computer executes the.
Instruction level parallelism And Superscalar processors By Kevin Morfin.
1 Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Topics to be covered Instruction Execution Characteristics
Advanced Architectures
Assembly language.
Computer Architecture Chapter (14): Processor Structure and Function
Central Processing Unit Architecture
A Closer Look at Instruction Set Architectures
William Stallings Computer Organization and Architecture 8th Edition
Advanced Topic: Alternative Architectures Chapter 9 Objectives
Overview Introduction General Register Organization Stack Organization
CS1251 Computer Architecture
Chapter 14 Instruction Level Parallelism and Superscalar Processors
Pipeline Implementation (4.6)
Computer Architecture
Morgan Kaufmann Publishers The Processor
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
Instruction Level Parallelism and Superscalar Processors
Computer Architecture and the Fetch-Execute Cycle
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
PZ01C - Machine architecture
Computers: Tools for an Information Age
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Computer Architecture
CPU Structure CPU must:
Lecture 4: Instruction Set Design/Pipelining
Computers: Tools for an Information Age
COMPUTER ORGANIZATION AND ARCHITECTURE
Parallel execution Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Presentation transcript:

Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3

Traditional processor design

Ways to speed up execution To increase processor speed Increase functionality of processor - add more complex instructions (CISC - Complex Instruction Set Computers) Need more cycles to execute instruction: Chapter 2 assumes 2 cycles: - fetch instruction - execute instruction How many: - decode operands - fetch operand registers - decode instruction - perform operation - decode resulting address - move data to store register - store result  8 cycles per instruction, not 2

Alternative - RISC Have simple instructions, each executes in one cycle RISC - Reduced instruction set computer Speed - one cycle for each operation, but more operations. For example: A=B+C CISC: 3 instructions Load Register 1, B Add Register 1, C Store Register 1, A RISC:10 instruction Address of B in read register Read B Move data Register 1 Address of C in read register Read C Move data Register 2 Add Register 1, Register 2, Register 3 Move Register 3 to write register Address of A in write register Write A to memory

Aspects of RISC design Single cycle instructions Large control memory - often more than 100 registers. Fast procedure invocation - activation record invocation part of hardware. Put activation records totally within registers.

Implications Cannot compare processor speeds of a RISC and CISC processor: CISC - perhaps 8-10 cycles per instruction RISC - 1 cycle per instruction CISC can do some of these operations in parallel.

Pipeline architecture CISC design: 1. Retrieve instruction from main memory. 2. Decode operation field, source data, and destination data. 3. Get source data for operation. 4. Perform operation. Pipeline design: while Instruction 1 is executing Instruction 2 is retrieving source data Instruction 3 is being decoded Instruction 4 is being retrieved from memory. Four instructions at once, with an instruction completion each cycle.

Impact on language design With a standard CISC design, the statement E=A+B+C+D will have the postfix EAB+C+D+= and will execute as follows: 1. Add A to B, giving sum. 2. Add C to sum. 3. Add D to sum. 4. Store sum in E. But, Instruction 2 cannot retrieve sum (the result of adding A to B until the previous instruction stores that result. This causes the processor to wait a cycle on Instruction 2 until Instruction 1 completes its execution. A more intelligent translator would develop the postfix EAB+CD++=, which would allow A+B to be computed in parallel with C+D

Further optimization E=A+B+C+D J=F+G+H+I has the postfix AB+FG+CD+HI+(1)(3)+(2)(4)+E(5)=F(6)= [where the numbers indicate the operation number within that expression]. In this case, each statement executes with no interference from the other, and the processor executes at the full speed of the pipeline

Conditionals A = B + C; if D then E = 1 ELSE E = 2 Consider the above program. A pipeline architecture may even start to execute (E=1) before it evaluates to see if D is true. Options could be: If branch is take, wait for pipeline to empty. This slows down machine considerably at each branch Simply execute the pipeline as is. The compiler has to make sure that if the branch is taken, there is nothing in the pipeline that can be affected. This puts a great burden on the compiler writer. Paradoxically the above program can be compiled and run more efficiently as if the following was written: if D (A=B+C) then E = 1 [Explain this strange behavior.]

Summary New processor designs are putting more emphasis on good language processor designs. Need for effective translation models to allow languages to take advantage of faster hardware. What’s worse - simple translation strategies of the past, may even fail now.

Multiprocessor system architecture Impact: Multiple processors independently executing Need for more synchronization Cache coherence: Data in local cache may not be up to date.

Tightly coupled systems Architecture on previous slide are examples of tightly coupled systems. All processors have access to all memory Semaphores can be used to coordinate communication among processors Quick (nanosecond) access to all data Later look at networks of machines (Loosely couples machines): Processors have access to only local memory Semaphores cannot be used Relatively slow (millisecond) access to some data Model necessary for networks like the Internet