AKT211 – CAO 06 – More on Advanced Processing Techniques Ghifar Parahyangan Catholic University Okt 17, 2011 Ghifar Parahyangan Catholic University Okt.

Slides:



Advertisements
Similar presentations
CH14 Instruction Level Parallelism and Superscalar Processors
Advertisements

Computer Organization and Architecture
CSCI 4717/5717 Computer Architecture
Tuan Tran. What is CISC? CISC stands for Complex Instruction Set Computer. CISC are chips that are easy to program and which make efficient use of memory.
Chapter 14 Superscalar Processors. What is Superscalar? “Common” instructions (arithmetic, load/store, conditional branch) can be executed independently.
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.
Chapter 12 Pipelining Strategies Performance Hazards.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
11/11/05ELEC CISC (Complex Instruction Set Computer) Veeraraghavan Ramamurthy ELEC 6200 Computer Architecture and Design Fall 2005.
Understanding the risc and cisc architectures
Unit -II CPU Organization By- Mr. S. S. Hire. CPU organization.
Chapter 14 Instruction Level Parallelism and Superscalar Processors
(6.1) Central Processing Unit Architecture  Architecture overview  Machine organization – von Neumann  Speeding up CPU operations – multiple registers.
The Pentium: A CISC Architecture Shalvin Maharaj CS Umesh Maharaj:
RISC and CISC by Eugene Clewlow. Overview History of CISC and RISC CISC and RISC  Philosophy  Attributes and disadvantages Summation.
Reduced Instruction Set Computers (RISC) Computer Organization and Architecture.
Cisc Complex Instruction Set Computing By Christopher Wong 1.
RISC and CISC. Dec. 2008/Dec. and RISC versus CISC The world of microprocessors and CPUs can be divided into two parts:
CH13 Reduced Instruction Set Computers {Make hardware Simpler, but quicker} Key features  Large number of general purpose registers  Use of compiler.
Computer Organization and Architecture Instruction-Level Parallelism and Superscalar Processors.
Basics and Architectures
Chun Chiu. Overview What is RISC? Characteristics of RISC What is CISC? Why using RISC? RISC Vs. CISC RISC Pipelines Advantage of RISC / disadvantage.
Data Representation By- Mr. S. S. Hire. Data Representation.
What have mr aldred’s dirty clothes got to do with the cpu
RISC Architecture RISC vs CISC Sherwin Chan.
Ramesh.B ELEC 6200 Computer Architecture & Design Fall /29/20081Computer Architecture & Design.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
RISC architecture and instruction Level Parallelism (ILP) based on “Computer Architecture: a Quantitative Approach” by Hennessy and Patterson, Morgan Kaufmann.
Chapter Six Sun SPARC Architecture. SPARC Processor The name SPARC stands for Scalable Processor Architecture SPARC architecture follows the RISC design.
Ted Pedersen – CS 3011 – Chapter 10 1 A brief history of computer architectures CISC – complex instruction set computing –Intel x86, VAX –Evolved from.
RISC and CISC. What is CISC? CISC is an acronym for Complex Instruction Set Computer and are chips that are easy to program and which make efficient use.
COMPUTER ORGANIZATION AND ASSEMBLY LANGUAGE Lecture 19 & 20 Instruction Formats PDP-8,PDP-10,PDP-11 & VAX Course Instructor: Engr. Aisha Danish.
 Introduction to SUN SPARC  What is CISC?  History: CISC  Advantages of CISC  Disadvantages of CISC  RISC vs CISC  Features of SUN SPARC  Architecture.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Reduced Instruction Set Computers. Major Advances in Computers(1) The family concept —IBM System/ —DEC PDP-8 —Separates architecture from implementation.
Pipelining and Parallelism Mark Staveley
CISC and RISC 12/25/ What is CISC? acronym for Complex Instruction Set Computer Chips that are easy to program and which make efficient use of memory.
COMPUTER ORGANIZATIONS CSNB123 NSMS2013 Ver.1Systems and Networking1.
EECS 322 March 18, 2000 RISC - Reduced Instruction Set Computer Reduced Instruction Set Computer  By reducing the number of instructions that a processor.
Reduced Instruction Set Computing Ammi Blankrot April 26, 2011 (RISC)
PART 5: (1/2) Processor Internals CHAPTER 14: INSTRUCTION-LEVEL PARALLELISM AND SUPERSCALAR PROCESSORS 1.
RISC / CISC Architecture by Derek Ng. Overview CISC Architecture RISC Architecture  Pipelining RISC vs CISC.
CISC. What is it?  CISC - Complex Instruction Set Computer  CISC is a design philosophy that:  1) uses microcode instruction sets  2) uses larger.
Addressing modes, memory architecture, interrupt and exception handling, and external I/O. An ISA includes a specification of the set of opcodes (machine.
Advanced Architectures
RISC and CISC Lecture 8.
Computer Architecture Chapter (14): Processor Structure and Function
Central Processing Unit Architecture
William Stallings Computer Organization and Architecture 8th Edition
William Stallings Computer Organization and Architecture 8th Edition
Chapter 9 a Instruction Level Parallelism and Superscalar Processors
Overview Introduction General Register Organization Stack Organization
Chapter 14 Instruction Level Parallelism and Superscalar Processors
CISC (Complex Instruction Set Computer)
Instruction Level Parallelism and Superscalar Processors
Central Processing Unit
Instruction Level Parallelism and Superscalar Processors
RISC and CISC.
Computer Architecture
Chapter 12 Pipelining and RISC
Created by Vivi Sahfitri
COMPUTER ORGANIZATION AND ARCHITECTURE
Presentation transcript:

AKT211 – CAO 06 – More on Advanced Processing Techniques Ghifar Parahyangan Catholic University Okt 17, 2011 Ghifar Parahyangan Catholic University Okt 17, 2011

OutlineOutline  Pipeline  RISC vs RISC  Superscalar

INSTRUCTION PIPELINE

PipelinePipeline Problem with single cycle designProblem with single cycle design –Slowest instruction pulls down the clock frequency –Resource utilization is poor –There are some instructions which are impossible to be implemented in this manner Organizationally needs a change  pipelineOrganizationally needs a change  pipeline

PipelinePipeline Similar to the use of an assembly line in a manufacturing plantSimilar to the use of an assembly line in a manufacturing plant –Products at various stages can be worked simultaneously

Two-Stage Instruction Pipeline Any problem ? The ‘fetch’ has to wait if : 1.T(exec) >T(fetch) ! 2.There is a branch instruction

Six-Stage Instruction Pipeline Decomposing the instruction processing into : Fetch Instruction (FI)Fetch Instruction (FI) Decode Instruction (DI)Decode Instruction (DI) Calculate Operands (CO)Calculate Operands (CO) Fetch Operands (FO)Fetch Operands (FO) Execute Instruction (EI)Execute Instruction (EI) Write Operands (WO)Write Operands (WO)

Six-Stage Instruction Pipeline Assumes that :Assumes that : –no memory conflicts –no branches –no interrupts

Six-Stage Instruction Pipeline With branches : Penalty : no instructions complete during time units Penalty : no instructions complete during time units

Six-Stage Instruction Pipeline Modified algorithm :Modified algorithm :

Pipeline Performance The cycle time of an instruction pipeline :The cycle time of an instruction pipeline :

Pipeline Performance Let T[k,n] be the total time required for a pipeline with k stages to execute n instructions (total execution time) :Let T[k,n] be the total time required for a pipeline with k stages to execute n instructions (total execution time) : Pipeline speedup :Pipeline speedup :

Speedup Factors with Instruction Pipeline

Pipeline Hazards Occurs when the pipeline, or some portion of the pipeline, must stall/idle because conditions do not permit continued executionOccurs when the pipeline, or some portion of the pipeline, must stall/idle because conditions do not permit continued execution 3 types of hazards :3 types of hazards : 1.Resource Hazards 2.Data Hazards 3.Control Hazards

Resource Hazards occurs when two (or more) instructions that are already in the pipeline need the same resourceoccurs when two (or more) instructions that are already in the pipeline need the same resource Sometime referred as a structural hazardSometime referred as a structural hazard

Data Hazards occurs when there is a conflict in the access of an operand locationoccurs when there is a conflict in the access of an operand location –ADD EAX, EBX /* EAX = EAX + EBX */ –SUB ECX, EAX /* ECX = ECX - EAX */ 3 types of data hazards :3 types of data hazards : –Read after write (RAW) –Write after read (WAR) –Write after write (WAW)

Control Hazards knows as a branch hazardknows as a branch hazard occurs when the pipeline makes the wrong decision on a branch prediction and therefore brings instructions into the pipeline that must subsequently be discardedoccurs when the pipeline makes the wrong decision on a branch prediction and therefore brings instructions into the pipeline that must subsequently be discarded

CISC vs RISC

What is CISC? CISC is an acronym for Complex Instruction Set Computer and are chips that are easy to program and which make efficient use of memory. Since the earliest machines were programmed in assembly language and memory was slow and expensive, the CISC philosophy made sense, and was commonly implemented in such large computers as the PDP-11 and the DECsystem 10 and 20 machines.CISC is an acronym for Complex Instruction Set Computer and are chips that are easy to program and which make efficient use of memory. Since the earliest machines were programmed in assembly language and memory was slow and expensive, the CISC philosophy made sense, and was commonly implemented in such large computers as the PDP-11 and the DECsystem 10 and 20 machines. Most common microprocessor designs such as the Intel 80x86 and Motorola 68K series followed the CISC philosophy.Most common microprocessor designs such as the Intel 80x86 and Motorola 68K series followed the CISC philosophy. CISC was developed to make compiler development simpler. It shifts most of the burden of generating machine instructions to the processor. For example, instead of having to make a compiler write long machine instructions to calculate a square-root, a CISC processor would have a built-in ability to do this.CISC was developed to make compiler development simpler. It shifts most of the burden of generating machine instructions to the processor. For example, instead of having to make a compiler write long machine instructions to calculate a square-root, a CISC processor would have a built-in ability to do this.

CISC Attributes The design constraints that led to the development of CISC (small amounts of slow memory and fact that most early machines were programmed in assembly language) give CISC instructions sets some common characteristics: A 2-operand format, where instructions have a source and a destination. Register to register, register to memory, and memory to register commands. Multiple addressing modes for memory, including specialized modes for indexing through arraysA 2-operand format, where instructions have a source and a destination. Register to register, register to memory, and memory to register commands. Multiple addressing modes for memory, including specialized modes for indexing through arrays Variable length instructions where the length often varies according to the addressing modeVariable length instructions where the length often varies according to the addressing mode Instructions which require multiple clock cycles to execute.Instructions which require multiple clock cycles to execute. E.g. Pentium is considered a modern CISC processor

Most CISC hardware architectures have several characteristics in common: Complex instruction-decoding logic, driven by the need for a single instruction to support multiple addressing modes.Complex instruction-decoding logic, driven by the need for a single instruction to support multiple addressing modes. A small number of general purpose registers. This is the direct result of having instructions which can operate directly on memory and the limited amount of chip space not dedicated to instruction decoding, execution, and microcode storage.A small number of general purpose registers. This is the direct result of having instructions which can operate directly on memory and the limited amount of chip space not dedicated to instruction decoding, execution, and microcode storage. Several special purpose registers. Many CTSC designs set aside special registers for the stack pointer, interrupt handling, and so on. This can simplify the hardware design somewhat, at the expense of making the instruction set more complex.Several special purpose registers. Many CTSC designs set aside special registers for the stack pointer, interrupt handling, and so on. This can simplify the hardware design somewhat, at the expense of making the instruction set more complex. CISC Hw. Architecture

At the time of their initial development, CISC machines used available technologies to optimize computer performance. Microprogramming is as easy as assembly language to implement, and much less expensive than hardwiring a control unit.Microprogramming is as easy as assembly language to implement, and much less expensive than hardwiring a control unit. The ease of microcoding new instructions allowed designers to make CISC machines upwardly compatible: a new computer could run the same programs as earlier computers because the new computer would contain a superset of the instructions of the earlier computers.The ease of microcoding new instructions allowed designers to make CISC machines upwardly compatible: a new computer could run the same programs as earlier computers because the new computer would contain a superset of the instructions of the earlier computers. As each instruction became more capable, fewer instructions could be used to implement a given task. This made more efficient use of the relatively slow main memory.As each instruction became more capable, fewer instructions could be used to implement a given task. This made more efficient use of the relatively slow main memory. Because micro-program instruction sets can be written to match the constructs of high-level languages, the compiler does not have to be as complicated.Because micro-program instruction sets can be written to match the constructs of high-level languages, the compiler does not have to be as complicated.

CISC Disadvantages Designers soon realized that the CISC philosophy had its own problems, including: Earlier generations of a processor family generally were contained as a subset in every new version - so instruction set & chip hardware become more complex with each generation of computers.Earlier generations of a processor family generally were contained as a subset in every new version - so instruction set & chip hardware become more complex with each generation of computers. So that as many instructions as possible could be stored in memory with the least possible wasted space, individual instructions could be of almost any length - this means that different instructions will take different amounts of clock time to execute, slowing down the overall performance of the machine.So that as many instructions as possible could be stored in memory with the least possible wasted space, individual instructions could be of almost any length - this means that different instructions will take different amounts of clock time to execute, slowing down the overall performance of the machine. Many specialized instructions aren't used frequently enough to justify their existence -approximately 20% of the available instructions are used in a typical program.Many specialized instructions aren't used frequently enough to justify their existence -approximately 20% of the available instructions are used in a typical program. CISC instructions typically set the condition codes as a side effect of the instruction. Not only does setting the condition codes take time, but programmers have to remember to examine the condition code bits before a subsequent instruction changes them.CISC instructions typically set the condition codes as a side effect of the instruction. Not only does setting the condition codes take time, but programmers have to remember to examine the condition code bits before a subsequent instruction changes them.

What is RISC? RISC? RISC, or Reduced Instruction Set Computer. is a type of microprocessor architecture that utilizes a small, highly- optimized set of instructions, rather than a more specialized set of instructions often found in other types of architectures.RISC? RISC, or Reduced Instruction Set Computer. is a type of microprocessor architecture that utilizes a small, highly- optimized set of instructions, rather than a more specialized set of instructions often found in other types of architectures. History The first RISC projects came from IBM, Stanford, and UC- Berkeley in the late 70s and early 80s. The IBM 801, Stanford MIPS, and Berkeley RISC 1 and 2 were all designed with a similar philosophy which has become known as RISC. Certain design features have been characteristic of most RISC processors:History The first RISC projects came from IBM, Stanford, and UC- Berkeley in the late 70s and early 80s. The IBM 801, Stanford MIPS, and Berkeley RISC 1 and 2 were all designed with a similar philosophy which has become known as RISC. Certain design features have been characteristic of most RISC processors: –one cycle execution time: RISC processors have a CPI (clock per instruction) of one cycle. This is due to the optimization of each instruction on the CPU and a technique called PIPELINING –pipelining: a technique that allows for simultaneous execution of parts, or stages, of instructions to more efficiently process instructions; –large number of registers: the RISC design philosophy generally incorporates a larger number of registers to prevent in large amounts of interactions with memory

RISC Attributes The main characteristics of CISC microprocessors are: Extensive instructions.Extensive instructions. Complex and efficient machine instructions.Complex and efficient machine instructions. Micro-encoding of the machine instructions.Micro-encoding of the machine instructions. Extensive addressing capabilities for memory operations.Extensive addressing capabilities for memory operations. Relatively few registers.Relatively few registers. In comparison, RISC processors are more or less the opposite of the above: Reduced instruction set.Reduced instruction set. Less complex, simple instructions.Less complex, simple instructions. Hardwired control unit and machine instructions.Hardwired control unit and machine instructions. Few addressing schemes for memory operands with only two basic instructions, LOAD and STOREFew addressing schemes for memory operands with only two basic instructions, LOAD and STORE Many symmetric registers which are organized into a register file.Many symmetric registers which are organized into a register file.

RISC Disadvantages There is still considerable controversy among experts about the ultimate value of RISC architectures. Its proponents argue that RISC machines are both cheaper and faster, and are therefore the machines of the future.There is still considerable controversy among experts about the ultimate value of RISC architectures. Its proponents argue that RISC machines are both cheaper and faster, and are therefore the machines of the future. However, by making the hardware simpler, RISC architectures put a greater burden on the software. Is this worth the trouble because conventional microprocessors are becoming increasingly fast and cheap anyway?However, by making the hardware simpler, RISC architectures put a greater burden on the software. Is this worth the trouble because conventional microprocessors are becoming increasingly fast and cheap anyway?

CISC versus RISC CISCRISC Emphasis on hardwareEmphasis on software Includes multi-clock complex instructions Single-clock, reduced instruction only Memory-to-memory: "LOAD" and "STORE" incorporated in instructions Register to register: "LOAD" and "STORE" are independent instructions Small code sizes, high cycles per second Low cycles per second, large code sizes Transistors used for storing complex instructions Spends more transistors on memory registers

SummationSummation As memory speed increased, and high-level languages displaced assembly language, the major reasons for CISC began to disappear, and computer designers began to look at ways computer performance could be optimized beyond just making faster hardware.As memory speed increased, and high-level languages displaced assembly language, the major reasons for CISC began to disappear, and computer designers began to look at ways computer performance could be optimized beyond just making faster hardware. One of their key realizations was that a sequence of simple instructions produces the same results as a sequence of complex instructions, but can be implemented with a simpler (and faster) hardware design. (Assuming that memory can keep up.) RISC (Reduced Instruction Set Computers) processors were the result.One of their key realizations was that a sequence of simple instructions produces the same results as a sequence of complex instructions, but can be implemented with a simpler (and faster) hardware design. (Assuming that memory can keep up.) RISC (Reduced Instruction Set Computers) processors were the result. CISC and RISC implementations are becoming more and more alike. Many of today's RISC chips support as many instructions as yesterday's CISC chips. And today's CISC chips use many techniques formerly associated with RISC chips.CISC and RISC implementations are becoming more and more alike. Many of today's RISC chips support as many instructions as yesterday's CISC chips. And today's CISC chips use many techniques formerly associated with RISC chips.

Modern Day Advancement CISC and RISC Convergence State of the art processor technology has changed significantly since RISC chips were first introduced in the early '80s. Because a number of advancements are used by both RISC and CISC processors, the lines between the two architectures have begun to blur. In fact, the two architectures almost seem to have adopted the strategies of the other. Because processor speeds have increased, CISC chips are now able to execute more than one instruction within a single clock. This also allows CISC chips to make use of pipelining. With other technological improvements, it is now possible to fit many more transistors on a single chip.CISC and RISC Convergence State of the art processor technology has changed significantly since RISC chips were first introduced in the early '80s. Because a number of advancements are used by both RISC and CISC processors, the lines between the two architectures have begun to blur. In fact, the two architectures almost seem to have adopted the strategies of the other. Because processor speeds have increased, CISC chips are now able to execute more than one instruction within a single clock. This also allows CISC chips to make use of pipelining. With other technological improvements, it is now possible to fit many more transistors on a single chip.

This gives RISC processors enough space to incorporate more complicated, CISC-like commands. RISC chips also make use of more complicated hardware, making use of extra function units for superscalar execution. All of these factors have led some groups to argue that we are now in a "post-RISC" era, in which the two styles have become so similar that distinguishing between them is no longer relevant. However, it should be noted that RISC chips still retain some important traits. RISC chips strictly utilize uniform, single-cycle instructions. They also retain the register-to-register, load/store architecture. And despite their extended instruction sets, RISC chips still have a large number of general purpose registers.This gives RISC processors enough space to incorporate more complicated, CISC-like commands. RISC chips also make use of more complicated hardware, making use of extra function units for superscalar execution. All of these factors have led some groups to argue that we are now in a "post-RISC" era, in which the two styles have become so similar that distinguishing between them is no longer relevant. However, it should be noted that RISC chips still retain some important traits. RISC chips strictly utilize uniform, single-cycle instructions. They also retain the register-to-register, load/store architecture. And despite their extended instruction sets, RISC chips still have a large number of general purpose registers. Modern Day Advancement

SUPERSCALAR

What is Superscalar ? Refers to a machine that is designed to improve the execution performance of scalar instructionsRefers to a machine that is designed to improve the execution performance of scalar instructions A superscalar processor is one in which multiple independent instruction pipelines are used, exploits what is knows as instruction-level parallelismA superscalar processor is one in which multiple independent instruction pipelines are used, exploits what is knows as instruction-level parallelism Equally applicable to RISC & CISCEqually applicable to RISC & CISC In practice usually RISCIn practice usually RISC

General Superscalar Organization

Fetching Two Instructions per Cycle

SuperpipelinedSuperpipelined Many pipeline stages need less than half a clock cycleMany pipeline stages need less than half a clock cycle Double internal clock speed gets two tasks per external clock cycleDouble internal clock speed gets two tasks per external clock cycle Superscalar allows parallel fetch executeSuperscalar allows parallel fetch execute

Superscalar vs Superpipelined

LimitationsLimitations Instruction level parallelismInstruction level parallelism Compiler based optimisationCompiler based optimisation Hardware techniquesHardware techniques Limited byLimited by –True data dependency –Procedural dependency –Resource conflicts –Output dependency –Antidependency

True Data Dependency ADD r1, r2 (r1 := r1+r2;)ADD r1, r2 (r1 := r1+r2;) MOVE r3,r1 (r3 := r1;)MOVE r3,r1 (r3 := r1;) Can fetch and decode second instruction in parallel with firstCan fetch and decode second instruction in parallel with first Can NOT execute second instruction until first is finishedCan NOT execute second instruction until first is finished

Procedural Dependency Can not execute instructions after a branch, in parallel with, instructions before a branchCan not execute instructions after a branch, in parallel with, instructions before a branch Also, if instruction length is not fixed, instructions have to be decoded to find out how many fetches are neededAlso, if instruction length is not fixed, instructions have to be decoded to find out how many fetches are needed This prevents simultaneous fetchesThis prevents simultaneous fetches

Resource Conflict Two or more instructions requiring access to the same resource at the same timeTwo or more instructions requiring access to the same resource at the same time –e.g. two arithmetic instructions Can duplicate resourcesCan duplicate resources –e.g. have two arithmetic units

DependenciesDependencies

AntidependencyAntidependency WAW dependencyWAW dependency –R3:=R3 + R5; (I1) –R4:=R3 + 1; (I2) –R3:=R5 + 1; (I3) –R7:=R3 + R4; (I4) I3 can not complete before I2 starts as I2 needs a value in R3 and I3 changes R3

Register Renaming Antidependencies occur because register contents may not reflect the correct ordering from the programAntidependencies occur because register contents may not reflect the correct ordering from the program May result in a pipeline stallMay result in a pipeline stall Registers allocated dynamicallyRegisters allocated dynamically –i.e. registers are not specifically named

Register Renaming example R3b:=R3a + R5a (I1)R3b:=R3a + R5a (I1) R4b:=R3b + 1 (I2)R4b:=R3b + 1 (I2) R3c:=R5a + 1 (I3)R3c:=R5a + 1 (I3) R7b:=R3c + R4b (I4)R7b:=R3c + R4b (I4) Without subscript refers to logical register in instructionWithout subscript refers to logical register in instruction With subscript is hardware register allocatedWith subscript is hardware register allocated Note R3a R3b R3cNote R3a R3b R3c Disadvantage: need more registers !Disadvantage: need more registers !

Superscalar Execution

Superscalar Execution Example - With Register Renaming for WAR and WAW dependencies.

ConclusionConclusion It thereby allows faster CPU than would otherwise be possible at the same clock rate. It thereby allows faster CPU throughput than would otherwise be possible at the same clock rate. All general-purpose CPUs developed since about 1998 are superscalar. All general-purpose CPUs developed since about 1998 are superscalar. The major problem of executing multiple instructions in a scalar program is the handling of data dependencies. If data dependencies are not effectively handled, it is difficult to achieve an execution rate of more than one instruction per clock cycle. The major problem of executing multiple instructions in a scalar program is the handling of data dependencies. If data dependencies are not effectively handled, it is difficult to achieve an execution rate of more than one instruction per clock cycle.

Comparison of processors

Any Question ?

THANK YOU