Advanced Topic: Alternative Architectures Chapter 9 Objectives

Slides:



Advertisements
Similar presentations
Chapter 8: Central Processing Unit
Advertisements

RISC / CISC Architecture By: Ramtin Raji Kermani Ramtin Raji Kermani Rayan Arasteh Rayan Arasteh An Introduction to Professor: Mr. Khayami Mr. Khayami.
PART 4: (2/2) Central Processing Unit (CPU) Basics CHAPTER 13: REDUCED INSTRUCTION SET COMPUTERS (RISC) 1.
Computer Architecture and Data Manipulation Chapter 3.
Processor Technology and Architecture
Chapter XI Reduced Instruction Set Computing (RISC) CS 147 Li-Chuan Fang.
State Machines Timing Computer Bus Computer Performance Instruction Set Architectures RISC / CISC Machines.
RISC. Rational Behind RISC Few of the complex instructions were used –data movement – 45% –ALU ops – 25% –branching – 30% Cheaper memory VLSI technology.
11/11/05ELEC CISC (Complex Instruction Set Computer) Veeraraghavan Ramamurthy ELEC 6200 Computer Architecture and Design Fall 2005.
Chapter 13 Reduced Instruction Set Computers (RISC) CISC – Complex Instruction Set Computer RISC – Reduced Instruction Set Computer.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Chapter 9 Alternative Architectures. 2 Chapter 9 Objectives Learn the properties that often distinguish RISC from CISC architectures. Understand how multiprocessor.
Reduced Instruction Set Computers (RISC) Computer Organization and Architecture.
Processor Organization and Architecture
Advanced Computer Architectures
Computer Organization and Architecture Reduced Instruction Set Computers (RISC) Chapter 13.
CH13 Reduced Instruction Set Computers {Make hardware Simpler, but quicker} Key features  Large number of general purpose registers  Use of compiler.
Basic Microcomputer Design. Inside the CPU Registers – storage locations Control Unit (CU) – coordinates the sequencing of steps involved in executing.
RISC:Reduced Instruction Set Computing. Overview What is RISC architecture? How did RISC evolve? How does RISC use instruction pipelining? How does RISC.
1 Chapter 1 Parallel Machines and Computations (Fundamentals of Parallel Processing) Dr. Ranette Halverson.
1 4.2 MARIE This is the MARIE architecture shown graphically.
RISC By Ryan Aldana. Agenda Brief Overview of RISC and CISC Features of RISC Instruction Pipeline Register Windowing and renaming Data Conflicts Branch.
Ramesh.B ELEC 6200 Computer Architecture & Design Fall /29/20081Computer Architecture & Design.
Chapter 8 CPU and Memory: Design, Implementation, and Enhancement The Architecture of Computer Hardware and Systems Software: An Information Technology.
Chapter 2 Data Manipulation. © 2005 Pearson Addison-Wesley. All rights reserved 2-2 Chapter 2: Data Manipulation 2.1 Computer Architecture 2.2 Machine.
RISC and CISC. What is CISC? CISC is an acronym for Complex Instruction Set Computer and are chips that are easy to program and which make efficient use.
ECEG-3202 Computer Architecture and Organization Chapter 7 Reduced Instruction Set Computers.
Pipelining and Parallelism Mark Staveley
EECS 322 March 18, 2000 RISC - Reduced Instruction Set Computer Reduced Instruction Set Computer  By reducing the number of instructions that a processor.
Lecture 3: Computer Architectures
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
Chapter 2 Data Manipulation © 2007 Pearson Addison-Wesley. All rights reserved.
Addressing modes, memory architecture, interrupt and exception handling, and external I/O. An ISA includes a specification of the set of opcodes (machine.
Computer Organization and Architecture Lecture 1 : Introduction
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of.
Today’s Agenda Exam 2 Part 2 (11:15am-12:30pm)
Advanced Architectures
Assembly language.
A Closer Look at Instruction Set Architectures
Parallel Processing - introduction
CS 147 – Parallel Processing
The Central Processing Unit
Advanced Topic: Alternative Architectures Chapter 9 Objectives
Overview Introduction General Register Organization Stack Organization
CISC (Complex Instruction Set Computer)
Dr. Clincy Professor of CS
Computer Architecture
Pipelining and Vector Processing
Prepared By:- Parminder Mann
Central Processing Unit
CISC AND RISC SYSTEM Based on instruction set, we broadly classify Computer/microprocessor/microcontroller into CISC and RISC. CISC SYSTEM: COMPLEX INSTRUCTION.
Morgan Kaufmann Publishers Computer Organization and Assembly Language
Chapter 8 Central Processing Unit
Chapter 2: Data Manipulation
Overview Parallel Processing Pipelining
Chapter 5: Computer Systems Organization
AN INTRODUCTION ON PARALLEL PROCESSING
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Chapter 14 Control Unit Operation
A Discussion on Assemblers
Chapter 2: Data Manipulation
Introduction to Microprocessor Programming
Chapter 12 Pipelining and RISC
Computer Architecture
Lecture 4: Instruction Set Design/Pipelining
COMPUTER ORGANIZATION AND ARCHITECTURE
Prepared By:- Bhim Singh
Chapter 2: Data Manipulation
Presentation transcript:

Advanced Topic: Alternative Architectures Chapter 9 Objectives Learn the properties that often distinguish RISC from CISC architectures. Understand how multiprocessor architectures are classified. Appreciate the factors that create complexity in multiprocessor systems. Become familiar with the ways in which some architectures transcend the traditional von Neumann paradigm. 1

Introduction We have so far studied only the simplest models of computer systems; classical single-processor von Neumann systems. This chapter presents a number of different approaches to computer organization and architecture. Some of these approaches are in place in today’s commercial systems. Others may form the basis for the computers of the future. 2 2

Typical Architecture Already Explored ALU carries out the logic operations (comparisons) and arithmetic operations (adding, etc). Each memory location has a unique address. Each location can store a word Control Unit monitors and control the execution of all instructions and the transfer of all information CU extracts the instruction from memory, decodes the instructions, make sure data is in the right place at the right time, tell the ALU which registers to use, services interrupts, turn on the correct circuitry in the ALU for operation execution Dr. Clincy Lecture

ARCHITECTURE APPROACHES Complex Instruction Set Computer (CISC) Large number of instructions, of variable length Instructions have complex layouts A single instruction can be complex and perform multiple operations ISSUE: a small subset of CISC instructions slows the system down EXAMPLE: Intel x86 Architectures Reduced Instruction Set Computer (RISC) Because of CISC’s issue, designers return to less complicated architecture and hardwired a small, but complete instruction set Hardwired instructions are much faster Compiler is responsible for producing efficient code for Instruction Set Architecture (ISA) Simple instructions that execute faster Each instruction only performs one operation All instructions are the SAME size EXAMPLE: Pentium family and MIPS family of CPUs (Microprocessor without Interlocked Pipeline Stages)

ENGLISH Today is your birthday 19 symbols CHINESE 7 symbols RISC vs CISC Analogy ENGLISH Today is your birthday 19 symbols CHINESE 今天是你的生日 7 symbols

RISC Vs CISC Machines The underlying philosophy of RISC machines is that a system is better able to manage program execution when the program consists of only a few different instructions that are the same length and require the same number of clock cycles to decode and execute. RISC systems access memory only with explicit load and store instructions. In CISC systems, many different kinds of instructions access memory, making instruction length variable and fetch-decode-execute time unpredictable. 6 6

RISC Vs CISC Machines The difference between CISC and RISC becomes evident through the basic computer performance equation: RISC systems shorten execution time by reducing the clock cycles per instruction. CISC systems improve performance by reducing the number of instructions per program. 7 7

RISC Vs CISC Machines The simple instruction set of RISC machines enables control units to be hardwired for maximum speed. The more complex and variable length instruction set of CISC machines requires microcode-based control units that interpret instructions as they are fetched from memory. This translation takes time. With fixed-length instructions, RISC lends itself to pipelining and speculative execution (more predictive). 8 8

RISC Vs CISC Machines Consider the program fragments: The total clock cycles for the CISC version might be: (2 movs  1 cycle) + (1 mul  30 cycles) = 32 cycles While the clock cycles for the RISC version is: (3 movs  1 cycle) + (5 adds  1 cycle) + (5 loops  1 cycle) = 13 cycles With RISC clock cycle being shorter, RISC gives us much faster execution speeds. mov ax, 0 mov bx, 10 mov cx, 5 Begin add ax, bx loop Begin mov ax, 10 mov bx, 5 mul bx, ax RISC CISC 9 9

RISC Machines By reducing instruction complexity, simpler chips are needed – as a result, transistors heavily used by CISC can be used in more innovative ways like: pipelines, cache and registers CISC procedure calls and parameter passing involves considerable effort and resources Involves saving a return address Preserving register values Passing parameters by pushing on a stack or using registers Branching to the subroutine Executing the subroutine Once it returns to the calling program after the subroutine execution, parameter value modifications are saved Previous register values are restored Freeing up registers from the above tasks provide RISC machines hundreds of registers to create register environments Because of their load-store ISAs, RISC architectures require a large number of CPU registers. These register provide fast access to data during sequential program execution. They can also be employed to reduce the overhead typically caused by passing parameters to subprograms. Instead of pulling parameters off of a stack, the subprogram is directed to use a subset of registers. 10 10

RISC Registers Instead of pulling parameters off of a stack, the subprogram is directed to use a subset of registers. Each subset of registers is called a register window set By having all of the registers divided in various sets, when a program is executing in a particular environment, only one certain register set is being used or visible If the program changes to a different environment (ie. Procedure is called), then a different set of registers is used or visible This approach is faster and much less complex than “reusing the same registers” Regarding parameter passing using this register window set approach, the windows must overlap. The window set overlapping is accomplished by dividing the register window set into distinct partitions, which include Global Registers: common to all windows Local Registers: local to the current window Input Registers: overlaps with the preceding window’s output registers Output Registers: overlaps with the next window’s input registers 11 11

RISC Registers Typical real RISC architectures include 16 register sets (or windows) of 32 registers each. The CPU is restricted to operating in only one single window at a time. So, for any snap shot of time, only 32 registers are available. This is how registers can be overlapped in a RISC system. When procedure one calls procedure two, any parameters needing to be passed are placed into the output register set of procedure one Once procedure two begins to execute, the output registers of P1 becomes the input registers for P2 The current window pointer (CWP) points to the active register window. 8 Global Registers (R0-R7), 8 Input Registers (R8-R15), 8 Local Registers (R16-R23) and 8 Output Registers (R24-R31) 12 12

RISC Vs CISC RISC CISC Multiple register sets. Three operands per instruction. Parameter passing through register windows. Single-cycle instructions. Hardwired control. Highly pipelined. CISC Single register set. One or two register operands per instruction. Parameter passing through memory. Multiple cycle instructions. Microprogrammed control. Less pipelined. Continued.... 13 13

RISC Vs CISC CISC RISC Many complex instructions. Variable length instructions. Complexity in microcode. Many instructions can access memory. Many addressing modes. RISC Simple instructions, few in number. Fixed length instructions. Complexity in compiler. Only LOAD/STORE instructions access memory. Few addressing modes. 14 14

RISC Vs CISC Machines Today It is becoming increasingly difficult to distinguish RISC architectures from CISC architectures today. Some RISC systems provide more extravagant instruction sets than some CISC systems. Some systems combine both approaches. With the rise of embedded systems and mobile computing, the terms RISC and CISC have lost their significance The RISC Vs CISC debate started when chip-area and processor design complexity were issues – now energy and power are the issues. The two top competitors today are ARM and Intel – Intel focus on performance and ARM focus on efficiency 15 15

Flynn’s Taxonomy Many attempts have been made to come up with a way to categorize computer architectures. Flynn’s Taxonomy has been the most enduring of these, despite having some limitations. Flynn’s Taxonomy takes into consideration the number of processors and the number of data streams that flow into the processor. A machine can have one or many processors that operate on one or many data streams. For Flynn’s Taxonomy, the architecture is driven or influenced by the instructions characteristics – all processor activities are determined by a sequence of program code – program code act on the data. Data driven or Dataflow type architectures sequence of processor events are based on the data characteristics and not the instructions’ characteristics 16 16

Flynn’s Taxonomy 17 17

Flynn’s Taxonomy The four combinations of multiple processors and multiple data paths are described by Flynn as: SISD: Single instruction stream, single data stream. These are classic uniprocessor systems. SIMD: Single instruction stream, multiple data streams. Executes a single instruction using multiple computations at the same time (or in parallel) – makes use of data-level parallelism. All processors execute the same instruction simultaneously. MIMD: Multiple instruction streams, multiple data streams. These are today’s parallel architectures. Multiple processors function independently and asynchronously in executing different instructions on different data MISD: Multiple instruction streams operating on a single data stream. Many processors performing different operations on the same data stream. 18 18

Flynn’s Taxonomy Flynn’s Taxonomy falls short in a number of ways: First, there appears to be no need for MISD machines. Second, parallelism is not homogeneous. This assumption ignores the contribution of specialized processors. Third, it provides no straightforward way to distinguish architectures of the MIMD category. Doesn’t take into consideration how the processors are connected or interface with memory. One idea is to divide these MIMD systems into those that share memory, and those that don’t (distributed memory), as well as whether the interconnections are bus-based or switch-based. 19 19

Flynn’s Taxonomy - MIMD Symmetric multiprocessors (SMP) and massively parallel processors (MPP) are MIMD architectures that differ in how they use memory. SMP systems share the same memory and MPP do not. An easy way to distinguish SMP from MPP is: MPP  many processors + distributed memory + communication via network SMP  fewer processors + shared memory + communication via memory 20 20

Flynn’s Taxonomy - MIMD Other examples of MIMD architectures are found in distributed computing, where processing takes place collaboratively among networked computers. A network of workstations (NOW) uses otherwise idle systems to solve a problem. A collection of workstations (COW) is a NOW where one workstation coordinates the actions of the others. A dedicated cluster parallel computer (DCPC) is a group of workstations brought together to solve a specific problem. A pile of PCs (POPC) is a cluster of (usually) heterogeneous systems that form a dedicated parallel system. Another name for these approaches is “Cluster Computing” 21 21

Flynn’s Taxonomy – Recent Expansion -SPMD Flynn’s Taxonomy has been expanded to include SPMD (single program, multiple data) architectures. Recall SIMD executes a single instruction using multiple computations at the same time – data parallelism. For SPMD, multiple independent processors execute different tasks of the same program at the same time – task parallelism. Supercomputers use this approach Each SPMD processor has its own data set and program memory. Different nodes can execute different instructions within the same program using instructions similar to: If myNodeNum = 1 do this, else do that With this, different nodes execute different instructions with in the same program Yet another idea missing from Flynn’s is whether the architecture is instruction driven or data driven. 22 22