+ CS 325: CS Hardware and Software Organization and Architecture Computer Evolution and Performance 2.

Slides:



Advertisements
Similar presentations
Register In computer architecture, a processor register is a small amount of storage available on the CPU whose contents can be accessed more quickly than.
Advertisements

MICROPROCESSORS TWO TYPES OF MODELS ARE USED :  PROGRAMMER’S MODEL :- THIS MODEL SHOWS FEATURES, SUCH AS INTERNAL REGISTERS, ADDRESS,DATA & CONTROL BUSES.
Parul Polytechnic Institute
Khaled A. Al-Utaibi  Computers are Every Where  What is Computer Engineering?  Design Levels  Computer Engineering Fields  What.
CEN 226: Computer Organization & Assembly Language :CSC 225 (Lec#3) By Dr. Syed Noman.
Computer Organization and Architecture
CS2422 Assembly Language & System Programming September 19, 2006.
Computer Organization and Architecture 18 th March, 2008.
Room: E-3-31 Phone: Dr Masri Ayob TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 5: CPU and Memory.
From Essentials of Computer Architecture by Douglas E. Comer. ISBN © 2005 Pearson Education, Inc. All rights reserved.
Processor Technology and Architecture
1 Hardware and Software Architecture Chapter 2 n The Intel Processor Architecture n History of PC Memory Usage (Real Mode)
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
ICS312 Set 3 Pentium Registers. Intel 8086 Family of Microprocessors All of the Intel chips from the 8086 to the latest pentium, have similar architectures.
Computer Organization and Assembly language
GCSE Computing - The CPU
Gursharan Singh Tatla Block Diagram of Intel 8086 Gursharan Singh Tatla 19-Apr-17.
Unit-1 PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE Advance Processor.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 251 Introduction to Computer Organization.
Computers Central Processor Unit. Basic Computer System MAIN MEMORY ALUCNTL..... BUS CONTROLLER Processor I/O moduleInterconnections BUS Memory.
CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.
Processor Structure & Operations of an Accumulator Machine
An Introduction to 8086 Microprocessor.
Internal hardware and external components of a computer Three-box Model  Processor The brain of the system Executes programs A big finite state machine.
The Computer Systems By : Prabir Nandi Computer Instructor KV Lumding.
1 Fundamental of Computer Suthida Chaichomchuen : SCC
Computers organization & Assembly Language Chapter 0 INTRODUCTION TO COMPUTING Basic Concepts.
Stack Stack Pointer A stack is a means of storing data that works on a ‘Last in first out’ (LIFO) basis. It reverses the order that data arrives and is.
The variety Of Processors And Computational Engines CS – 355 Chapter- 4 `
Cis303a_chapt04.ppt Chapter 4 Processor Technology and Architecture Internal Components CPU Operation (internal components) Control Unit Move data and.
Chapter 2 Data Manipulation. © 2005 Pearson Addison-Wesley. All rights reserved 2-2 Chapter 2: Data Manipulation 2.1 Computer Architecture 2.2 Machine.
Stored Programs In today’s lesson, we will look at: what we mean by a stored program computer how computers store and run programs what we mean by the.
+ CS 325: CS Hardware and Software Organization and Architecture Exam 2: Study Guide.
Stored Program A stored-program digital computer is one that keeps its programmed instructions, as well as its data, in read-write,
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Overview von Neumann Architecture Computer component Computer function
DR. SIMING LIU SPRING 2016 COMPUTER SCIENCE AND ENGINEERING UNIVERSITY OF NEVADA, RENO CS 219 Computer Organization.
The Central Processing Unit (CPU)
University of Sargodha, Lahore Campus Prepared by Ali Saeed.
MODULE 5 INTEL TODAY WE ARE GOING TO DISCUSS ABOUT, FEATURES OF 8086 LOGICAL PIN DIAGRAM INTERNAL ARCHITECTURE REGISTERS AND FLAGS OPERATING MODES.
1 x86 Programming Model Microprocessor Computer Architectures Lab Components of any Computer System Control – logic that controls fetching/execution of.
Computer Organization & Assembly Language University of Sargodha, Lahore Campus Prepared by Ali Saeed.
Riyadh Philanthropic Society For Science Prince Sultan College For Woman Dept. of Computer & Information Sciences CS 251 Introduction to Computer Organization.
Chapter 12 Processor Structure and Function. Central Processing Unit CPU architecture, Register organization, Instruction formats and addressing modes(Intel.
Stored Program Concept Learning Objectives Learn the meaning of the stored program concept The processor and its components The fetch-decode-execute and.
GCSE Computing - The CPU
Part of the Assembler Language Programmers Toolbox
Stored program concept
History – 2 Intel 8086.
UNIT Architecture M.Brindha AP/EIE
Introduction to 8086 Microprocessor
8086 Microprocessor.
Computer Organization & Assembly Language Chapter 3
Introduction of microprocessor
Components of Computer
University of Gujrat Department of Computer Science
Introduction to Assembly Language
CS 301 Fall 2002 Computer Organization
Chapter 2: Data Manipulation
Computer Architecture CST 250
Chapter 2: Data Manipulation
Introduction to Microprocessor Programming
Unit-I 80386DX Architecture
CPU Structure CPU must:
Intel 8086.
GCSE Computing - The CPU
Chapter 2: Data Manipulation
Part I Data Representation and 8086 Microprocessors
Presentation transcript:

+ CS 325: CS Hardware and Software Organization and Architecture Computer Evolution and Performance 2

+ Outline Von Neumann Architecture Processor Hierarchy Registers ALU Processor Categories Processor Performance Amdahl’s Law Computer Benchmarks

+ Von Neumann Architecture Characteristic of most modern processors. Central idea is Stored Program. Three basic components: Processor Memory I/O Facilities

+ Illustration of Von Neumann Architecture

+ Processor Digital Device. Performs computation involving multiple steps. Building blocks used to form computer system.

+ Hierarchical Structure and Computational Engines Most computer architecture follows a hierarchical approach. Subparts of a large, central processor are sophisticated enough to meet our definition of a processor. Some engineers use the term computational engine for sub-piece that is less powerful than the main processor.

+ Illustration of Processor Hierarchy

+ Major Components of a Conventional Processor Controller Computational Engine (ALU) Local Data Storage Internal Interconnections External Interface

+ Illustration of a Conventional Processor

+ Parts of a Conventional Processor Controller Overall responsibility for execution Moves through sequence of steps Coordinates other units Computational Engine Operates as directed by controller Typically provides arithmetic and Boolean operations (ALU) Performs one operation at a time

+ Parts of a Conventional Processor Local Data Storage Holds data values for operations Must be loaded before operation can be performed Typically implemented with registers Internal Interconnections Allows transfer of values among units of the processor Sometimes called data path

+ Parts of a Conventional Processor External Interface Handles communication between processor and rest of computer system Provides connections to external memory as well as external I/O devices

+ Another Illustration of Processor

+ Parts of a Conventional Processor ALU Status Flags: Neg, Zero, Carry, Overflow Shifter: Left  multiplication by 2 Right  division by 2 Complementer: Logical NOT

+ Example Register Organizations

+ Processor Registers Motorola CPU - MC bit general purpose registers (D0 – D7) 8 32-bit address registers (A0 – A7) 1 32-bit program counter 1 16 status register

+ Processor Registers Intel 8086 – 16-bit General Purpose: AX – Accumulator: Multiply, Divide, I/O BX – Base: Pointer to base address (data) CX – Count: Counter for loops, shifts DX – Data: Multiply, Divide, I/O Pointer and Index: SP – Stack Pointer: pointer to top of stack BP – Base Pointer: pointer to base address (stack) SI – Source Index: source string/index pointer DI – Destination Index: Destination string/index pointer Segment Registers: CS – Code Segment DS – Data Segment SS – Stack Segment ES – Extra Segment Program Status: PC – Program Counter SR – Status Register

+ Processor Registers Intel – Pentium 2 Similar to 8086, but register width doubled to 32-bit

+ Arithmetic Logic Unit (ALU) Main computational engine in conventional processor. Complex unit that can perform variety of tasks Integer arithmetic (add, subtract, multiply, divide) Shift (left, right, circular) Boolean (AND, OR, NOT, XOR) Typically CPU “bit size” refers to ALU and register size 32-bit CPU  32-bit ALU and registers 64-bit CPU  64-bit ALU and registers

+ Processor Categories and Roles Many possible roles for individual processors in: Coprocessors Microcontrollers Microsequencers Embedded system processors General purpose processors

+ Coprocessor Operates in conjunction with and under the control of another processor. Special purpose processor Performs a single task Operates at high speed Example: Math Coprocessor Used for floating point mathematical operations

+ Microcontroller Programmable device Dedicated to control of a physical system Example: ECU for automobile engine Roadway intersection traffic lights

+ Microsequencer Similar to microcontroller Controls coprocessors and other engines within a large processor Example: Move operands to floating point unit Invoke an operation (divide) Move result back to memory

+ Embedded System Processor Operates sophisticated electronic device Usually more powerful than microcontroller Example: Controlling a DVD player, including commands from a remote control

+ General Purpose Processor Most powerful type of processor Completely programmable Full functionality Example: CPU in personal computer/laptop (CISC x86 architecture) CPU in smartphone/tablet (RISC ARM architecture)

+ Processor Performance

+ Clock and Instruction Rate Clock Cycle Time interval in which all basic circuits (steps) inside a process must complete Time at which gates are clocked (gate-signal propagation) Clock Rate 1/clock cycle (GHz – billion cycles per second) Instruction Rate Measure of time required to execute instructions MIPS – million instructions per second Varies since some instructions take more time (more clock cycles) than others Shift left instruction vs. fetch from memory instruction

+ Basic Performance Equation Define:N = Number of instructions executed in the program S = Average number of cycles for instructions in the program R = Clock rate T = Program execution time T = N * S R

+ Improve Performance To improve performance: Decrease N and/or S Increase R Parameters are not independent: Increasing R may increase S as well N is primarily controlled by compiler Processors with large R may not have the best performance Due to larger S Making logic circuits faster/smaller is a definite win Increases R while S and N remain unchanged

+ Amdahl’s Law Potential speed up of program using multiple processors. Concluded that: Code needs to be parallelizable Speed up is bound, giving diminishing returns for more processors Task dependent Servers gain by maintaining multiple connections on multiple processors Databases can be split into parallel tasks

+ Amdahl’s Law Most important principle in computer design: Make the common case fast Optimize for the normal case Enhancement: any change/modification in the design of a component Speedup: how much faster a task will execute using an enhanced component versus using the original component. Speedup = Component enhanced Component original

+ Amdahl’s Law The enhanced feature may not be used all the time. Let the fraction of the computation time when the enhanced feature is used be F. Let the speedup when the enhanced feature is used be Se. Now the execution time with the enhancement is: Ex new = Ex old * (1 – F) + Ex old * (F/Se) This gives the overall speedup (So) as: So = Exold/Exnew = 1 / ((1 - F) + (F/Se))

+ Amdahl’s Law – Example 1 Suppose that we are considering an enhancement that runs 10 times faster than the original component but is usable only 40% of the time. What is the overall speedup gained by incorporating the enhancement? Se = 10 F = 40 / 100 = 0.4 So = 1 / ((1 – F) + (F / Se)) = 1 / (0.6 + (0.4 / 10)) = 1 / 0.64 = 1.56

+ Amdahl’s Law – Example 2 Suppose that we hired a guru programmer that made 70% of our program run 15x faster that the original program. What is the speedup of the enhanced program? Se = 15 F = 70 / 100 = 0.7 So = 1 / ((1 – F) + (F / Se)) = 1 / (0.3 + (0.7 / 15)) = 1 / = 2.88

+ Amdahl’s Law – Example 3 Suppose that we hired two students to enhance our WKU web Server performance. The first student increased the performance of the server by 12% for 85% of the time. The second student increased the performance of the server by 2x for 25% of the time. Which student produced the overall highest speedup? Student1Student2 Se = 1.12Se = 2 F = 85 / 100 = 0.85F = 25 / 100 = 0.25So = 1 / ((1 – F) + (F / Se)) = 1 / ( (0.85 / 1.12)) = 1 / ( (0.25 / 2)) = 1 / = 1 / = 1.1 = 1.14

+ Benchmarks LINPACK (Scientific Computing) Speed in solving linear system of equations (matrix multiplications)

+ Top 10 Supercomputers

+ Top 500 Performance Development

+ Benchmarks - LINPACK Current fastest supercomputer: Tianhe-2 (MiklyWay-2) 3.12 million 2.2Ghz Pflops/sec = 33,860,000,000,000,000 Floating point operations/sec Current High End Desktop: Intel I7 “Haswell” 4770k 4 3.5Ghz 177 Gflops/sec = 177,000,000,000 Floating point operations/sec Current Google Android Smartphone: Google Nexus Ghz ARM RISC Architecture 393 Mflops/sec = 393,000,000 Floating point operations/sec