COMPUTER ARCHITECTURE & OPERATIONS I Instructor: Hao Ji
Administrivia Class Web Page Syllabus Class Policy Class Notes Posted before class Read class notes before class Assignments Posted after class Pay attention to the due dates Blackboard Posting grades Sending out s to class
Administrivia Instructional Addresses Instructor: Hao Ji Office phone: x7742 Office location: E&CS 2127A Office hours: M, W: 1:00 PM – 4:15 PM by appointment
Administrivia Grading Policy (4+) Assignments 40% Late Assignment Policy 0~24 hrs: -5% 24~48 hrs: -10% >48 hrs: grade = 0 (1+) Midterms: 30% (1) final: 25% (5+) quizzes: 5% Announced in the last class before quiz
Administrivia Textbook Computer Organization and Design: The Hardware/Software Interface, 5 th Edition, by Patterson and Hennessy, Morgan and Kaufman Publishers, Inc., 2014 Same textbook in CS270
Honor Code All assignments, unless explicitly specified, are to be completed on your own ODU Honor Council Evidence of cheating, plagiarism, or unauthorized collaboration will result in a 0 grade for quiz/assignment/exam May have further consequences
How to get help? Ask questions in class (or after class) Attend office hours me Make sure that you put “CS170” in your subject line Send it from your.odu account It wouldn’t come to my spam folder State clearly what you need in your
How to Get an A in this Class Attendance Attend class regularly and on time Ask questions Work on in-class exercises and labs Notes Read over class notes before class Review class notes after class Homework Get started as early as possible Contact me if you encounter problems
CS170 will cover Chapters 1, 2, 3 Appendix B
What you will learn What is a Computer?
What you will learn Representing numbers in computers Binary, Octal, Hexadecimal Positive, Negative Floating Point Numbers Designing Computer Logic Computer Hardware Components
What You Will Learn How programs are translated into the machine language And how the hardware executes them The hardware/software interface What determines program performance And how it can be improved How hardware designers improve performance What is parallel processing
What You Will Learn Understanding Performance Algorithm Determines number of operations executed Programming language, compiler, architecture Determine number of machine instructions executed per operation Processor and memory system Determine how fast instructions are executed I/O system (including OS) Determines how fast I/O operations are executed
Topics Overview of Computer Architectures Classes of computers Components of a computer Input Output Processing Programming languages High-level language Hardware language Performance Definition Measure Power wall
Topics (cont.) Basics of Logic Design Gates Truth Tables Logic Equations Combinational Logic Hardware Description Language ALU Clocks Memory Elements Flip-Flops, Latches, and Registers SRAM and DRAM Timing Methodologies Programmable Devices
Topics (cont.) Instructions of the Computer Operations and Operands of the Computer Hardware Logical Instruction Decision Making Instructions Representation of numbers Instruction representations Communication Addressing Synchronization Parallelism
Topics (cont.) Arithmetic Addition and Subtraction Multiplication Division Floating Point Parallelism
Importance of This Course Prerequisite for CS270 You must get a C or better to pass Foundation for advanced courses Operating Systems Programming Language Compiler Design Networking Parallel Programming Algorithm I/O Management
Homework 0: Who Are We? Tell us about yourself, Name/Year/Major Something interesting about yourself Expectation in this class
Computer What is a Computer?
Computer What is a Computer? “A computer is a general-purpose device that can be programmed to carry out a set of arithmetic or logical operations automatically” -- Wikipedia.
Computer Evolution Moore’s Law The number of transistors that can be placed inexpensively on an integrated circuit doubles approximately every two years Chip performance double every two years So does CPU speed Memory Number of sensors Number of Pixels in digital camera
Moore’s Law
The Computer Revolution Progress in computer technology Underpinned by Moore’s Law Makes novel applications feasible Computers in automobiles Cell phones Human genome project Computational biology/chemistry/physics World Wide Web Search Engines Computers are pervasive §1.1 Introduction
Classes of Computers Desktop computers General purpose, variety of software Subject to cost/performance tradeoff
Classes of Computers Server computers Network based High capacity, performance, reliability Range Small file servers Supercomputers
Poor Man’s Super Computer What is a Cluster? “Collection of interconnected stand-alone computers working together as a single, integrated computing resource” Cluster consists of Nodes Network OS Cluster middleware Standard components Avoiding expensive proprietary components
Classes of Computers Embedded computers Hidden as components of systems Examples Computer in your car Processor in your cell phone Stringent power/performance/cost constraints
The Processor Market
Decimal Representation Example 5489 = 5x x x x10 0
Binary Representation Only 0s and 1s Example b =1x2 8 +0x2 7 +0x2 6 +1x2 5 +0x2 4 +0x2 3 +1x2 2 +1x2 1 +0x2 0
Decimal to Binary Number 294 Divide by 2result 147 remainder 0 Divide by 2result73remainder 1 Divide by 2result36remainder1 Divide by 2result18remainder0 Divide by 2result9remainder0 Divide by 2result4remainder1 Divide by 2result2remainder 0 Divide by 2result1remainder0 Divide by 2result0remainder1 Answer:
Significant Bits Most Significant Bit (MSB) Least Significant Bit (LSB)
Octal Representation 294 = b Binary to Octal
Hexadecimal Representation 296 = b Binary to Hexadecimal
Binary to Decimal b =1x2 8 +0x2 7 +0x2 6 +1x2 5 +0x2 4 +0x2 3 +1x2 2 +1x2 1 +0x2 0 =294
Decimal terms and Binary Terms
Summary Syllabus Moore’s Law Classes of Computers Decimal, Binary, Octal, Hexadecimal Representations Conversion btw. Different Representations
Time for a Break (10 mins)
Review Last Session Syllabus Moore’s Law Classes of Computers Decimal, Binary, Octal, Hexadecimal Representations This Session Program and Computer Compiler, Assembler, and Linker Components of a Computer
Understanding Computer Performance The performance of a Program depends on Algorithm Determines number of operations executed Programming language, compiler, architecture Determine number of machine instructions executed per operation Processor and memory system Determine how fast instructions are executed I/O system (including OS) Determines how fast I/O operations are executed
Below Your Program Application software Written in high-level language System software Compiler: translates High Level Language code to machine code Operating System: service code Handling input/output Managing memory and storage Scheduling tasks & sharing resources Hardware Processor, memory, I/O controllers §1.2 Below Your Program
Levels of Program Code High-level language Level of abstraction closer to problem domain Provides for productivity and portability Assembly language Symbolic representation of instructions Hardware representation Binary digits (bits) Encoded instructions and data
Compiler Function of Compiler Convert programs in high-level language to programs in assembly language
Example: C Compiler C program Assembly Program
Assembler Translates assembly language into binary instructions Assembly Language Use symbols instead of 0’s and 1’s More readable
Binary Instructions MIPS binary code for summing 0 to 100 square
Linker Separate Compilation Allows a program to be split into pieces that are stored in different files Each file contains a logically related collection of subroutines and data structures that form a module Can be compiled separately Can be reused Linker Merge Modules together
Functions of a Linker
Tasks of a Linker Search the program libraries to find library routines used by the program Determine the memory locations that code from each module will occupy and relocates its instructions by adjusting absolute references Resolves references among modules Matching references
Relationship Among Compiler, Assembler, and Linker
Example: gcc compiler Compile a simple program gcc –v test.c
Components of a Computer Same components for all kinds of computer Desktop, server, embedded Input/output includes User-interface devices Display, keyboard, mouse Storage devices Hard disk, CD/DVD, flash Network adapters For communicating with other computers §1.3 Under the Covers The BIG Picture
Anatomy of a Computer Output device Input device Network cable
Anatomy of a Mouse Optical mouse LED illuminates desktop Small low-res camera Basic image processor Looks for x, y movement Buttons & wheel Supersedes roller-ball mechanical mouse
Through the Looking Glass LCD screen: picture elements (pixels) Mirrors content of frame buffer memory
Opening the Box
Inside the Processor (CPU) Datapath: performs operations on data Control: sequences datapath, memory,... Cache memory Small fast SRAM memory for immediate access to data
Inside the Processor AMD Barcelona: 4 processor cores
Abstractions Abstraction helps us deal with complexity Hide lower-level detail Instruction set architecture (ISA) The hardware/software interface Application binary interface The ISA plus system software interface Implementation The details underlying and interface The BIG Picture
A Safe Place for Data Volatile main memory Loses instructions and data when power off Non-volatile secondary memory Magnetic disk Flash memory Optical disk (CDROM, DVD)
Networks Communication and resource sharing Local area network (LAN): Ethernet Within a building Wide area network (WAN: the Internet) Wireless network: WiFi, Bluetooth
Technology Trends Electronics technology continues to evolve Increased capacity and performance Reduced cost DRAM capacity
Summary Performance of a Computer Compiler Assembler Linker Components of a Computer
Time for a Break (10 mins)
Review Last Session Program and Computer Compiler, Assembler, and Linker Components of a Computer This Session Definition of Computer Performance Measure of Computer Performance
An Analogy Which airplane has the best performance? §1.4 Performance
Answer That depends on … If performance means “the least time of transferring 1 passenger from one place to another” Concorde “the least time of transferring 450 passenger from one place to another” Boeing 747 Performance can be defined in different ways
Response Time and Throughput Response time (AKA Execution Time) Total time required for a computer to complete a task Measured by time Throughput (AKA Bandwidth) Number of tasks done work done per unit time e.g., tasks/transactions/… per hour
Response Time and Throughput Assuming each task in a computer is a serial task. How are response time and throughput affected by Replacing with a faster processor? Reduce response time Increase throughput Adding more processors? Increase throughput Same response time We’ll focus on response time for now…
Response Time and Throughput Assuming each task in a computer is a serial task. How are response time and throughput affected by Replacing with a faster processor? Reduce response time Increase throughput Adding more processors? Increase throughput Same response time We’ll focus on response time for now…
Performance and Execution Time Performance
Relative Performance “X is n time faster than Y” Example: time taken to run a program 10s on A, 15s on B Execution Time B / Execution Time A = 15s / 10s = 1.5 So A is 1.5 times faster than B
Measuring Execution Time Elapsed (Wallclock) time Total response time, including all aspects Processing, I/O, OS overhead, idle time Determines system performance CPU time Time spent processing a given job Discounts I/O time, other jobs’ shares Comprises user CPU time and system CPU time User CPU time: CPU time spent in a program itself System CPU time: CPU time spent in the OS performing task on behalf of the program Different programs are affected differently by CPU and system performance
CPU Clocking Operation of digital hardware governed by a constant-rate clock Clock (cycles) Data transfer and computation Update state Clock period Clock period: duration of a clock cycle e.g., 250ps = 0.25ns = 250×10 –12 s Clock frequency (rate): cycles per second e.g., 4.0GHz = 4000MHz = 4.0×10 9 Hz
CPU Time
Performance Improvement Performance improved by either Increasing clock rate => Shorter clock period => More but shorter instructions => More clock cycles Reducing number of clock cycles => Longer clock period => Less but Longer Instructions => Reducing clock rate Hardware designer must often trade off clock rate against cycle count
CPU Time Example A Program on Computer A: 2GHz clock, 10s CPU time Designing Computer B Aim for 6s CPU time Can do faster clock, but causes 1.2 × clock cycles How fast must Computer B clock be?
Instruction Set Architecture Instruction Set Architecture (ISA) An abstract interface between the hardware and the lowest-level software that encompasses all the information necessary to write a machine language program that will run correctly Repertoire of instructions Registers Memory access I/O
Clock Cycles per Instruction (CPI) Average number of clock cycles per instruction for a program
Instruction Count and CPI Instruction Count (IC) for a program Determined by program, ISA and compiler Average cycles per instruction Determined by CPU hardware If different instructions have different CPI Average CPI affected by instruction mix
CPI Example Computer A: Cycle Time = 250ps, CPI = 2.0 Computer B: Cycle Time = 500ps, CPI = 1.2 Same ISA Which is faster, and by how much? A is faster… …by this much
CPI in More Detail If different instruction classes take different numbers of cycles Weighted average CPI Relative frequency
CPI Example Alternative compiled code sequences using instructions in classes A, B, C ClassABC CPI for class123 IC in sequence 1212 IC in sequence 2411 Sequence 1: IC = 5 Clock Cycles = 2×1 + 1×2 + 2×3 = 10 Avg. CPI = 10/5 = 2.0 Sequence 2: IC = 6 Clock Cycles = 4×1 + 1×2 + 1×3 = 9 Avg. CPI = 9/6 = 1.5
Summary Response Time and Throughput Performance Measure CPI (Cycles per Instruction) IC (Instructions Count) Performance Definition
What I want you to do Review Chapter 1 Prepare for your first Quiz Next Class (Wednesday) Power Wall Basic of Logic Design Integrated Circuits