Download presentation
Presentation is loading. Please wait.
1
1 COMP 206: Computer Architecture and Implementation Montek Singh Wed., Aug 31, 2005 Lecture 1
2
2Outline Course Information Logistics Logistics Grading Grading Syllabus Syllabus Course Overview Course Overview Technology Trends Moore’s Law Moore’s Law The CPU-Memory Gap The CPU-Memory Gap
3
3 Course Information (1) Time and Place MW 2:00-3:15pm, Sitterson Hall 011 MW 2:00-3:15pm, Sitterson Hall 011Instructor Montek Singh Montek Singh montek@cs.unc.edu (not singh@cs!) montek@cs.unc.edu (not singh@cs!) montek@cs.unc.edu SN 245, 962-1832 SN 245, 962-1832 Office hours: MW 3:15-4:15pm, and by appointment Office hours: MW 3:15-4:15pm, and by appointment Teaching Assistant Maybe? Maybe? Course Web Page http://www.cs.unc.edu/~montek http://www.cs.unc.edu/~montek
4
4 Course Information (2) Prerequisites COMP 120 and digital logic (PHYS 102), or equivalent COMP 120 and digital logic (PHYS 102), or equivalent I assume you know the following topics I assume you know the following topics CPU: ALU, control unit, registers, buses, memory management Control Unit: register transfer language, implementation, hardwired and microprogrammed control Memory: address space, memory capacity I/O: CPU-controlled (polling, interrupt), autonomous (DMA) Representative books (available in Brauer Library) Representative books (available in Brauer Library) Baron & Higbie: Computer Architecture. Addison Wesley, 1992 Kuck: The Structure of Computers and Computations (Vol. 1). Wiley 1978 Stallings: Computer Organization and Architecture: Designing for Performance (4th edition). Prentice Hall, 1996 Patterson & Hennessy: Computer Organization and Design: The Hardware/Software Interface (2nd edition). Morgan Kaufmann Publishers, 1997
5
5 Course Information (3) Textbook Hennessy & Patterson: Computer Architecture: A Quantitative Approach (3 rd edition), Morgan Kaufmann Publishers, 2002 Hennessy & Patterson: Computer Architecture: A Quantitative Approach (3 rd edition), Morgan Kaufmann Publishers, 2002 available in the university bookstore also from: www.amazon.com, www.bn.com, …
6
6 Course Information (4) Textbook (contd.) We will cover the following material: We will cover the following material: Chapter 1 (Fundamentals of Computer Design) Chapter 2 (Instruction Set Principles and Examples) Appendix A (Pipelining: Basic and Intermediate Concepts) Chapters 3 & 4 (Instruction-Level Parallelism) Chapter 5 (Memory-Hierarchy Design) Chapter 7 (Storage Systems) Chapters 6 & 8 (Multiprocessors, Interconnection Networks) –selected topics, time permitting Additional readings/papers will be handed out in class mostly on case studies mostly on case studies
7
7 Course Information (5) Grading 25-30% homework assignments (5 or 6) 25-30% homework assignments (5 or 6) 20-25% midterm exam 20-25% midterm exam 15% small project 15% small project no system building, no extensive programming 35% final exam 35% final exam Assignments are due at beginning of class on due date Late assignments: penalty=20%/day Late assignments: penalty=20%/day Honor Code is in effect: for all homework/exams/projects encouraged to discuss ideas/concepts with others encouraged to discuss ideas/concepts with others work handed in must be your own work handed in must be your own
8
8 What is in COMP 206 for me? Understand modern computer architecture so you can: Write better programs Write better programs Understand the performance implications of algorithms, data structures, and programming language choices Write better compilers Write better compilers Modern computers need better optimizing compilers and better programming languages Write better operating systems Write better operating systems Need to re-evaluate the current assumptions and tradeoffs Example: gigabit networks Design better computer architectures Design better computer architectures There are still many challenges left Example: the CPU-memory gap Satisfy the Distribution Requirement Satisfy the Distribution Requirement
9
9 Computer Architecture Is … “…the structure of a computer that a machine language programmer must understand to write a correct (timing independent) program for that machine.” Amdahl, Blaauw, and Brooks, 1964 “Architecture of the IBM System 360”, IBM Journal of Research and Development
10
10 Technology Programming Languages Operating Systems History Applications Interface Design Measurement & Evaluation Parallelism Computer Architecture: Instruction Set Design Organization Hardware COMP 206 Course Focus Understanding the design techniques, machine structures, technology factors, and evaluation methods that will determine the form of computers in 21st century
11
11 Computer Architecture Topics Pipelining, Hazard Resolution, Superscalar, Reordering, Prediction, Speculation Addressing, Protection, Exception Handling L1 Cache L2 Cache DRAM Disks, Tape Coherence, Bandwidth, Latency Emerging Technologies Interleaving Bus protocols RAID VLSI Input/Output and Storage Memory Hierarchy Pipelining and Instruction Level Parallelism Instruction Set Architecture
12
12 Technology Trends Evaluate Existing Systems for Bottlenecks Simulate New Designs and Organizations ImplementNextGenerationSystem Benchmarks Workloads Implementation Complexity Computer Engineering Methodology
13
13 Underlying Technologies Generational Evolutionary Parallelism YearLogicStorageProg. Lang.O/S 54Tubescore (8 ms) 58Transistor (10µs)Fortran 60Algol, CobolBatch 64Hybrid (1µs)thin film (200ns)Lisp, APL, Basic 66IC (100ns)PL1, Simula,C 67Multiprog. 71LSI (10ns)1k DRAMO.O.V.M. 73 (8-bit µP) 75(16-bit µP)4k DRAM 78VLSI (10ns)16k DRAMNetworks 8064k DRAM 84(32-bit µP)256k DRAMADA 87ULSI1M DRAM 89GaAs4M DRAMC++ 92(64-bit µP)16M DRAMFortran90
14
14 Predictions for the Early 2000s Technology Very large dynamic RAM: 256 Mbits to 1Gb and beyond Very large dynamic RAM: 256 Mbits to 1Gb and beyond Large fast static RAM: 16 MB, 5ns Large fast static RAM: 16 MB, 5ns Complete systems on a chip 100+ million transistors 100+ million transistors Parallelism Superscalar, Superpipelined, Vector, Multiprocessors? Superscalar, Superpipelined, Vector, Multiprocessors? Processor Arrays? Processor Arrays? Special-Purpose Architectures? GPU’s, mp3 players, nanocomputers … GPU’s, mp3 players, nanocomputers … Reconfigurable Computers? Wearable computers Wearable computers
15
15 Predictions for the Early 2000s (2) Low Power 50% of PCs portable now (?) 50% of PCs portable now (?) Hand held communicators Hand held communicators Performance per watt, battery life Performance per watt, battery life Transmeta Transmeta Asynchronous (clockless) design Asynchronous (clockless) design Parallel I/O Many applications I/O limited, not computation Many applications I/O limited, not computation Computation scaling, but memory, I/O bandwidth not keeping pace Computation scaling, but memory, I/O bandwidth not keeping pace Multimedia New interface technologies New interface technologies Video, speech, handwriting, virtual reality, … Video, speech, handwriting, virtual reality, …
16
16 Diversion: Clocked Digital Design Most current digital systems are synchronous: Clock: a global signal that paces operation of all components Clock: a global signal that paces operation of all components clock Benefit of clocking: enables discrete-time representation l all components operate exactly once per clock tick l component outputs need to be ready by next clock tick allows “glitchy” or incorrect outputs between clock ticks
17
17 Microelectronics Trends Current and Future Trends: Significant Challenges Large-Scale “Systems-on-a-Chip” (SoC) Large-Scale “Systems-on-a-Chip” (SoC) 100 Million ~ 1 Billion transistors/chip Very High Speeds Very High Speeds multiple GigaHertz clock rates Explosive Growth in Consumer Electronics Explosive Growth in Consumer Electronics demand for ever-increasing functionality … … with very low power consumption (limited battery life) Higher Portability/Modularity/Reusability Higher Portability/Modularity/Reusability “plug ’n play” components, robust interfaces
18
18 Alternative Paradigm: Asynchronous Design Digital design with no centralized clock Synchronization using local “handshaking” Asynchronous Benefits: Higher Performance: not limited by slowest component Higher Performance: not limited by slowest component Lower Power: zero clock power; inactive parts consume little power Lower Power: zero clock power; inactive parts consume little power Reduced Electromagnetic Noise: no clock spikes [e.g., Philips pagers] Reduced Electromagnetic Noise: no clock spikes [e.g., Philips pagers] Greater Modularity: variable-speed interfaces; reusable components Greater Modularity: variable-speed interfaces; reusable components Asynchronous System (Distributed Control) handshakinginterface Synchronous System (Centralized Control) clock
19
19 Tech Trends: Moore’s Law 4004 (’71): 2,250 transistors 8086 (’78): 29,000 transistors 486™ DX (’89): 1,180,000 transistors Pentium 4 (’00): 42,000,000 transistors CMOS improvements: Die size: 2x every 3 yrs Line width: halve / 7 yrs
20
20 Tech Trends: Memory Capacity # Megabits on single DRAM chip yearsizecyc time 198064 Kb250 ns 1983256 Kb220 ns 19861 Mb190 ns 19894 Mb165 ns 199216 Mb145 ns 199564Mb100 ns 2002512Mb 60 ns
21
21 CapacitySpeed Logic2x in 2 years2x in 3 years DRAM4x in 3 years1.4x in 10 years Disk2x in 3 years1.4x in 10 years Technology Trends (Summary)
22
22 Processor Performance Trends
23
23 Processor Perspective
24
24 Measurement and Evaluation Design Analysis Architecture is an iterative process: Searching the space of possible designs At all levels of computer systems Bad Ideas Good Ideas Creativity Mediocre Ideas Cost / Performance Analysis
25
25 Measurement Tools Die Area, Power, Speed Estimation Tools Benchmarks, Traces, Mixes Simulation (many levels) ISA, RT, Gate, Circuit ISA, RT, Gate, Circuit Queuing Theory Rules of Thumb Fundamental Laws
26
26 Time to run the task (Execution Time) –Execution time, response time, latency Tasks per day, hour, week, sec, ns … (Performance) –Throughput, bandwidth The Bottom Line: Performance (and Cost) Plane Boeing 747 Concorde Speed 610 mph 1350 mph DC to Paris 6.5 hours 3 hours Passengers 470 132 Throughput (pmph) 286,700 178,200
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.