Slide 1 Motivations and Introduction Phenomenal growth in computer industry/technology: X2/18mo in 20yr. multi-GFLOPs processors, largely due to –Micro-electronics technology –Computer Design innovations We have come a long way in a short time of 56 years since the 1 st general purpose computer in 1946:
Slide 2 Motivations and Introduction Past (Milestones): –First electronic computer ENIAC in 1946: 18,000 vacuum tubes, 3,000 cubic feet, 20 2-foot 10-digit registers, 5 KIPs (thousand additions per second); –First microprocessor (a CPU on a single IC chip) Intel 4004 in 1971: 2,300 transistors, 60 KIPs, $200; –Virtual elimination of assembly language programming reduced the need for object-code compatibility; –The creation of standardized, vendor-independent operating systems, such as UNIX and its clone, Linux, lowered the cost and risk of bringing out a new architecture –RISC instruction set architecture paved ways for drastic design innovations that focused on two critical performance techniques: instruction-level parallelism and use of caches
Slide 3 Motivations and Introduction Present (State of the art): –Microprocessors approaching/surpassing 10 GFLOPS; –A high-end microprocessor ( $10million) ten years ago; –While technology advancement contributes a sustained annual growth of 35%, innovative computer design accounts for another 25% annual growth rate a factor of 15 in performance gains!
Slide 4 Motivations and Introduction Present (State of the art): –Three different computing markets (fig. 1.3): »Desktop Computing –- driven by price- performance (a few hundreds through over 10K); »Servers – availability driven (distinguished from reliability), providing sustained high performance (fig. 1.2) »Embedded Computers – fastest growing portion of the computer market, real-time performance driven, and need to minimize memory and power, as well as ASIC
Slide 5 Motivations and Introduction Present (State of the art): –The Task of the Computer Designer (Fig. 1.4): »Instruction Set Architecture (Traditional view of what Computer Architecture is), the boundary between software and hardware; »Organization, high-level aspects of a computer’s design, such as the memory system, the bus structure, the internal design of CPU, based on a given instruction set architectrue; »Hardware, the specifics of a machine, including the detailed logic design and the packaging technology of the machine. Future (Technology Trends): –A truly successful instruction set architecture (ISA) should last for decades, however it takes an computer architect’s acute observation and knowledge of the rapidly changing technology, in order for the ISA to survive and cope with such changes:
Slide 6 Motivations and Introduction Future (Technology Trends): »IC logic technology: transistor count on a chip grows at 55% annual rate (35% density growth rate % die size growth) while device speed scales more slowly; »Semiconductor DRAM: density grows at 60% annually while cycle time improves very slowly (decreasing one-third in ten years). Bandwidth per chip increases twice as fast as latency decreases; »Magnetic dish technology: density increases at 100% annual rate since 1990 while access time improves at about a third every ten years; and »Network technology: both latency and bandwidth have been improving, with more focus on bandwidth of late; the increasing importance of networking has led to faster improvement in performance than before—Internet bandwidth doubles every year in the U.S. challenge & opportunity for computer designer »Scaling of transistor performance: while transistor density increases quadratically with linear decrease in feature size, transistor performance increases roughly linearly with decrease in feature size challenge & opportunity for computer designer! »Wires and power in IC: propagation delay and power needs?
Slide 7 Motivations and Introduction Cost, Price and Their Trends: –Understanding cost and pricing structure of the industry and market is key to cost-sensitive design of computers; –The Learning Curve: manufacturing costs decrease over time (Fig.1.5&1.6), best measured by change in yield helps project costs over product’s life;Fig
Slide 8 Motivations and Introduction Cost, Price and Their Trends: –Cost of an IC (Fig. 1.8):Fig. 1.8
Slide 9 Motivations and Introduction Cost, Price and Their Trends: –Cost of an IC: die yield has been obtained empirically, where ά corresponds inversely to the number of masking levels (manufacturing complexity). For today’s metal CMOS processes, it’s estimated at 4.0
Slide 10 Motivations and Introduction Distribution of Cost in a System Cost vs. Price (Fig. 1.10)Fig. 1.10
Slide 11 Motivations and Introduction Cost vs. Price (Fig. 1.10)Fig –Component cost(CC): original cost from a designer’s point of view; –Direct cost (DC, 20% of CC): making a product (labor cost, scrap, warranty, etc), not including service and maintenance; –Gross margin (GM, 33% of CC+DC): indirect cost overhead: R&D, marketing, sales, manufacturing equipment maintenance, building rental, cost of financing, pretax profits, and taxes; Average selling price (ASP) = CC + DC + GM –Average discount (AD, 33% of ASP): volume discounts by manufacturers; List price = ASP + AD
Slide 12 Performances & Quantitative Principles “X is n times faster than Y” Performance (throughput) is inversely proportional to execution time: Definition of time: –wall-clock time: response time or elapsed time; –CPU time: the accumulated time during which CPU is computing: »user CPU time »system CPU time –An example from UNIX: 90.7u 12.9s 2:39 65% »90.7u: user CPU time (seconds) »12.9s: system CPU time »2:39(159 sec): elapsed time »65%: percentage of CPU time
Slide 13 Performances & Quantitative Principles Workload Representations (in decreasing accuracy): –Real applications: most accurate but inflexible and poor portability –Modified/scripted applications: scripts to stimulate (or highlight) certain features and to enhance portability –Kernels: extracted from real programs, good for isolating performance of individual features of a machine –Toy benchmarks: simple and run on almost all computers, good for beginning programming assignments –Synthetic benchmarks: artificially created to match an “average” execution profile, do not reward optimizations of behaviors in real programs but absent from benchmarks, and vice versa--thus can be misleading
Slide 14 Performances & Quantitative Principles Benchmark Suites: collection of kernels, real and benchmark programs, lessening the weakness of any one benchmark by the presence of others.(fig. 1.11) –Desktop Benchmark Suites: SPEC SPEC89SPEC92SPEC95 SPEC2000CINTCFP2000 »CPU-intensive benchmarks: SPEC (Standard Performance Evaluation Corporation): SPEC89 SPEC92 SPEC95 SPEC2000(11 int CINT & 14 fp CFP2000, fig. 1.12): real programs modified for portability and highlighting CPU SPECviewperf SPECapc »Graphics-intensive benchmarks: SPECviewperf for systems supporting the OpenGL graphics library, SPECapc for applications with intensive use of graphics –Server Benchmark Suites: SPEC CPU2000 »CPU-throughput benchmarks: SPEC CPU2000 SPECrate SPECSFS SPECWeb »I/O-intensive benchmarks: SPECSFS for file server, SPECWeb for web server TPC »Transaction-processing (TP) benchmarks: TPC (Transaction Processing Council): TCP-A (85) TCP-C (complex query) TCP-H (ad-hoc decision support) TCP-R (business decision support) TCP-W (web- oriented) EEMBC –Embedded Benchmarks: EEMBC (“embassy suites”, fig. 1.13)
Slide 15
Slide 16
Slide 17
Slide 18
Slide 19
Slide 20
Slide 21