Download presentation
Presentation is loading. Please wait.
1
Lecture 2: Intro to Computer Architecture
Michael B. Greenwald Computer Architecture CIS 501 Fall 1999
2
General Information Class: TR 1:30-3, in LRSM Auditorium Recitation: T 10: in Moore 225 Instructor: Professor Michael Greenwald Office: Moore (GRW), room Office hours: R10:30-12noon or by appt. TA: Sotiris Ioannidis Office: Moore, room 102e Office hours: TR5-6PM or by appt. Secretary: Christine Metz Office: Moore, room 556
3
Outline Review Quantitative principles of computer design
Amdahl’s law CPU performance equation Quantitative measurements Costs Performance
4
Typos in HW 3c. New version on web page. D = defects/
Defects per layer
5
Technology Trends: Microprocessor Capacity
“Graduation Window” Alpha 21264: 15 million Pentium Pro: 5.5 million PowerPC 620: 6.9 million Alpha 21164: 9.3 million Sparc Ultra: 5.2 million Moore’s Law CMOS improvements: Die size: 2X every 3 yrs Line width: halve / 7 yrs
6
Trends in application demands
Program increase memory demands by factor of per year (1/2 to 1 bit/year) Avail. disk space (or net bw) is always consumed. User I/O bandwidth grows: tty->crt->bitmap->video->?virtual reality? Processing power: cheapest to produce one version of program. Optimize for mid-range. Slow on low- end, fast on high-end. Are these demands growing because of increased capabilities or increased appetites?
7
The Quantitative Approach
8
Measurement and Evaluation Quantitative Approach
Architecture is an iterative process: Searching the space of possible designs At all levels of computer systems Cost / Performance Analysis Creativity Good Ideas Mediocre Ideas Bad Ideas
9
Measurement and Evaluation Quantitative Approach
Not a guarantee of good ideas, just a way to discard bad ideas. Cost / Performance Analysis Creativity Good Ideas Mediocre Ideas Bad Ideas
10
Computer Engineering Methodology
Technology Trends
11
Computer Engineering Methodology
Evaluate Existing Systems for Bottlenecks Benchmarks Where to start: existing systems bottlenecks Technology Trends
12
Computer Engineering Methodology
Evaluate Existing Systems for Bottlenecks Benchmarks Technology Trends Simulate New Designs and Organizations Workloads
13
Computer Engineering Methodology
Evaluate Existing Systems for Bottlenecks Implementation Complexity Benchmarks How hard to build Importance of simplicity (wearing a seat belt); avoiding a personal disaster Theory vs. practice Technology Trends Implement Next Generation System Simulate New Designs and Organizations Workloads
14
Measurement Tools Benchmarks, Traces, Mixes
Hardware: Cost, delay, area, power estimation Simulation (many levels) ISA, RT, Gate, Circuit Queuing Theory Rules of Thumb Fundamental “Laws”/Principles Measure Experiment Analyze Design
15
All produce “measures”: what do measures mean? How do they compare?
Measurement Tools Benchmarks, Traces, Mixes Hardware: Cost, delay, area, power estimation Simulation (many levels) ISA, RT, Gate, Circuit Queuing Theory Rules of Thumb Fundamental “Laws”/Principles Measure Experiment Analyze Design All produce “measures”: what do measures mean? How do they compare?
16
The Bottom Line: Performance (and Cost)
Plane DC to Paris 6.5 hours 3 hours Speed 610 mph 1350 mph Passengers 470 132 Throughput (pmph) 286,700 178,200 Boeing 747 Fastest for 1 person? Which takes less time to transport 470 passengers? BAD/Sud Concorde Time to run the task (ExTime) Execution time, response time, latency Tasks per day, hour, week, sec, ns … (Performance) Throughput, bandwidth
17
The Bottom Line: Performance (and Cost)
Plane DC to Paris 6.5 hours 3 hours Speed 610 mph 1350 mph Passengers 470 132 Throughput (pmph) 286,700 178,200 Boeing 747 Fastest for 1 person? Which takes less time to transport 470 passengers? BAD/Sud Concorde Which is better?
18
The Bottom Line: Performance (and Cost)
Plane DC to Paris 6.5 hours 3 hours Speed 610 mph 1350 mph Passengers 470 132 Throughput (pmph) 286,700 178,200 Boeing 747 Fastest for 1 person? Which takes less time to transport 470 passengers? BAD/Sud Concorde Which is better? It depends if you are trying to win a race from DC to Paris, or you are trying to move the most people.
19
The Bottom Line: Performance (and Cost)
Plane DC to Paris 6.5 hours 3 hours Speed 610 mph 1350 mph Passengers 470 132 Throughput (pmph) 286,700 178,200 Boeing 747 Fastest for 1 person? Which takes less time to transport 470 passengers? BAD/Sud Concorde Even if trying to move most people, performance is useless without understanding cost. Else, why not just fly two Concordes at once, doubling throughput? , $160M in ‘98
20
Costs Performance metrics are mostly useless without understanding costs.
21
Integrated Circuits Costs
IC cost = Die cost Testing cost Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Wafer Defect Die Smaller dies are cheaper, and reduce cost per defect.
22
Integrated Circuits Costs
IC cost = Die cost Testing cost Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Defect Smaller dies are cheaper, and reduce cost per defect.
23
Die Cost goes roughly with die area4
IC Cost parameters Number of masking levels (measure of manufacturing complexity), was typically 3.0, growing wafer yield = wafers that are not completely bad. Typically close to 100% Defects per unit area = 0.6 to 1.2 per cm2. Drops with learning curve. Die Cost goes roughly with die area4
24
Integrated Circuits Costs
IC cost = Die cost Testing cost Packaging cost Final test yield Die cost = Wafer cost Dies per Wafer * Die yield Dies per wafer = * ( Wafer_diam / 2)2 – * Wafer_diam – Test dies Die Area 2 * Die Area Die Yield = Wafer yield * 1 + Defects_per_unit_area * Die_Area { } Die Cost goes roughly with die area4
25
Integrated Circuits Costs
Die cost = Wafer cost Dies per Wafer * Die yield Dies per wafer = * ( Wafer_diam / 2)2 – * Wafer_diam – Test dies Die Area 2 * Die Area Die Yield = Wafer yield * 1 + Die Cost = Wafer cost * 1 + * ( Wafer_diam / 2)2 – * Wafer_diam Defects_per_unit_area * Die_Area { } Defects_per_unit_area * Die_Area { } Die Cost goes roughly with die area4
26
Die Cost goes roughly with die area+1
IC Cost parameters Defects per unit area = 0.6 to 1.2 per cm2 Technologies that can fix defects (e.g. lasers a’la Lincoln Labs (MIT)), reduce effective defects per unit area and increase yield. However, need to understand costs which differ from formula. Still: Die Cost goes roughly with die area+1
27
Real World Examples(circa ‘93)
Chip Metal Line Wafer Defect Area Dies/ Yield Die Cost layers width cost /cm2 mm2 wafer 386DX $ % $4 486DX $ % $12 PowerPC $ % $53 HP PA $ % $73 DEC Alpha $ % $149 SuperSPARC $ % $272 Pentium $ % $417 From "Estimating IC Manufacturing Costs,” by Linley Gwennap, Microprocessor Report, August 2, 1993, p. 15
28
Other Costs Die Test Cost = Test Jig Cost * Ave. Test Time Die Yield
Packaging Cost: depends on pins, heat dissipation Chip Die Package Test & Total cost pins type cost Assembly 386DX $ QFP $1 $4 $9 486DX2 $ PGA $11 $12 $35 PowerPC 601 $ QFP $3 $21 $77 HP PA $ PGA $35 $16 $124 DEC Alpha $ PGA $30 $23 $202 SuperSPARC $ PGA $20 $34 $326 Pentium $ PGA $19 $37 $473
29
Cost/Performance What is Relationship of Cost to Price?
Component Costs Direct Costs (add 25% to 40%) recurring costs: labor, purchasing, scrap, warranty Gross Margin (add 82% to 186%) nonrecurring costs: R&D, marketing, sales, equipment maintenance, rental, financing cost, pretax profits, taxes Average Discount to get List Price (add 33% to 66%): volume discounts and/or retailer markup List Price Average Discount 25% to 40% Avg. Selling Price Gross Margin 34% to 39% 6% to 8% Direct Cost Component Cost 15% to 33%
30
Cost/Performance What is Relationship of Cost to Price?
Component Costs Direct Costs (add 25% to 40%) recurring costs: labor, purchasing, scrap, warranty Gross Margin (add 82% to 186%) nonrecurring costs: R&D, marketing, sales, equipment maintenance, rental, financing cost, pretax profits, taxes Average Discount to get List Price (add 33% to 66%): volume discounts and/or retailer markup List Price Average Discount Avg. Selling Price Discretion Gross Margin Direct Cost Component Cost
31
Chip Prices (August 1993) Assume purchase 10,000 units
Chip Area Mfg. Price Multi- Comment mm2 cost plier 386DX 43 $9 $ Intense Competition 486DX2 81 $35 $ No Competition PowerPC $77 $ DEC Alpha 234 $202 $ Recoup R&D? Pentium 296 $473 $ Early in shipments
32
Summary: Price vs. Cost
33
Cost/Price/Profit How is R&D funded?
R&D 4% to 12%, contributes to gross margin (it is an indirect cost) Two views: Only 4% of income on R&D! Investment: every $1 spent on R&D should lead to $8 to $25 in sales!
34
PERFORMANCE
35
The Bottom Line: Performance (and Cost)
Plane DC to Paris 6.5 hours 3 hours Speed 610 mph 1350 mph Passengers 470 132 Throughput (pmph) 286,700 178,200 Boeing 747 Fastest for 1 person? Which takes less time to transport 470 passengers? BAD/Sud Concorde Even if trying to move most people, performance is useless without understanding cost. Else, why not just fly two Concordes at once, doubling throughput? , $160M in ‘98
36
Performance Terminology
Time versus Performance: duration vs. rate. Time: response time = execution time Rate: throughput Reciprocals: there is both a time and a performance measure for any performance metric. “Improve performance”: time decreases, performance increases For computer systems the key performance metric is total execution time
37
Meaning of “Execution Time” (a.k.a. Response time)
Wall-clock-time, response time, elapsed-time: latency (including idle time) vs. CPU Time: non-idle System vs. User time: both elapsed and CPU system performance: elapsed time on unloaded system (includes OS + idle time) CPU performance: user CPU time on unloaded system
38
Terminology What do we mean when we compare two measures and say that “X is n times faster than Y”?
39
The Bottom Line: Performance (and Cost)
"X is n times faster than Y" means ExTime(Y) Performance(X) = = n ExTime(X) Performance(Y) Speed of Boeing 747 vs. Concorde Throughput of Boeing 747 vs. Concorde 1350 / 610 = 2.2X 286,700/ 178, X
40
The Bottom Line: Performance (and Cost)
"X is n times faster than Y" means 286,700 Performance(X) = 1.60 178,200 Performance(Y) Speed of Boeing 747 vs. Concorde Throughput of Boeing 747 vs. Concorde 1350 / 610 = 2.2X 286,700/ 178, X
41
The Bottom Line: Performance (and Cost)
"X is n times faster than Y" means 286,700 Performance(X) = 1.60 178,200 Performance(Y) Speed of Boeing 747 vs. Concorde Throughput of Boeing 747 vs. Concorde 1350 / 610 = 2.2X 286,700/ 178, X Note: Natural or meaningful units. Hours per passenger-mile is slightly weirder than passenger-miles per hour.
42
Measurement Tools Benchmarks, Traces, Mixes
Hardware: Cost, delay, area, power estimation Simulation (many levels) ISA, RT, Gate, Circuit Queuing Theory Rules of Thumb Fundamental “Laws”/Principles Measure Experiment Analyze Design ENGINEERING:Convert this to that
43
Fundamental Principle of Computer Design
Make the common case fast In every trade-off, favor the frequent case over the infrequent case. But how do we quantify this? At what point is the cost to the infrequent case sufficiently large as to offset speedups to the frequent case?
44
Fundamental Principle of Computer Design
Make the common case fast In every trade-off, favor the frequent case over the infrequent case. But how do we quantify this? At what point is the cost to the infrequent case sufficiently large as to offset speedups to the frequent case? Amdahl’s Law quantifies this principle
45
Amdahl's Law Speedup due to enhancement E:
ExTime w/o E Performance w/ E Speedup(E) = = ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected
46
Amdahl's Law Speedup due to enhancement E:
ExTime w/o E Performance w/ E Speedup(E) = = ExTime w/ E Performance w/o E Suppose that enhancement E accelerates a fraction F of the task by a factor S, and the remainder of the task is unaffected
47
Amdahl’s Law ExTimenew = ExTimeold x (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced 1 ExTimeold ExTimenew Speedupoverall = = (1 - Fractionenhanced) + Fractionenhanced Speedupenhanced
48
Amdahl’s Law: Example Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = Speedupoverall =
49
Amdahl’s Law: Example Floating point instructions improved to run 2X; but only 10% of actual instructions are FP ExTimenew = ExTimeold x ( /2) = 0.95 x ExTimeold 1 Speedupoverall = = 1.053 0.95
50
Amdahl’s Law: Example Suppose fetching a page from a web cache is 1000 times faster than getting the page over the net, but hit rate on cache is only 30% ExTimenew = Speedupoverall =
51
Amdahl’s Law: Example Suppose fetching a page from a web cache is 1000 times faster than getting the page over the net, but hit rate on cache is only 30% ExTimeWCache = ExTimeold x ( /1000) = ExTimeold x .7003 1 .7003 Speedupoverall = = 1.428
52
Amdahl’s Law: Example Just because something seems quantifiable, doesn’t mean it is meaningful. Quality of class = .5 student effort quality of instructor value of material MetricWProf+ = Metricold x ( /106) Metricold x .75 1 .75 Speedup = = 1.333 So even if I were a million times better as a professor, the class would only be times as good.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.