Download presentation
Presentation is loading. Please wait.
Published byConnor Hughes Modified over 11 years ago
1
On The Energy Efficiency of Computation Mihai Budiu CMU CS CALCM Seminar Feb 17, 2004 Note: this version fixes some errors in the ASH performance graphs shown
2
2 Presentation Setup main( ) { signal(SIGINT, welcome); while (slides( ) && time( )) { talk( ); }
3
3 Why Do We Care? Toasted CPU: about 2 sec after removing cooler. (Toms Hardware Guide)
4
4 Power and Power Density Data from Fred Polack, Intel, MICRO 32 Assuming constant die size, no power management
5
5 Power Density Distribution Chip surface Data from Fred Polack, Intel, MICRO 32
6
6 Outline Introduction Power and Energy Efficiency –data from Bob Brodersen, Berkeley wireless group Synchronous Hardware Efficiency Asynchronous Hardware Efficiency ASH Efficiency Conclusions
7
7 Energy Efficiency Metric How much computing can we can do......with a finite energy source?
8
8 Some Arithmetic
9
9 Energy and Power Efficiency The energy efficiency metric for energy constrained applications (OP/nJ) = thermal (power) considerations when maximizing throughput (MOPS/mW). JouleWatt OP/nJ = MOPS/mW
10
10 ISSCC Chips (.18 m-.25 m) #YearDescription #YearDescription 11997S/390 111998Graphics 22000PPC (SOI) 121998Multimedia 31999G5 132000Multimedia 42000G6 142002Mpg decoder 52000Alpha 151998Multimedia 61998P6 162001Encryption Processor 71998Alpha 172000Hearing Aid Processor 81999PPC 182000FIR for Disk Read Head 91998StrongArm 191998MPEG Encoder 102000Comm 202002802.11a Baseband MicroprocessorsDedicatedDSPs #YearDescription
11
11 Energy Efficiency (MOPS/mW or OP/nJ) 3 orders of magnitude!
12
12 Outline Introduction Power and Energy Efficiency Synchronous Hardware Efficiency Asynchronous Hardware Efficiency ASH Efficiency Conclusions
13
13 Explaining the Difference Operations per second: MOPS = f clk £ N op Operations per clock Chip area per operation Efficiency: MOPS/P chip = (f clk £ N op )/ (A chip £ C sw £ V dd 2 £ f clk ) =1/(A op £ C sw £ V dd 2 ) Normalized switched capacitancePower: P chip = A chip £ C sw £ V dd 2 £ f clk
14
14 Supply Voltage, V dd MOPS/P chip =1/(A op £ C sw £ V dd 2 )
15
15 Normalized Switched Capacitance, C sw MOPS/P chip =1/(A op £ C sw £ V dd 2 ) 3x
16
16 Area per operation, A op A op = A chip /N op MOPS/P chip =1/(A op £ C sw £ V dd 2 ) AHA!
17
17 Focusing In PPC NEC DSP 802.11a
18
18 P: MOPS/mW=.13 Useful arithmetic N op = 2 (two ways) f clock = 450 MHz ) 900 MIPS A op = A chip /2= 42mm 2 Power = 7 Watts
19
19 DSP: MOPS/mW=7 4 processors £ 4 ops each N op = 16 f clock = 50 MHz ) 800 MOPS A op = A chip /16= 5.3mm 2 Power = 110 mW
20
20 Dedicated Design: MOPS/mW=200 N op = 96 f clock = 25 MHz ) 2400 MOPS A op = 5.4 mm 2 /96 =.15 mm 2 Power = 12 mW Complex MAC = 8 ops Fully parallel mapping of adaptive correlator algorithm.
21
21 Memory is More Power-Efficient Hint: use on-chip caches
22
22 Energy Distribution in P useful (includes local clock)
23
23 Efficiency and Performance V dd + ! f clock +, MOPS + Power + MOPS/mW * Better metric: Energy £ delay –Roughly independent of V dd
24
24 Efficiency and Technology 1000 100 10 1 0.1 0.01 0.001 210.50.250.130.10.07 MOPS / mW feature size [µ] hardwired microprocessors [T. Claasen, ISSCC 1999] DSP
25
25 How Low Can You Go? Energy required to compute is ZERO If computation is quasistatic......and no information is destroyed (reversible) Ops/nJ ! 1 Rolf Landauer
26
26 Outline Introduction Power and Energy Efficiency Synchronous Hardware Efficiency Asynchronous Hardware Efficiency ASH Efficiency Conclusions
27
27 Lutonium Performance Asynchronous microcontroller Designed and implemented at Caltech 0.18 m technology 1.8V supply, 0.4V/0.5V th 200 MIPS 1.8 ops/nJ DSP-like Alain Martin
28
28 Efficiency and Supply Voltage
29
29 Async Processor Breakdown useful
30
30 Outline Introduction Power and Energy Efficiency Synchronous Hardware Efficiency Asynchronous Hardware Efficiency ASH Efficiency Conclusions
31
31 Application-Specific Hardware C code Compiler for Application Specific Hardware Asynchronous Circuits Memory
32
32 Tool-Flow C CASH core Verilog back-end Synopsys, Cadence P/R ASIC 180nm std. cell library, 2V ~1999 technology Mediabench kernels (1 hot function/benchmark) Memory
33
33 Caveat Memory we model this part accurately optimistic speed model, no power accounting
34
34 ASH Performance
35
35 ASH vs 600MHz CPU
36
36 ASH Area minimal RISC core
37
37 Normalized Area many C macros
38
38 ASH Energy Efficiency
39
39 All Together Now
40
40 Conclusions Performance comes at a price Energy efficiency is expressed in ops/nJ or MOPS/mW Dedicated hardware is more power-efficient than microprocessors ASH efficiency competitive with dedicated hardware
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.