Download presentation
Presentation is loading. Please wait.
1
Parameterized Systems-on-a-Chip Frank Vahid Tony Givargis, Roman Lysecky, Leslie Tauro, Susan Cotterell Department of Computer Science and Engineering University of California, Riverside The Dalton Project Supported by: NSF, NEC, DAC scholarship
2
2 Outline Introduction Parameterized Systems-on-a-Chip Exploring Parameter Configurations Conclusions
3
3 IC Introduction Advent of system-on-a-chip Micro- proc. IC Memory IC Peripher. IC FPGA IC Board Microprocessor core (aka “IP”) Peripheral core Introduction
4
4 System-on-a-chip (SOC) Introduction
5
5 The Productivity Gap [ITRS99]
6
6 Programmable Platforms (ITRS99) Pre-fabricated IC, synthesizable HDL, or both –“reference designs” (VLSI), “silicon platforms” (Philips), “fig chips” (Vahid/Givargis99) Micro- processor CacheMemoryDMABridge FPGA Peripheral System bus Peripheral bus Programmable Platform Introduction
7
7 Targeted to Embedded Systems May drive future architecture design [Patterson98] Varied power/performance/size constraints –Programmable platforms must adapt Introduction
8
8 Micro- processor CacheMemoryDMABridge FPGA Peripheral System bus Peripheral bus Programmable Platform Adapting platforms to constraints One solution: Architectural Parameters Application1 main() while (…) { … } Cache Application2 main() while(…) { … } Cache Introduction
9
9 Related work Pleiades project [Rabaey97] VLSI’s Velocity ArchitectureApplications Numbers Mapping Analysis Our focus Introduction Microprocessor + FPGA Philips’ Y-Chart approach Microcontrollers
10
10 Outline Introduction Parameterized Systems-on-a-Chip Exploring Parameter Configurations Conclusions
11
11 Basic parameters -- cache Micro- processor CacheMemoryDMABridge FPGA Peripheral System bus Peripheral bus Programmable Platform Cache Parameterized Systems-on-a-chip
12
12 Basic parameters -- cache TagIndexOffset VTDVTD == Mux Data Associativity Cache Size Line Size Parameterized Systems-on-a-chip
13
13 Micro- processor CacheMemoryDMABridge FPGA Peripheral Programmable Platform System bus Peripheral bus Basic parameters -- bus Parameterized Systems-on-a-chip
14
14 Basic parameters -- Bus Bus Change Bus Width [Givargis98] C1C1 C2C2 C 1 > C 2 Parameterized Systems-on-a-chip Mux Demux Mux Demux Bus
15
15 Basic parameters -- Bus Bus Parameterized Systems-on-a-chip Encode data to reduce switching (Bus Invert) [Stan95] Encoder Decoder Encoder Decoder invert_ctrl 0100101101001011 1001011010010110 Hamming Dist = 6 0110100101101001 1 0100101101001011 0 invert_ctrl Binary Encoding Bus-Invert Encoding Hamming Dist = 3
16
16 Parameter definitions Parameter –An architectural feature that can be varied, with a small set of possible values, without changing the application’s essential functionality. Configuration –A selection of a particular value for every architecture parameter Static vs. dynamic parameter –Static: Value is set before fabricating the IC. –Dynamic: Value is set after fabricating the IC. Parameterized Systems-on-a-chip
17
17 Potential tradeoffs experiment [ICCAD99] Parameterized Systems-on-a-chip Micro- processor MemoryDMABridge FPGA Peripheral System bus Peripheral bus I-cache D-cache ParametersPossible values I-cache Size32k,16k,8k,4k,2k,1k,512,256,128 Line8, 16, 32 Associativity2, 4, 8 D-cache Size32k,16k,8k,4k,2k,1k,512,256,128 Line8, 16, 32 Associativity2, 4, 8 Mp-c bus Data bus width4, 8, 16, 32 Data bus inverton or off Sys. bus Data bus width4, 8, 16, 32 Data bus inverton or off
18
18 Potential tradeoffs experiment [ICCAD99] Cache: Dinero [Edler, Hill] ISS: [Tiwari96] Micro- processor CacheMemory C Program Bus simulator Instr. Set Simulator Cache Simulator Memory Simulator Power Total power Parameterized Systems-on-a-chip
19
19 Potential tradeoffs experiment X-axis: execution time (sec) Y-axis: power (watt) Tradeoff between performance and power Computed power for all 45,568 configurations –For each of four C applications –Used microprocessor, cache, and bus simulators (1 wk CPU) Parameterized Systems-on-a-chip
20
20 Potential tradeoffs experiment Narrower bus required a larger cache size Bus: 8-1/32-1 I: 32k, 8, 8 D: 16k, 8, 16.995 sec, 3.4 W, 30K Bus: 16-1/32-1 I: 16k, 8, 16 D: 32k, 8, 8.389 sec, 11.4 W, 21kG Bus: 32-1/32-0 I: 16k, 4, 4 D: 16k, 4, 4.086 sec, 43.6 W, 20kG Parameterized Systems-on-a-chip
21
21 Potential tradeoffs experiment Performance varied by 11x Power varied by 13x Area varied by 1x Energy consumption varied by 2x Parameterized Systems-on-a-chip
22
22 Potential tradeoffs experiment Bus: 8-1/4-0, I: 1k, 2, 4 D: 512, 2, 4 5 ms,.02 W, 18kG Bus: 16-1/32-1 I: 1k, 4, 4 D: 512, 4, 8 3 ms,.07 W, 17kG Bus: 32-1/32-1 I: 1k, 4, 4 D: 512, 4, 8 2 ms,.19 W, 15kG Parameterized Systems-on-a-chip
23
23 Potential tradeoffs experiment Performance varied by 2.5x Power varied by 9.5x Area varied by 1x Energy consumption varied by 4x Parameterized Systems-on-a-chip
24
24 Potential tradeoffs experiment How much variation in total system power and performance can we obtain just by varying the cache and bus parameters? –9 to 14x improvement in power/performance How interdependent are these two types of parameters? –fixing cache param. values, then selecting bus param. values results in non-optimal solutions Parameterized Systems-on-a-chip
25
25 Many more parameters possible Some examples include: –Code compression –Address bus encoding –Multiple levels of memory hierarchy –CPU parameters (e.g., voltage scale, DP width) –Peripheral core parameters (our current focus) –Fertile research area Can yield even larger tradeoffs if we: –Create parameter-aware compiler –Adapt OS? Parameterized Systems-on-a-chip
26
26 Outline Introduction Parameterized Systems-on-a-Chip Exploring Parameter Configurations Conclusions
27
27 Evaluation by gate-level simulation Exploring Parameter Configurations Micro- processor CacheMemoryDMABridge FPGA Peripheral Programmable Platform System bus Peripheral bus Capture each core in HDL, synthesize, simulate HDL synthesis HDL simulation Total power Reconfigure Hours (often tens) per configuration
28
28 Evaluation by system-level simulation Exploring Parameter Configurations Micro- processor CacheMemoryDMABridge Peripheral Peripheral bus C Program Trace Generator Bus simulator Instr. Set Simulator Cache Simulator Memory Simulator Power Total power OO models DMA Simulator Bridge Simulator Peripheral Simulator Peripheral Simulator Power Minutes-per-configuration Contrast with hours-per-conf. Reconfigure
29
29 Evaluation by trace-simulation Exploring Parameter Configurations OO non-fct. models Note that the cache simulator is non-functional Same approach for others –Get traces from small # of system simulation Bus trace Bus trace simulator Instr. trace Simulator Memory trace Simulator Instr. trace C Program Trace Generator Cache trace Simulator Address trace DMA trace Simulator Bridge trace Simulator Peripheral trace Simulator Peripheral trace Simulator Instr. traces Power Total power Power Reconfigure Seconds-per-configuration
30
30 Evaluation by trace-analysis Exploring Parameter Configurations Equations Further speedup -- –statistically-characterize traces –Still only small # of system simulations Bus stats. Bus trace simulator Instr. trace analyzer Memory trace analyzer Instr. stats. C Program Trace Generator Cache trace analyzer Address stats. DMA trace analyzer Bridge trace analyzer Peripheral trace analyzer Peripheral trace analyzer Instr. stats. Power Total power Power Reconfigure Milliseconds-per-configuration
31
31 Trace-analysis approach for cache Given a trace of memory refs Cache parameters Size (S) Line/block-size (L) Associativity (A) Compute # of misses (N) Size (S) # of misses (N) Exploring Parameter Configurations
32
32 Trace-analysis approach for cache Exploring Parameter Configurations
33
33 Trace-analysis approach for cache Capture improvements obtainable by: –changing line-size at small/large values of cache-size –changing associativity at small/large values of cache-size Exploring Parameter Configurations
34
34 Trace-analysis approach for bus Exploring Parameter Configurations Items/second Bus width Num transfers per item Random data capacitance
35
35 Trace-analysis approach for bus Bus equation: m items/second (denotes the traffic N on the bus) n bits/item k bit wide bus bus-invert encoding random data assumption Exploring Parameter Configurations
36
36 Trace-analysis experiments Bus ABus B Peripheral 1 Peripheral Bus Bridge CPU I-Cache D-Cache Peripheral 2Peripheral n Memory Cache parameters – size: 128, 256, 512, 1k, 2k, 4k, 8k, 16k, 32k – assoc: 2, 4, 8 – line: 8, 16, 32 Bus Parameters – width: 4, 8, 16, 32 – code: binary/bus-invert Analyzed 45K sets exhaustively for each of 4 examples. Exploring Parameter Configurations
37
37 Experiment Results Diesel application’s performance Blue (light-gray) is system-simulation-based Red (dark-gray) is trace-analysis-based 4% error 320x faster Exploring Parameter Configurations
38
38 Experiment Results Diesel application’s energy consumption Blue (light-gray) is obtained using full simulation Red (dark-gray) is obtained using our equations 2% error 420x faster Exploring Parameter Configurations
39
39 Experiment Results CKey application’s performance Blue (light-gray) is obtained using full simulation Red (dark-gray) is obtained using our equations 8% error 125x faster Exploring Parameter Configurations
40
40 Experiment Results CKey application’s energy consumption Blue (light-gray) is obtained using full simulation Red (dark-gray) is obtained using our equations 3 % error 125x faster Exploring Parameter Configurations
41
41 Experiment Results 125 - 400x speedup 1-18% absolute error (power & performance) 2% average power error Time (hours) Power Error (%) Exploring Parameter Configurations
42
42 Conclusions Parameters can improve usefulness of programmable platforms –by adapting platform to particular application and to power/performance constraints Good tradeoff range even for basic parameters Fast and accurate evaluation seems possible Future research –Fast evaluation techniques for general cores –More parameters for SOC’s, static and dynamic –Couple with parameter-aware compilers –Dynamic re-configuration (adapt)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.