Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Wire-driven Microarchitectural Design Space Exploration School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332,

Similar presentations


Presentation on theme: "1 Wire-driven Microarchitectural Design Space Exploration School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332,"— Presentation transcript:

1 1 Wire-driven Microarchitectural Design Space Exploration School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332, USA Mongkol Ekpanyapong Sung Kyu Lim Chinnakrishnan Ballapuram Hsien-Hsin “Sean” Lee ISCAS 2005, Kobe, Japan

2 Wire-driven Microarchitectural Design Space Exploration 2 Microarchitecture Design Trend Transistors are almost free  billions of billions [Pat Gelsinger keynote in DAC-42] Processor architects tend to Increase module capacity to improve the performance (e.g. caches, BTB, ROB, etc) Increase the die dimension Assume communications are free, too But ….. Delay = 80 ns 1mm Delay = 20 ns 0.5mm

3 Wire-driven Microarchitectural Design Space Exploration 3 Alleviating Wire Delay Buffers Insertion to speed up In reality, chip size is growing Issues in many via cuts, area, power,.. Flip-Flop Insertion to meet cycle time (P4 dedicates 2 pipe stages for communication) Module 2 FF Module 2 FF Module 1 Latency is not scalable !

4 Wire-driven Microarchitectural Design Space Exploration 4 Motivation Wires, in particular global wires, is a problem In deep submicron processor design Conventional architecture techniques increasing module sizes (e.g. caches) will no longer guarantee performance improvement Early design space exploration (DSE) at the microarchitecture level needs to take “wire impact” into account A high efficiency DSE framework is imperative

5 5 Algorithms

6 Wire-driven Microarchitectural Design Space Exploration 6 Dynamic communication-aware Profile-guided Floorplanning [DAC-42] CACTIGENESYS PROFILING FLOORPLANNING CYCLE-BASED SIMULATOR Technology Parameter Architecture Description Application Target Frequency Module-level Netlist + Profile Module-level Layout + Wire Latency Use Traffic ProfileFor floorplanning

7 Wire-driven Microarchitectural Design Space Exploration 7 CACTIGENESYS PROFILING FLOORPLANNING CYCLE-BASED SIMULATOR ADAPTIVE PARAMETER TUNING Technology Parameter Architecture Description Application Target Frequency Module-level Netlist + Profile Module-level Layout + Wire Latency AMPLE  Adaptive Microarchitectural PLanning Engine Wire-drivenAutomatedDesign SpaceExploration

8 Wire-driven Microarchitectural Design Space Exploration 8 Adaptive Parameter Tuning Algorithm Initialization ADAPTIVE PARAMETER TUNING For each uarch parameter Gradient Search End

9 Wire-driven Microarchitectural Design Space Exploration 9 AMPLE  Initialization Initialization For N uarch parameters (N+1) Iteration Smart Start Priority_search Priority_search() based on Microarch_Planning Results Profile-Guided Microarch_Planning Microarch_Planning() Optional: Profile-Guided Microarch_Planning Microarch_Planning() For N uarch parameters (N+1) Iteration

10 Wire-driven Microarchitectural Design Space Exploration 10 Smart Start: Initial Microarchitecture Configurations ClasswidthBTBRUULSQIL1DL1L2L3ALUFPU Processor bound 851225612816K8K128K064 Cache sensitive 42561282568K16K512K042 Bandwidth bound 42561281288K8K128K042 Good starting points can reduce design space exploration time Applications are classified into three categories: Processor-bound applications Cache-sensitive applications Bandwidth-bound applications

11 Wire-driven Microarchitectural Design Space Exploration 11 Priority Search Prioritize microarchitectural parameters High impact parameters are tuned first Correlation metric can be used to identify critical parameters, but it requires large runtime Gradient First-order Ratio (GFR) is proposed here as follow: Higher GFR  Higher priority A uarch parameter (e.g. BTB) The uarch parameter has max IPC gain Initialization For each uarch parameter Gradient Search End

12 Wire-driven Microarchitectural Design Space Exploration 12 Adaptive Parameter Tuning Algorithm ADAPTIVE PARAMETER TUNING Initialization For each uarch parameter End Gradient Search

13 Wire-driven Microarchitectural Design Space Exploration 13 Gradient Search Algorithm Gradient Search Update Parameter and Prune Profile-Guided Microarch_Planning() Compute Gain Return While Gain > Threshold && Acyclic

14 Wire-driven Microarchitectural Design Space Exploration 14 Compute Gain and New Parameters Let [p,i] be a microarchitecture parameter p at iteration i Let  denotes the step size Gain Equation: Parameter Calculation Equation: Parameters are pruned or rounded if unrealistic

15 Wire-driven Microarchitectural Design Space Exploration 15 Search Pruning Rationale Reduce search time by pruning unrealistic parameters Cache size order L1 < L2 < L3 Issue width ≥ Number of ALUs No search in floating point units for integer applications Upper and lower bound on number of modules and module size

16 16 Experimental Results

17 Wire-driven Microarchitectural Design Space Exploration 17 DSE Runtime Comparison Bench.Brute ForceSimulated AnnealingAMPLE TimeIterationTimeIterationTimeIteration 164.gzip 31438434432936 175.vpr 40638450434235 181.mcf 20938429431836 254.gap 32538449433635 300.twolf 20238421431836 171.swim 83076862434533 179.art 1,561768126438638 189.lucas 39676822431532 Normalized Avg. Time14.311.341.00

18 Wire-driven Microarchitectural Design Space Exploration 18 Performance Comparison Best: best pick from brute force SA: Simulated Annealing Gra: AMPLE w/ design goal of “performance” Gra II: AMPLE w/ design goal of “performance + area” 1.0 = brute force average

19 Wire-driven Microarchitectural Design Space Exploration 19 Area Comparison Best: best pick from brute force SA: Simulated Annealing Gra: AMPLE w/ design goal of “performance” Gra II: AMPLE w/ design goal of “performance + area” 1.0 = brute force average

20 Wire-driven Microarchitectural Design Space Exploration 20 Contributions and Conclusion We propose AMPLE DSE Framework Wire delay conscious Goal-directed High performance Cost effectiveness Highly efficient An order of magnitude faster than time-limted (incomplete) brute force 1.43x faster than simulated annealing We show that AMPLE outperforms prior art in DSE turnaround time DSE quality

21 Wire-driven Microarchitectural Design Space Exploration 21 Q & A That’s All Folks !


Download ppt "1 Wire-driven Microarchitectural Design Space Exploration School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332,"

Similar presentations


Ads by Google