Download presentation
Presentation is loading. Please wait.
1
Temperature Aware Microprocessor Floorplanning Considering Application Dependent Power Load *Chunta Chu, Xinyi Zhang, Lei He, and Tom Tong Jing Electrical Engineering Department University of California, Los Angeles, 90095, CA This work was partially supported by NSF CAREER award and a UC MICRO grant sponsored by Altera and Intel Chunta Chu is now with Apache Design Solutions
2
Outline Motivation Problem formulation and models Experimental results Conclusion 1
3
Motivation Ever increasing integration level and clock rate lead to increased temperature and temperature gradient Extra clock skew and performance degradation Excessive leakage Increased cooling cost Increased clock needs interconnect pipelining Microprocessor floorplan should smooth the temperature gradient and also take into account interconnect pipelining 2
4
Existing Work Quick but not accurate [Han: TACS ’ 05] Model temperature by deterministic heat diffusion model No consideration of interconnect pipelining More accurate but far less efficient [Sankaranarayanan: JILP ’ 05] and [Nookala: ISLPED ’ 06] Calculate temperature for each potential floorplanning No explicit interconnect pipelining 3
5
Primary Contribution An efficient yet effective floorplanning Explicit modeling of interconnect pipelining by TPWL model [Long:DAC ’ 04] Stochastic heat diffusion model to avoid temperature calculation Reduce highest temperature by up to 3 o C and run up to 27x faster compared with the existing most accurate solution [Sankaranarayanan: JILP ’ 05]] 4
6
Outline Motivation Problem formulation and models Experimental results Conclusion 5
7
Problem Formulation Find a floorplanning for given soft modules of a microprocessor Minimize where CPI is average cycles per instruction 6
8
CPI Model [He-Long,DAC’04 ] Pre-calculate CPI for a number of floorplans based on predicted trajectory in the solution space Table lookup to calculate CPI for a new floorplan by interpolation based on its distance to floorplans with known CPI Less than 3% error compared to cycle accurate uArch simulation 7
9
Deterministic Heat Diffusion Model [Han: TACS ’ 05] The heat diffusion between two modules M i and M j and are the average power densities over time The total heat diffusion for module M i The bigger the heat diffusion is, the smaller the temperature gradient and Tmax are 8 H H (a) (b)
10
Recast of Problem Formulation Find a floorplanning for given soft modules of a microprocessor Minimize 9
11
Primary Limitation of Deterministic Heat Diffusion Average power density ignores power load correlation (a) Transient temperature is higher when power is positively correlated (b) Transient temperature is lower when power is negatively correlated 10
12
Power Correlation of Alpha-chip in SimpleScalar (a) Positively correlated (b) uncorrelated 11
13
Calculation of Power Correlation Treat power for each module as a stochastic process Obtain samples of the above stochastic process for each module as transient power simulated over SPEC2000 benchmarks Compute power correlation between modules as co-variance between the above stochastic processes 12
14
Correlation between Modules 1Decode2Branch3RAT4RUU 5LSQ6IALU17IALU28IALU3 9IntReg10IL111DL112IALU4 13FPAdd14FPMul15FPReg16L2_1 17L2_218L2_3 13
15
Correlation between Modules 1Decode2Branch3RAT4RUU 5LSQ6IALU17IALU28IALU3 9IntReg10IL111DL112IALU4 13FPAdd14FPMul15FPReg16L2_1 17L2_218L2_3 14 Correlation between modules 3 and10 is 0.9
16
Other Limitations: It Ignores Dead Space Without considering dead space may lead to higher Temperature. 15 Floorplan has dead spaces and some modules can diffuse more heat to the dead space. Ex.M1’s temperature is lower in (a) than that in (b)
17
Other Limitations: It ignores module geometry M1 has higher temperature in (a) than in (b), since M2’s area is smaller than M3’s area Power density: M1>>M4>M2=M3 16 Besides shared length between modules, the depth of the adjacent module also have to be considered.
18
Other Limitations: It ignores border effect Module can diffuse different amount of heat to the border depending on the package design 17
19
Stochastic Heat Diffusion Model Given m modules, n dead spaces, and power vector Pi=[p i1, …,p iT ] over T time steps for module M i Mean power density for module M i A i is the area for module M i, P Di is the transient power density vector, which equals P i /A i. E(X) is the expectation value of vector X 18
20
Stochastic Heat Diffusion Model (Cond.) If the adjacent module M j or dead space N j is totally inside the window, we modify P Dj to 19
21
Stochastic Heat Diffusion Model (Cond.) Heat diffusion to the adjacent modules L ij :shared length bewteen M i and M j Heat diffusion to the adjacent dead spaces, C ij :shared length between M i and N j Heat diffusion to the border B i :shared length between M i and the border Con_lateral and Con_adjacent: unit thermal conductance 20
22
Stochastic Heat Diffusion Model (Cond.) Given m modules, n dead spaces, Power density covariance between M i and M j E(P Di,P Dj ) is the expectation value of P Di P Dj over T timesteps The standard deviation of the total heat diffusion for module M i 21
23
Stochastic Heat Diffusion Model (Cond.) The total stochastic heat diffusion for M i Given Z potential hottest modules, the total stochastic heat flow is W i : weight proportional to 22
24
Outline Motivation Problem formulation and models Experimental results Conclusion 23
25
Implementation and Experiment uP 90nm Issue Width4 Die Area (mm 2 )100 Die Thickness (mm)0.5 Heat Spreader (mm 2 )900 Heat Sink (mm 2 )2500 24 The floorplanner uses sequence pair based simulated annealing [PARQUET] Experiments consider SPEC2000 benchmarks One SuperScalar processors for 90nm technology Modules are soft and the aspect ratio is between 0.33 ~3 and L2 is partitioned into three modules
26
Comparison with HotSpot tool [JILP ’ 05] [JILP’05 ] directly calculates temperature but ignores interconnect piplelining Our model Reduces temperature by up to 3 o C with 1.34% increase in area Runs up to 27x faster uP in 90nm Tmax( o C)Area(mm 2 )(WS)Runtime(s) [JILP’05] 93.0119.4(4.7%)2300 Ours 90.0121.0(5.6%)85 Impact -3.2%+1.34%1/27x 25
27
Impact of Thermal Modeling Our stochastic thermal model can reduce temperature up to 8.9 o C Compared to the thermal-oblivious floorplanner Compared with the deterministic model, our model obtains up to 3.2 o C reduction of the on-chip peak temperature, and 1.13x better CPI performance. uP in 90nm Obj.CPITmax( o C)Area(mm 2 )WS(%) BestAvgBestAvgBestAvg AC0.8200.89097.796.7118.5(3.05)122.4(6.89) ACH d 0.995 +21.3% 1.000 +12.4% 92.0 -5.8% 92.2 -4.7% 122.0(6.67) +2.9% 125.3(9.08) +2.3% ACH s 0.880 +7.3% 0.954 +7.2% 88.8 -9.1% 88.9 -8.1% 121.1(6.10) +2.2% 123.2(7.36) +0.6% Obj: A: area C: CPI H d : [Han:TACS’05] H s : Ours 26
28
Conclusions We have developed a stochastic heat diffusion model to effectively capture correlation between transient power over workload We have also developed an efficient yet effective thermal-aware uP floorplanning In the future, we will extend to 3D integration and multi-core processors 27
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.