Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dynamic Power Redistribution in Failure-Prone CMPs Paula Petrica, Jonathan A. Winter * and David H. Albonesi Cornell University *Google, Inc.

Similar presentations


Presentation on theme: "Dynamic Power Redistribution in Failure-Prone CMPs Paula Petrica, Jonathan A. Winter * and David H. Albonesi Cornell University *Google, Inc."— Presentation transcript:

1 Dynamic Power Redistribution in Failure-Prone CMPs Paula Petrica, Jonathan A. Winter * and David H. Albonesi Cornell University *Google, Inc.

2 Paula Petrica WEED2010 2 Motivation Hardware failures expected to become prominent in future generations Front End (FE) Back End (BE) Load-Store Queue (LSQ) Core

3 Paula Petrica WEED2010 3 Motivation Deconfiguration tolerates defects at the expense of performance Pipeline imbalance Units correlated with deconfigured one might become overprovisioned Power inefficiencies Application specific Front End (FE) Back End (BE) Load-Store Queue (LSQ) Core

4 Paula Petrica WEED2010 4 Research Goal Given a CMP with a set of failures and a power budget: Eliminate power inefficiencies Improve performance

5 Paula Petrica WEED2010 5 Outline Motivation Architecture Power Harnessing Performance Boosting Power Transfer Runtime Manager Conclusions and future work

6 Paula Petrica WEED2010 6 Core 2 Front End (FE) Load-Store Queue (LSQ) Architecture Two-step approach Transfer power Harness Power Back End (BE) Core 1 Front End (FE) Load-Store Queue (LSQ) Back End (BE)

7 Paula Petrica WEED2010 7 Power Harnessing FQ Decode/ Rename Dispatch ROB IQ Select D-Cache RF BPred I-Cache FE BE LSQ

8 Paula Petrica WEED2010 8 Pipeline Imbalance Performance Loss Power Saved

9 Paula Petrica WEED2010 9 Performance Boosting Distribute accumulated margin of power to boost performance Temporarily enable a previously dormant feature Requirements Small area and fast power-up Small PPR (Power-Performance Ratio)

10 Paula Petrica WEED2010 10 Performance Boosting Techniques Speculative Cache Access Speculatively send L1 requests to the L2 cache Speculatively access both tag and data in the L2 cache at the same time (rather than serially) Turned on independently or in combination Approximately linear power-performance relationship Benefits applications limited by L1 cache capacity Load L1 Cache L2 Cache L1 MissTagData Lower Hierarchy Level miss hit L2 Cache Tag Data Lower Hierarchy Level miss L2 Cache

11 Paula Petrica WEED2010 11 Performance Boosting Techniques Boosting main memory performance CLEAR [N. Kirman et al, HPCA 2005] Predict and speculatively retire long latency loads Supply predicted values to destination registers Free processor resources for non-dependent instructions Linear power-performance relationship Benefits memory bound applications

12 Paula Petrica WEED2010 12 Performance Boosting Techniques DVFS Scale up voltage and frequency Already built in Cubic power cost for linear performance benefit Benefits high-IPC applications

13 Paula Petrica WEED2010 13 Comparison of Boosting Techniques Performance Improvement

14 Paula Petrica WEED2010 14 Core 2 Front End (FE) Load-Store Queue (LSQ) Architecture Two-step approach Transfer power Harness Power Back End (BE) Core 1 Front End (FE) Load-Store Queue (LSQ) Back End (BE)

15 Paula Petrica WEED2010 15 Power Transfer Runtime Manager Periodically coordinate chip-wide effort to relocate power among cores Obtain current local hardware deconfiguration status (due to faults) Determine additional components to be deconfigured Transfer power to one or more mechanisms that make best use of it

16 Paula Petrica WEED2010 16 Power Transfer Runtime Manager Sampling Phase Steady Phase Sample deconfigurations Choose additional deconfiguration Sample performance boosting Compute global throughput with fairness Choose best 4-core configuration Apply DVFS (greedy) Local decisions Global Decisions

17 Paula Petrica WEED2010 17 Global vs Local Optimization 100 4-core configurations, random errors and random SPEC CPU2000 benchmarks 22.2% 10.0% Speedup

18 Paula Petrica WEED2010 18 Diversity of Boosting Techniques 100 4-core configurations, random errors and random SPEC CPU2000 benchmarks 22.2% 6.3% Speedup

19 Paula Petrica WEED2010 19 Power Transfer Runtime Manager 100 4-core configurations, random errors and random SPEC CPU2000 benchmarks 22.2% 15.3% 10.0% 6.3% Speedup

20 Paula Petrica WEED2010 20 Conclusions We proposed a technique to increase performance given a certain power budget in the presence of hard faults Exploited the deconfiguration capabilities already built in microprocessors Demonstrated that pipeline imbalances and additional deconfiguration are application-dependent Proposed several boosting techniques Demonstrated the potential for substantial performance gains for a 4-core CMP

21 Paula Petrica WEED2010 21 Future Work Heuristic approaches to scale this problem to many cores Simulated Annealing, Genetic Algorithm Pareto optimal fronts to reduce the number of combinations Hierarchical optimization

22 Questions?


Download ppt "Dynamic Power Redistribution in Failure-Prone CMPs Paula Petrica, Jonathan A. Winter * and David H. Albonesi Cornell University *Google, Inc."

Similar presentations


Ads by Google