Canturk ISCI Margaret MARTONOSI Phase Detection and Prediction on Real Systems for Workload-Adaptive Power Management Canturk ISCI Margaret MARTONOSI Talk will present Recent study on phase abaliz on pwr char-n Where we look at the problem from two different angles Namely c-f-b and e-c-b approaches and Eval their responses
Canturk Isci - Margaret Martonosi Program Phases Distinct and often-recurring regions of program behavior How can we detect recurrent execution under real system variability? How can we predict future phase patterns? How can we leverage predicted phase behavior for workload-adaptive power management? Can we do better than simple, reactive methods? Useful for: Characterizing execution regions Use current phase/behavior to predict future behavior Managing dynamic adaptation Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi Research Overview Monitor application execution via specific features Classify features into phases Detect/Predict phase behavior Apply dynamic power management guided by phase predictions Validate with real measurements Dynamic Management Power Estimation Phase Analysis Power Estimation Runtime Monitoring Hardware Performance Counters Dynamic Program Flow Application Real Measurements Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi This Talk Application Track memory accesses per instruction (Mem/Uop) via performance counters Runtime Monitoring Hardware Performance Counters Dynamic Program Flow Runtime Monitoring Hardware Performance Counters Dynamic Program Flow Classify execution into phase patterns based on Mem/Uop rates Predict future behavior with the Global Phase History Table (GPHT) predictor This talk a specific recent project Power Estimation Use phase predictions to guide dynamic voltage and frequency scaling (DVFS) Phase Classification Phase Prediction Phase Analysis Dynamic Management Real Measurements Canturk Isci - Margaret Martonosi
From Execution to Phases Mem/Uop Phases 0.020 0.015 Mem/Uop Rate 0.010 0.005 1 2 3 4 5 Phases 0.000 2.80E+10 2.90E+10 3.00E+10 3.10E+10 3.20E+10 3.30E+10 Cycles Assign different Mem/Uop ranges to different phases Higher phase number more memory bound phase Phase patterns expose available recurrence! Simple phase definition Resilient to system variations Invariant to dynamic power management actions Canturk Isci - Margaret Martonosi
Predicting Phases with the GPHT PHT Tags PHT Pred-n Age / Invalid Pt’ Pt’-1 Pt’-2 … Pt’-N Pt’ Pt’-1 Pt’-2 … … … … Pt’-N Pt’+1 15 20 : -1 GPHR Pt-1 Pt-2 … Pt-N Pt Pt-N-1 Pt’’ Pt’’ Pt’’-1 Pt’’-2 … Pt’’-N Pt’’-1 Pt’’-2 … … … … Pt’’-N Pt’’+1 Pt’’+1 : : : : : : : : : GPHR depth PHT entries Pt Pt : : : : : : : : : : : : : : : : : : Last observed phase from performance counters P0 P0 P0 … … … … P0 P0 GPHR depth Predicted Phase From GPHR(0) if no matching pattern From the corresponding PHT Prediction entry if matching pattern in PHT Similar to a global history branch predictor Implemented in OS for on-the-fly phase prediction Canturk Isci - Margaret Martonosi
Prediction Accuracies 100 90 80 LastValue Prediction Accuracy (%) 70 PHT:1024, GPHR:8 60 PHT:128, GPHR:8 PHT:64, GPHR:8 50 PHT:1, GPHR:8 40 gzip_log mcf_inp gcc_200 gap_ref gcc_166 apsi_ref gcc_scilab gcc_expr ammp_in parser_ref mgrid_in applu_in equake_in wupwise_ref gcc_integrate bzip2_program bzip2_source bzip2_graphic Compare to reactive approaches (Last Value prediction) GPHT performs significantly better for highly varying applications Up to 6X and on average 2.4X misprediction improvement Good performance down to 128 PHT entries Converges to last value as PHT entries 1 Canturk Isci - Margaret Martonosi
Phase Driven Dynamic Power Management Phase definitions Memory boundedness DVFS potential Each predicted phase Corresponding (V,f) setting Implementation overview: Now we can use these phases to guide dynamic power mgmt Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi Complete Example Mem/Uop (GPHT) ACTUAL_PHASE PRED_PHASE (GPHT) 0.000 0.004 0.008 0.012 0.016 0.020 0.024 Mem/Uop GPHT can accurately predict varying application behavior! 1 2 3 4 5 Phases 2 4 6 8 10 12 14 Significant power savings compared to baseline! Power (Baseline) Power (GPHT) Power [W] 0.3 0.6 0.9 1.2 1.5 1.8 2.1 BIPS (Baseline) BIPS (GPHT) Negligible performance degradation! BIPS 1.5E+09 2.0E+09 2.5E+09 3.0E+09 3.5E+09 4.0E+09 4.5E+09 5.0E+09 Instructions Canturk Isci - Margaret Martonosi
Improvement over Reactive Methods 7% EDP improvement over reactive methods! Comparable or less performance degradation! Plots show EDP impr. And perf degr. For GPHT and last val, wrt baseline exec-n Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi Conclusions Phase characterizations help identify repetitive application behavior under real-system variability and dynamic management actions Runtime phase predictions with the Global Phase History Table can accurately predict future application behavior Up to 6X and on average 2.4X less mispredictions than reactive approaches Dynamic power management guided by these phase predictions help improve system power/performance efficiency 27% EDP improvements over baseline and 7% over reactive approaches Presented research framework and real-system experiments can guide phase-oriented characterization and dynamic adaptation applications In this work, we showed our observations with a real sys exp-n to eval workload pwr char-n with control flow and event counter based features The results of our study showed : We hope that Resulting experimental framework and observations can guide phase-oriented characterization and system adaptation work on real systems Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi Thanks! Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi EXTRAS 1.1) Why care about phases examples 1.2) Why care about pwr phases examples 1.3) What are different features that prev studies looked at? 2) Experiment setup details Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi 1.1) Why Care About Phases? Characterizing execution regions E1 E2 E3 E4 Summarize exec. Into repr exec. regions Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi 1.1) Why Care About Phases? Characterizing execution regions Managing dynamic adaptation OFF ON Dynamic/adaptive mgmt Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi 1.1) Why Care About Phases? Characterizing execution regions Managing dynamic adaptation Use current phase/behavior to predict future behavior 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 3 8 13 Time [s] Load Refs Store Misses Canturk Isci - Margaret Martonosi
1.2) Why Care About Power Phases? Useful for: Guiding power budget / temperature limit management Slow down! Power [W] Temp. [oC] Time [s] Uncontrolled T Enforced T I.e. Montecito/Foxton I.e. Montecito/Foxton Canturk Isci - Margaret Martonosi
1.2) Why Care About Power Phases? Useful for: Guiding power budget / temperature limit management Power/Temperature aware scheduling Power [W] This helps in 2 ways: Reduce cooling cost/heat removal rate for a server Extend battery life for a mobile as less cooling power/time is needed Time [s] [Bellosa et al. COLP’03] Canturk Isci - Margaret Martonosi
1.2) Why Care About Power Phases? Useful for: Guiding power budget / temperature limit management Power/Temperature aware scheduling Power balancing for multiprocessor systems/activity migration Power Power Task1 Task2 Swap hot task Migrate hot task Or Slow down hot core Core/μP 1 Core/μP 2 Speed up! Slow down! Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi Older Canturk Isci - Margaret Martonosi
Canturk Isci - Margaret Martonosi This Talk Classify application execution into phases based on HW performance counters Predict phase behavior Apply dynamic power management guided by phase predictions Validate with real measurements Application Runtime Monitoring Hardware Performance Counters Dynamic Program Flow Power Estimation Power Estimation Phase Analysis Dynamic Management Real Measurements Canturk Isci - Margaret Martonosi
Predicting Phases with the GPHT PHT Tags PHT Pred-n Age / Invalid Pt’’ Pt’’-1 Pt’’-2 … Pt’’-N Pt’ Pt’-1 Pt’-2 Pt’-N : P0 Pt’’+1 Pt’+1 : P0 15 20 : -1 GPHR Pt Pt-1 Pt-2 … … … … Pt-N GPHR depth PHT entries Pt Last observed phase from performance counters GPHR depth Predicted Phase From GPHR(0) if no matching pattern From the corresponding PHT Prediction entry if matching pattern in PHT Similar to a global history branch predictor Implemented in OS for on-the-fly phase prediction Canturk Isci - Margaret Martonosi