Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX. 2006 Feb 14, 2006.

Slides:



Advertisements
Similar presentations
Discovering and Exploiting Program Phases Timothy Sherwood, Erez Perelman, Greg Hamerly, Suleyman Sair, Brad Calder CSE 231 Presentation by Justin Ma.
Advertisements

Managing Wire Delay in Large CMP Caches Bradford M. Beckmann David A. Wood Multifacet Project University of Wisconsin-Madison MICRO /8/04.
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee Margaret Martonosi.
1 Adapted from UCB CS252 S01, Revised by Zhao Zhang in IASTATE CPRE 585, 2004 Lecture 14: Hardware Approaches for Cache Optimizations Cache performance.
Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Sim-alpha: A Validated, Execution-Driven Alpha Simulator Rajagopalan Desikan, Doug Burger, Stephen Keckler, Todd Austin.
Accurately Approximating Superscalar Processor Performance from Traces Kiyeon Lee, Shayne Evans, and Sangyeun Cho Dept. of Computer Science University.
Program Slicing Mark Weiser and Precise Dynamic Slicing Algorithms Xiangyu Zhang, Rajiv Gupta & Youtao Zhang Presented by Harini Ramaprasad.
Techniques for Multicore Thermal Management Field Cady, Bin Fu and Kai Ren.
Access Region Locality for High- Bandwidth Processor Memory System Design Sangyeun Cho Samsung/U of Minnesota Pen-Chung Yew U of Minnesota Gyungho Lee.
CISC Machine Learning for Solving Systems Problems Presented by: John Tully Dept of Computer & Information Sciences University of Delaware Using.
Workloads Experimental environment prototype real sys exec- driven sim trace- driven sim stochastic sim Live workload Benchmark applications Micro- benchmark.
A Scalable Front-End Architecture for Fast Instruction Delivery Paper by: Glenn Reinman, Todd Austin and Brad Calder Presenter: Alexander Choong.
CS752 Decoupled Architecture for Data Prefetching Jichuan Chang Kai Xu.
Variability in Architectural Simulations of Multi-threaded Workloads Alaa R. Alameldeen and David A. Wood University of Wisconsin-Madison
NUMA Tuning for Java Server Applications Mustafa M. Tikir.
From Sequences of Dependent Instructions to Functions An Approach for Improving Performance without ILP or Speculation Ben Rudzyn.
Phase Detection Jonathan Winter Casey Smith CS /05/05.
Automatically Characterizing Large Scale Program Behavior Timothy Sherwood Erez Perelman Greg Hamerly Brad Calder.
1 Using A Multiscale Approach to Characterize Workload Dynamics Characterize Workload Dynamics Tao Li June 4, 2005 Dept. of Electrical.
Instrumentation and Profiling David Kaeli Department of Electrical and Computer Engineering Northeastern University Boston, MA
Variational Path Profiling Erez Perelman*, Trishul Chilimbi †, Brad Calder* * University of Califonia, San Diego †Microsoft Research, Redmond.
Code Coverage Testing Using Hardware Performance Monitoring Support Alex Shye, Matthew Iyer, Vijay Janapa Reddi and Daniel A. Connors University of Colorado.
1 Presenter: Chien-Chih Chen Proceedings of the 2002 workshop on Memory system performance.
Catching Accurate Profiles in Hardware Satish Narayanasamy, Timothy Sherwood, Suleyman Sair, Brad Calder, George Varghese Presented by Jelena Trajkovic.
A Characterization of Processor Performance in the VAX-11/780 From the ISCA Proceedings 1984 Emer & Clark.
University of Karlsruhe, System Architecture Group Balancing Power Consumption in Multiprocessor Systems Andreas Merkel Frank Bellosa System Architecture.
Parapet Research Group, Princeton University EE Vice-Versa Talk #2 Apr 29, 2005 Phase Analysis on Real Systems Canturk ISCI Margaret MARTONOSI.
SYNAR Systems Networking and Architecture Group Scheduling on Heterogeneous Multicore Processors Using Architectural Signatures Daniel Shelepov and Alexandra.
Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee and Margaret Martonosi.
Dept. of Computer and Information Sciences : University of Delaware John Cavazos Department of Computer and Information Sciences University of Delaware.
Electrical and Computer Engineering University of Wisconsin - Madison Prefetching Using a Global History Buffer Kyle J. Nesbit and James E. Smith.
ACMSE’04, ALDepartment of Electrical and Computer Engineering - UAH Execution Characteristics of SPEC CPU2000 Benchmarks: Intel C++ vs. Microsoft VC++
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
Parapet Research Group, Princeton University EE IEEE International Symposium on Workload Characterization IISWC ’05, Austin, TX Oct 06, 2005 Detecting.
ICOM 6115: Computer Systems Performance Measurement and Evaluation August 11, 2006.
Automatically Characterizing Large Scale Program Behavior Timothy Sherwood Erez Perelman Greg Hamerly Brad Calder Used with permission of author.
Investigating Adaptive Compilation using the MIPSpro Compiler Keith D. Cooper Todd Waterman Department of Computer Science Rice University Houston, TX.
Methodologies for Performance Simulation of Super-scalar OOO processors Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project.
Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.
Practical Path Profiling for Dynamic Optimizers Michael Bond, UT Austin Kathryn McKinley, UT Austin.
Runtime Software Power Estimation and Minimization Tao Li.
Experimentation in Computer Science (Part 2). Experimentation in Software Engineering --- Outline  Empirical Strategies  Measurement  Experiment Process.
Molecular Classification of Cancer Class Discovery and Class Prediction by Gene Expression Monitoring.
11 Online Computing and Predicting Architectural Vulnerability Factor of Microprocessor Structures Songjun Pan Yu Hu Xiaowei Li {pansongjun, huyu,
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
BarrierWatch: Characterizing Multithreaded Workloads across and within Program-Defined Epochs Socrates Demetriades and Sangyeun Cho Computer Frontiers.
Combining Software and Hardware Monitoring for Improved Power and Performance Tuning Eric Chi, A. Michael Salem, and R. Iris Bahar Brown University Division.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Workload Design: Selecting Representative Program-Input Pairs Lieven Eeckhout Hans Vandierendonck Koen De Bosschere Ghent University, Belgium PACT 2002,
Identifying Program Power Phase Behavior Using Power Vectors Canturk Isci & Margaret Martonosi WWC Austin, TX.
Sunpyo Hong, Hyesoon Kim
Application Domains for Fixed-Length Block Structured Architectures ACSAC-2001 Gold Coast, January 30, 2001 ACSAC-2001 Gold Coast, January 30, 2001.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
Best detection scheme achieves 100% hit detection with
1 ROGUE Dynamic Optimization Framework Using Pin Vijay Janapa Reddi PhD. Candidate - Electrical And Computer Engineering University of Colorado at Boulder.
Michael J. Voss and Rudolf Eigenmann PPoPP, ‘01 (Presented by Kanad Sinha)
GPGPU Performance and Power Estimation Using Machine Learning Gene Wu – UT Austin Joseph Greathouse – AMD Research Alexander Lyashevsky – AMD Research.
KIT – University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association SYSTEM ARCHITECTURE GROUP DEPARTMENT OF COMPUTER.
Parapet Research Group, Princeton University EE Workshop on Hardware Performance Monitor Design and Functionality HPCA-11 Feb 13, 2005 Hardware Performance.
PINTOS: An Execution Phase Based Optimization and Simulation Tool) PINTOS: An Execution Phase Based Optimization and Simulation Tool) Wei Hsu, Jinpyo Kim,
Experience Report: System Log Analysis for Anomaly Detection
Canturk ISCI Margaret MARTONOSI
Join Processing in Database Systems with Large Main Memories (part 2)
Phase Capture and Prediction with Applications
Christophe Dubach, Timothy M. Jones and Michael F.P. O’Boyle
Phase based adaptive Branch predictor: Seeing the forest for the trees
Canturk Isci Gilberto Contreras Margaret Martonosi
Srinivas Neginhal Anantharaman Kalyanaraman CprE 585: Survey Project
Presentation transcript:

Princeton University Electrical Engineering 12th International Symposium on High-Performance Computer Architecture HPCA-12, Austin, TX Feb 14, 2006 Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques Canturk ISCI Margaret MARTONOSI

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 2 What are Program Phases?  Distinct and often-recurring regions of program behavior  Ex: Vortex

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 3 Power Behavior has Phases, too  Recurring intervals of distinct power behavior  Ex: Vortex

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 4 Phase Methods for Power Two main approaches: Key Question: How do these methods perform in terms of useful representations of power phase behavior?  Several studied program characteristics  Control Flow Methods  Basic Block Vectors (BBVs) [Sherwood et al. ASPLOS’02]  Event Monitoring Techniques  Performance Monitoring Counters (PMCs) [Isci and Martonosi Micro’03]

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 5 Outline  Real-system experimentation framework  How do the control-flow-based and event-counter-based approaches perform in power characterization?  Reasons why the two approaches can differ

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 6 Pin Pintool Application Binary Application Experimental Setup OS Hardware Instrument basic block heads Sample basic block head addresses Collect PMC event rates Enable/Disable counters Read/Flush power history Enable/Disable power input Performance Counter Hardware External Power Measurement via Current Probe OS serial device file

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 7 A {BBV,PMC,Power} Sample 0x x x x x804878d 0x804879c Visited basic blocks: PMCs: Power history: 35.9W 36.9W 37.2W 37.5W 36.5W BBV Hash 1 Sample BBV PMC vector 37W 1 Power number 13 0x

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 8 From Sample Vectors to Phases  We have vectors as proxy for power  Like vectors => like power  Cluster similar vectors together and consider them a power phase  Here: First Pivot Clustering  Paper also shows agglomerative clustering

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 9 Evaluation  Main Steps  Cluster BBV samples  Cluster PMC vectors  Compare each to true measured power  Also compare to  Oracle: classify directly for power  Random: assign samples to target clusters randomly  Benchmarks:  46 benchmark-input pairs from SPEC2K and other document creation, media and scientific applications

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 10 Results  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 11 Errors with respect to Bounds  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior  BBV and PMCs both improve on upper bounds, but also significant gap over lower bound  BBVs 70% of Random  PMCs 40% of Random

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 12 Errors with respect to Bounds  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior  BBV and PMCs both improve on upper bounds, but also significant gap over lower bound  Oracle 30% of BBVs  Oracle 50% of PMCs

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 13 Comparing PMCs to BBVs  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior  BBV and PMCs both improve on upper bounds, but also significant gap over lower bound  PMCs generally lead to less errors than BBVs  PMCs achieve 40% less errors than BBVs

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 14 Different Target Number of Clusters  PMCs perform relatively better for the practical range of target clusters  Relative BBV error is significantly larger than PMCs for small number of phases [1-10]

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 15 Why BBV and PMC Phases Differ  Same Basic Blocks can have different power behavior  Different cache hit/miss patterns at different points  Operand dependent behavior  Different basic blocks can have similar power behavior  Different execution paths with effectively same execution characteristics

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 16 Effectively Same Execution  Mesh: Various computationally similar tasks  Lead to many control-flow phases, not binding to application behavior M1M2M3M1M2M3M1M2M3 BBV Patterns PMC Events

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 17 Resulting Phases  For power management, too detailed control flow information can be detrimental! M1M2M3M1M2M3M1M2M3 HL M Desired Phases: BAACBACBACB BBV: BA C PMC: Observed Phases: BBV Patterns

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 18 Potentials & Challenges  Control flow (BBVs):  Perfect repeatability  Architectural independence  Detail at program level  Runtime applicability  BBV phases ≢ power phases  No physical binding to power  Event counters (PMCs):  Runtime monitoring  Strong relation to power  Imperfect repeatability  Lack of detail  Combining the strengths of two sides?  Mutual information, but direct combination of vectors does not help!  Future direction: Consider in terms of hierarchy/cooperation

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 19 Conclusions  Phase characterizations with control flow and event counter features can provide insight to application power behavior on real systems  2-6X better characterization than upper bounds  Two features can suggest significantly different power phase characterizations under different scenarios  PMC based approaches generally provide a better proxy to changes in power behavior  40% less errors than BBVs  Resulting experimental framework and observations can help guide phase-oriented characterization and system adaptation work on real systems

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 20 Conclusions  Real-system setup comparing power phase classifications based on BBVs and PMCs  46 benchmark-input pairs / Live system runs  Phase characterizations with control flow and event counter features can provide insight to application power behavior on real systems  2-6X better characterization than upper bounds  PMC based approaches generally provide a better proxy to changes in power behavior  40% less errors than BBVs  Resulting experimental framework and observations can help guide phase-oriented characterization and system adaptation work on real systems

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 21 Conclusions  Real-system setup comparing power phase classifications based on BBVs and PMCs  46 benchmark-input pairs / Live system runs  PMCs alone have 40% lower error than BBVs alone for power phases  Likely because they have a natural physical binding to power estimation  But control flow information (BBVs) offers often-useful detail  Hybrids of PMC+BBV are likely to offer best approach  Both offer useful foundation for a range of dynamic power adaptations

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 22 Thanks!

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 23 EXTRAS  1.1) Why care about phases examplesWhy care about phases examples  1.2) Why care about pwr phases examplesWhy care about pwr phases examples  1.3) What are different features that prev studies looked at?What are different features that prev studies looked at?  2) Experiment setup detailsExperiment setup details  3) Illustration of BBV, PMC & Power collection methodologyIllustration of BBV, PMC & Power collection methodology  3.1) Similarity/Vector Distance conceptSimilarity/Vector Distance concept  3.2) Error computationError computation  3.3) Tracked EventsTracked Events  3.6) Results for agglomerativeResults for agglomerative  3.7) More on Sources of contradictionMore on Sources of contradiction  3.8) Explaining BBV patternsExplaining BBV patterns  3.9) Dcache example to varying data localityDcache example to varying data locality  3.95) Stream Example to Operand Dependent BehaviorStream Example to Operand Dependent Behavior  3.96) 5 Phases to effectively same execution?5 Phases to effectively same execution?  3.97) BBV+PMC HierarchyBBV+PMC Hierarchy  4) Power Behavior Under PinPower Behavior Under Pin  5) How bad are the sampling effects on BBVs?How bad are the sampling effects on BBVs?  10) How much power phases different from IPC phases? (What is the scatter for IPC & PWR from measurement)How much power phases different from IPC phases?What is the scatter for IPC & PWR from measurement

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 24  Characterizing execution regions 1.1) Why Care About Phases? E1E2E3E4

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Why Care About Phases?  Characterizing execution regions  Managing dynamic adaptation OFF ON

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Why Care About Phases?  Characterizing execution regions  Managing dynamic adaptation  Use current phase/behavior to predict future behavior

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Why Care About Power Phases?  Useful for:  Guiding power budget / temperature limit management Slow down! Power [W]Temp. [ o C] Time [s] Uncontrolled T Enforced T  I.e. Montecito/Foxton

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Why Care About Power Phases?  Useful for:  Guiding power budget / temperature limit management  Power/Temperature aware scheduling Time [s] Power [W] [Bellosa et al. COLP’03]

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Why Care About Power Phases?  Useful for:  Guiding power budget / temperature limit management  Power/Temperature aware scheduling  Power balancing for multiprocessor systems/activity migration Power Task1Task2 Swap hot task Slow down! Speed up! Core/μP 1Core/μP 2

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Evaluating Phase Methods for Power  All lead to different interesting characterizations  Several methods, looking at different similarity features -Specific metrics (IPC, EPI) -Hardware performance vectors -Branch counts -Working sets -Basic block vectors -Procedures From event monitors From (sampled) control flow  We specifically look at basic block vectors (BBVs) & performance counter based (PMC) vectors  How do these behave in terms of power representation?  Is there a dominant method or does a combination work better?

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 31 2) Experimental Framework  Goal: To acquire control flow, performance metric and power behavior of workload execution at matching & controlled observation points on a real system  Control flow:  Sampled PC  Basic block vectors (BBVs)  Performance related events:  Performance monitoring counters (PMCs)  PMC vectors  Power:  External measurements via current probe/DMM  Verification

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 32 Pin Analysis Instrumentation 2) Setup Application Binary Performance Counter Hardware External Power Measurement via Current Probe OS serial device file Experimental Machine Instrument basic block heads Sample basic block head addresses Collect PMC event rates Start/stop/reset counters Read/Flush power history Detach/Attach power input

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 33 Pin Experimental Setup Application Binary Performance Counter Hardware External Power Measurement via Current Probe OS serial device file Experimental Machine Instrument basic block heads Sample basic block head addresses Collect PMC event rates Start/stop/reset counters Read/Flush power history Detach/Attach power input

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 34 2) Experiment Setup Details  PIN kit 1795  3 level Trace instrumentation  ~Every user trace: Conditional inlined trace count  Every K Trace call: Sample EIP  Every 5-20M Trace call: Generate BBV & Collect PMCs & Read PWR history  Constraint: Instrumentation should not overwhelm Power variations!!  BBV Generation:  Sample BBL heads  hash into 32 dimensions  PMC Reading:  Single rotation subset of 15  Sample via syscalls at major instrumentation & reset for next  Power Reading:  Read from serial device buffer  Disable device at major instrumentation & exhaust buffer

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 35 3) Methodology Illustration Execution timeline: 0x x x x x804878d 0x804879c Visited basic blocks: Power history: PMCs: st level analysis: 2 nd level analysis: 3 rd level analysis: Instr-n count = 5 BBV =

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 36 3) Methodology Illustration Execution timeline: 0x x x x x804878d 0x804879c 36.5W Visited basic blocks: Power history: PMCs: st level analysis: 2 nd level analysis: 3 rd level analysis: Instr-n count = 11 BBV =

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 37 3) Methodology Illustration Execution timeline: 0x x x x x804878d 0x804879c 36.5W Visited basic blocks: Power history: PMCs: st level analysis: 2 nd level analysis: 3 rd level analysis: Instr-n count = 18 BBV =

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 38 3) Methodology Illustration Execution timeline: 0x x x x x804878d 0x804879c 37.2W Visited basic blocks: Power history: PMCs: st level analysis: 2 nd level analysis: 3 rd level analysis: Instr-n count ~ 1M BBV = W 36.5W 2 nd level analysis: H32 1 2

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 39 3) Methodology Illustration Execution timeline: 0x x x x x804878d 0x804879c 37.2W Visited basic blocks: Power history: PMCs: st level analysis: 2 nd level analysis: 3 rd level analysis: Instr-n count ~ 1M BBV = W 36.5W

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 40 1 Sample 3) Methodology Illustration Execution timeline: 0x x x x x804878d 0x804879c 35.9W Visited basic blocks: Power history: PMCs: st level analysis: 2 nd level analysis: 3 rd level analysis: Instr-n count ~ 100M BBV = W 37.2W 37.5W 36.5W 3 rd level analysis: BBV PMC vector 37W 1 Power number

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) What is the similarity of a vector?  We use L1 (Manhattan) Distance  Simple Example:  Consider 4 vectors, each with 4 dimensions: Log all distances in the similarity matrix: For 2 Target Phases: {0,1,2} & {3}

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Error Computation  For 5 Target Phases:  Representative Power:  Mean of power values for each phase  Error for a benchmark:  RMS error over all samples after phase classification, with respect to the representative power of that phase  Single number measure of how much power variation remains within each phase, averaged-nonuniformly-over all phases.

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Tracked Events

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Results  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior  BBV and PMCs perform improve on upper bounds, but also significant gap over lower bound  PMCs generally lead to less errors than BBVs First pivotAgglomerative (compl.)

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Sources of Contradiction between Control Flow and Performance Metrics  Dynamic change in data locality  Sort a 10K / 10M entry (presorted/random)  More prominent with TP/DB applications  Effectively same execution  Scientific/computational programs  Compute similarity between PMC / PC vector samples  Operand dependent behavior  Overflow/Operand width dependent execution  More to be observed with power-aware architectural choices i.e. Pentium M – Execution width scaling [Wu et al. Micro’05] [Gochman et al. ITJ Q2’03]

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) The Reasons of Contradiction  No global observation explains all cases  Memory subsystem  Memory boundness  Dynamically varying data locality  Number of traversed basic blocks  Effectively same execution  Operand dependent behavior

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Effectively Same Execution  Mesh: Various computationally similar tasks  Lead to many control-flow phases, not binding to application behavior M1M2M3M1M2M3M1M2M

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Motivating Ex: Dcache Microkernel  Specify L1 hit rate, generate desired hits via random linked list traversal A C M P Z L1 Size L2 Size Mem Size

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Dcache: Control flow  Sample PC every 1M instructions ± 100, map to basic blocks L1 IntensiveL2 IntensiveMem Intensive

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Dcache: Performance counters and power  Sample PMCs every 100M instructions, collect power from current probe L1 Intensive L2 Intensive Mem Intensive

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Operand Dependent Behavior  Stream: 4 repetitive operations  Reaches OVF after 261 iterations BBV sequences:

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Operand Dependent Behavior  Drastic change in power behavior  Not seen in control flow, but is followed by PMC vectors  Timeline shows the actual impact

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Resulting Phases  BBVs distinguish all different regions of operation  However, the distance between the M phases still larger than the distance between H, L and M3 even for N=3  Too much granularity conceals the available information M1M2M3M1M2M3M1M2M3 BDCEDEDED BDACBDCBDCE BA C BAACBACBACB HL M Desired Phases: 5 Phases: PMC: BBV: 3 Phases: PMC: BBV:

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) BBV+PMC Hierarchy, an EX: BACDBCDBACDBCD

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) BBV+PMC Hierarchy, an EX: BACDBCDBACDBCD PMCs: ? BAAC BAAB Knowledge of BBV repetition information helps detect/predict actual recurrent behavior:

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi ) Flagging ‘Problem’ Phases  Same control-flow with varying behavior can be identified to avoid false predictions based on this recurrence ?

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 58 4) Gcc Native/Pin Powers  (a) Native execution power behavior without instrumentation  (b) Flattened power behavior with Pin basic block instrumentation Instrumentation dominant Execution dominant  (c) Improved external power behavior with Pin trace instrumentation and conditional inlining  (d) Power behavior assigned to application execution by Pintool

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 59 4) Power Results with Pin  Do we still have the hook on power variability?  Native  From PIN Mcf Vortex Gzip

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 60 5) Sampling Effect on BBVs  Is sampling good enough? Are they Meaningful?  Full Blown BBV SimMatrices  Our sampled & hashed BBV Simmatrices

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] 5) Similarity Matrix Example  Consider 4 vectors, each with 4 dimensions:  Log all distances in the similarity matrix  Color-scale from black to white (only for upper diagonal)

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 62 5) Interpreting Similarity Matrix Plot  Level of darkness at any location (r,c) shows the amount of similarity between vectors – samples – r & c. i.e. 0 & 2  All samples are perfectly similar to themselves All (r,r) are black  Vertically above the diagonal shows similarity of the sample at the diagonal to previous samples i.e. 1 vs. 0  Horizontally right of the diagonal shows similarity of the sample at the diagonal to future samples i.e. 1 vs. 2,3

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 63 5) Full vs. sampled BBV Phases Comparison with Gzip Sampled BBV Phases Full-blown BBV Phases

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 64 10) Why Need Power Phases? (Power ≡? IPC )  How different is characterization for IPC from power?

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 65 10) Why Need Power Phases? (Power ≡? IPC )  Bzip2-graphic: very strong relation between power and IPC domains

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 66 10) Why Need Power Phases? (Power ≡? IPC )  Swim: Still strong relation, but the power and IPC domains across benchmarks are different (different relation)

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 67 10) Why Need Power Phases? (Power ≡? IPC )  Mcf: Similar power range as bzip2 achieved with much lower IPC domains

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 68 10) Why Need Power Phases? (Power ≡? IPC )  Ghostscript: Similar IPC can lead to drastically different power ranges within a benchmark at the lower end

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 69 10) Why Need Power Phases? (Power ≡? IPC )  Lame: Similar observation at higher end, with distinct IPC domains overlapping power

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 70 10) Why Need Power Phases? (Power ≡? IPC )  Art: The actual spread due to measurement is much smaller!

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 71 Ditched Slides

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 72 Phase Analysis & Real Systems  Phases: Self-similar, mostly recurrent, execution regions  Useful for characterization, dynamic-adaptive management  SimPoints [Sherwood et al., ASPLOS’02]  Multiconfigurable HW [Dhodapkar and Smith, ISCA’02]  Real systems impose additional constraints  Larger granularities O(ms)  Applicability to large-scale management methods Dynamic voltage/frequency scaling Thermal Management  Identifying recurrence under inexact replication of repetitive behavior!

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 73 Phase Analysis & Real Systems  Phases: Self-similar, mostly recurrent, execution regions  Real-system experiments  Long execution timescale observations  Incorporating system effects / verification with real measurements  Useful for characterization, dynamic-adaptive management E1E2E3E4E5

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 74 Phases  Distinct and often-recurring regions of program behavior

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 75 What are Program Phases?  Distinct and often-recurring regions of program behavior

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 76 Why Care About Phases? E1E2E3E4E5  Characterizing execution regions

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 77 Why Care About Phases?  Characterizing execution regions  Managing dynamic adaptation

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 78 Why Care About Phases?  Characterizing execution regions  Managing dynamic adaptation  Use current phase/behavior to predict future behavior

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 79 Power Phases  Recurring intervals of distinct power behavior  Ex: Vortex Billions of Instructions

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 80 Evaluating Phase Methods for Power  All lead to different interesting characterizations  We specifically look at basic block vectors (BBVs) & performance counter based (PMC) vectors  How do these behave in terms of power representation?  Is there a dominant method or does a combination work better? Similarity Based On: Metrics (IPC, EPI, etc) Hardware Performance Vectors BBVs, Working Sets ProceduresBranches Sampling Quanta: Instruction/Code/Time/Energy intervals From performance monitoring counters From (sampled) control flow

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 81 Outline  Background and Definitions  Real-system experimentation framework  How do the control-flow-based and event-counter-based approaches perform in power characterization?  Reasons why the two approaches can differ

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 82 Experimental Framework  Goal: To acquire control flow, performance metric and power behavior of workload execution at matching & controlled observation points on a real system  These will guide phase classification of control flow and event based behavior with validation against power measurements  Control flow:  From sampled PC  Will construct basic block vectors (BBVs) for each observed sample  Performance related events:  From performance monitoring counters (PMCs)  Will construct PMC vectors for each sample  Power:  From external measurements via current probe/DMM  Will provide the actual power behavior for each observed sample for verification

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 83 Experimental Setup Application Binary Application OS Hardware Pin Instrument basic block heads Sample basic block head addresses Collect PMC event rates Start/stop/reset counters Read/Flush power history Detach/Attach power input Performance Counter Hardware External Power Measurement via Current Probe OS serial device file

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 84 Pin Experimental Setup Application Binary Application OS Hardware Instrument basic block heads Sample basic block head addresses Collect PMC event rates Enable/Disable counters Read/Flush power history Enable/Disable power input Performance Counter Hardware External Power Measurement via Current Probe OS serial device file

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 85 Phase Classification  For each benchmark:  N sets of {BBV 32,PMC 15,Power}  Cluster BBV 32 and PMC 15 into F sets using L1-distance of vectors Control-flow & event-counter based phases  Clustering methods:  First Pivot: Simple, online method modified for a fixed target number  Agglomerative: Complex, offline method. Link pairs of clusters based on linkage criterion Average linkage Complete linkage

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 86 Phase Classification  For each benchmark: N sets of {BBV 32,PMC 15,Power}  Cluster BBV 32 and PMC 15 using L1-distance of vectors Control-flow & event-counter based phases  Clustering methods:  First Pivot, Agglomerative [Average & Complete linkage]  First Pivot:

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 87 From Sample Vectors to Phases  If we knew power precisely  We could divide power into ranges and classify points directly  But we have vectors as proxy for power  Like vectors => like power  Cluster similar vectors together and consider them a power phase  Here: First Pivot Clustering  Paper also shows agglomerative clustering

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 88 Evaluation  Compute classification error for each method with sample standard deviation of power  Upper and lower bounds:  Oracle: classify directly for power  Random: assign samples to target clusters randomly  Benchmarks:  46 benchmark-input pairs from SPEC2K and other document creation, media and scientific applications

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 89 Results  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 90 Evaluation and Results  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior  BBVs 12.9%  PMCs 7.3%  BBVs 1.9%  PMCs 1.5%  BBVs 5.6%  PMCs 3.3%

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 91 Errors with respect to Bounds  Consistent results regardless of clustering method  SPECfp < SPECint < Others following from variability of power and memory behavior  BBV and PMCs both improve on upper bounds, but also significant gap over lower bound  BBVs 3.3X of Oracle  PMCs 1.9X of Oracle

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 92 Resulting Phases  BBVs distinguish all different regions of operation  However, the distance between the M phases still larger than the distance between H, L and M3 even for N=3  Too much granularity conceals the available information M1M2M3M1M2M3M1M2M3

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 93 Effectively Same Execution  Mesh: Various computationally similar tasks  Lead to many control-flow phases, not binding to application behavior M1M2M3M1M2M3M1M2M3

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 94  Control flow (BBVs):  Repeatability  Architecturally independent  Runtime applicability [sampling, mapping to BBLs]  Managing dimensions  False alarms with effectively same execution behavior  Misses on varying data locality and operand dependent behavior  Event counters (PMCs):  Runtime applicability  Imperfect repeatability  Managing variable event ranges  Lack of detail  Combining the strengths of two sides?  They have mutual info, but direct combination of vectors does not help!  Future direction: Consider in terms of hierarchy Potentials & Challenges with the Phase Characterization Approaches

Phase Characterization for Power: Evaluating Control-Flow-Based and Event-Counter-Based Techniques [HPCA-12 ’06] Canturk Isci - Margaret Martonosi 95 Background: Power and Phases  Runtime processor power monitoring and estimation [Micro’03]  Sample PMCs to estimate powers for 22 chip components  Real measurement feedback for tuning and verification  Workload power phase behavior with power vectors [WWC’03]  Consider power estimations as power vectors  Characterize “power phases” based on vector similarity