Parapet Research Group, Princeton University EE IEEE International Symposium on Workload Characterization IISWC ’05, Austin, TX Oct 06, 2005 Detecting Recurrent Phase Behavior under Real-System Variability Canturk ISCI Margaret MARTONOSI
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 2 Phase Analysis & Real Systems Phases: Self-similar, mostly recurrent, execution regions How to identify phase recurrences when real-system effects make them inexact replicas? Useful for characterization, dynamic-adaptive management E1E2E3E4E5
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 3 Underlying Research Questions What are the types and extent of system-induced variations? How do phases manifest themselves with real-system effects? Can we extract recurrent behavior in spite of these variations? If so, how?
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 4 Background: Power and Phases Runtime processor power monitoring and estimation [Micro’03] Sample PMCs to estimate powers for 22 chip components Real measurement feedback for tuning and verification Workload power phase behavior with power vectors [WWC’03] Consider power estimations as power vectors Characterize “power phases” based on vector similarity
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 5 Variability in Real-System Runs Initial idea was to look at phase distributions of apps and use some signature analysis to detect/predict phases HOWEVER: Multiple runs inevitably exhibit different behavior Quantities & durations vary Phase distributions vary Metric Variability Time Variability
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 6 Underlying Research Questions What are the types and extent of system-induced variations? Metric variability Time variability How do phases manifest themselves with real-system effects? Can we extract recurrent behavior in spite of these variations? If so, how?
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 7 Real-System Variability Effects on Phases t Metric ABCB Ideal ABCBDB Glitch ABCBDEB Gradient ABCBDEB Shift ABCBDEF Mutation ABCBDEF Time Dilation
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 8 Real-System Variability Effects on Phases A direct apples to apples comparison of phase signatures is not very relevant in real world! ABCB Ideal ABCBDB Glitch ABCBDEB Gradient ABCBDEB Shift ABCBDEF Mutation ABCBDEF Time Dilation FINAL
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 9 How do Phase Distributions Compare? Phase sequences for 2 run snippets of gcc
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 10 How do Phase Distributions Compare? Mutual histograms for 2 runs of gcc How many times run1 was in phase ‘U’, while run2 was in phase ‘G’.
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 11 Underlying Research Questions What are the types and extent of system-induced variations? How do phases manifest themselves with real-system effects? Can we extract recurrent behavior in spite of these variations? If so, how?
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 12 Improving Phase Analysis Using Transitions t Metric ABCB t ABCBDEF Ideal Final
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 13 Value based phase representations do not show good correlation t Value Based Phases (VBP) ABCB t ABCBDEF Improving Phase Analysis Using Transitions
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 14 Our Proposed Solution with Transitions Tracking phase transitions rather than phase sequences is more useful in detecting recurrent behavior t Transition Based Phases (TBP) ABCB t ABCBDEF …
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 15 Our Transition-Guided Detection Framework Benchmark run #1 Sample PMCs to form 12D vectors Benchmark run #2 Vector stream #1 Identify Transitions Vector stream #2 TBP init #1 Apply glitch/gradient filtering TBP init #2 TBP gg #1TBP gg #2 Apply near-neighbor blurring TBP ggN #1 Match ⇒ Peak at best alignment Mismatch ⇒ No observable peak Apply cross correlation
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 16 Sampling Effects: Glitches & Gradients Nothing happens without disturbances Glitches Glitch: Instability where before & after are same Spurious transitions Nothing happens instantaneously Gradients Gradient: Instability where before & after are different A single true trans-n Glitch/Gradient Filtering: Very simple: no consecutive transitions
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 17 Time Shifts Cross-correlation of binary sequences shows the highest matching of signatures at the best alignment Ex: Matching and Mismatch cases, and “The Peak” Matching case: Gcc1-Gcc2 Mismatch case: Gcc-Equake Strong peak indicates good match! Low peak signifies mismatch!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 18 Observation: Dilations exist as small jitters (few samples) Proposed Solution: “Near-Neighbor Blurring” Blur edges slightly Consider transitions as distributions around their actual locations Tolerance: Spread of this distribution, [t-x, t+x] samples Ex: Matching improvement with tolerance=2: Time Dilations run1 Mismatch ! t run t 1
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi Observation: Dilations exist as small jitters (few samples) Proposed Solution: “Near-Neighbor Blurring” Blur edges slightly Consider transitions as distributions around their actual locations Tolerance: Spread of this distribution, [t-x, t+x] samples Ex: Matching improvement with tolerance=2: Time Dilations run1 Match! t run t
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 20 Our Transition-Guided Detection Framework Benchmark run #1 Sample PMCs to form 12D vectors Benchmark run #2 Vector stream #1 Identify Transitions Vector stream #2 TBP init #1 Apply glitch/gradient filtering TBP init #2 TBP gg #1TBP gg #2 Apply near-neighbor blurring TBP ggN #1 Match ⇒ Peak at best alignment Mismatch ⇒ No observable peak Apply cross correlation
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 21 Results How do we quantify phase recognition quality? Matching Score: Range of values ≥ 0 Higher is better
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 22 Results Detection Results: (green: highest match; red: highest mismatch)
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 23 Receiver Operating Characteristics Best detection scheme (tolerance=1) achieves 100% hit detection with <5% false alarms. (Using the same threshold for all apps!) Very high detect threshold P{hit} = 0 P{false alarm} = 0 0 detect threshold P{hit} = 1 P{false alarm} = 1 Desired operating point P{hit} ~ 1 P{false alarm} ~ 0
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 24 Comparison: TBP Outperform VBP In all cases transitions perform better In almost all cases near-neighbor blurring improves detection
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 25 Conclusions Detecting phase behavior on real systems has interesting challenges resulting from system induced variability Phase transition information improves detection capabilities TBP show 6X better detection capabilities than VBP Supporting methods, such as Glitch/Gradient Filtering and Near-Neighbor Blurring improve detectability of transition signatures Near-neighbor blurring with tolerance=1 achieve 100% recurrence detection with <5% false alarms Resulting infrastructure can enable a range of phase-oriented system adaptations!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 26 Thanks!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 27 BACKUPS 0.5) How much noise, how much variation? 1) Variation in time sequences of phase distributions for two gcc runs; recurrent phases with ammp 2) Refined transition counts for different thresholds 3) Advantages with Power/PMC Vectors 4) Threshold vs. Hits & Misses with Tolerance=1 5) How about instr-n based sampling/control flow-based approach? 6) What’s the source of variability? 7) Glitches/Gradients vs. sampling frequency? 8) Use of this framework? 9) Multithreaded / OLTP like benchmarks? 10) SMT/CMP/multiprogramming environment?
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi ) Noise vs. Variations Gcc GzipVpr Vortex Gap Crafty Measured Modeled Stable Apps Vpr/Crafty change very little, Variable ones change much more
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 29 1)Phase Distributions Along Execution Timeline for 2 Runs of Gcc
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 30 1) Recurrence Example with Ammp Although obvious to the eye, comparing phase sequences directly does not reveal the recurrence clearly!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 31 2) Refined Transitions for Different Thresholds Gcc Equake
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 32 3) Advantages with Power/PMC Vectors Direct relation to actual processor power consumption Acquired at runtime Identify program phases with no programmatical knowledge of application
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 33 4) Threshold vs. Hits & Misses with Tolerance=1 100% hits with < 5% false alarms, for threshold: 3/14=0.21 – 4/14=0.29
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 34 5) How about inst-n based sampling / control flow-based approaches? We have tried 3 methods: OS/USR counting with PMCs Doesn’t eliminate variability Binding to threads in sampling Didn’t solve variability/registration problems Dynamic instrumentation with Pin Got back to perfect repeatability Lost actual benchmark execution behavior that flows thru the processor PC sampling doesn’t solve variability if we simply sample PCs every 1ms or so. (Application execution time varies) Sampling at fixed instruction counts is for a specific PID makes it deterministic Has its downsides with uncontrolled timing behavior and not being able to bind to flow thru processor
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 35 6) What’s the Source of Variability? We don’t have perfect, classified answer yet. Maybe Pin/atom can help - Different locality at different runs - Intensity of spontaneous system processes - Inexact memory access patterns / swaps - Different cache/tlb/bht etc states
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 36 7) Glitches/Gradients vs. Sampling Frequency Reducing frequency smoothes glitches, BUT dithers gradients More sluggish, LPF’ed response Also smoothes actual phase changes We use 100ms to meet limitations of high frequency corner: No observable perturbation to actual execution Limited by RS232 speed Close lower bound to acquire 3-4 DMM samples
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 37 8) What’s the Use of This? First, this is a GENERIC recurrence detection under variability system!! Can use to detect/predict phases with specific features: Memory boundness Hotspots Can be stretched to security/reliability: Matching signatures with PIDs Specific promising avenues: CMP workload balancing by signatures power Activity migration in the case of hotspot signatures **DVFS at experienced signatures** Need help from BBVs under phase behavior changes with taken actions!!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 38 9) Multithreaded/OLTP Like Benchmarks? No fundamental analysis problem as we don’t try to bind to processes Some of the experimented ones: Mozilla, Xmms, Mplayer FLAT power behavior Not interesting Need more infrastructure work to get OLTP like applications running on our platform Interesting follow-on to see variability of these apps
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 39 10) SMT/CMP/Multiprogramming Environments Don’t have the SMT/CMP platforms hooked up for multimeter (yet ) SMT should be similar, as long as the multi-app behavior is somewhat repeatable CMP less clear, one PMC set & power measurement per core? Overall per chip? We have tried multiprogramming on our P4: Memory intensive apps create too much swapping/thrashing for the behavior to be somewhat repeatable. Not useful for phase detection How deterministic is Task switching?
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 40 OLD/EXTRA Slides
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 41 Phase Analysis & Real Systems Phases: Self-similar, mostly recurrent, execution regions Useful for characterization, dynamic-adaptive management SimPoints [Sherwood et al., ASPLOS’02] Multiconfigurable HW [Dhodapkar and Smith, ISCA’02] Real systems impose additional constraints Larger granularities O(ms) Applicability to large-scale management methods Dynamic voltage/frequency scaling Thermal Management Identifying recurrence under inexact replication of repetitive behavior!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 42 Leverages from Previous Work Runtime processor power monitoring and estimation [Micro’03] Sample PMCs to estimate powers for 22 chip components Real measurement feedback for tuning and verification Workload power phase behavior with power vectors [WWC’03] Consider power estimations as power vectors Characterize “power phases” based on vector similarity Evaluate against real measurements Improvements in this work Reduce dimensions for better discrimination Track phase transitions with vector distances
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 43 Real-System Variability Effects on Phases A direct apples to apples comparison of phase signatures is not very relevant in real world!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 44 Fundamental Challenge: How do we still extract recurrent behavior information? Speech/Humming recognition: Stored libraries, signal stats Pitch tracking Image/Biomedical: Image warping Registration/Mutual information Architects: Simple to apply online Implementable w/o massive state & combinationals
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 45 Possible Solution with Transitions Trying to detect application from behavior Upper Case: Hit! Lower Case: False alarm? Tracking phase transitions rather than phase sequences proves to be more useful in detecting recurrent behavior* Gcc1-Gcc2 Gcc-Equake
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 46 Sampling Effects: Glitches & Gradients Nothing happens without disturbances Glitches Glitch: Instability where before & after is same Spurious Transitions Nothing happens instantaneously Gradients Gradient: Instability where before & after is different A single true trans-n
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 47 Glitch/Gradient Filtering Very simple: no consecutive transitions Leads to large reductions in transition count We call these “Refined Transitions (TBP gg )” Gcc example: Transitions identified from PMCs and actual measured power behavior
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 48 Time Shifts We have binary information We can do cheaper than shifted correlation coeff-s Using Cross-Correlations show equally useful results Easily implementable Ex: Matching and Mismatch cases, and “The Peak” Gcc1-Gcc2 Gcc-Equake
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 49 Observation: Dilations exist as small jitters (few samples) Proposed Solution: “Near-Neighbor Blurring” Blur edges slightly Consider transitions as distributions around their actual locations Tolerance: Spread of this distribution, [t-x, t+x] samples Ex: Matching improvement with tolerance=4: Time Dilations run1 run2 run1 run2 Mismatch ! Match!
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 50 Results How do we quantify phase recognition quality? Matching Score: Detection Results: (green: highest match; red: highest mismatch)
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 51 Comparison of Methods Comparing 3 cases: Original (Value Based) Phases vs. Refined Transitions vs. Near-Neighbour Blurred Transitions In all cases transitions perform better In almost all cases near-neighbor blurring improves detection
Detecting Recurrent Phase Behavior under Real-System Variability [IISWC ’05] Canturk Isci - Margaret Martonosi 52 Conclusions Phase-recurrent behavior detection on real systems has interesting problems resulting from system induced variability Looking at phase transition information in part improves detection capabilities TBP show 6X better detection capabilities than VBP Supporting methods, such as Glitch/Gradient Filtering and Near-Neighbor Blurring improve detectability of transition signatures Near-neighbor blurring with tolerance=1 achieve 100% detection with <5% false alarms