Download presentation
Presentation is loading. Please wait.
Published byWilfrid King Modified over 9 years ago
1
Parapet Research Group, Princeton University EE Vice-Versa Talk #2 Apr 29, 2005 Phase Analysis on Real Systems Canturk ISCI Margaret MARTONOSI
2
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 2 Previously… Runtime processor power monitoring and estimation Power Phase Behavior of programs (Power Vectors) POWER CLIENT POWER SERVER Gcc GzipVpr Vortex Gap Crafty MeasuredModeled
3
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 3 Previously… Runtime processor power monitoring and estimation Power Phase Behavior of programs (Power Vectors)
4
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 4 Today! Phase detection on real systems: Variability effects and potentials for repeatability Virtual memory behavior – Tuning Initial results What’s going on? BBVs – PMCs – PVs… and POWER Simple metric prediction studies Short term vs. long term MAJOR MINOR MAYBE
5
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 5 Phase Detection with Power Vectors Initial idea was to look at phase distributions of app-s and use some signature analysis to detect/predict phases HOWEVER: Multiple runs -inevitably- exhibit different real system behavior The quantities & durations vary The phase distributions vary Metric Var Time Var
6
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 6 Variability Effects in Real System Behavior A direct apples to apples comparison of phase signatures is not very relevant in real world!
7
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 7 Ammp and Apples Although obvious to the eye, comparing phase sequences directly does not reveal the recurrence clearly!
8
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 8 How do Phase Distributions Compare? Ex: 2 runs of gcc
9
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 9 How do Phase Distributions Compare? Ex: 2 runs of gcc
10
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 10 We Got Ourselves a Problem: How do we extract this recurrent behavior information? Speech/Humming recognition: Stored libraries, signal stats Pitch tracking Image/Biomedical: Image warping Registration/Mutual information Architects: Simple to apply online Implementable w/o massive state & combinationals
11
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 11 Interesting Observation with Transitions Trying to detect application from behavior Upper Case: Hit! Lower Case: False alarm? Tracking phase transitions rather than phase sequences proves to be more useful in detecting recurrent behavior* Gcc1-Gcc2 Gcc-Equake
12
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 12 Our Transition-Guided Detection Framework Benchmark run #1 Sample PMCs to form 12D vectors Benchmark run #2 Vector stream #1 Identify Transitions Vector stream #2 T init #1 Apply glitch/gradient filtering T init #2 T gg #1T gg #2 Apply near-neighbor blurring T ggN #1 Match ⇒ Peak at best alignment Mismatch ⇒ No observable peak Apply cross correlation
13
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 13 Sampling Effects: Glitches & Gradients Nothing happens without disturbances Glitches Glitch: Instability where before & after is same Spurious Transitions Nothing happens instantaneously Gradients Gradient: Instability where before & after is different A single true trans-n
14
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 14 Glitch/Gradient Filtering Very simple: no consecutive transitions Leads to large reductions in transition count We call these “Refined Transitions (T gg )”
15
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 15 (Also Helps with Threshold Choices)
16
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 16 Time Shifts We have binary information We can do cheaper than shifted correlation coeff-s Using Cross-Correlations show equally useful results Easily implementable Ex: Matching and Mismatch cases, and “The Peak” Gcc1-Gcc2 Gcc-Equake
17
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 17 Observation: Dilations exist as small jitters (few samples) Proposed Solution: “Near-Neighbor Blurring” Blur edges slightly Consider transitions as distributions around their actual locations Tolerance: Spread of this distribution, [t-x, t+x] samples Ex: Matching improvement with tolerance=4: Time Dilations 00100000010010000000000 01000000010000100001000.6.81.6.4.6.81 1.6.4.2000000 01000000010000100001000 run1 run2 run1 run2 Mismatch ! Match!
18
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 18 Our Transition-Guided Detection Framework Benchmark run #1 Sample PMCs to form 12D vectors Benchmark run #2 Vector stream #1 Identify Transitions Vector stream #2 T init #1 Apply glitch/gradient filtering T init #2 T gg #1T gg #2 Apply near-neighbor blurring T ggN #1 Match ⇒ Peak at best alignment Mismatch ⇒ No observable peak Apply cross correlation
19
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 19 Results How do we quantify the strength of the peak? Matching Score: Detection Results: (green: highest match; red: highest mismatch)
20
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 20 Receiver Operating Characteristics Our best detection scheme (tolerance=1) achieves 100% hit detection with <5% false alarms. (For a uniform threshold!)
21
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 21 Comparison of Methods Comparing 3 cases: Original (Value Based) Phases vs. Refined Trans-ns vs. Near-Nbr Blurred Trans-ns In all cases transitions perform better In almost all cases near-neighbor blurring improves detection
22
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 22 Conclusions Phase-recurrent behavior detection on real systems has interesting problems resulting from system induced variability Looking at phase transition information in part improves detection capabilities Supporting methods such as Glitch/Gradient Filtering and Near-Neighbor Blurring improve detectability of transition signatures
23
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 23 Today! Phase detection on real systems: Variability effects and potentials for repeatability Virtual memory behavior – Tuning Initial results What’s going on? BBVs – PMCs – PVs… and POWER Simple metric prediction studies Short term vs. long term
24
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 24 Workload Phases Memory Behavior? Few of the Inspirations: Redhat Magazine Issue #1 [Dec 2004] Dynamically Tracking Page Miss Ratio Curve [ASPLOS 2005] Gokul Kandiraju [PhD Thesis 2004] Can we track phase behavior from PMCs and VM related stats to dynamically manage memory behavior? Less page locality fetch less contiguous pages at once Recurring reference with high reuse distance launder less aggressively Targets Exec time & Energy IndicatorActionEffect James Donald -
25
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 25 Platform P4, No SMT, 256K Mem, Linux 2.4.7-10 SPEC2K is designed to fit in 256K Choose High Memory Benchmarks + Multiprogramming Multiprogramming combinations of these leads to lots of thrashing
26
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 26 Effect of Thrashing with Multiprogramming For most cases, it leads to 5-10% power/performance penalty Applu+Apsi! 6X Time 2.5X Energy Non thrashing combinations, achieve 5-10% improvement James Donald -
27
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 27 Action Effect Non-intrusive tuning possibilities: Kswapd:tries_base Max # of pages swapout daemon tries to free at once Kswapd:swap_cluster # of pages swapout daemon writes at once Page-cluster: Log 2 (# of contiguous pages) kernel reads at once at a page fault Intrusive tuning possibilities: Page scanning period (Overhead if tasks fit in Mem) Page age counters (reuse vs. pollution) Inactive-Clean Percentage (balance I/O and Mem demand) Task memory allocation (Workload dependent Mem demand) IndicatorActionEffect James Donald -
28
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 28 Non-intrusive Results Gzip: gzip + gzip + gzip Gap: gap + gzip Bzip2: bzip2 + bzip2 Tries_base and swap_cluster have no visible effect Page-cluster shows ~7% improvement wrt default James Donald -
29
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 29 Conclusions and Todos Multiprogramming involving thrashing has a lot of potential for improvement for performance/power Experimented cases don’t show promising actions Intrusive actions may be more useful leading to effective actions as well as better (per task) tracking NEXT STEPS: Looking into mm for potential dynamic tunings Defining indicators tracking relevant behavior Page miss ratio / Swap rates / Bus Utilization Q: Is There any Potential? James Donald -
30
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 30 Tomorrow! Phase detection on real systems: Variability effects and potentials for repeatability Virtual memory behavior – Tuning Initial results What’s going on? BBVs – PMCs – PVs… and POWER Simple metric prediction studies Short term vs. long term
31
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 31 Comparing Phase Methods for Power All lead to different interesting characterizations How do these compare in terms of power representation? Is there a dominant method or does a (hierarchical) combination work better? We specifically look at BBVs & PMC-Power Vectors Similarity Based On: Metrics (IPC, EPI, etc) Hardware Performance Vectors BBVs, Working Sets ProceduresBranches Sampling Quanta: Code/Time/Energy intervals From Performance Monitoring Counters From Sampled PC Traces
32
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 32 Different Phases Ex: Dcache Microkernel Specify L1 hit rate, generate ~desired hits via random linked list traversal A C M P Z Cache Size
33
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 33 Dcache Performance Traces Each hit rate range is obvious Trends NOT identical across metrics: Linear L1 misses vs. Nonlinear IPC FOR A SINGLE METRIC: How you capture phases depends on metric and chosen threshold
34
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 34 Dcache PC Traces No visible phases from PC samples Address Space Sampling alone is NOT sufficient!!
35
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 35 Experiment Setup PIN kit 1795 3 level Trace instrumentation ~Every user trace: Conditional inlined trace count Every 50-200K Trace call: Sample EIP Every 5-20M Trace call: Generate BBV & Collect PMCs & Read PWR history Constraint: Instrumentation should not overwhelm Power variations!! BBV Generation: Sample BBL heads hash into 32 dimensions (based on Jenkins) PMC Reading: Single rotation subset Sample via ‘popen’s due to platform conflicts Power Reading: Read from serial device buffer No polling possible disable device at major instrumentation & exhaust buffer
36
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 36 BBV Results Is sampling good enough? Are they Meaningful? B. Calder’s Full Blown BBV SimMatrices Our sampled & hashed BBV Simmatrices
37
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 37 Power Results Do we still have the hook on power variability? Native From PIN Native From PIN
38
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 38 Currently… Still need to verify benchmarks for power and validity Constructing power vectors with the reduced set Applying symmetric phase analyses to BBVs and PMCs Power representation of phases wrt measurements 90-10 Prediction with regression trees
39
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 39 Today! Phase detection on real systems: Variability effects and potentials for repeatability Virtual memory behavior – Tuning Initial results What’s going on? BBVs – PMCs – PVs… and POWER Simple metric prediction studies Short term vs. long term
40
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 40 Metric (IPC) Value Prediction No big challenge to get good results, but improving for edges is interesting Statistical Predictor: Transition guided, history based (EWMA) IPC Prediction Instead of fixed history window, use stable regions between transitions as your history in a circular buffer Transitions based on a threshold Threshold = 0 “Last Value Predictor” Our experience: Variabilities are bursty transitions There are stable regions with probable gradients between transitions
41
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 41 Ammp, thr=0% (Last Value)
42
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 42 Ammp, thr=10%
43
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 43 Using Stability Considerations (8) in IPC Pred-ns
44
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 44 Predicting Durations X=f(x) approach: F(x) = x, x/2, x/8, … Initial Stability requirement: 2,8,… Table based? Idea was: At each transition: predict once for duration based on history: Log(prev_duration) = key val-s [0,1,2,3,4,5] History: |5|3|5|3|5| 3 |1|3|5|1|3| 5 -need to filter bursts somehow -Partial matchings?? NOT EXPLORED!!
45
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 45 Ammp Duration Prediction Predict Based on F(x)=x/8 Stability Criterion=8 samples Extend duration stability continues IPC based on last value Predictions only at checkpoints
46
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 46 Long Term IPC Prediction with Gradients Last value not very useful at long term Instead of 0 order, consider a 1 st order prediction: Need additional ΔIPC information Next IPC = Current IPC + ΔIPC Ex: F(x)=x/8
47
Phase Analysis – Challenges on Real-Systems Canturk Isci - Margaret Martonosi 47 Improvements? Using Prediction Probability Tables: P{N more|20 stable @ IPC} Ex: Vortex Using adaptive functions based on history Table based function approaches NP(N|20) 0-90.111111111 10.0-790.577777778 79-990.022222222 100-10000.288888889 1000+0
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.