1 / 38

Detecting Recurrent Phase Behavior under Real-System Variability

Detecting Recurrent Phase Behavior under Real-System Variability. Canturk ISCI Margaret MARTONOSI. E1. E2. E3. E4. E5. GATE. ON. Phase Analysis & Real Systems. Phases: Self-similar, mostly recurrent, execution regions. Useful for characterization, dynamic-adaptive management.

Download Presentation

Detecting Recurrent Phase Behavior under Real-System Variability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detecting Recurrent Phase Behavior under Real-System Variability Canturk ISCIMargaret MARTONOSI

  2. E1 E2 E3 E4 E5 GATE ON Phase Analysis & Real Systems • Phases: Self-similar, mostly recurrent, execution regions • Useful for characterization, dynamic-adaptive management • How to identify phase recurrences when real-system effects make them inexact replicas? Canturk Isci - Margaret Martonosi

  3. Underlying Research Questions • What are the types and extent of system-induced variations? • How do phases manifest themselves with real-system effects? • Can we extract recurrent behavior in spite of these variations?If so, how? Canturk Isci - Margaret Martonosi

  4. Background: Power and Phases • Runtime processor power monitoring and estimation [Micro’03] • Sample PMCs to estimate powers for 22 chip components • Real measurement feedback for tuning and verification • Workload power phase behavior with power vectors [WWC’03] • Consider power estimations as power vectors • Characterize “power phases” based on vector similarity Canturk Isci - Margaret Martonosi

  5. Variability in Real-System Runs • Initial idea was to look at phase distributions of apps and use some signature analysis to detect/predict phases • HOWEVER: • Multiple runs inevitably exhibit different behavior • Quantities & durations varyPhase distributions vary Metric Variability Time Variability Canturk Isci - Margaret Martonosi

  6. Underlying Research Questions • What are the types and extent of system-induced variations? • Metric variability • Time variability • How do phases manifest themselves with real-system effects? • Can we extract recurrent behavior in spite of these variations?If so, how? Canturk Isci - Margaret Martonosi

  7. Real-System Variability Effects on Phases Metric t Ideal Glitch Gradient Shift Mutation Time Dilation Canturk Isci - Margaret Martonosi

  8. Real-System Variability Effects on Phases • A direct apples to apples comparison of phase signatures is not very relevant in real world! Ideal FINAL Glitch Gradient Shift Mutation Time Dilation Canturk Isci - Margaret Martonosi

  9. Underlying Research Questions • What are the types and extent of system-induced variations? • How do phases manifest themselves with real-system effects? • Can we extract recurrent behavior in spite of these variations?If so, how? Canturk Isci - Margaret Martonosi

  10. Improving Phase Analysis Using Transitions Metric Ideal t Metric Final t Canturk Isci - Margaret Martonosi

  11. Improving Phase Analysis Using Transitions Value Based Phases (VBP) • Value based phase representations do not show good correlation 3 2 2 1 t 6 5 4 3 2 2 1 t Canturk Isci - Margaret Martonosi

  12. Our Proposed Solution with Transitions Transition Based Phases (TBP) • Tracking phase transitions rather than phase sequences is more useful in detecting recurrent behavior 1 1 1 00…0 00…0 00…0 00…0 t 1 1 1 1 1 1 00…0 00…0 00…0 00…0 00…0 t Canturk Isci - Margaret Martonosi

  13. Apply near-neighbor blurring Our Transition-Guided Detection Framework Benchmark run #1 Benchmark run #2 Sample PMCs to form 12D vectors Vector stream #1 Vector stream #2 The INTRO Identify Transitions TBPinit #1 TBPinit #2 Apply glitch/gradient filtering TBPgg #1 TBPgg #2 TBPggN #1 Apply cross correlation Match ⇒Peak at best alignment Mismatch ⇒ No observable peak Canturk Isci - Margaret Martonosi

  14. GLITCHES: Initial Transitions: 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 Refined Trans-ns: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 GRADIENTS: Initial Transitions: 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 Refined Trans-ns: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 Sampling Effects: Glitches & Gradients • Nothing happens without disturbances  Glitches • Glitch: Instability where before & after are same  Spurious transitions • Nothing happens instantaneously  Gradients • Gradient: Instability where before & after are different  A single true trans-n • Glitch/Gradient Filtering: • Very simple: no consecutive transitions Canturk Isci - Margaret Martonosi

  15. Strong peak indicates good match! Low peak signifies mismatch! Time Shifts • Cross-correlation of binary sequences shows the highest matching of signatures at the best alignment • Ex: Matching and Mismatch cases, and “The Peak” Mismatch case: Gcc-Equake Matching case: Gcc1-Gcc2 Canturk Isci - Margaret Martonosi

  16. 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 Time Dilations • Observation: Dilations exist as small jitters (few samples) • Proposed Solution: “Near-Neighbor Blurring” • Blur edges slightly  Consider transitions as distributions around their actual locations • Tolerance: Spread of this distribution, [t-x, t+x] samples • Ex: Matching improvement with tolerance=2: run1 1 Mismatch! t run2 t Canturk Isci - Margaret Martonosi

  17. 1 1 1 1 1 .7 .7 .7 .7 .7 .7 .3 .3 .3 .3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 Time Dilations • Observation: Dilations exist as small jitters (few samples) • Proposed Solution: “Near-Neighbor Blurring” • Blur edges slightly  Consider transitions as distributions around their actual locations • Tolerance: Spread of this distribution, [t-x, t+x] samples • Ex: Matching improvement with tolerance=2: run1 Match! 1 t run2 t Canturk Isci - Margaret Martonosi

  18. Apply near-neighbor blurring Our Transition-Guided Detection Framework Benchmark run #1 Benchmark run #2 Sample PMCs to form 12D vectors Vector stream #1 Vector stream #2 The SUMMARY Identify Transitions TBPinit #1 TBPinit #2 Apply glitch/gradient filtering TBPgg #1 TBPgg #2 TBPggN #1 Apply cross correlation Match ⇒Peak at best alignment Mismatch ⇒ No observable peak Canturk Isci - Margaret Martonosi

  19. Results • How do we quantify phase recognition quality? • Matching Score: • Range of values ≥ 0 • Higher is better Canturk Isci - Margaret Martonosi

  20. Results • Detection Results: (green: highest match; red: highest mismatch) Canturk Isci - Margaret Martonosi

  21. 0 detect threshold P{hit} = 1P{false alarm} = 1 Desired operating point P{hit} ~ 1P{false alarm} ~ 0 Very high detect threshold P{hit} = 0P{false alarm} = 0 Receiver Operating Characteristics • Best detection scheme (tolerance=1) achieves 100% hit detection with <5% false alarms. • (Using the same threshold for all apps!) Canturk Isci - Margaret Martonosi

  22. Comparison: TBP Outperform VBP • In all cases transitions perform better • In almost all cases near-neighbor blurring improves detection Canturk Isci - Margaret Martonosi

  23. Conclusions • Detecting phase behavior on real systems has interesting challenges resulting from system induced variability • Phase transition information improves detection capabilities • TBP show 6X better detection capabilities than VBP • Supporting methods, such as Glitch/Gradient Filtering and Near-Neighbor Blurring improve detectability of transition signatures • Near-neighbor blurring with tolerance=1 achieve 100% recurrence detection with <5% false alarms • Resulting infrastructure can enable a range of phase-oriented system adaptations! Canturk Isci - Margaret Martonosi

  24. Thanks! Canturk Isci - Margaret Martonosi

  25. BACKUPS • 0.5) How much noise, how much variation? • 1) Variation in time sequences of phase distributions for two gcc runs; recurrent phases with ammp • 2) Refined transition counts for different thresholds • 3) Advantages with Power/PMC Vectors • 4) Threshold vs. Hits & Misses with Tolerance=1 • 5) How about instr-n based sampling/control flow-based approach? • 6) What’s the source of variability? • 7) Glitches/Gradients vs. sampling frequency? • 8) Use of this framework? • 9) Multithreaded / OLTP like benchmarks? • 10) SMT/CMP/multiprogramming environment? Canturk Isci - Margaret Martonosi

  26. Gap Vortex Gzip Vpr Gcc Crafty Measured Modeled 0.5) Noise vs. Variations Stable Apps Vpr/Crafty change very little, Variable ones change much more Canturk Isci - Margaret Martonosi

  27. 1)Phase Distributions Along Execution Timeline for 2 Runs of Gcc Canturk Isci - Margaret Martonosi

  28. 1) Recurrence Example with Ammp • Although obvious to the eye, comparing phase sequences directly does not reveal the recurrence clearly! Canturk Isci - Margaret Martonosi

  29. 2) Refined Transitions for Different Thresholds Gcc Equake Canturk Isci - Margaret Martonosi

  30. 3) Advantages with Power/PMC Vectors • Direct relation to actual processor power consumption • Acquired at runtime • Identify program phases with no programmatical knowledge of application Canturk Isci - Margaret Martonosi

  31. 4) Threshold vs. Hits & Misses with Tolerance=1 100% hits with < 5% false alarms, for threshold: 3/14=0.21 – 4/14=0.29 Canturk Isci - Margaret Martonosi

  32. 5) How about inst-n based sampling / control flow-based approaches? • We have tried 3 methods: • OS/USR counting with PMCs • Doesn’t eliminate variability • Binding to threads in sampling • Didn’t solve variability/registration problems • Dynamic instrumentation with Pin • Got back to perfect repeatability • Lost actual benchmark execution behavior that flows thru the processor • PC sampling doesn’t solve variability if we simply sample PCs every 1ms or so. (Application execution time varies) • Sampling at fixed instruction counts is for a specific PID makes it deterministic • Has its downsides with uncontrolled timing behavior and not being able to bind to flow thru processor Canturk Isci - Margaret Martonosi

  33. 6) What’s the Source of Variability? • We don’t have perfect, classified answer yet. • Maybe Pin/atom can help • - Different locality at different runs • - Intensity of spontaneous system processes • - Inexact memory access patterns / swaps • - Different cache/tlb/bht etc states Canturk Isci - Margaret Martonosi

  34. 7) Glitches/Gradients vs. Sampling Frequency • Reducing frequency smoothes glitches, BUT dithers gradients  More sluggish, LPF’ed response • Also smoothes actual phase changes • We use 100ms to meet limitations of high frequency corner: • No observable perturbation to actual execution • Limited by RS232 speed • Close lower bound to acquire 3-4 DMM samples Canturk Isci - Margaret Martonosi

  35. 8) What’s the Use of This? • First, this is a GENERIC recurrence detection under variability system!! • Can use to detect/predict phases with specific features: • Memory boundness • Hotspots • Can be stretched to security/reliability: • Matching signatures with PIDs • Specific promising avenues: • CMP workload balancing by signatures  power • Activity migration in the case of hotspot signatures • **DVFS at experienced signatures** • Need help from BBVs under phase behavior changes with taken actions!! Canturk Isci - Margaret Martonosi

  36. 9) Multithreaded/OLTP Like Benchmarks? • No fundamental analysis problem as we don’t try to bind to processes • Some of the experimented ones: • Mozilla, Xmms, Mplayer • FLAT power behavior  Not interesting • Need more infrastructure work to get OLTP like applications running on our platform • Interesting follow-on to see variability of these apps Canturk Isci - Margaret Martonosi

  37. 10) SMT/CMP/Multiprogramming Environments • Don’t have the SMT/CMP platforms hooked up for multimeter (yet) • SMT should be similar, as long as the multi-app behavior is somewhat repeatable • CMP less clear, one PMC set & power measurement per core? Overall per chip? • We have tried multiprogramming on our P4: • Memory intensive apps create too much swapping/thrashing for the behavior to be somewhat repeatable. • Not useful for phase detection • How deterministic is Task switching? Canturk Isci - Margaret Martonosi

  38. OLD/EXTRA Slides Canturk Isci - Margaret Martonosi

More Related