1 / 42

Phase Analysis on Real Systems

Phase Analysis on Real Systems. Canturk ISCI Margaret MARTONOSI. Previously…. Runtime processor power monitoring and estimation Power Phase Behavior of programs ( Power Vectors ). Today!. Phase detection on real systems: Variability effects and potentials for repeatability

tirzah
Download Presentation

Phase Analysis on Real Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Phase Analysis on Real Systems Canturk ISCIMargaret MARTONOSI

  2. Previously… • Runtime processor power monitoring and estimation • Power Phase Behavior of programs (Power Vectors) Canturk Isci - Margaret Martonosi

  3. Today! • Phase detection on real systems: • Variability effects and potentials for repeatability • Virtual memory behavior – Tuning • Initial results • What’s going on? • BBVs – PMCs – PVs… and POWER • Simple metric prediction studies • Short term vs. long term MAJOR MINOR MAYBE Canturk Isci - Margaret Martonosi

  4. Phase Detection with Power Vectors • Initial idea was to look at phase distributions of app-s and use some signature analysis to detect/predict phases • HOWEVER: • Multiple runs -inevitably- exhibit different real system behavior • The quantities & durations vary • The phase distributions vary Metric Var Time Var Canturk Isci - Margaret Martonosi

  5. Variability Effects in Real System Behavior • A direct apples to apples comparison of phase signatures is not very relevant in real world! Canturk Isci - Margaret Martonosi

  6. How do Phase Distributions Compare?Ex: 2 runs of gcc We Want We Get Canturk Isci - Margaret Martonosi

  7. We Got Ourselves a Problem: • How do we extract this recurrent behavior information? • Speech/Humming recognition: • Stored libraries, signal stats • Pitch tracking • Image/Biomedical: • Image warping • Registration/Mutual information • Architects: • Simple to apply online • Implementable w/o massive state & combinationals Canturk Isci - Margaret Martonosi

  8. Interesting Observation with Transitions • Trying to detect application from behavior • Upper Case: • Hit! • Lower Case: • False alarm? • Tracking phase transitions rather than phase sequences proves to be more useful in detecting recurrent behavior* Gcc1-Gcc2 Gcc-Equake Canturk Isci - Margaret Martonosi

  9. Apply near-neighbor blurring Our Transition-Guided Detection Framework Benchmark run #1 Benchmark run #2 Sample PMCs to form 12D vectors Vector stream #1 Vector stream #2 The INTRO Identify Transitions Tinit #1 Tinit #2 Apply glitch/gradient filtering Tgg #1 Tgg #2 TggN #1 Apply cross correlation Match ⇒Peak at best alignment Mismatch ⇒ No observable peak Canturk Isci - Margaret Martonosi

  10. GLITCHES: Initial Transitions: 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 Refined Trans-ns: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 GRADIENTS: Initial Transitions: 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 Refined Trans-ns: 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 Sampling Effects: Glitches & Gradients • Nothing happens without disturbances  Glitches • Glitch: Instability where before & after is same  Spurious Transitions • Nothing happens instantaneously  Gradients • Gradient: Instability where before & after is different  A single true trans-n Canturk Isci - Margaret Martonosi

  11. Glitch/Gradient Filtering • Very simple: no consecutive transitions • Leads to large reductions in transition count • We call these “Refined Transitions (Tgg)” Canturk Isci - Margaret Martonosi

  12. Time Shifts • We have binary information  We can do cheaper than shifted correlation coeff-s • Using Cross-Correlations show equally useful results • Easily implementable • Ex: Matching and Mismatch cases, and “The Peak” Gcc1-Gcc2 Gcc-Equake Canturk Isci - Margaret Martonosi

  13. Time Dilations • Observation: Dilations exist as small jitters (few samples) • Proposed Solution: “Near-Neighbor Blurring” • Blur edges slightly  Consider transitions as distributions around their actual locations • Tolerance: Spread of this distribution, [t-x, t+x] samples • Ex: Matching improvement with tolerance=4: run1 Mismatch! run2 run1 Match! run2 Canturk Isci - Margaret Martonosi

  14. Apply near-neighbor blurring Our Transition-Guided Detection Framework Benchmark run #1 Benchmark run #2 Sample PMCs to form 12D vectors Vector stream #1 Vector stream #2 The SUMMARY Identify Transitions Tinit #1 Tinit #2 Apply glitch/gradient filtering Tgg #1 Tgg #2 TggN #1 Apply cross correlation Match ⇒Peak at best alignment Mismatch ⇒ No observable peak Canturk Isci - Margaret Martonosi

  15. Results • How do we quantify the strength of the peak? • Matching Score: • Detection Results: (green: highest match; red: highest mismatch) Canturk Isci - Margaret Martonosi

  16. Receiver Operating Characteristics • Our best detection scheme (tolerance=1) achieves 100% hit detection with <5% false alarms. • (For a uniform threshold!) Canturk Isci - Margaret Martonosi

  17. Comparison of Methods • Comparing 3 cases: • Original (Value Based) Phases vs. Refined Trans-ns vs. Near-Nbr Blurred Trans-ns • In all cases transitions perform better • In almost all cases near-neighbor blurring improves detection Canturk Isci - Margaret Martonosi

  18. Conclusions • Phase-recurrent behavior detection on real systems has interesting problems resulting from system induced variability • Looking at phase transition information in part improves detection capabilities • Supporting methods such as Glitch/Gradient Filtering and Near-Neighbor Blurring improve detectability of transition signatures Canturk Isci - Margaret Martonosi

  19. Today! • Phase detection on real systems: • Variability effects and potentials for repeatability • Virtual memory behavior – Tuning • Initial results • What’s going on? • BBVs – PMCs – PVs… and POWER • Simple metric prediction studies • Short term vs. long term Canturk Isci - Margaret Martonosi

  20. Workload Phases  Memory Behavior? • Few of the Inspirations: • Redhat Magazine Issue #1 [Dec 2004] • Dynamically Tracking Page Miss Ratio Curve [ASPLOS 2005] • Gokul Kandiraju [PhD Thesis 2004] • Can we track phase behavior from PMCs and VM related stats to dynamically manage memory behavior? • Less page locality  fetch less contiguous pages at once • Recurring reference with high reuse distance  launder less aggressively • Targets • Exec time & Energy ?? ?? Indicator Action Effect Canturk Isci - Margaret Martonosi James Donald -

  21. Platform • P4, No SMT, 256K Mem, Linux 2.4.7-10 • SPEC2K is designed to fit in 256K • Choose High Memory Benchmarks + Multiprogramming • Multiprogramming combinations of these leads to lots of thrashing Canturk Isci - Margaret Martonosi

  22. Action  Effect Indicator Action Effect • Non-intrusive tuning possibilities: • Kswapd:tries_base • Max # of pages swapout daemon tries to free at once • Kswapd:swap_cluster • # of pages swapout daemon writes at once • Page-cluster: • Log2(# of contiguous pages) kernel reads at once at a page fault • Intrusive tuning possibilities: • Page scanning period (Overhead if tasks fit in Mem) • Page age counters (reuse vs. pollution) • Inactive-Clean Percentage (balance I/O and Mem demand) • Task memory allocation (Workload dependent Mem demand) Canturk Isci - Margaret Martonosi James Donald -

  23. Non-intrusive Results • Gzip: gzip + gzip + gzip • Gap: gap + gzip • Bzip2: bzip2 + bzip2 • Tries_base and swap_cluster have no visible effect • Page-cluster shows ~7% improvement wrt default Canturk Isci - Margaret Martonosi James Donald -

  24. Conclusions and Todos • Multiprogramming involving thrashing has a lot of potential for improvement for performance/power • Experimented cases don’t show promising actions • Intrusive actions may be more useful leading to effective actions as well as better (per task) tracking • NEXT STEPS: • Looking into mm for potential dynamic tunings • Defining indicators tracking relevant behavior • Page miss ratio / Swap rates / Bus Utilization • Q: Is There any Potential? Canturk Isci - Margaret Martonosi James Donald -

  25. Tomorrow! • Phase detection on real systems: • Variability effects and potentials for repeatability • Virtual memory behavior – Tuning • Initial results • What’s going on? • BBVs – PMCs – PVs… and POWER • Simple metric prediction studies • Short term vs. long term Canturk Isci - Margaret Martonosi

  26. Comparing Phase Methods for Power • All lead to different interesting characterizations • How do these compare in terms of power representation? • Is there a dominant method or does a (hierarchical) combination work better? • We specifically look at BBVs & PMC-Power Vectors From Sampled PC Traces From Performance Monitoring Counters Canturk Isci - Margaret Martonosi

  27. A Cache Size C M P Z Different Phases Ex: Dcache Microkernel • Specify L1 hit rate, generate ~desired hits via random linked list traversal Canturk Isci - Margaret Martonosi

  28. Each hit rate range is obvious Trends NOT identical across metrics: Linear L1 misses vs. Nonlinear IPC FOR A SINGLE METRIC: How you capture phases depends on metric and chosen threshold Dcache Performance Traces Canturk Isci - Margaret Martonosi

  29. No visible phases from PC samples Address Space Sampling alone is NOT sufficient!! Dcache PC Traces Canturk Isci - Margaret Martonosi

  30. Experiment Setup • PIN kit 1795 • 3 level Trace instrumentation • ~Every user trace: Conditional inlined trace count • Every 50-200K Trace call: Sample EIP • Every 5-20M Trace call: • Generate BBV & Collect PMCs & Read PWR history • Constraint: Instrumentation should not overwhelm Power variations!! • BBV Generation: • Sample BBL heads  hash into 32 dimensions (based on Jenkins) • PMC Reading: • Single rotation subset • Sample via ‘popen’s due to platform conflicts • Power Reading: • Read from serial device buffer • No polling possible  disable device at major instrumentation & exhaust buffer Canturk Isci - Margaret Martonosi

  31. BBV Results • Is sampling good enough? Are they Meaningful? B. Calder’s Full Blown BBV SimMatrices Our sampled & hashed BBV Simmatrices Canturk Isci - Margaret Martonosi

  32. Power Results • Do we still have the hook on power variability? From PIN Native Native From PIN Canturk Isci - Margaret Martonosi

  33. Currently… • Still need to verify benchmarks for power and validity • Constructing power vectors with the reduced set • Applying symmetric phase analyses to BBVs and PMCs • Power representation of phases wrt measurements • 90-10 Prediction with regression trees Canturk Isci - Margaret Martonosi

  34. Today! • Phase detection on real systems: • Variability effects and potentials for repeatability • Virtual memory behavior – Tuning • Initial results • What’s going on? • BBVs – PMCs – PVs… and POWER • Simple metric prediction studies • Short term vs. long term Canturk Isci - Margaret Martonosi

  35. Metric (IPC) Value Prediction • No big challenge to get good results, but improving for edges is interesting • Statistical Predictor:Transition guided, history based (EWMA) IPC Prediction • Instead of fixed history window, use stable regions between transitions as your history in a circular buffer • Transitions based on a threshold • Threshold = 0  • “Last Value Predictor” • Our experience: • Variabilities are bursty transitions • There are stable regions with probable gradients between transitions Canturk Isci - Margaret Martonosi

  36. Ammp, thr=0% (Last Value) Canturk Isci - Margaret Martonosi

  37. Ammp, thr=10% Canturk Isci - Margaret Martonosi

  38. Using Stability Considerations (8) in IPC Pred-ns Canturk Isci - Margaret Martonosi

  39. Predicting Durations • X=f(x) approach: • F(x) = x, x/2, x/8, … • Initial Stability requirement: 2,8,… • Table based? • Idea was: • At each transition: predict once for duration based on history: • Log(prev_duration) = key val-s [0,1,2,3,4,5] • History: • |5|3|5|3|5|  3 • |1|3|5|1|3|  5 • need to filter bursts somehow • Partial matchings?? • NOT EXPLORED!! Canturk Isci - Margaret Martonosi

  40. Ammp Duration Prediction • Predict Based on F(x)=x/8 • Stability Criterion=8 samples • Extend duration  stability continues • IPC based on last value • Predictions only at checkpoints Canturk Isci - Margaret Martonosi

  41. Long Term IPC Prediction with Gradients • Last value not very useful at long term • Instead of 0 order, consider a 1st order prediction: • Need additional ΔIPC information • Next IPC = Current IPC + ΔIPC • Ex: F(x)=x/8 Canturk Isci - Margaret Martonosi

  42. Improvements? • Using Prediction Probability Tables: • P{N more|20 stable @ IPC} • Ex: Vortex • Using adaptive functions based on history • Table based function approaches Canturk Isci - Margaret Martonosi

More Related