1 / 33

Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations

Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations. Chi-Leung Wong , Zehra Sura , Xing Fang , Kyungwoo Lee , Samuel P. Midkiff , Jaejin Lee and David Padua University of Illinois at Urbana-Champaign IBM T.J. Watson Research Center Purdue University

ledell
Download Presentation

Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluating the Impact of Thread Escape Analysis on Memory Consistency Optimizations Chi-Leung Wong, Zehra Sura, Xing Fang, Kyungwoo Lee, Samuel P. Midkiff, Jaejin Lee and David Padua University of Illinois at Urbana-Champaign IBM T.J. Watson Research Center Purdue University Seoul National University

  2. Outline • Memory Models • The Pensieve System • Escape Analyses • Qualitative Impact of Escape Analyses on Delay Set Analysis and Synchronization Analysis • Experimental Results • Conclusion

  3. Memory Models • Consider the following code segments: • Thread 1 : data = 100; data_ready = true; • Thread 2 : while (!data_ready); t = data; • Can t == 0? • Yes if reordering happens • Thread 1 : data_ready = true; data = 100; • Can be done by compiler and hardware • Memory models tell us the answer • Sequential Consistency says no

  4. Objective of the Pensieve Project • Sequential consistency (SC) on top of Intel x86 memory models • Implementation based on Jikes RVM • All analyses done in JIT time • Need to minimize both analysis and application execution time

  5. Enforcing SC • Done by enforcing memory accesses orders • not all orderings need to be enforced • only enforce orders really needed • Delay Set Analysis (DSA) [SS88] computes such orders • Our approach : Approximation of DSA • Orders enforced by inserting fences in generated code

  6. x x’ x y x y’ y x’ Original DSA • Program edge • x executes before y in the same thread • Conflict edge • x and x’ conflict accesses • Order of access affects program outcome • In this paper: • to the same memory location • one of them is a write

  7. x y’ Not mixed x x y’ Not minimal z y x’ x y y x’ Mixed Minimal y Original DSA (Cont’d) • Critical cycle • Minimal • Cannot form smaller cycle using subset of nodes • Mixed • Contains both edges • Enforce program edges on a critical cycle

  8. Approximate DSA • Approximate of critical cycle • x precedes y • Conflict accesses for • x and x’ • y and y’ • y’ precedes x’ • Enforce program edges on approx critical cycle x y’ y x’

  9. Source Program Thread Escape Analysis Synchronization Analysis Program Analyses Delay Set Analysis Program Analyses Orders to Enforce Code Optimizations FenceInsertion & Optimization Target Program The Pensieve System

  10. Escape Analyses • Identify objects which may be accessed by two or more threads • Output: set of variables • {v | v points to an object may be accessed by >= 2 threads}

  11. x Impact on Delay Set Analysis • x, y, y’, x’ must be escaping accesses • Cannot form a cycle if one of them is not escaping access • Fewer escaping accesses implies fewer possible pairs of (x,y) • Fewer checks to be done • Fewer delays y’ y x’

  12. Impact on Synchronization Analysis • Synchronization analysis reduces number of conflict edges considered by DSA • Consider synchronized construct • Calls to start() and join() • Our system only consider t1.join() • if it can match some t2.start() call • t1 and t2 are not escaping • More precise escape info • more join() calls matched • more precise DSA result

  13. Escape Analyses Comparison • In this study, we compare 4 algorithms: • Connectivity Analysis (Pensieve) • Field Base Analysis (Pensieve) • For comparison purposes • Bogda’s Analysis • Removing Unnecessary Synchronization in Java. (OOPSLA 1999) • Ruf’s Analysis • Effective Synchronization Removal for Java. (PLDI 2000)

  14. Connectivity Escape Analysis • An object is escaping if both • Reachable by more than one thread due to two possible cases: • Reachable by a static field • Passed from a thread constructor • Accessed by more than one thread • Do not assume this escaping in run() by default • Field insensitive for most memory accesses • I.e. do not distinguish x.f vs x.g • Except accesses to Runnable objects

  15. Field Base Escape Analysis • An object is escaping if • Reachable from a static field • Passed from a thread constructor • Do not assume this escaping in run() by default • Similar to connectivity base analysis, • Field sensitive • Suppose O1, O2 of same type • O1.f different from O1.g • O1.f same as O2.f

  16. Bogda’s Escape Analysis • An object is escaping if it is reachable: • By a static field • By a Runnable object • Via more than 1 field reference

  17. Ruf’s Escape Analysis • An object is escaping if both • Reachable from either • A static field or • A Runnable object • Synchronized by more than one thread • Adapted for our own use • “synchronized”  “accessed”

  18. Experimental Settings (Machine) • Intel (Dell PowerEdge 6600 SMP) • 4 Intel hyperthreaded 1.5Ghz Xeon processors • with 1MB cache each • 6G system memory.

  19. Experimental Settings (Software) • Original • default Jikes RVM implementation • base case for performance comparison • Enforcing SC • Empty • Arg Escaping • Connectivity analysis • Field-base analysis • Bogda’s analysis (bogda) • Ruf’s analysis

  20. Measurements • Escape Analysis Time • Impact on Delay Set Analysis Time • Impact on Synchronization Analysis Time • Slowdown due to fence insertion • Delay Set Analysis only • Delay Set Analysis with Synchronization Analysis

  21. Escape Analysis Time

  22. Impact on Delay Set Analysis Time

  23. Impact on Synchronization Analysis Time

  24. Escape+DSA+ Synchronization Analysis Time / Compilation Time

  25. Slowdown (DSA Only)

  26. Slowdown (DSA+Sync Analysis)

  27. Slowdown of connect (DSA+Sync Analysis)

  28. Conclusions • Evaluate interaction between escape analysis and synchronization/delay set analysis • Montecarlo and jbb motivates enabling field sensitivity for connectivity base analysis

  29. Backup Slides Follow

  30. Number of Delay Checks Performed

  31. Total Compilation Time

  32. Number of Delays Found (DSA Only)

  33. Number of Delays Found (DSA + Sync Analysis)

More Related