Analyses and Optimizations for Multithreaded Programs - PowerPoint PPT Presentation

clint
analyses and optimizations for multithreaded programs n.
Skip this Video
Loading SlideShow in 5 Seconds..
Analyses and Optimizations for Multithreaded Programs PowerPoint Presentation
Download Presentation
Analyses and Optimizations for Multithreaded Programs

play fullscreen
1 / 110
Download Presentation
Analyses and Optimizations for Multithreaded Programs
146 Views
Download Presentation

Analyses and Optimizations for Multithreaded Programs

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Analyses and Optimizations for Multithreaded Programs John Whaley IBM Tokyo Research Laboratory Martin Rinard, Alex Salcianu, Brian Demsky MIT Laboratory for Computer Science

  2. Motivation • Threads are Ubiquitous • Parallel Programming for Performance • Manage Multiple Connections • System Structuring Mechanism • Overhead • Thread Management • Synchronization • Opportunities • Improved Memory Management

  3. What This Talk is About • New Abstraction: Parallel Interaction Graph • Points-To Information • Reachability and Escape Information • Interaction Information • Caller-Callee Interactions • Starter-Startee Interactions • Action Ordering Information • Analysis Algorithm • Analysis Uses (synchronization elimination, stack allocation, per-thread heap allocation)

  4. Outline • Example • Analysis Representation and Algorithm • Lightweight Threads • Results • Conclusion

  5. Sum Sequence of Numbers 9 8 1 5 3 7 2 6

  6. 1 5 3 7 2 6 9 8 Group in Subsequences

  7. 1 5 3 7 2 6 9 8 + + + + 10 17 8 6 Sum Subsequences (in Parallel)

  8. 1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 0

  9. 1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 17

  10. 1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 23

  11. 1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 33

  12. 1 5 3 7 2 6 9 8 + + + + 17 10 8 6 Add Sums Into Accumulator Accumulator 41

  13. Common Schema • Set of tasks • Chunk tasks to increase granularity • Tasks have both • Independent computation • Updates to shared data

  14. Realization in Java class Accumulator { int value = 0; synchronized void add(int v) { value += v; } }

  15. 0 2 6 Realization in Java class Task extends Thread { Vector work; Accumulator dest; Task(Vector w, Accumulator d) { work = w; dest = d; } public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } } Task work dest Vector Accumulator

  16. 0 2 6 Realization in Java class Task extends Thread { Vector work; Accumulator dest; Task(Vector w, Accumulator d) { work = w; dest = d; } public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } } Enumeration Task work dest Vector Accumulator

  17. Realization in Java void generateTask(int l, int u, Accumulator a) { Vector v = new Vector(); for (int j = l; j < u; j++) v.addElement(new Integer(j)); Task t = new Task(v,a); t.start(); } void generate(int n, int m, Accumulator a) { for (int i = 0; i < n; i ++) generateTask(i*m, i*(m+1), a); }

  18. Task Generation Accumulator 0

  19. Task Generation Accumulator 0 Vector

  20. 2 Task Generation Accumulator 0 Vector

  21. 2 6 Task Generation Accumulator 0 Vector

  22. 2 6 Task Generation Task work dest Accumulator 0 Vector

  23. 2 8 6 9 Task Generation Task work dest Accumulator 0 Vector Vector

  24. 2 8 6 9 Task Generation Task work dest Accumulator 0 Vector dest work Task Vector

  25. 1 2 8 6 5 9 Task Generation Task work dest Accumulator 0 Vector dest dest Task work work Task Vector Vector

  26. Analysis

  27. Analysis Overview • Interprocedural • Interthread • Flow-sensitive • Statement ordering within thread • Action ordering between threads • Compositional, Bottom Up • Explicitly Represent Potential Interactions Between Analyzed and Unanalyzed Parts • Partial Program Analysis

  28. Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Abstraction: Points-to Graph • Nodes Represent Objects • Edges Represent References work dest Vector Accumulator

  29. Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Inside Nodes • Objects Created Within Current Analysis Scope • One Inside Node Per Allocation Site • Represents All Objects Created At That Site work dest Vector Accumulator

  30. Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Outside Nodes • Objects Created Outside Current Analysis Scope • Objects Accessed Via References Created Outside Current Analysis Scope work dest Vector Accumulator

  31. Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Outside Nodes • One per Static Class Field • One per Parameter • One per Load Statement • Represents Objects Loaded at That Statement work dest Vector Accumulator

  32. Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Inside Edges • References Created Inside Current Analysis Scope work dest Vector Accumulator

  33. Analysis Result for run Method public void run() { int sum = 0; Enumeration e = work.elements(); while (e.hasMoreElements()) sum += ((Integer) e.nextElement()).intValue(); dest.add(sum); } this Enumeration Task • Outside Edges • References Created Outside Current Analysis Scope • Potential Interactions in Which Analyzed Part Reads Reference Created in Unanalyzed Part work dest Vector Accumulator

  34. Concept of Escaped Node • Escaped Nodes Represent Objects Accessible Outside Current Analysis Scope • parameter nodes, load nodes • static class field nodes • nodes passed to unanalyzed methods • nodes reachable from unanalyzed but started threads • nodes reachable from escaped nodes • Node is Captured if it is Not Escaped

  35. Why Escaped Concept is Important • Completeness of Analysis Information • Complete information for captured nodes • Potentially incomplete for escaped nodes • Lifetime Implications • Captured nodes are inaccessible when analyzed part of the program terminates • Memory Management Optimizations • Stack allocation • Per-Thread Heap Allocation

  36. Intrathread Dataflow Analysis • Computes a points-to escape graph for each program point • Points-to escape graph is a pair <I,O,e> • I - set of inside edges • O - set of outside edges • e - escape information for each node

  37. Dataflow Analysis • Initial state: I : formals point to parameter nodes, classes point to class nodes O: Ø • Transfer functions: I´ = (I – KillI) U GenI O´ = O U GenO • Confluence operator is U

  38. Intraprocedural Analysis • Must define transfer functions for: • copy statement l = v • load statement l1 = l2.f • store statement l1.f = l2 • return statement return l • object creation site l = new cl • method invocation l = l0.op(l1…lk)

  39. copy statement l = v KillI= edges(I, l) GenI= {l} × succ(I, v) I´ = (I – KillI) U GenI Existing edges l v

  40. copy statement l = v KillI= edges(I, l) GenI= {l} × succ(I, v) I´ = (I – KillI) U GenI Generated edges l v

  41. load statement l1 = l2.f SE= {n2 in succ(I, l2) . escaped(n2)} SI= U{succ(I, n2, f) . n2 in succ(I, l2)} case 1: l2 does not point to an escaped node (SE= Ø) KillI= edges(I, l1) GenI= {l1} × SI Existing edges l1 f l2

  42. load statement l1 = l2.f SE= {n2 in succ(I, l2) . escaped(n2)} SI= U{succ(I, n2, f) . n2 in succ(I, l2)} case 1: l2 does not point to an escaped node (SE= Ø) KillI= edges(I, l1) GenI= {l1} × SI Generated edges l1 f l2

  43. load statement l1 = l2.f case 2: l2 does point to an escaped node (not SE=Ø) KillI= edges(I, l1) GenI= {l1} × (SIU {n}) GenO= (SE× {f}) × {n} where n is the load node for l1 = l2.f Existing edges l1 l2

  44. load statement l1 = l2.f case 2: l2 does point to an escaped node (not SE=Ø) KillI= edges(I, l1) GenI= {l1} × (SIU {n}) GenO= (SE× {f}) × {n} where n is the load node for l1 = l2.f Generated edges l1 n f l2

  45. store statement l1.f = l2 GenI= (succ(I, l1) × {f}) × succ(I, l2) I´ = I U GenI Existing edges l1 l2

  46. store statement l1.f = l2 GenI= (succ(I, l1) × {f}) × succ(I, l2) I´ = I U GenI Generated edges l1 f l2

  47. object creation site l = new cl KillI= edges(I, l) GenI= {<l, n>} where n is inside node for l = new cl Existing edges l

  48. object creation site l = new cl KillI= edges(I, l) GenI= {<l, n>} where n is inside node for l = new cl Generated edges n l

  49. Method Call • Analysis of a method call: • Start with points-to escape graph before the call site • Retrieve the points-to escape graph from analysis of callee • Map outside nodes of callee graph to nodes of caller graph • Combine callee graph into caller graph • Result is the points-to escape graph after the call site

  50. a t v Start With Graph Before Call Points-to Escape Graph before call to t = new Task(v,a)