1 / 55

Static and Dynamic Program Analysis: Synergies and Applications

Static and Dynamic Program Analysis: Synergies and Applications. Lecture 23 CS 6340. Today’s Computing Platforms. Trends:. Traits:. parallel cloud mobile. numerous diverse distributed. Unprecedented software challenges:

shiloh
Download Presentation

Static and Dynamic Program Analysis: Synergies and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Static and Dynamic Program Analysis: Synergies and Applications Lecture 23 CS 6340

  2. Today’s Computing Platforms Trends: Traits: • parallel • cloud • mobile • numerous • diverse • distributed Unprecedented software challenges: reliability, productivity, scalability, energy-efficiency

  3. A Challenge in Mobile Computing • Rich apps are hindered • by resource-constrained • mobile devices (battery, • CPU, memory, ...) How can we seamlessly partition mobile apps and offload compute- intensive parts to the cloud?

  4. A Challenge in Cloud Computing service levelagreements energyefficiency data locality scheduling How can we automatically predict performance metrics of programs?

  5. A Challenge in Parallel Computing • Most Java programs are • so rife with concurrency • bugs that they work only • by accident. • – Brian Goetz • Java Concurrency in Practice “ ” How can we automatically make concurrent programs more reliable?

  6. Terminology • Program Analysis • Discovering facts about programs • Dynamic Analysis • Program analysis using program executions • Static Analysis • Program analysis without running programs

  7. This Talk • Techniques Synergistically combine diverse techniques to solve modern software challenges Challenges programreliability programscalability staticanalysis dynamicanalysis programperformance machinelearning

  8. Our Result: Mobile Computing • Techniques Seamless Program Partitioning: Upto 20X decrease in energy used on phone Challenges programreliability programscalability staticanalysis dynamicanalysis programperformance machinelearning

  9. Our Result: Cloud Computing • Techniques Automatic Performance Prediction: Prediction error < 7% at < 6% program runtime cost Challenges programreliability programscalability staticanalysis dynamicanalysis programperformance machinelearning

  10. Our Result: Parallel Computing • Techniques Scalable Program Verification: 400 concurrency bugs in 1.5 MLOC Java programs Challenges programreliability programscalability staticanalysis dynamicanalysis programperformance machinelearning

  11. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  12. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  13. Program Partitioning: CloneCloud [EuroSys’11] Offline Optimization using ILP static analysis yieldsconstraints rich app ….……offload………………..………..resume….. …. …... ….. offload • dictates correct solutions • uses program’s call graphto avoid nested migration ………… …….. ……..... resume dynamic analysis yields objective function How do we automatically find which function(s) to migrate? • dictates optimal solutions • uses program’s profiles tominimize time or energy

  14. CloneCloud on Face Detection App • phone = HTC G1 running Dalvik VM on Android OS • cloud = desktop running Android x86 VM on Linux 1 image 100 images energy used on phone (J) • Upto 20X decrease in energy used on phone • Similar for total running time, and other apps

  15. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  16. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  17. From Offline to Online Program Partitioning • CloneCloud uses same partitioning regardless of input • But different partitionings optimal for different inputs • 1 image: phone-only optimal • 100 images: (phone + cloud) optimal • Challenge: automatically predict running time of a function on an input • can be used to decide online whether or not to partition • but also has many other applications

  18. The Problem: Predicting Program Performance Inputs: Program P Input I class Bldg {List events, floors;main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time;}Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…));} } 9 2 1 0 6 2 1 5 5 4 8 7 7 0 java Elev /foo/input.txt Output: Estimated running time of P(I) Goals: Accurate Efficient General-purpose Automatic

  19. Our Solution: Mantis [NIPS’10] program P input I Offline Online performancemodel estimated runningtime of P(I) training inputsI1, …, IN

  20. R = .3 + .5 f4 + .8 f42 Offline Stage of Mantis class Bldg {List events, floors;main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time;}Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…));} } • Instrument program with broad classes of features • Collect feature values and timeon training data • Express time as function of few features using sparse regression • Obtain evaluators for featuresusing static slicing analysis f4+= t; f1++; f2++; f3++;

  21. Static Slicing Analysis • static-slice(v) = { all actions that may affect value of v } • Computed using data and control dependencies

  22. From Dependencies to Slices class Bldg {List events, floors;main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time; }Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…));} } f4 += t; static-slice(f4) =

  23. R = .3 + .5 f4 + .8 f42 Offline Stage of Mantis class Bldg {List events, floors;main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time;}Bldg() {floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…));} } • Instrument program with broad classes of features • Collect feature values and timeon training data • Express time as function of few features using sparse regression • Obtain evaluators for featuresusing static slicing analysis • If feature is costly, discard it and repeat process f4+= t; f1++; f2++; f3++;

  24. Offline Stage of Mantis • Instrument program with broad classes of features • Collect feature values and timeon training data • Express time as function of few features using sparse regression • Obtain evaluators for featuresusing static slicing analysis • If feature is costly, discard it and repeat process dynamicanalysis trainingdata profiledata rejectedfeatures machinelearning staticanalysis chosen features

  25. Mantis on Apache Lucene • Popular open-source indexing and search engine • Datasets used: Shakespeareand King James Bible • 1000 inputs, 100 for training • Feature counts: • instrumented = 6,900 • considered = 410 (6%) • chosen = 2 • Similar results for other apps prediction error = 4.8% running time ratio = 28X (program/slices)

  26. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  27. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  28. class Bldg {List events, floors;main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time;}Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…));} } f4 += t; Static Analysis of Concurrent Programs Data or control flow of one thread can be affected by actions of other threads! Either prove these actions thread-local ... … or include these actions in the slice as well

  29. Bldg Elev floors events floors List List floors elems elems Elev Object[] Object[] 0 1 0 1 BP Floor Floor BP v The Thread-Escape Problem • thread-local(p,v): Is v reachable from single thread at pon all inputs? class Bldg { List events, floors; main() { Bldg b = new Bldg();for (…) BP v = b.events.get(i); int t = v.time; } Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…)); } } p: a program state at p = local = shared

  30. The Thread-Escape Problem • thread-local(p,v): Is v reachable from single thread at pon all inputs? • Needs to reason about all inputs  Use static analysis

  31. The Need for Program Abstractions • All static analyses need abstraction • represent sets of concrete entities as abstract entities • Why? • Cannot reason directly about infinite concrete entities • For scalability • Our static analysis: • How are pointer locations abstracted? • How is control flow abstracted?

  32. Example: Trivial Pointer Abstraction thread-local(p, v)? class Bldg { List events, floors; main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time; } Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…)); } } Bldg Elev floors events p: floors List List floors elems elems Elev Object[] Object[] 0 1 0 1 BP Floor Floor BP v class List {List() { this.elems = new Object[…];} }

  33. Example: Allocation Sites Pointer Abstraction thread-local(p, v)? class Bldg { List events, floors; main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time; } Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…)); } } Bldg Elev floors events p: floors List List floors elems elems Elev Object[] Object[] 0 1 0 1 BP Floor Floor BP v class List {List() { this.elems = new Object[…];} }

  34. Bldg Elev floors events floors List List floors elems elems Elev Object[] Object[] 0 1 0 1 BP Floor Floor BP v Example: k-CFA Pointer Abstraction thread-local(p, v)? class Bldg { List events, floors; main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time; } Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…)); } } p: class List {List() { this.elems = new Object[…];} }

  35. Complexity of Static Analyses precise scalable H = allocation sites, I = call sites L = program points, F = fields Challenge: an abstraction that is both precise and scalable Our Static Analysis: Q = queries

  36. Drawback of Existing Static Analyses • Different queries require different parts of the program to be abstracted precisely • But existing analyses use the same abstraction to prove all queries simultaneously • ⇒existing analyses sacrifice precision and/or scalability P P⊢Q1? static analysis Q1 abstraction A P⊢Q2? Q2

  37. Insight 1: Client-Driven Static Analysis • Query-driven: allows using separate abstractions for proving different queries • Parametrized: parameter dictates how much precision to use for each program part for a given query Q2 Q1 static analysis static analysis abstraction A2 abstraction A1 P P⊢Q1? P⊢Q2?

  38. h1 h2 h3 h4 h5 h6 h7 Example: Client-Driven Static Analysis thread-local(p, v)? class Bldg { List events, floors; main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time; } Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…)); } } h1: Bldg Elev floors events p: floors List List floors h2: elems elems Elev h3: Object[] Object[] h4: 0 1 0 1 h5: BP Floor Floor BP h6: v class List {List() { this.elems = new Object[…];} } h7:

  39. Insight 2: Leveraging Dynamic Analysis • Challenge: Efficiently find cheap parameter to prove query • 2^H choices, most choices imprecise or unscalable • Our solution: Use dynamic analysis • parameter is inferred efficiently (linear in H) • it can fail to prove query, but it is precise in practice and no cheaper parameter can prove query H inputsI1... In static analysis dynamic analysis Q P⊢Q? abstraction A P

  40. Bldg Elev floors events floors List List floors elems elems Elev Object[] Object[] 0 1 0 1 BP Floor Floor BP v h1 h2 h3 h4 h5 h6 h7 Example: Leveraging Dynamic Analysis thread-local(p, v)? class Bldg { List events, floors; main() { Bldg b = new Bldg(); for (…) BP v = b.events.get(i); int t = v.time; } Bldg() { floors = new List(); for (…) floors.add(new Floor()); for (…) Elev e = new Elev(floors); e.start(); events = new List(); for (…) events.add(new BP(…)); } } h1: p: h2: h3: h4: h5: h6: class List {List() { this.elems = new Object[…];} } h7:

  41. Using Dynamic Analysis for Static Analysis • Previous work: • Counterexamples: query is false on some input • suffices if most queries are expected to be false • Likely invariants: a query true on some inputs is likely true on all inputs [Daikon (Ernst et al.’01)] • Our approach [POPL’12]: • Proofs: a query true on some inputs is likely trueon all inputs and for likely the same reason

  42. Benchmark Characteristics

  43. Precision Comparison Previous Approach Our Approach • Pointer abstraction: • Allocation sites • Control abstraction: • Flow insensitive • Context insensitive • Pointer abstraction: • 2-partition • Control abstraction: • Flow sensitive • Context sensitive

  44. Precision Comparison Previous Approach Our Approach • Previous scalable approach resolves 27% of queries • Our approach resolves 82% of queries • 55% of queries are proven thread-local • 27% of queries are observed thread-shared

  45. Running Time Breakdown

  46. Sparsity of Our Abstraction

  47. Feedback from Real-World Deployments • 16 bugs in jTDS (JDBC driver for Microsoft SQL Server) • Before: As far as we know, there are no concurrency issues in jTDS • After: Probably the whole synchronization approach in jTDS should be revised from scratch • 17 bugs in Apache Commons Pool (object pooling library) • Thanks to an audit by Mayur Naik many potential synchronization • issues have been fixed -- Release notes for Commons Pool 1.3 • 319 bugs in Apache Derby (relational database engine) • This looks like very valuable information … could this tool be run • on a regular basis? It is likely that new races could get introduced • as new code is submitted ” “ “ ” “ ” “ ”

  48. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  49. Talk Outline • Overview • Seamless Program Partitioning • Automatic Performance Prediction • Scalable Program Verification • Putting It All Together

  50. Program Analyses as Building Blocks applied to partitioning,but also applicable for better scheduling, etc. performance analysis partitioning analysis slicing analysis call-graph analysis pointer analysis datarace analysis applied to datarace analysis, but also used for deadlock detection, etc. applied to slicing, but can also yield simpler memory models, etc. thread-escape analysis may-happen-in- parallel analysis CHORD: a versatile program analysis platform[PLDI’11 tutorial]

More Related