1 / 33

Dataflow Analysis for Concurrent Programs using Datarace Detection

Dataflow Analysis for Concurrent Programs using Datarace Detection. Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein 6/5/08. Outline. Motivation Overview of Radar Radar Formalization Radar Optimizations Radar(Relay) Evaluation & Results

gwylan
Download Presentation

Dataflow Analysis for Concurrent Programs using Datarace Detection

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dataflow Analysis for Concurrent Programs using Datarace Detection Ravi Chugh, Jan W. Voung, Ranjit Jhala, Sorin Lerner LBA Reading Group Michelle Goodstein 6/5/08

  2. Outline • Motivation • Overview of Radar • Radar Formalization • Radar Optimizations • Radar(Relay) • Evaluation & Results • Conclusions

  3. Motivation • Want to apply dataflow analysis to concurrent programs without: • Requiring annotations • Escape analysis (loss of precision) • Custom concurrency analysis • Model checking (combinatorial explosion)

  4. Introducing Radar • Scheme for concurrent dataflow analysis • Starts with sequential dataflow analysis • Race detection creates concurrent analysis • Can use already-created race detectors • We’ll see it applied to Relay

  5. Outline • Motivation • Overview of Radar • Radar Formalization • Radar Optimizations • Radar(Relay) • Evaluation & Results • Conclusions

  6. Assumptions • For each procedure, either • Have access to code • Have access to a sound summary • Shared memory is sequentially consistent

  7. Radar’s Key Insights • Adjustability of sequential analysis: • Concurrent dataflow facts are a subset of sequential dataflow facts • “Missing facts” • Facts that can be killed by other threads • Suppose we have a fact about lvalue l • “At line y, l is not null” • Enough to know if another thread can write to l concurrently • “At line z, another thread can write to l”

  8. Radar’s Key Insights • Pseudo-Races : • Identify “missing facts”, • Remove from sequential analysis • Solution: insert a pseudo-read for location l • Ask a race detector: “is there a race at this point for l?” • Yes  Another thread can write. Remove fact • No  No other thread can write. Retain fact. • Producer/Consumer examples follow • Non-null dataflow analysis • Sequential analysis on left • Facts “killed” by concurrency crossed out in red

  9. Producer–Consumer Pseudo-read for px->data at line PA Consumer thread can execute line C5  Race! px->data is crossed out at line PA First example: non-null facts

  10. Modified producer/consumer Still race-free, other than perf_ctr Now, producer acquires/releases lock twice Second example: non-null facts

  11. Insert pseudo-read at P5 on px->data Races with C5 write to cx->data Kills px->data at P5 and where it propagates At P8, not necessarily true that px->data is non-null Null pointer dereference! Note: no data races (except on perf_ctr) We can detect this! Second example: non-null facts

  12. Outline • Motivation • Overview of Radar • Radar Formalization • Radar Optimizations • Radar(Relay) • Evaluation & Results • Conclusions

  13. Sequential Dataflow Analysis • Representation: nodes in CFG • Flow function F(n,d,p): facts true after point p • n: node, d: incoming dataflow fact, p: program point • lvals(f): lvalues fact f depends on • ThreadKill(p,l): computes whether race can occur on l at program point p Fadj(n,d,p) = {fF(n,d,p), llvals(f), fThreadKill(p,l)}

  14. Is Radar Sound? • Suppose there is an oracle function O • Give a program point p and a location l • Returns whether a race is possible • Suppose radar is given a race detector R • Radar is sound if O(p,l) implies R(p,l) • If there is a race, radar wil detect it • Can also return false positives

  15. Outline • Motivation • Overview of Radar • Radar Formalization • Radar Optimizations • Radar(Relay) • Evaluation & Results • Conclusions

  16. Radar Optimizations • Reduce number of times call ThreadKill • Handle function calls

  17. Reduce ThreadKill calls • Race detector for cross product of program points and lvalues is expensive • Many program points have similar behavior • For each lvalue in a region: • Racy for entire region • Not racy for entire region • Compute once for entire region • Region Map: points  “regions

  18. Incorporating Function Calls • To handle function calls: • Introduce a new kind of region: Introprocedural Summary Region (SumReg) • At a particular call site, approximately summarizes possible regions can pass through • To maintain soundness • Suppose there is a transitively reachable path from a callsite cs to a racy region • Summary region must repot that cs is racy

  19. Radar’s Requirements • Race Detection Engine • Region  Lvalue  raciness • Region Map • Points  Race-equivalent Regions • Summary Region Map • Callsites  Summary Regions

  20. Outline • Motivation • Overview of Radar • Radar Formalization • Radar Optimizations • Radar(Relay) • Evaluation & Results • Conclusions

  21. Relay • Relay • Static race detection tool • Lockset-based • Works bottom up • Scales to the linux kernel

  22. Relay • Uses relative lockset analysis: • L+, L- : • L+ : locks definitely acquired since function entry point • L- : locks possibly released since function entry point • Relative lockset for exit point of function is stored as summary of function’s behavior • Approximates effect of function call on locks currently held

  23. Radar(Relay) • Race Detection Engine • Relay • Region Map • Maps program point  (g, (L+,L-)) • g: function name • (L+,L-): relative lockset summary for function g • Summary Region Map • Function g being called at the call site cs in function h • Computes AllUnlocks(cs) =  L- in g • Suppose Region is (h, (L+,L-)) • Returns (h, (L+ - AllUnlocks(cs),L-AllUnlocks(cs)))

  24. Pseudoreads • Suppose at some program point p fact f holds • RegionMap(p):region (g, (L+,L-)) • For all lvalues l  lvals(f): • Pretend to read l at p with relative lockset (L+,L-) • For any other lvalue m which might be aliased… • Intersection of positive locksets is empty  report race

  25. Relay with Radar: Implementation • First Pass: Run Relay • Computes relative lockset associated with each function • Second Pass: Sequential Analysis • Pretend no races exist • Collect all the possible queries about races • Third Pass: Run Relay, Adding Pseudo-reads • Insert pseudo-access wherever race query exist • Fourth Pass: Adjusted Sequential Analysis • At each pseudo-access for l, query race detector • If race could occur, kill facts depending on l

  26. Outline • Motivation • Overview of Radar • Radar(Relay) • Radar Formalization • Radar Optimizations • Evaluation & Results • Conclusions

  27. Evaluation • Focus on non-null dataflow analysis • Used 4 black boxes to answer race queries • Steensgaard’s pointer analysis • If a value is reachable from a global  true • Radaralias • Region map always returns empty lockset • Answers the question of whether any two values alias • Radar • Optimistic • Always return false • Unsound, and overly precise

  28. Results

  29. Terminology • Blob nodes: • Many lvalues on the heap are merged into one node by alias analysis • Can lead to false positives when checking null-dereferences • Other work shows hard to account for heap structures • Next figure excludes “blob nodes” for pointer dereferences • Non-blob dereferences: • Apache: 52% • SSL: 76% • Linux: 71%

  30. Results

  31. Results • Consider gap between Seq and Steensgaard • Check how much is bridged by Radar • With and without locks

  32. Outline • Motivation • Overview of Radar • Radar(Relay) • Radar Formalization • Radar Optimizations • Evaluation & Results • Conclusions

  33. Conclusions • Radar is • Scalable • Not tied to particular concurrency models • Tunable to desired precision • Radar(Relay) • Good precision relative to sequential, steensgaard • Future Work • More types of analysis • Race detection for other concurrency constructs

More Related