230 likes | 351 Views
This paper presents a novel dynamic analysis tool, CHECKMATE, designed for detecting both resource and communication deadlocks in concurrent programs. With over 6,500 reported deadlock issues, traditional detection methods often fall short. CHECKMATE addresses these limitations by allowing the verification of all event orderings through advanced data structures without reliance on idiom-based checks. The tool enhances the detection of various deadlock types and reduces false positives, demonstrating effectiveness through applications on multiple Java libraries and benchmarks.
E N D
Pallavi Joshi* MayurNaik† KoushikSen* David Gay‡ *UC Berkeley †Intel Labs Berkeley ‡Google Inc An Effective Dynamic Analysis for Detecting Generalized Deadlocks
Motivation • Today’s concurrent programs are rife with deadlocks • 6,500/198,000 (~ 3%) of bug reports in Sun’s bug database at http://bugs.sun.com are deadlocks • Deadlocks are difficult to detect • Usually triggered non-deterministically, on specific thread schedules • Fixing other concurrency bugs like races can introduce new deadlocks • Our past experience with reporting races: developers often ask for deadlock checker
Motivation • Most of previous deadlock detection work has focused on resource deadlocks • Example // Thread 1 // Thread 2 sync(L1) { sync(L2) { sync(L2) { sync(L1) { …. …. } } } } L1 T1 T2 L2
Motivation • Other kinds of deadlocks, e.g. communication deadlocks, are equally notorious • Example //Thread 1 // Thread 2 if(!b) { b = true; sync(L) { sync(L) { L.wait(); L.notify(); } } } T1 T2 if(!b) b = true notify L wait L b is initially false
Goal • Build a dynamic analysis based tool that • detects communication deadlocks • scales to large programs • has low false positive rate
Our Initial Effort • Take cue from existing dynamic analyses for other concurrency errors • Existing dynamic analyses check for the violation of an idiom • Races • every shared variable is consistently protected by a lock • Resource deadlocks • no cycle in lockgraph • Atomicity violations • atomic blocks should have the pattern (R+B)*N(L+B)*
Our Initial Effort • Which idiom should we check for communication deadlocks?
Our Initial Effort • Recommended usage of condition variables // F1 // F2 sync (L) { sync (L) { while (!b) b = true; L.wait (); L.notifyAll (); assert (b == true); } }
Our Initial Effort • Recommended usage pattern (or idiom) based checking does not work • Example // Thread 1 // Thread 2 sync (L1) sync (L2) while (!b) L2.wait (); sync (L1) sync (L2) L2.notifyAll (); No violation of idiom, but still there is a deadlock!
Revisiting existing analyses • Relax the dependencies between relevant events from different threads • verify all possible event orderings for errors • use data structures to check idioms (vector clocks, lock-graphs etc.) to implicitly verify all event orderings
Revisiting existing analyses • Idiom based checking does not work for communication deadlocks • But, we can still explicitly verify all orderings of relevant events for deadlocks
Trace Program // Thread 1 // Thread 2 if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false T2 T1 lock L wait L lock L notify L unlock L unlock L
Trace Program T2 T1 • Thread t1 { • lock L; • wait L; • unlock L; • } lock L wait L • Thread t2 { • lock L; • notify L; • unlock L; • } lock L notify L unlock L unlock L
Trace Program T2 T1 • Thread t1 { Thread t2 { • lock L; lock L; • wait L; || notify L; • unlock L; unlock L; • } } lock L wait L lock L notify L unlock L unlock L
Trace Program • Built out of only a subset of events • usually much smaller than original program • Throws away a lot of dependencies between threads • could give false positives • but increases coverage
Trace Program : Add Dependencies // Thread 1 // Thread 2 if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false T2 T1 if (!b) lock L wait L unlock L b = true lock L notify L unlock L
Trace Program : Add Dependencies T2 T1 if (!b) • Thread t1 { • if (!b) { • lock L; • wait L; • unlock L; • } • } lock L • Thread t2 { • b = true; • lock L; • notify L; • unlock L; • } wait L unlock L b = true lock L notify L unlock L
Trace Program : Add Power • Use static analysis to add to the predictive power of the trace program • Thread t1 { • if (!b) { • lock L; • wait L; • unlock L; • } • } // Thread 1 // Thread 2 @ !b => L.wait() if (!b) { b = true; sync (L) { sync (L) { L.wait (); L.notify (); } } } b is initially false
Trace Program : Other Errors • Effective for concurrency errors that cannot be detected using an idiom • communication deadlocks, deadlocks because of exceptions, … // Thread 1 // Thread 2 while (!b) { try{ sync (L) { foo(); L.wait (); b = true; } sync (L) { L.notify(); } } } catch (Exception e) {…} b is initially false • can throw an exception
Implementation and Evaluation • Implemented for deadlock detection • both communication and resource deadlocks • Built a prototype tool for Java called CHECKMATE • Experimented with a number of Java libraries and applications • log4j, pool, felix, lucene, jgroups, jruby.... • Found both previously known and unknown deadlocks (17 in total)
Conclusion • CHECKMATE is a novel dynamic analysis for finding deadlocks • both resource and communication deadlocks • Effective on a number of real-world Java benchmarks • Trace program based approach is generic • can be applied to other errors, e.g. deadlocks because of exceptions