1 / 30

MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs

MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs. Shan Lu ( shanlu@cs.uiuc.edu ) Shan Lu , Soyeon Park , Chongfeng Hu , Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, and Yuanyuan Zhou University of Illinois

lynde
Download Presentation

MUVI: Automatically Inferring Multi-Variable Access Correlations and Detecting Related Semantic and Concurrency Bugs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MUVI: Automatically InferringMulti-Variable Access Correlations andDetecting Related Semantic and Concurrency Bugs Shan Lu (shanlu@cs.uiuc.edu) Shan Lu, Soyeon Park, Chongfeng Hu, Xiao Ma, Weihang Jiang, Zhenmin Li, Raluca A. Popa, and Yuanyuan Zhou University of Illinois http://opera.cs.uiuc.edu

  2. Bugs are bad! • Software bugs are costly! • Account for 40% of system failures [Marcus2000] • Cost US economy $59.5 billion annually [NIST] • Techniques to improve program correctness are desired

  3. Software bug categories • Memory bugs • Improper memory accesses and usage • A lot of study and effective detection tools • Semantic bugs • Violation to the design requirements or programmer intentions • Biggest part (~80%*) of software bugs • No silver bullet • Concurrency bugs • Wrong synchronization in concurrent execution • Increasingly important with the pervading concurrent program trend • Hard to detect *Have Things Changed Now? -- An Empirical Study of Bug Characteristics in Modern Open Source Software [ACID’06]

  4. An important type of semantic information • Software programs contain many variables • Variables are NOT isolated • Semantic bond exists among variables • Correct programs consistently access correlated variables Variable Access Correlation t y x z u s v w

  5. Class THD { … char* db; int db_length; } struct net_device_stats { … long rv_packets long rv_bytes; } struct fb_var_screeninfo { … int red_msb; int blue_msb; int green_msb; int transp_msb; } struct st_test_file * cur_file; struct st_test_file * file_stack; 4 MySQL Linux Linux MySQL Constraint specification Different representation Different aspects Implementation-demand Variable correlation in programs • Semantic correlation widely exists among variables

  6. Variable access correlation ( constraint ) • Maintaining correlation usually needs consistent access db db_length write ( )  access* ( ) rv_packets rv_bytes *access: read or write write ( )  write ( ) red/…/transp red/…/transp access ( ) access ( ) write ( )  write ( ) file_stack cur_file A1 ( x )  A2 ( y ) access access read read write write Variable access correlation

  7. Violating the correlations leads to bugs • Programmers may forget to access correlated variables • A type of semantic bugs not handled by previous tools Mostly consistent access --- correct Inconsistent access --- BUG! Correlated variables More examples of inconsistent update bugs are in our paper. Confirmed by Linux developers Inconsistent update bugs

  8. Violating the correlations leads to bugs (ii) Thread 1 Thread 2 js_FlushPropertyCache ( … ) { memset ( cachetable, 0, SIZE); … cacheempty = TRUE; } js_PropertyCacheFill ( … ) { cachetable[indx] = obj; … cacheempty = FALSE; } struct JSCache { … JSEntry table[SIZE]; bool empty; } lock ( T ) lock ( T ) unlock ( T ) unlock ( T ) lock ( E ) BUG lock ( E ) • Programmers may forget to synchronize concurrent accesses to correlated variables • This is NOT a traditional data race bug • Bug occurs even if accesses to each single variable are well synchronized Mozilla unlock ( E ) unlock ( E ) Multi-variable concurrency bugs

  9. Our contribution • A technique to automatically infer variable access correlation • Bug detection based on variable access correlation • Inconsistent-update semantic bugs • Multi-variable concurrency bugs • Disclose correlations and new bugs from real-world applications (Linux-device_driver, Mozilla, MySQL, Httpd) • > 6000 variable correlations • 39 new inconsistent-update semantic bugs • 4 new multi-variable concurrency bugs from Mozilla

  10. Outline • Motivation • What is variable access correlation • MUVI variable access correlation inference • MUVI bug detection • Inconsistent-update semantic bug detection • Multi-variable concurrency bug detection • Evaluation • Conclusions

  11. Basic idea of correlation inference access correlation A1 ( x )  A2 ( y ) • Our target: • Our inference method: • Assumption: mature program, mostly correct • x and y appear together in many times • x and y seldom appear separately Statistically infer access correlation based on variable access pattern in source code • How to judge``together’’? • Our metric: • static code distance within a function scope • Our paper talks about other potential metrics Access correlation How to do this efficiently?

  12. Frequent itemset mining • A common data mining technique • Itemset: a set of items ( no order ) • E.g. (v, w, x, y, z) • Sub-itemset: • E.g. (w, y) • Itemset database • Goal: find frequent sub-itemsets in an itemset database • Support: number of appearances • E.g. support of (w, y) is 3 • Frequent: support > threshold ( v, w, x, y, z ) (v, w, y, z, s ) (v, w, y, t ) (v, x, m, n)

  13. Pre-processing Itemset Database Mining Frequent variable sets Post-processing Variable access correlation Flowchart of variable correlation inference Source files How? How?

  14. MUVI Inference algorithm (pre-process) • What is an item? • A variable • What is an itemset? • A function • What to put into an itemset? • Accessed variables • Access type (read/write) Program Source Code ? Itemset Database

  15. MUVI Inference algorithm (pre-process) • Input: program • Output: an itemset database • Flow-insensitive, inter-procedural analysis • Consider Global variables and structure-typed variables • Also consider variables accessed in callee functions Database int x; f1 ( ) { read x; } f2 ( ) { S t; write t.y; } int z; f3 ( ) { read z; f1 ( ); f2 ( ); } f1 {read, x} {read, x} f3 f2 {write, S::y} {write, S::y} f3 {read, z} … …… f1 f2

  16. MUVI Inference algorithm (post-process) • Input: frequent variable sets (x, y), which appear together in many functions • Pruning • What if x and y appear separately many times? • Prune out low confidence (conditional probability) pairs • What if x is too popular, e.g. stderr, stdout? • Categorize based on access type • write (x)  write (y)? Or write (x)  read (y)? etc. • Output: variable correlation A1 ( x )  A2 ( y )

  17. Outline • Motivation • MUVI variable access correlation inference • MUVI bug detection • Inconsistent-update semantic bug detection • Multi-variable concurrency bug detection • Evaluation • Conclusions

  18. Inconsistent-update bug detection • Step 1: get all write(x)acc(y) correlations • Step 2: get all violations to above correlations • Step 3: prune out unlikely bugs • Code analysis to check caller and callee functions write (fb_var_screeninfo::blue_msb)  access (fb_var_screeninfo::transp_msb) #support = 11 #violation = 1 (function neofb_check_var) inconsistent-update bug

  19. Multi-variable concurrency bug detection-- MUVI Lock-set algorithm • Original algorithm • Look for common locks among conflicting accesses to each shared variable • MV Lock-Set algorithm • Look for common locks among conflicting accesses to each shared variable and their correlated accesses

  20. Multi-variable concurrency bug detection-- Other MUVI extension algorithm • MUVI happens-before algorithm • Check the happens-before relation among conflicting accesses to each single variable • Check the happens-before relation among conflicting accesses to each single variable and correlated accesses • Other extension • Extending hybrid race detection • Extending atomicity violation bug detection

  21. Outline • Motivation • MUVI variable access correlation inference • MUVI bug detection • Inconsistent-update semantic bug detection • Multi-variable concurrency bug detection • Evaluation • Conclusions

  22. Methodology • For variable correlation and inconsistent-update bug detection: • Linux (device driver) • Mozilla • MySQL • PostgreSQL • For multi-variable concurrency bug detection: • Fiveexisting real bugs from Mozilla and MySQL All latest versions Find four new multi-variable concurrency bugs during the detection process

  23. Results on correlation inference • Macro, inline functions • coincidence

  24. Inconsistent-update bug detection results • Semantic exceptions • Wrong correlations • No future read access

  25. Multi-variable concurrency bug detection results • MV-Happens-Before has similar results • Variables are conditionally correlated • The correlation is missed by MUVI

  26. Multi-variable concurrency bug detection results • 4 new multi-variable concurrency bugs detected! Wrong result!

  27. Conclusion • Variable access correlations can be inferred • Variable access correlation is important • Help detect two types of bugs • Other usage • Provide specifications to ease programming • Provide hints for assigning locks or TMs • E.g. AtomicSet, AutoLocker, Colorama

  28. Related works • Program specification inference • [ErnstICSE00], [EnglerSOSP01], [KremenekOSDI06], [LiblitPLDI03], [WhaleyISSTA02], [YangICSE06], etc. • Code pattern mining • [LiOSDI04], [LiFSE05], [LivshitsFSE05], etc. • Concurrency bug detection • [ChoiPLDI02], [EnglerSOSP03], [FlanaganPOPL04], [SavageTOCS97], [Praun01], [XuPLDI05], [YuSOSP05], etc. • Techniques for easing concurrent programming • [Harris03], [HerlihyISCA93], [McCloskeyPOPL06], [Rajwar02], [Hammond04], [Moore6], [Rossbach07], etc.

  29. Acknowledgement • Prof. Stefan Savage (shepherd) • Anonymous reviewers • Prof. Liviu Iftode • GOOGLE student travel grant • NSF, DOE, Intel research grants

  30. Thanks! http://opera.cs.uiuc.edu

More Related