1 / 42

Fault Diagnosis* of Software Systems

Fault Diagnosis* of Software Systems. Rui Maranhão Dept. of Informatics Engineering Faculty of Engineering of University of Porto FEUP, 13Jan10. * aka automatic debugging. Fault Diagnosis* of Software Systems. Rui Abreu Software Technology Dept. Delft University of Technology, NL

holly
Download Presentation

Fault Diagnosis* of Software Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fault Diagnosis* of Software Systems Rui Maranhão Dept. of Informatics Engineering Faculty of Engineering of University of Porto FEUP, 13Jan10 * aka automatic debugging

  2. Fault Diagnosis* of Software Systems Rui Abreu Software Technology Dept. Delft University of Technology, NL PARC, 18Jul09 * aka automatic debugging

  3. About the speaker… PhD, TUD ST, UU Philips Research Labs LESI, UM Ass. prof., FEUP Siemens, Porto

  4. PhD defense (4Nov09) • picasaweb.google.com/rui.maranhao

  5. Outline • Fault Diagnosis • Spectrum-Based Fault Localization • Statistics-based • Reasoning-based • Summary

  6. Software Faults • Faults (bugs) have been around since the beginning of computer science • can have serious financial or life-threatening consequences

  7. Fault Diagnosis Identify (SW) component(s) that are root cause of failure healthy f1 f2 f3 x y = f(x,h) faulty f4 f5 x, y: observation vectors f: system function, fj: component functions h: system health state vector, hj: component health vars Diagnose failure: solve inverse problem h = f-1(x,y) Diagnosis: h4 = fault state; or (h2 and h5) = fault state; or ..

  8. Fault Diagnosis Approaches x x fault diagnosis: SFL x x x SW x x fault diagnosis: MBD x x x x: fault (bug, defect)

  9. x y = f(x,h) f2 M2 f3 M3 f1 M1 M5 f5 f4 M4 =? Model-Based Diagnosis • suppose we have component models (Mi) of all fi: • can infer location(s) of failure (Model-Based Diagnosis) Pass/Fail y’ = M(x,h’) searchfor h’=h such that y’ consistent with y

  10. x y = f(x,h) f2 2 f3 3 1 f1 5 f5 f4 4 =? Spectrum-based Fault Localization • suppose we only have trace on involvement of fi (and have M): • can infer location(s) of failure (Spectrum-Based Fault Localization) Pass/Fail y’ = M(x) trace correlatetrace with Pass/Fail test outcomes

  11. MBD vs. SFL • MBD • Reasoning approach based on (propositional) models • High(est) diagnostic accuracy • Prohibitive (modeling and/or diagnosis) cost for [Emb] SW • SFL • Statistical approach based on spectra (traces) • Lower diagnostic accuracy: cannot reason over multiple faults • No modeling cost (except test oracle) + low diagnosis cost

  12. Outline • Fault Diagnosis • Spectrum-based Fault Localization • Statistics-based • Reasoning-based • Summary

  13. 1 2 3 4 5 6 7 8 9 10 11 12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 SFL: Principle (1) 1 2 3 4 5 6 7 8 9 10 11 12 Not touched Touched, pass Touched, fail

  14. 1 2 3 4 5 6 7 8 9 10 11 12 1 1 0 0 0 1 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 SFL: Principle (2) 1 2 3 4 5 6 7 8 9 10 11 12 Not touched Touched, pass Touched, fail

  15. 1 2 3 4 5 6 7 8 9 10 11 12 2 1 1 0 1 2 2 2 2 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 SFL: Principle (3) 1 2 3 4 5 6 7 8 9 10 11 12 Not touched Touched, pass Touched, fail

  16. 1 2 3 4 5 6 7 8 9 10 11 12 2 1 1 0 1 2 2 2 2 1 1 1 1 0 0 1 1 1 1 0 1 0 0 0 SFL: Principle (4) 1 2 3 4 5 6 7 8 9 10 11 12 Not touched Touched, pass Touched, fail

  17. 1 2 3 4 5 6 7 8 9 10 11 12 3 1 1 0 1 3 3 2 3 1 3 3 1 0 0 1 1 1 1 0 1 0 0 0 SFL: Principle (5) 1 2 3 4 5 6 7 8 9 10 11 12 Not touched Touched, pass Touched, fail

  18. 1 2 3 4 5 6 7 8 9 10 11 12 3 1 1 0 1 3 3 2 3 1 3 3 2 0 0 2 1 2 2 0 2 1 0 0 SFL: Principle (6) 1 2 3 4 5 6 7 8 9 10 11 12 Not touched Touched, pass Touched, fail

  19. 1 2 3 4 5 6 7 8 9 10 11 12 3 1 1 0 1 3 3 2 3 1 3 3 2 0 0 2 1 2 2 0 2 1 0 0 SFL: Principle (7) 1 2 3 4 5 6 7 8 9 System components are ranked according to likelihood of causing the detected errors 10 11 12 Not touched Touched, pass Touched, fail

  20. SFL Example (1) void RationalSort( int n, int *num, int *den ) { // block c1 int i,j; for ( i=n-1; i>=0; i-- ) { // block c2 for ( j=0; j<i; j++ ) { // block c3 if ( RationalGT( num[j], den[j], num[j+1], den[j+1] ) ) { // block c4 swap( &num[j], &num[j+1] ); /* swap( &den[j], &den[j+1] ); */ <== FAULT } } } }

  21. SFL Example (2) intermediate error

  22. SFL Example (3) (hit) spectrum c1 c2 c3c4 P/F (I1) 1 0 0 0 0 (P) (I2) 1 1 0 0 0 (P) (I3) 1 1 1 1 0 (P) (I4) 1 1 1 1 0 (P) (I5) 1 1 1 1 1 (F) (I6) 1 1 1 0 0 (P) n11: 1 1 1 1 n11: # hits where run is failing n10: 5 4 3 2 n10: # hits where run is passing n01: 0 0 0 0 n01: # misses where run is failing s: 1/6 1/5 1/4 1/3 s = n11 / (n11 + n10 + n01) (Jaccard similarity coefficient)

  23. Shortcomings of using Statistics test suite: a b x P/F 1 0 1 F 0 1 1 F 1 0 0 F 0 1 0 F 1 1 1 P if (a == 1) y = f1(x); if (b == 1) y = f2(x); if (x == 0) y = f3(y); return y; f1(x) { /* c1 */ return x * 100; } f1(x) { /* c2 */ return x / 100; } f3(y) { /* c3 */ return y + 1; }

  24. Similarity-based Approach c1 c2 c3 P/F 1 0 0 1 (F) 0 1 0 1 (F) 1 0 1 1 (F) 0 1 1 1 (F) 1 1 0 0 (P) n11: 2 2 2 n10: 1 1 0 n01: 1 1 1 s: 1/2 1/22/3 c3 ranked highest instead of c1, c2 !

  25. Reasoning-based Approach (1) c1 c2 c3 P/F 1 0 0 1 (F) c1 must be faulty 0 1 0 1 (F) c2 cannot be single fault 1 0 1 1 (F) c3 cannot be single fault 0 1 1 1 (F) c2, c3 cannot be double fault 1 1 0 0 (P)

  26. Reasoning-based Approach (2) c1 c2 c3 P/F 1 0 0 1 (F) 0 1 0 1 (F) c2 must be faulty 1 0 1 1 (F) c1 cannot be single fault 0 1 1 1 (F) c3 cannot be single fault 1 1 0 0 (P) c1, c3 cannot be double fault

  27. Reasoning-based Approach (3) c1 c2 c3 P/F 1 0 0 1 (F) c1, c2 can be double fault 0 1 0 1 (F) 1 0 1 1 (F) 0 1 1 1 (F) 1 1 0 0 (P) Summary:c1, c2 faulty but not single-fault c1, c2 can be double-fault c1, c3 nor c2, c3 can be double-fault so {c1, c2} is the only diagnosis possible (subsuming the triple fault {c1, c2, c3})

  28. Idea: Extend SFL with MBD • MBD • Reasoning approach based on generic model • High(er) diagnostic accuracy • Prohibitive (modeling and/or diagnosis) cost for [Emb] SW • SFL • Statistical approach based on spectra • Lower diagnostic accuracy: cannot reason over multiple faults • No modeling cost (except test oracle) + low diagnosis cost [DX’08, DX’09, SARA’09, IJCAI’09, ASE’09]

  29. Spectrum-Based Reasoning Generic component model: j : hj => inp_okj => out_okj c1 c2 c3 P/F 1 0 0 1 (F) h1 0 1 0 1 (F) h2 h1h2 1 0 1 1 (F) h1h3 0 1 1 1 (F) h2h3 1 1 0 0 (P) diagnosis candidates (minimal hitting set) In this case only 1 diagnosis candidate {c1,c2} => correct diagnosis

  30. Bayesian Probability Ranking Many diagnostic candidates dk (e.g., d1 = {c3}, d2 = {c4, c5}, ..) => Diagnosis = candidates ordered in terms of probability Bayes’ rule: Pr(dk|obsi) = ( Pr(obsi|dk) Pr(dk|obsi-1) ) / Pr(obsi) A priori Pr(dk) = p|dk| (1-pM-|dk|) where p fault probability of c Intermittent component failure model: gj [0,1] System failure model: failure when  1 comp. fails (Pr=(1-gj)) => Pr(obsi|dk) = 1 - j gj for fail; Pr(obsi|dk) = j gj for pass

  31. Example Diagnosis c1 c2 c3 P/F 1 1 0 1 (F) d1 = h1h2 0 1 1 1 (F) ...d2 = h1h3 1 0 0 1 (F) 1 0 1 0 (P) Pr(obs|d1) = (1-g1g2) (1-g2) (1-g1) g1 MLE: maximize Pr(obs|d1) wrt g1,g2: g1 = .47, g2 = .19, Pr(d1) = 0.19 Pr(obs|d2) = (1-g1) (1-g3) (1-g1) g1g3 MLE: maximize Pr(obs|d2) wrt g1,g3: g1 = .41, g3 = .50, Pr(d2) = 0.04 => Diagnosis = < {c1,c2}, {c1,c3} > (i.e., start testing c1 and c2)

  32. Implementation (Zoltar) system * * instrumented to obtain spectrum spectrum diagnostic engine * diagnosis pass/fail * Tarantula [GAT], Barinel [TUD]

  33. Diagnostic Work - Metric • Wasted effort W • Excess work to find the faulty components • Independent of C • As opposed to measuring work • As an example, suppose • A M=10-component program, c1 and c2 are faulty • D = <{1,3}, {3,5}, {2,4}> • W = 2/10

  34. Diagnostic Work vs. # Obs (5 faults / 20)

  35. Case Studies (Siemens set)

  36. Case Studies (PSC/NXP)

  37. Outline • Fault Diagnosis • Spectrum-based Fault Localization • Statistics-based • Reasoning-based • Summary

  38. Summary • Fault diagnosis critical success factor in developing SW • SFL viable approach to SW fault diagnosis • Bayesian reasoning extension yields more accuracy at marginal complexity increase • Work is appreciated by the scientific community • 28 research papers published and a best demo award @ ASE’09 • On its way: Java + eclipse plug-in + graphical display of results • Acknowledgments: Arjan van Gemund (TUD), Peter Zoeteweij (ex-TUD), Johan de Kleer (PARC), Wolfgang Mayer (UniSA), Markus Stumptner (UniSA), Rob Golsteijn (Philips/NXP Semiconductors), Hasan Sozer (UTwente), Mehmet Aksit (UTwente), ….

  39. Help me! I’ve got a problem. We might lose the picture for a while… No Way, Not now!They’re about to score! I’ll recover for you!

  40. Questions ?

  41. Assignments • Debugging functional languages • Would these concepts apply to e.g., scala? • Visualizations for spreadsheets’ diagnostic reports • How to automatically detect errors? • Mobile Apps: testing and debugging • Android ; Windows Phone (2 different assignments) • SFL for software evolution

More Related