1 / 26

Generating Analyses for Detecting Faults in Path Segments

Generating Analyses for Detecting Faults in Path Segments. Wei Le* and Mary Lou Soffa University of Virginia. *currently with Rochester Institute of Technology. Motivation. Static analysis: an integral part of fault detection High code coverage No executables required

Download Presentation

Generating Analyses for Detecting Faults in Path Segments

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generating Analyses for Detecting Faults in Path Segments Wei Le* and Mary Lou Soffa University of Virginia *currently with Rochester Institute of Technology

  2. Motivation • Static analysis: an integral part of fault detection • High code coverage • No executables required • Find faults early, so cheaper to fix

  3. Challenges of Current Static Analysis Precision many false positives and little support for diagnosis Scalability manual annotations sometimes required Generality hardcode heuristics, new tools for different types of faults Important to achieve all three

  4. Precision: Path-Sensitive Analyses Heuristics based: ESP[das02] (based on an assumption of typestate fault) Summary based: Saturn[xie07] (lack of interprocedual path-sensitivity) Partially exploring the state space: Prefix[bush00] exhaustive analysis based on the structure of a program

  5. Framework: Athena • Automatically generate analyses from specifications: • precise: low false positives and rich diagnostic info • interprocedural path-sensitive analysis • reports path-segments of a fault • scalable: only covers code relevant to the fault • demand-driven analysis • general: data- and control-centric, liveness and safety • a specification technique and a generation algorithm

  6. Faults • Commonality of the faults - Generality • The violations are always observable at certain statements • We are able to construct constraints to express violations • Locality of a fault - Scalability • Only the segments along the paths that are relevant to the fault • Only a limited number of statements on the paths that contribute to the fault • Fault locality holds for a variety of the faults

  7. Athena: Components Generate Analyses Specification Language Specification Repository Parser Analyzer Generator Precision and Scalability of the Analyses Path-Sensitive Demand-Driven Template

  8. Athena: Workflow Step 1: Specifying Faults Step 2: Generating Analysis Definition of a Fault Information for Detecting the Fault Syntax trees Code modules Analyzer for the Spec Parser Analyzer Generator Demand-Driven Template Spec Step 3: Analyzing programs with Generated Analysis Infeasible Safe Faulty (severity, root cause) Don’t-know Path Segment Generated Analysis Program Path Classification

  9. Components I: Specification and Language Generate Analyses Specification Language Specification Repository Parser Analyzer Generator • Spec: <program point, constraints> <program point, actions> • Language: attributesand operatorson attributes • Attributes – abstractions on program objects, e.g. len(s) • Operators – comparison (>,<), computation (+, -), command (:=) Precision and Scalability of the Analyses Path-Sensitive Demand-Driven Template

  10. Grammar of the Language Specification→Vars VarList DefineFault FaultSigList DetectFault DetectSigList VarList → Var* Var → VarType namelist; VarType →Vbuffer|Vint|Vany|Vptr|... FaultSigList → FaultSigItem <or FaultSigItem>* DetectSigList → DetectSigItem <or DetectSigItem>* |# include < ExistentSpec > FaultSigItem →CodeSignature ProgramPoint S-Constraint Condition| CodeSignature ProgramPoint L-Constraint Condition DetectSigItem →CodeSignature ProgramPoint Update Action ProgramPoint → $LangSyntax$|Condition|$LangSyntax$&&Condition Condition → Attribute Comparator Attribute|!Condition|[Condition]| Condition&&Condition|Condition || Condition Action → Attribute:=Attribute| ^ Condition|[Action]|Action&&Action|Action || Action|Condition→ Action Attribute → PrimitiveAttribute(var, ...)|Constant|!Attribute|¬Attribute|[Attribute]| Attribute ° Attribute|Attribute Op Attribute|min(Attribute,Attribute)|[Attribute,Attribute] PrimitiveAttribute →Size|Len|Value|MatchOperand|TMax|TMin|... Constant →0|true|false|... Comparator → = |  | > | < |  |  | |  Op → +| − | * |  | 

  11. Grammar of the Language Specification→Vars VarList DefineFault FaultSigList DetectFault DetectSigList VarList → Var* Var → VarType namelist; VarType →Vbuffer|Vint|Vany|Vptr|... FaultSigList → FaultSigItem <or FaultSigItem>* DetectSigList → DetectSigItem <or DetectSigItem>* |# include < ExistentSpec > FaultSigItem →CodeSignature ProgramPoint S-Constraint Condition| CodeSignature ProgramPoint L-Constraint Condition DetectSigItem →CodeSignature ProgramPoint Update Action ProgramPoint → $LangSyntax$|Condition|$LangSyntax$&&Condition Condition → Attribute Comparator Attribute|!Condition|[Condition]| Condition&&Condition|Condition || Condition Action → Attribute:=Attribute| ^ Condition|[Action]|Action&&Action|Action || Action|Condition→ Action Attribute → PrimitiveAttribute(var, ...)|Constant|!Attribute|¬Attribute|[Attribute]| Attribute ° Attribute|Attribute Op Attribute|min(Attribute,Attribute)|[Attribute,Attribute] PrimitiveAttribute →Size|Len|Value|MatchOperand|TMax|TMin|... Constant →0|true|false|... Comparator → = |  | > | < |  |  | |  Operators → +| − | * |  | 

  12. Specification Buffer Overflow Specification 12

  13. Component II: Demand-Driven Template • Formulate fault detection problems into queries about program facts, e.g., variable relations • Scalable: Buffer overflow detection [le08] Generate Analyses Specification Language Specification Repository Parser Analyzer Generator Precision and Scalability of the Analyses Path-Sensitive Demand-Driven Template

  14. Demand-Driven Template Program Resolution bar() 1 80/8>=len(t) len(t)<8 : safe Raise Queries no s = (char*)malloc(80) 2 Propagate Queries size(s)>= len(t) len(t) < 8 x[10] = ‘0’ 3 Update Queries Safe size(s)>= len(t) len(t) < 8 strlen(t) < 8 4 yes Evaluate Queries yes no 5 strcpy(s,t) strcat(x,t) 6 Query size(s)>= len(t)

  15. Demand-Driven Template Program • Rules for Propagating Query • Interprocedural, path-sensitive, context-sensitive • Branch, loop, call, infeasible path • Evaluating Queries (integer constraints) • Algebra rules, inequalities • Integer constraint solver Raise Queries no Propagate Queries Update Queries yes Evaluate Queries

  16. Components III: Parser and Code Generator Generate Analyses Specification Language Specification Repository Parser Analyzer Generator Precision and Scalability of the Analyses Path-Sensitive Demand-Driven Template

  17. Parsing Specification (YACC) CodeSignature: GetOp(s) = strcpy S_Constraint: Size(Src1(s)) Len(Src2(s)) Non-leaf: Operator CodeSignature, S_Constraint A B =  GetOp strcpy º º Size Src1 Len Src2 Leaf: attribute

  18. Code Generation Code Signature int GetOp (statement t) { C_Syntax(t); return t.opcode; } Find the function that implements the semantics of leaf attributes = GetOp strcpy Construct a function that implements the semantics of the tree based on the semantics of operators bool IsStrcpy(statement t){ if (GetOp(t)==“strcpy”) return true; else return false; } Create the instance of the call IsStrcpy(n)

  19. Generating Analysis Syntax trees Code modules Analyzer for the Spec Parser Analyzer Generator Demand-Driven Template Spec Code Module Generated Demand-Driven Template Raise Queries if(isnode(s)) q= raiseQ(s) if(isnode(s)) q= raiseQ(s) Propagate Queries no Update Queries if(isnode(s)) updateQ(q) if(isnode(s)) updateQ(q) Evaluate Queries yes 19

  20. Experimental Setup • Athena (analyze C/C++/C#) – YACC, Phoenix and Disolver 20

  21. Can WeGenerate Analyses for Different Faults? • Detection: 84 faults of four types from 9 benchmarks, 68 new • False positive/negative: 18 false positives, missed 3 • Path segments: generally relevant to 1-4 procedures; maximum 35 procedures • Scalability: apache (268.9 k) – 4 hours and ffmpeg (48.1 k) – 2.3 hours New faults: many located along the same paths; dynamic tools would halt Main source of imprecision: infeasible paths and pointers Locality helped achieve the scalability; without guidance, manual inspection is hard Code complexity matters; Generality does compromise scalability, but still scalable 21

  22. Comparable with Manually Customized Detectors? Heuristics designed for suppressing false positives may adversely hurt detection rate • Lack interprocedural path-sensitivity • Heuristics of applying consistency rules 22

  23. Related Work • Static fault detection: type based, model checking, data flow analysis • Path-sensitive fault detection: Prefix, Metal, ESP, Archer, Saturn, Calysto – exhaustive based static analysis • Athena is demand-driven, more precise, scalable and general • Slicing and other demand-driven analyses • Athena first uses it for computing path segments of faults 23

  24. Conclusions • Athena - generates demand-driven, path-based, symbolic analysis for detecting specified faults: • Faults are developed along paths, but manifest locality, thus demand-driven, path-based analysis is more precise and scalable • Specification provides a way of mapping fault detection problems to constraints on program objects at the program points • To specify different faults, the required attributes are limited, and the expression power comes from the composition of the attributes

  25. Thank you and Questions?

  26. Branch Analysis Fault Detection p[10] Len(t)<10 IsEntry(t) 10  Len(t) [Safe] 1 scanf(%s, t) Len(t)<10 IsEntry(t) Size(p)  Len(t) 2 Feasible Len(t)<10 Size(p)  Len(t) i = strlen(t) 3 Len(t) < 10 i <10 4 Value(i) < 10 yes Size(p)  Len(t) strcpy(p,t) 5

More Related