1 / 23

Verifying Dereference Safety via Expanding-Scope Analysis

Verifying Dereference Safety via Expanding-Scope Analysis. Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson) N. Rinetzky (Tel-Aviv University) M.G. Nanda (IBM IRL).

shelly
Download Presentation

Verifying Dereference Safety via Expanding-Scope Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson) N. Rinetzky (Tel-Aviv University) M.G. Nanda (IBM IRL)

  2. Why Null-Dereference Analysis? • Common problem • …or symptom of other problems • Null-dereference warning may help in identifying root cause • Relevant to all software • Specification is obvious (absence of NPE) • Requires no user interaction

  3. Why Sound Null-Dereference Analysis? • Safety guarantees are important in some domains • Results can become an in-code specification, e.g., via JSR 305 • Annotations can help with code understanding • Annotations can simplify future analyses (e.g., after modifications) • Precise and efficient sound analysis is challenging • Lessons carry over to other static analyses

  4. Example answers expected • class A { • final A a = new A(); • static main() { • B b = new B(); • initB(b); • a.foo(b); // okay • } • foo(B b) { • b.f.fun(); // okay • b.f.f.gun(); // null-deref. • } • static initB(B b) { • b.f = new F(); // okay • b.f.f = null; // okay • } • } • Interprocedural information is needed often • Allocations in callers (e.g., new B()) common • Allocations in callees (e.g., new F()) common

  5. Common approaches • Most existing tools perform intraprocedural analysis • Have to make assumptions about callers/callees • Option 1: pessimistic assumptions about callers/callees • Result: a sea of false alarms

  6. Results of pessimistic intraproc. analysis • class A { • final A a = new A(); • static main() { • B b = new B(); • initB(b); • a.foo(b); // null deref. • } • foo(B b) { • b.f.fun(); // two null derefs. • b.f.f.gun(); // null deref. • } • static initB(B b) { • b.f = new F(); // null deref. • b.f.f = null; // okay • } • } • Reports four false alarms • Only real error is on line 10

  7. Common approaches • Most existing tools perform intraprocedural analysis • Have to make assumptions about callers/callees • Option 2: optimistic assumptions about callers/callees • Result: missing real errors (catching the most glaring ones)

  8. Results of optimistic intraproc. analysis • class A { • final A a = new A(); • static main() { • B b = new b(); • initB(b); • a.foo(b); // okay • } • foo(B b) { • b.f.fun(); // okay • b.f.f.gun(); // okay • } • static initB(B b) { • b.f = new F(); // okay • b.f.f = null; // okay • } • } • Misses the real error on line 10

  9. Common approaches • Most existing tools perform intraprocedural analysis • Have to make assumptions about callers/callees • Option 3: mostly optimistic assumptions • Detects inconsistencies in programmer’s beliefs • Test x == null: belief that x could be null before test • Dereference of x without a test: belief that x cannot be null • Allow analysis to dismiss assumptions contradicted by beliefs • Result: missing real errors, reporting safe dereferences as unsafe • Generally, few false alarms but many missed errors • Same result as option 2 (optimistic assumptions) in our example

  10. Prospects for interprocedural analysis • Whole-program analysis cannot scale to large software • Majority of instructions are relevant to null-dereference analysis • Can’t prune down program to a small relevant subset • Need mechanism to break down a program’s complexity

  11. Expanding-Scope Analysis • Holy Grail • Cost: INTRAprocedural analysis • Precision: INTERprocedural (whole-program) analysis • Staged approach • Analyze dereferences with limited interprocedural context • Verify dereferences with the least amount of context • Increase interprocedural context for harder cases • In simplest form • Start with local analysis (with pessimistic assumptions) • Verify some dereferences without considering context • Consider remaining dereferences with extra level of context • Verify some dereferences within a call subtree of immediate callers • … • We refer to individual analyses as Limited-Scope Analyses

  12. Expanding-Scope Analysis f f f f … f.foo() … f f f

  13. Expanding-Scope Analysis main B b = new B(); initB(b); a.foo(b); initB foo b.f .fun(); b.f = new F(); b.f.f = null b.f .f .gun();

  14. Abstract Domain • Product of three abstract domains • Abstract domain for may-alias analysis • Implementation: flow- & context-insensitive Andersen-style • Abstract domain for must-alias analysis • Implementation: demand-driven (based on def-use chains) • Set APnn of non-null access paths • Access paths denote l-value expressions: • (VarId| StaticFieldId).InstanceFieldId* • Finiteness of domain guaranteed by (parameterized) bounds on • Size of APnn • Maximal length of access paths in APnn • Only the final component (set of non-null access paths APnn) changes

  15. Transfer Functions (statements) Let  = InstanceFieldId* (sequences of instance fields)

  16. Transfer Functions (conditions)

  17. Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion) • Real OO applications (e.g., web applications) have wide call graphs • High scope limits are too expensive to analyze • New stages help stave off the need for high scope limits • Pruning • Verifies dereferences of (non-null) final and stationary fields • Special local (scope-0) analyses • Caller-guarantee analysis (top-down in call graph) • Propagates callers’ guarantees to callees • E.g., for references passed as arguments down deep call chains • Callee-guarantee analysis (bottom-up in call graph) • Propagates callees’ guarantees up to callers • E.g., for field initializations in deep initialization call chains

  18. Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion) pruning caller-guarantee • limited-scope • data-flow analyses callee-guarantee scope-1 • subtrees of depth 1 from parents scope-2 • subtrees of depth 2 from grandparents • … … symbolic high priority low priority

  19. Steps of staged interproc. analysis • class A { • static main() { • initB(b); • } • foo(B b) { • } • static initB(B b) { • } • } • Pruning (final & stationary fields) • Limited-scope analysis • Scope-0 (local analysis) • Scope-1 analysis final A a = new A(); Caller-guarantee (local) analysis Callee-guarantee (local) analysis Scope-1 analysis B b = new B(); b.f  APnn a.foo(b); b  APnn b.f .fun(); b.f .f .gun(); b  APnn b.f = new F(); b.f .f = null;

  20. Experimental results • 21 (mostly open-source) applications • ~3K-465K bytecodes; ~300-37K dereferences • Avg: ~90% of dereferences verified soundly and automatically • ~8% dismissed by Pruning • ~77% dismissed by caller-guarantee analysis • ~5% dismissed by remaining stages • Final scope limit: between 2 and 5 (chosen heuristicallly) • Diminishing returns after local analyses (caller-/callee-guarantee) • Higher scope limits useful in the absence of caller/callee guarantees • Max. access-path length: 2 for all but four applications • Higher access-path lengths had no effect for most applications • Helped C-like applications (direct field dereferences without getters)

  21. Experimental results • Expected many false alarms due to simple abstract domain • Implemented heuristic symbolic path-validity checking • This phase selected ~20% as high-priority warnings • Surprisingly low incidence of false alarms due to path-correlation • Biggest domain shortcoming: not tracking access-path types • Causes unnecessarily high cost of verifying certain dereferences • Includes too many irrelevant code portions when verifying a dereference • Produces false alarms due to examining type-infeasible paths • Results are encouraging for the simplicity of the domain

  22. Tool-User Interaction • The output includes suggested annotations • Ordered by the number of warnings guaranteed to be dismissed • Actual number would require an alternate abstract domain • Current annotation options • Field f is non-null • Parameter p or return value of method foo() is non-null • User may choose to accept some annotations • We studied annotations for 8 benchmarks with high warning counts • A few hours effort for non-familiar code • Result: 30% decrease in warning counts

  23. Summary • Novel expanding-scope analysis • Applicable to multiple abstract domains • Scalable and precise null-dereference analysis • Staged analysis makes a simple abstract domain effective • Vision: improve programs’ specifications and robustness • Cleanse programs by examining warnings and suggested annotations • Check accepted annotations with assertions or symbolic techniques • Extend the program’s specification and analyzability via annotations

More Related