Verifying Dereference Safety via Expanding-Scope Analysis

Verifying Dereference Safety via Expanding-Scope Analysis Alexey Loginov (GrammaTech, Inc.) Joint work with: E. Yahav, S. Chandra, S. Fink (IBM TJ Watson) N. Rinetzky (Tel-Aviv University) M.G. Nanda (IBM IRL)

Why Null-Dereference Analysis? • Common problem • …or symptom of other problems • Null-dereference warning may help in identifying root cause • Relevant to all software • Specification is obvious (absence of NPE) • Requires no user interaction

Why Sound Null-Dereference Analysis? • Safety guarantees are important in some domains • Results can become an in-code specification, e.g., via JSR 305 • Annotations can help with code understanding • Annotations can simplify future analyses (e.g., after modifications) • Precise and efficient sound analysis is challenging • Lessons carry over to other static analyses

Example answers expected • class A { • final A a = new A(); • static main() { • B b = new B(); • initB(b); • a.foo(b); // okay • } • foo(B b) { • b.f.fun(); // okay • b.f.f.gun(); // null-deref. • } • static initB(B b) { • b.f = new F(); // okay • b.f.f = null; // okay • } • } • Interprocedural information is needed often • Allocations in callers (e.g., new B()) common • Allocations in callees (e.g., new F()) common

Common approaches • Most existing tools perform intraprocedural analysis • Have to make assumptions about callers/callees • Option 1: pessimistic assumptions about callers/callees • Result: a sea of false alarms

Results of pessimistic intraproc. analysis • class A { • final A a = new A(); • static main() { • B b = new B(); • initB(b); • a.foo(b); // null deref. • } • foo(B b) { • b.f.fun(); // two null derefs. • b.f.f.gun(); // null deref. • } • static initB(B b) { • b.f = new F(); // null deref. • b.f.f = null; // okay • } • } • Reports four false alarms • Only real error is on line 10

Common approaches • Most existing tools perform intraprocedural analysis • Have to make assumptions about callers/callees • Option 2: optimistic assumptions about callers/callees • Result: missing real errors (catching the most glaring ones)

Results of optimistic intraproc. analysis • class A { • final A a = new A(); • static main() { • B b = new b(); • initB(b); • a.foo(b); // okay • } • foo(B b) { • b.f.fun(); // okay • b.f.f.gun(); // okay • } • static initB(B b) { • b.f = new F(); // okay • b.f.f = null; // okay • } • } • Misses the real error on line 10

Common approaches • Most existing tools perform intraprocedural analysis • Have to make assumptions about callers/callees • Option 3: mostly optimistic assumptions • Detects inconsistencies in programmer’s beliefs • Test x == null: belief that x could be null before test • Dereference of x without a test: belief that x cannot be null • Allow analysis to dismiss assumptions contradicted by beliefs • Result: missing real errors, reporting safe dereferences as unsafe • Generally, few false alarms but many missed errors • Same result as option 2 (optimistic assumptions) in our example

Prospects for interprocedural analysis • Whole-program analysis cannot scale to large software • Majority of instructions are relevant to null-dereference analysis • Can’t prune down program to a small relevant subset • Need mechanism to break down a program’s complexity

Expanding-Scope Analysis • Holy Grail • Cost: INTRAprocedural analysis • Precision: INTERprocedural (whole-program) analysis • Staged approach • Analyze dereferences with limited interprocedural context • Verify dereferences with the least amount of context • Increase interprocedural context for harder cases • In simplest form • Start with local analysis (with pessimistic assumptions) • Verify some dereferences without considering context • Consider remaining dereferences with extra level of context • Verify some dereferences within a call subtree of immediate callers • … • We refer to individual analyses as Limited-Scope Analyses

Expanding-Scope Analysis f f f f … f.foo() … f f f

Expanding-Scope Analysis main B b = new B(); initB(b); a.foo(b); initB foo b.f .fun(); b.f = new F(); b.f.f = null b.f .f .gun();

Abstract Domain • Product of three abstract domains • Abstract domain for may-alias analysis • Implementation: flow- & context-insensitive Andersen-style • Abstract domain for must-alias analysis • Implementation: demand-driven (based on def-use chains) • Set APnn of non-null access paths • Access paths denote l-value expressions: • (VarId| StaticFieldId).InstanceFieldId* • Finiteness of domain guaranteed by (parameterized) bounds on • Size of APnn • Maximal length of access paths in APnn • Only the final component (set of non-null access paths APnn) changes

Transfer Functions (statements) Let  = InstanceFieldId* (sequences of instance fields)

Transfer Functions (conditions)

Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion) • Real OO applications (e.g., web applications) have wide call graphs • High scope limits are too expensive to analyze • New stages help stave off the need for high scope limits • Pruning • Verifies dereferences of (non-null) final and stationary fields • Special local (scope-0) analyses • Caller-guarantee analysis (top-down in call graph) • Propagates callers’ guarantees to callees • E.g., for references passed as arguments down deep call chains • Callee-guarantee analysis (bottom-up in call graph) • Propagates callees’ guarantees up to callers • E.g., for field initializations in deep initialization call chains

Staged Analysis in SALSA(Scalable Analysis via Lazy Scope expAnsion) pruning caller-guarantee • limited-scope • data-flow analyses callee-guarantee scope-1 • subtrees of depth 1 from parents scope-2 • subtrees of depth 2 from grandparents • … … symbolic high priority low priority

Steps of staged interproc. analysis • class A { • static main() { • initB(b); • } • foo(B b) { • } • static initB(B b) { • } • } • Pruning (final & stationary fields) • Limited-scope analysis • Scope-0 (local analysis) • Scope-1 analysis final A a = new A(); Caller-guarantee (local) analysis Callee-guarantee (local) analysis Scope-1 analysis B b = new B(); b.f  APnn a.foo(b); b  APnn b.f .fun(); b.f .f .gun(); b  APnn b.f = new F(); b.f .f = null;

Experimental results • 21 (mostly open-source) applications • ~3K-465K bytecodes; ~300-37K dereferences • Avg: ~90% of dereferences verified soundly and automatically • ~8% dismissed by Pruning • ~77% dismissed by caller-guarantee analysis • ~5% dismissed by remaining stages • Final scope limit: between 2 and 5 (chosen heuristicallly) • Diminishing returns after local analyses (caller-/callee-guarantee) • Higher scope limits useful in the absence of caller/callee guarantees • Max. access-path length: 2 for all but four applications • Higher access-path lengths had no effect for most applications • Helped C-like applications (direct field dereferences without getters)

Experimental results • Expected many false alarms due to simple abstract domain • Implemented heuristic symbolic path-validity checking • This phase selected ~20% as high-priority warnings • Surprisingly low incidence of false alarms due to path-correlation • Biggest domain shortcoming: not tracking access-path types • Causes unnecessarily high cost of verifying certain dereferences • Includes too many irrelevant code portions when verifying a dereference • Produces false alarms due to examining type-infeasible paths • Results are encouraging for the simplicity of the domain

Tool-User Interaction • The output includes suggested annotations • Ordered by the number of warnings guaranteed to be dismissed • Actual number would require an alternate abstract domain • Current annotation options • Field f is non-null • Parameter p or return value of method foo() is non-null • User may choose to accept some annotations • We studied annotations for 8 benchmarks with high warning counts • A few hours effort for non-familiar code • Result: 30% decrease in warning counts

Summary • Novel expanding-scope analysis • Applicable to multiple abstract domains • Scalable and precise null-dereference analysis • Staged analysis makes a simple abstract domain effective • Vision: improve programs’ specifications and robustness • Cleanse programs by examining warnings and suggested annotations • Check accepted annotations with assertions or symbolic techniques • Extend the program’s specification and analyzability via annotations

Verifying Dereference Safety via Expanding-Scope Analysis

Verifying Dereference Safety via Expanding-Scope Analysis

Presentation Transcript

Expanding the Scope: Pushing the Boundaries of HCI

Expanding the Scope: Pushing the Boundaries of HCI

Safety Analysis

Expanding Scope of Nursing Practice

SPPI: BtoAll Indices Expanding the scope of publication

Panel Session: Expanding the Scope of Scrutiny

Verifying Atomicity via Data Independence

Verifying Pressures

Verifying Safety Properties using Separation and Heterogeneous Abstractions

Verifying the Safety of User Pointer Dereferences

The Expanding Scope And Sphere Of Artificial Intelligence

Short Term Loans - Expanding Financial Scope During Crisis

Short Term Loans - Expanding Financial Scope During Crisis

Scope of Electrical Safety Audit

Safety Analysis

“Verifying Einstein”

DICOM 2003: Expanding the standard’s Scope

Verifying the Safety of User Pointer Dereferences

Verifying Your Trade with Volume Analysis