String Analysis for Dynamic Field Access

1 / 29

String Analysis for Dynamic Field Access - PowerPoint PPT Presentation

String Analysis for Dynamic Field Access. Esben Andreasen Magnus Madsen. Department of Computer Science Aarhus University. Motivation. Static Analysis of JavaScript type analysis, bug finding or refactoring a key component is Points-To analysis Analysis of JavaScript is a difficult

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'String Analysis for Dynamic Field Access' - stamos

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

String Analysis for Dynamic Field Access

Esben Andreasen

Department of Computer Science

Aarhus University

Motivation
• Static Analysis of JavaScript
• type analysis, bug finding or refactoring
• a key component is Points-To analysis
• Analysis of JavaScript is a difficult
• a flexible object model (prototype-based)
• dynamic field access
• coercions and eval
• non-standard scope rules
• ...

Today

Static Field Access

Field Name

Dynamic Field Access (DFA)

where e is any expression

Object

Array

v = o["p"]

v = o["p" + "q"]

v = o[???]

o

length

map

pop

push

reduce

reduceRight

reverse

shift

slice

some

sort

splice

unshift

__defineGetter__

__defineSetter__

__lookupGetter__

__lookupSetter__

constructor

hasOwnProperty

(4 more)

concat

every

filter

forEach

indexOf

join

lastIndexOf

Dynamic Writes

o["p"] = v

o["p" + "q"] = v

o[???] = v

All fields of o

may now point to v!

o[e1][e2] = v

e.g. prototype

e.g. toString

All fields of Object

may now point to v!

Spurious Event Handlers

var elm = \$("#button");

elm.onclick = function() {}

elm[???] = function() {}

The function is registered as all possible event handlers

Usage in Practice

Survey by Richards et al. [1]:

• 8.3% of all reads are dynamic
• 10.3% of all writes are dynamic

Dynamic field access is prevalent in libraries:

• jQuery, Mootools and Prototype: 300+ DFAs

[1]: Gregor Richards, Sylvain Lebresne, Brian Burg, and Jan Vitek. An Analysis of the Dynamic Behavior of JavaScript Programs. In PLDI, 2010.

Proposed Solution

What we need: A way to distinguish strings flowing into dynamic field accesses.

Solution: A set of light-weight lattices.

• focus on concatenate, equaland join
• compact and efficient
• ideally O(1)time and space
(Simple) Lattices
• Constant String (CS)

single concrete string, e.g. "push".

• String Set (SS)

set of k strings, e.g. "pop"and "push".

• Length Interval (LI)

min. and max. length, e.g. the interval [3; 4] for the strings "pop" and "push".

Character Inclusion (CI)

Sets of characters which may and mustoccur.

Example for the strings "pop" and "push":

Prefixes and Suffixes
• Prefix-Suffix Characters (PS)

first and last character, e.g. for the strings "pop" and "push":

and

• Prefix Suffix Inclusion (PSI)

may and must sets of characters for the first and last character.

(like the character inclusion lattice.)

Index Predicate

A boolean valued predicate which may/must hold for each of the first string indices.

Examples:

• isUppercase (useful for e.g. ”lastIndexOf”)
• isUnderscore (useful for e.g. ”__defineGetter__”)
• isDigit (useful for array indices)
Index Predicate: Concatenation

+

1

0

0

1

0

1

1

0

0

0

1

0

0

1

1

1

0

0

0

may-case

1

0

0

1

1

1

1

0

0

0

1

0

0

1

0

1

1

0

0

0

String Hash
• Pick a distributive hash function : where is some universe of a fixed size .
• Distributive:
• The lattice is the powerset lattice of .
• Length Hash (LH)
• Hashes the length instead of the string itself.
String Hash: Example

foo = (the quick brown fox)

4

33

29

40

13

• String Constant
• Prefix Suffix
• Character Inclusion
• Index Predicate
String Hash: Concatenation

Example: Let and

Assume a universe of size then:

Easy to compute in iterations.

Paper has solution in time.

Overview
• The Hlattice is the (reduced) product of
• the string set lattice (SS)
• the character inclusion lattice (CI)
• the string hash lattice (SH)
Evaluation

Q1: How precise are the lattices, independent of any particular analysis, for reasoning about strings used in DFAs?

Q2: To what degree does a more precise string lattice, for DFAs, improve overall precision and performance of a static analysis?

Evaluation: Dynamic Analysis

We perform a recordand replay of several popular JavaScript libraries:

Record:

• The history of every string flowing to a DFA.
• The field names of every receiver object at a DFA.

Replay:

• For every DFA merge the histories of strings and determine for each lattice if it has a false positive.

(example next slide)

Dynamic Analysis: Example

e = (c ? "a": "b") + "x"

o = {"ax": 1, "bx": 2}

v = o[e];

+

evaluating o[e] in

the abstract?

join

"x"

"a"

"b"

Evaluation: Dynamic Analysis

DFAs with zero false positives

Const. Str. Insufficient

Hybrid

Prefix/Suffic Incl.

Evaluation: Static Analysis

Flow-sensitive dataflow analysis for JavaScript

• inter-procedural, context-insensitive
• instantiated with the constant and hybrid lattices

Benchmarks

• Mozilla Sunspider + Google Octane
• Various GitHub projects
Summary
• Dynamic field access is common in JavaScript.
• Simple constant propagation is insufficient for reasoning about dynamic field accesses.
• The proposed hybridlatticeimproves precision and performance for 7 out of 10 benchmark programs.

Thank You!

Arrays

var a = [1, 2, 3];

var i = 0;

while (...) {

var x = a[i++];

}

String Patterns

"a|b|c|d".split("|")

Number- & Type String Lattices
• Number String (N)

A powerset lattice of the strings:

{Infinity, -Infinity, NaN, 0, 1, ...}

• Type String (T)

A powerset lattice of the strings:

{Boolean, Function, Object, String, Undefined}