1 / 35

# - PowerPoint PPT Presentation

Rupesh Nasre. Advisor: Prof R Govindarajan. Apr 05, 2008. Pointer Analysis. Outline. Motivation and Introduction. Related Work. Preliminary Results. Research Directions. Pointer analysis is the mechanism of statically finding out possible run-time values of a pointer.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about '' - lidia

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Apr 05, 2008.

Pointer Analysis.

• Motivation and Introduction.

• Related Work.

• Preliminary Results.

• Research Directions.

Pointer analysis is the mechanism of statically finding out possible run-time values of a pointer.

What is Pointer Analysis?

Pointer analysis is the mechanism of statically finding out possible run-time values of a pointer

and

relation of various pointers with each other.

What is Pointer Analysis?

Relation between pointers. possible run-time values of a pointer

• p = arr + ii;

q = arr + jj;

if (p == q) {

fun();

}

• q = p;

...

if (p == q) {

fun();

}

Variants of Pointer Analysis. possible run-time values of a pointer

• Alias analysis.

do p and q point to the same memory location?

• Points-to analysis.

does p point to memory location x?

Why Pointer Analysis? possible run-time values of a pointer

• for parallelization:

fun(p);

fun(q);

• for common subexpression elimination:

x = p + 2;

y = q + 2;

if (p == q) {

fun();

}

• for other optimizations.

Introduction. possible run-time values of a pointer

• Flow sensitivity.

• Context sensitivity.

• Field sensitivity.

• Unification based.

• Inclusion based.

Flow sensitivity. possible run-time values of a pointer

p = &x;

p = &y;

label:

...

flow-sensitive: {(p, &y)}.

flow-insensitive: {(p, &x), (p, &y)}.

Context sensitivity. possible run-time values of a pointer

caller1() { caller2() { fun(int *ptr) {

fun(p); fun(q); r = ptr;

} } }

context-insensitive: {(r, p), (r, q)}.

context sensitive: {(r, p)} along call-path caller1,

{(r, q)} along call-path caller2.

Field sensitivity. possible run-time values of a pointer

x.f = p;

or

p = x.f;

field-sensitive: {(x.f, p)}.

field-insensitive: {(x, p)}.

Unification based. possible run-time values of a pointer

one(&s1); one(struct s*p) { two(struct s*q) {

one(&s2); p->a = 3; q->b = 4;

two(&s3); two(p); }

}

unification-based: {(p, &s1), (p, &s2), (p, &s3),

(q, &s1), (q, &s2), (q, &s3)}.

Inclusion based. possible run-time values of a pointer

one(&s1); one(struct s*p) { two(struct s*q) {

one(&s2); p->a = 3; q->b = 4;

two(&s3); two(p); }

}

inclusion-based: {(p, &s1), (p, &s2),

(q, &s1), (q, &s2), (q, &s3)}

Like all other important problems in Computer Science... possible run-time values of a pointer

• Alias analysis without memory allocation, intra-procedural, flow-sensitive, supporting arbitrary levels of indirection, is NP-hard.

• For two levels of indirection, it is still NP-hard.

• Even flow-insensitive analysis is NP-hard (for arbitrary levels of indirection).

• With dynamic memory allocation, allowing structs, it becomes undecidable.

• Even for scalars (no structs), it remains undecidable.

G Ramalingam, The undecidability of aliasing, TOPLAS 1994.

Venkatesan Chakaravarthy, New results on the computability and complexity of points-to analysis, POPL 2003.

But the good news is... possible run-time values of a pointer

• For single pointer dereference, even a flow-sensitive analysis with only scalars and well-defined types is in P, if dynamic memory allocation is not allowed.

• For arbitrary number of dereferences, if the analysis is flow-insensitive, it is in P.

G Ramalingam, The undecidability of aliasing, TOPLAS 1994.

Venkatesan Chakaravarthy, New results on the computability and complexity of points-to analysis, POPL 2003.

Open Problems. possible run-time values of a pointer

• When dynamic memory allocation is not allowed, but arbitrary number of levels of dereferencing is allowed, the problem is NP-hard. Is it in NP?

• Is the above problem for bounded number of dereferences in P?

• When dynamic memory is allowed, is the problem decidable?

Related Work. possible run-time values of a pointer

• Choi et al, POPL 1993.

• flow sensitive.

• solution set for each program point.

• alias sets for each CFG node.

• uses worklists for efficiency.

• precise but inefficient.

J D Choi,M Burke, P Carini, Efficient flow-sensitive interprocedural computation of pointer induced aliases and side effects,

POPL 1993.

Related Work. possible run-time values of a pointer

• Andersen, PhD Thesis, 1994.

• flow insensitive.

• context insensitive.

• inclusion based.

• each variable represented using separate node.

• precision used as upper bound.

Lars Ole Andersen, Program Analysis and Specialization for the C Programming Language, PhD thesis, 1994.

Related Work. possible run-time values of a pointer

• Burke et al, LCPC 1995.

• flow insensitive.

• alias solution for each procedure.

• worklist used for efficiency.

• can filter alias information based on scoping.

• nearly as precise as Andersen's.

M Burke, P Carini, J D Choi, M Hind, Flow-insensitive interprocedural alias analysis in the presence of function pointers,

LCPC 1995.

Related Work. possible run-time values of a pointer

• Reps et al, POPL 1995.

• problem formulated using graph reachability.

• poly-time algorithm for interprocedural finite distributive subset-based problems.

• graph reachability used for aliasing.

Thomas Reps, Susan Horwitz, Mooly Sagiv, Precise Interprocedural Dataflow Analysis via Graph Reachability, POPL 1995.

Related Work. possible run-time values of a pointer

• Steensgaard, POPL 1996.

• flow insensitive.

• context insensitive.

• field insensitive.

• unification based.

• linear space and almost linear time algorithm.

• imprecise but sets lower bound on time complexity.

Bjarne Steensgaard, Points-to Analysis in Almost Linear Time, POPL 1996.

Related Work. possible run-time values of a pointer

• Ghiya et al, PLDI 1996.

• flow sensitive.

• context sensitive.

• field insensitive.

• makes use of direction, interference and shape.

• classifies as tree, dag or cyclic graph.

Rakesh Ghiya, Laurie Hendren, Is it a Tree, a DAG, or a Cyclic Graph? A Shape Analysis For Heap Directed Pointers in C,

PLDI 1996.

Related Work. possible run-time values of a pointer

• Cheng et al, PLDI 2000.

• uses access paths.

• flow insensitive.

• field sensitive.

• cost effective context sensitivity.

• works well for large number of indirect function calls.

Ben-Chung Cheng, Wen-Mei Hwu, Modular Interprocedural Pointer Analysis using Access Paths: Design, Implementation,

and Evaluation, PLDI 2000.

Related Work. possible run-time values of a pointer

• Whaley et al, PLDI 2004.

• context sensitive.

• field sensitive.

• partially flow sensitive.

• inclusion based.

• scalable (10 min, 400 MB, 8000 methods).

• ordered BDDs.

John Whaley, Monica Lam, Cloning-based Context-sensitive Pointer Alias Analysis Using Binary Decision Diagrams, PLDI

2004.

Related Work. possible run-time values of a pointer

• Lattner et al, PLDI 2007.

• context sensitive.

• flow insensitive.

• field sensitive.

• unification based.

• scalable.

• efficient (3 sec for 200K lines).

• low storage requirement (30MB).

Chris Lattner, Andrew Lenharth, Vikram Adve, Making Context Sensitive Points-to Analysis with Heap Cloning Practical For

The Real World, PLDI 2007.

Our Experiments. possible run-time values of a pointer

• framework = LLVM.

• algorithm = Andersen.

• benchmark = SPEC 2000.

Our Experiments. possible run-time values of a pointer

Our Experiments. possible run-time values of a pointer

Research Directions. possible run-time values of a pointer

• Pointer arithmetic.

void f(struct list *p, struct list *q) {

struct list *tmp;

tmp = p->next;

p->next = q->next;

q->next = q->next->next;

p->next->next = tmp;

}

Research Directions. possible run-time values of a pointer

• Profiling.

• at specific program points like function entry, exit.

• for hot functions.

• for fat pointers.

Research Directions. possible run-time values of a pointer

• Complex data structures.

• a recursive data structure is merged into a single node.

• some programs have a single global data structure to operate on, like symbol table, dictionary.

• how to characterize complexity of a data structure?

Rupesh Nasre. possible run-time values of a pointer

Apr 05, 2008.

Pointer Analysis.

188.ammp Description. possible run-time values of a pointer

Benchmark Program General Category:

Computational Chemistry. Modeling large systems of molecules usually

associated with Biology.

Benchmark Description:

The benchmark runs molecular dynamics (i.e. solves the ODE defined

by Newton's equations for the motions of the atoms in the system) on a

protein-inhibitor complex which is embedded in water (see Harrison 1993 for

descriptions of the algorithm and stability analysis on it). The energy is

approximated by a classical potential or "force field". The protein is

HIV protease complexed with the inhibitor indinavir. There are 9582

atoms in the water and protein making this representative of a typical

large simulation. This benchmark is derived from published work on

understanding drug resistance in HIV (Weber and Harrison 1999).

Input Description:

The problem tracks how the atoms move from an initial

coorinates and initial velocities.

Conferences. possible run-time values of a pointer

POPL: Principles of Programming Languages.

PLDI: Programming Language Design and Implementation.

MSP: Memory Systems Performance.

LCPC: Languages and Compilers for Parallel Computing.

Related Work. possible run-time values of a pointer

• Raman et al, MSP 2005.

• uses executable instructions.

• run time (dynamic).

• collects RDS profile.

• no type information.

• interesting properties of data structures are found out.