1 / 60

The PPA Algorithm

The PPA Algorithm. Jeff Da Silva September 10 th , 2004. *A = ~. ~ = *B. The Pointer Alias Analysis Problem. Statically decide for any pair of pointers, at any point in the program, whether two pointers point to the same memory location. *A = ~ ~ = *B. Pointer Analysis Issues.

dewey
Download Presentation

The PPA Algorithm

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The PPA Algorithm Jeff Da Silva September 10th, 2004

  2. *A = ~ ~ = *B The Pointer Alias Analysis Problem • Statically decide for any pair of pointers, at any point in the program, whether two pointers point to the same memory location. *A = ~ ~ = *B

  3. Pointer Analysis Issues • Scalability vs. Accuracy • Generally, a ‘difficult’ tradeoff exists between: • the amount of computation and memory required vs. • the accuracy of the analysis. • Precision/Efficiency tradeoff, where is the sweet spot? • Which metric should be used? • Direct metric • Report performance applied to an optimization • Dynamically measure false positives • Which benchmark suite? • Are the results reproducible?

  4. Pointer Analysis Issues • Complications associated with pointer arithmetic, casting, function pointers, long jumps, and multithreaded applications. • Can these be ignored? • Different pointer analysis uses have different needs. • A universal pointer analysis probably doesn’t exist.

  5. Pointer Analysis Design Choices • Flow sensitivity • Context sensitivity • Heap modeling • Aggregate modeling • Alias representation • Whole program • Incremental compilation

  6. Probabilistic Pointer Analysis (PPA) • Polynomial Time Complexity (guessing) • Inaccurate – many false ‘maybe’ outputs, but provides approximate probability metric • Does not require entire program • Memory Required: yet to be determined • Scalable Accuracy/Efficiency Tradeoff • Doubly Exponential • Accurate – very few ‘maybe’ outputs (control deps/runtime) • Requires Entire Program Info • Memory Required: Oodles • Does not scale well Chen, et al: Only Other PPA Address-taken Steensgaard SPAN BDD based • Linear Time Complexity • Inaccurate - many false ‘maybe’ outputs • Memory Required: Negligible

  7. PPA Algorithm Objectives • An Interprocedural,Flow Sensitive, Context Sensitive/Mergedapproach that uses Transfer Functions. • Must be scalable and should require less space and time than any traditional analysis. • Provide an approximate probability for the ‘Maybe’ output.

  8. Design Choices (tentative) • Flow sensitivity: flow sensitive • Context sensitivity: context merged • Heap modeling: allocation site • Aggregate modeling: arrays aggregated, structs separated • Alias representation: points-to • Whole program: not required • Incremental compilation: limited support

  9. Speculative Parallelizing (TLS) Compiler Probabilistic Dependence Analysis How is Probabilistic Pointer Analysis used? Probabilistic Pointer Analysis Dynamic Profiling Speculative Parallelized Executable Source Code

  10. *A = ~ ~ = *B The Probabilistic Pointer Analysis (PPA) Problem Probabilistic Pointer Analysis (PPA): For any pair of pointers, at any point in the program, statically estimate the probability that two pointers point to the same memory location. *A = ~ ~ = *B

  11. x y z r s t a b c The Traditional Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c;

  12. x y z r s t a b c The Traditional Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) x=&s; s=&c;

  13. x y z r s t a b c The Traditional Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) x=&s; s=&c; r=&b;z=&r;

  14. x y z r s t a b c The Traditional Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) x=&s; s=&c; r=&b; z=&r; if(…) y = x;

  15. x y z r s t a b c The Traditional Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) x=&s; s=&c; r=&b; z=&r; if(…) y = x; *x = &a;

  16. x y z r s t a b c The Probabilistic Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; 1.0 1.0 1.0 1.0 1.0 1.0

  17. x y z r s t a b c The Probabilistic Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) /*60% taken*/ x=&s; s=&c; 0.4 0.6 0.6 0.4

  18. x y z r s t a b c The Probabilistic Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) /*60% taken*/ x=&s; s=&c; r=&b;z=&r; 0.6 0.4 0.6 0.4

  19. y z x r s t a b c The Probabilistic Points-To Graph int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) /*60% taken*/ x=&s; s=&c; r=&b; z=&r; if(…) /*10% taken*/ y = x; 0.04 0.96 0.6 0.4 0.6 0.4

  20. The Probabilistic Points-To Graph y int a, b, c; int *r, *s, *t; int **x, **y, **z; x=&r; y=&s; z=&t; r=&a; s=&b; z=&c; if(…) /*60% taken*/ x=&s; s=&c; r=&b; z=&r; if(…) /*10% taken*/ y = x; *x = &a; x z 0.04 0.6 0.96 0.4 r s t 0.4 0.16 0.4 0.24 0.6 0.6 0.6 a b c What is the probability that **y points to a?

  21. y x z 0.04 0.6 0.96 0.4 r s t 0.4 0.16 0.24 0.6 0.6 a b c A Probabilistic Points-To Matrix

  22. My PPA Algorithm PPA Algorithm Goal: • For each program point generate a probabilistic points-to graph that specifies, for each pointer, the set of probabilities that it points to each location.

  23. Definition: Probability Analysis • Let <p,v> denote a points-to relationship from a pointer p to a location v. • At every static program point s there exists a probability function P(s, <p,v>) that denotes the probability that p points to v during dynamic program execution. P(s, <p,v>) = D(s, <p,v>) / D(s) • Where D(s) is the number of times s is (expected to be) dynamically visited and D(s, <p,v>) is the number of times that the points-to relation <p,v> dynamically holds.

  24. Conservative Probability • My algorithm is a may alias conservative analysis. • A probability of 0.0 [P(s,<p,v>) = 0.0] indicates that a points-to relation <p,v> will never hold. • The converse is not true. • A probability of 1.0 [P(s,<p,v>) = 1.0] indicates that a points to relation <p,v> will always hold. • The converse is not necessarily true: a dynamic points-to relationship <p,v> that always exists may not be reported with a probability of 1.0.

  25. Location Sets • Each node in the graph is implemented with a location set, which is a triple of the form <name, offset, stride>consisting of: a variable name that describes the memory block, an offset within that block and a stride that characterizes the recurring structure of data vectors (in bytes). struct ds { int e,f,g; } … int a; struct ds b; int c[100] struct ds d[100]; Aggregate modeling: arrays aggregated, structs separated

  26. Special Location Sets • Each dynamic memory allocation site has its own name. Eg: the location set that represents a field f in a structure dynamically located at site s is <s, f, 0>. • Additional Location Sets • UND: undefined • UNK: unknown • NULL: C null

  27. Basic Pointer Assignment Transformations Ignoring pointer arithmetic and casting for now.

  28. PPA • Let Xs represent the probabilistic points-to graph/matrix at a specific program point s. XIN Basic pointer assignment instruction XOUT • Claim: There exists a transformation function T(X) for every instruction i, such that XOUT = Ti(XIN).

  29. Linear Transformations • A transformation T(X) is linear iff the following relationships hold for all points-to matrices U and V: • T(U+V) = T(U) + T(V) • T(cU) = cT(U) • If TB and TA are linear transformations represented by the matrices B and A respectively, then: • TB(TA(X)) = [B][A][X]

  30. Linear Points-To Representation • A points-to matrix is used to represent the points-to graph. • Matrix row/column labeling: • Locations sets are denoted with L<id> • Pointers are denoted with P<id> • Rules for linearity: • Pointers can only point to Location sets • Location sets always point to themselves with probability 1.0 • All rows sum to 1.0

  31. a b tmp a b tmp x y UND NULL UNK allocL6 Linear Points-To Representation int *a; /*L1, P1*/ int *b; /*L2, P2*/ int x[N]; /*L3*/ int y[N]; /*L4*/ int *tmp; /*L5, P5*/ … ~ = (int*)calloc(N, sizeof(int)) /*L6*/;

  32. L1 L2 L5 P1 P2 P5 Linear Points-To Representation int *a; /*L1, P1*/ int *b; /*L2, P2*/ int x[N]; /*L3*/ int y[N]; /*L4*/ int *tmp; /*L5, P5*/ … ~ = (int*)calloc(N, sizeof(int)) /*L6*/; UND NULL UNK L3 L4 L6

  33. Points-To Matrix int *a; /*L1, P1*/ int *b; /*L2, P2*/ int x[N]; /*L3*/ int y[N]; /*L4*/ int *tmp; /*L5, P5*/ … ~ = (int*)calloc(N, sizeof(int)) /*L6*/;

  34. Points-To Matrix Properties Ø I Ø

  35. The Transformation Matrix • For every Basic Pointer Assignment there exists a linear transformation matrix T such that: XOUT = TXIN XIN Basic pointer assignment instruction XOUT

  36. The Pointer Assignment Operation MATLAB code: % PPA_ptra: Probabilistic Pointer Analysis pointer assignment function % Returns the PPA ptr assignment transformation matrix function T = PPA_ptra(ptr, loc, N) T = eye(N); T(ptr,ptr) = 0.0; T(ptr,loc) = 1.0;

  37. Pointer Assignment Example int *a; /*L1, P1*/ int *b; /*L2, P2*/ int x[N]; /*L3*/ int y[N]; /*L4*/ int *tmp; /*L5, P5*/ tmp = a; S1: P5 -> P1; T(P5->P1) = TS1 = eye(12); Ts1(P5,P5) = 0.0 Ts1(P5,P1) = 1.0

  38. Pointer Assignment Example int *a; /*L1, P1*/ int *b; /*L2, P2*/ int x[N]; /*L3*/ int y[N]; /*L4*/ int *tmp; /*L5, P5*/ a = x; S2: P1 -> L3; T(P1->L3) = TS2 = eye(12); Ts2(P1,P1) = 0.0 Ts2(P1,L3) = 1.0

  39. Combining Transformation Matrices XOUT = T2 T1 XIN XIN T1: Basic pointer assignment instruction T2: Basic pointer assignment instruction XOUT

  40. Combining Pointer Assignment Example int *a; /*L1, P1*/ int *b; /*L2, P2*/ int x[N]; /*L3*/ int y[N]; /*L4*/ int *tmp; /*L5, P5*/ void swap { tmp = a; a = b; b = tmp; } S1: P5 -> P1 S2: P1 -> P2 S3: P2 -> P5 Tswap = TS3 TS2 TS1

  41. L5 L2 L1 L1 L2 L5 P2 P1 P5 P2 P1 P5 L3 L3 L4 L4 L6 L6 UND UND NULL NULL UNK UNK Combining Pointer Assignment Example Tswap

  42. L2 L5 L1 L1 L2 L5 P2 P1 P5 P2 P1 P5 L3 L3 L4 L4 L6 L6 UND UND NULL NULL UNK UNK Combining Pointer Assignment Example Tswap 0.3 0.9 0.9 0.1 0.7 0.7 0.1 0.3 0.9 0.1

  43. q p r A N Control flow and loops • Loops are found and back edges are labeled with there back edge count. [assume all loops have constant trip count for now] • Denoted with a capital letter • All other edges are labeled with there basic block fan-in probability that sums to 1. • Denoted with a small case letter

  44. = TA A A = TC [pTBTA + qTA] B q p C The Effect of Control Flow

  45. A = TD [pTB + qTC] TA B C q p D The Effect of Control Flow

  46. A … B1 B2 Bn p2 p1 pn C = Tc [p1TB1 + p2TB2 + … + pnTBn] TA The Effect of Control Flow

  47. Example int *a; /*L1, P1*/ int *b; /*L2, P2*/ int x[N]; /*L3*/ int y[N]; /*L4*/ int *tmp; /*L5, P5*/ void might_alias { if(!RANDOM(10)) a = b; } BB1: if() /*0.1*/ BB2: S1: P1 -> P2 fi BB3: Tmight_alias = TBB3 [0.1 TBB2 TBB1 + 0.9 TBB1]

  48. = TA[TA]N = [TA]N+1 A N Loops – Constant Trip Count = [ TB [TA]N+1 ]M+1 A N M B

  49. Loop Transformation types • Identity • Converges • Periodic • Converges and Periodic for(i=0;i<N;i++) { swap(); } for(i=0;i<N;i++) { if(RANDOM(10)) { a = b; swap(); } } for(i=0;i<N;i++) { if(!RANDOM(10)) { a = b; } } If Odd If Even

  50. = 1/(N+1) [ [TA]0 + [TA]1 + … + [TA]N] ] A N Loops – Non-Constant Trip Count Geometric Series Transform [gstr] operation = gstr(TA, 0, N)

More Related