1 / 26

Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo

Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code. Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo Institute of Computing Technology, Chinese Academy of Sciences { htyu, zqzhang, fxb, huowei }@ict.ac.cn. Jingling Xue

herve
Download Presentation

Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Level by Level: Making Flow- and Context-Sensitive Pointer Analysis Scalable for Millions of Lines of Code Hongtao Yu Zhaoqing Zhang Xiaobing Feng Wei Huo Institute of Computing Technology, Chinese Academy of Sciences { htyu, zqzhang, fxb, huowei }@ict.ac.cn Jingling Xue University of New South Wales jingling@cse.unsw.edu.au

  2. Outline • Introduction • Framework • Analyzing a Level • Experiments • Conclusion

  3. Introduction • Motivation • Who needs flow- and context-sensitive (FSCS) pointer analysis ? • Software checking tools • Program understanding • Parallelization tools • Hardware synthesis • Existed methods cannot scale to large real programs • Aiming at millions of lines of C code

  4. Improve scalability • For flow-sensitivity • Decreasing iterations in dataflow analysis • Saving space of points-to graph • For context-sensitivity • Summary-based • Low storage penalty • Low apply penalty

  5. Idea • Level by Level analysis • Analyze the pointers in decreasing order of their points-to levels • Suppose int **q, *p, x; q has a level 2, p has a level 1 and x has a level 0. • Fast flow-sensitive analysis on full sparse SSA • Fast and accurate context-sensitive analysis using a full transfer function

  6. Contribution • performs a full-sparse flow-sensitive pointer analysis using a flow-insensitive algorithm • performs a context-sensitive pointer analysis efficiently with precise full transfer function • yields a flow- and context-sensitive interproce-duralmay/must mod/ref on a compact SSA form • analyzes million lines of code in minutes, fast-erthan the state-of-the art FSCS pointer ana-lysisalgorithms

  7. Framework • for points-to level from the highest to lowest Compute points-to level Bottom-up Top-down • Propagate points-to set • Evalute transfer functions • incremental build call graph Figure 1. Level-by-level pointer analysis (LevPA).

  8. Points-to level • Property 1.If a variable x is possibly pointed to by a pointer y, then ptl(x) ≤ ptl(y). • Property 2.If a variable y is possibly assigned to x, then ptl(x) = ptl(y). • Compute points-to level by a Unification-based pointer analysis

  9. Example • int o, t; • main() { • L1: int **x, **y; • L2: int *a, *b, *c, *d, *e; • L3: x = &a; y = &b; • L4: foo(x, y); • L5: *b = 5; • L6: if ( … ) { x = &c; y = &e; } • L7: else { x = &d; y = &d; } • L8: c = &t; • L9: foo( x, y); • L10: *e = 10; } • voidfoo( int **p, int **q) { • L11: *p = *q; • L12: *q = &obj; • } • ptl(x, y, p, q) =2 • ptl(a, b, c, d, e) =1 • ptl(t, o) = 0 • analyze • first { x, y, p, q } • then { a, b, c, d, e} • last { t, o }

  10. Bottom-up analyze level 2 • void foo( int **p, int **q) { • L11: *p = *q; • L12: *q = &obj; } • main() { • L1: int **x, **y; • L2: int *a, *b, *c, *d, *e; • L3: x = &a; y = &b; • L4: foo(x, y); • L5: *b = 5; • L6: if ( … ) { x = &c; y = &e; } • L7: else { x = &d; y = &d; } • L8: c = &t; • L9: foo( x, y); • L10: *e = 10; }

  11. Bottom-up analyze level 2 • void foo( int **p, int **q) { • L11: *p1 = *q1; • L12: *q1 = &obj; } • p1’s points-to depend on formal-in p • q1’s points-to depend on formal-in q • main() { • L1: int **x, **y; • L2: int *a, *b, *c, *d, *e; • L3: x = &a; y = &b; • L4: foo(x, y); • L5: *b = 5; • L6: if ( … ) { x = &c; y = &e; } • L7: else { x = &d; y = &d; } • L8: c = &t; • L9: foo( x, y); • L10: *e = 10; }

  12. Bottom-up analyze level 2 • void foo( int **p, int **q) { • L11: *p1 = *q1; • L12: *q1 = &obj; } • p1’s points-to depend on formal-in p • q1’s points-to depend on formal-in q • main() { • L1: int **x, **y; • L2: int *a, *b, *c, *d, *e; • L3: x1 = &a; y1 = &b; • L4: foo(x1, y1); • L5: *b = 5; • L6: if ( … ) { x2 = &c; y2 = &e; } • L7: else { x3 = &d; y3 = &d; } • x4=ϕ (x2, x3); y4=ϕ (y2, y3) • L8: c = &t; • L9: foo( x4, y4); • L10: *e = 10; } • x1 →{ a } • y1 →{ b } • x2 →{ c } • y2 → { e } • x3 → { d } • y3 →{ d } • x4 → { c, d } • y4 → { e, d }

  13. Full-sparse Analysis • Achieve flow-sensitivity flow-insensitively • Regard each SSA name as a unique variable • Set constraint-based pointer analysis • Full sparse • Saving time • Saving space

  14. Top-down analyze level 2 • voidfoo( int **p, int **q) { • L11: *p = *q; • L12: *q = &obj; } L4: foo.p→ { a } foo.q→ { b } • main: Propagate to callsite • main() { • L1: int **x, **y; • L2: int *a, *b, *c, *d, *e; • L3: x = &a; y = &b; • L4: foo(x, y); • L5: *b = 5; • L6: if ( … ) { x = &c; y = &e; } • L7: else { x = &d; y = &d; } • L8: c = &t; • L9: foo( x, y); • L10: *e = 10; } • L9: • foo.p→ { c, d } • foo.q→ { d, e } • foo.p→ { a, c, d } • foo.q→ { b, d, e }

  15. Top-down analyze level 2 • voidfoo( int **p, int **q) { • L11: *p = *q; • L12: *q = &obj; } • foo: Expand pointer dereferences • voidfoo( int **p, int **q) { • μ(b, d, e) • L11: *p1 = *q1; • χ(a, c, d) • L12: *q1 = &obj; • χ(b, d, e) • } • main() { • L1: int **x, **y; • L2: int *a, *b, *c, *d, *e; • L3: x = &a; y = &b; • L4: foo(x, y); • L5: *b = 5; • L6: if ( … ) { x = &c; y = &e; } • L7: else { x = &d; y = &d; } • L8: c = &t; • L9: foo( x, y); • L10: *e = 10; } Merging calling contexts here

  16. Context Condition • To be context-sensitive • Points-to relation ci • p ⟹ v (p→v ) , pmust (may) point to v, p is a formal parameter. • Context Condition ℂ(c1,…,ck) • a Boolean function consists of higher-level points-to relations • Context-sensitive μ and χ • μ(vi, ℂ(c1,…,ck)) • vi+1=χ(vi, M, ℂ(c1,…,ck)) • M ∈ {may, must}, indicates weak/strong update

  17. Context-sensitive μ and χ void foo( int **p, int **q) { μ(b, q⟹b) μ(d,q→d) μ(e,q→e) L11: *p1 = *q1; a=χ(a , must, p⟹a) c=χ(c , may, p→c) d=χ(d , may, p→d) L12: *q1 = &obj; b=χ(b , must, q⟹b) d=χ(d , may, q→d) e=χ(e , may, q→e) }

  18. Bottom-up analyze level 1 void foo( int **p, int **q) { μ(b1, q⟹b) μ(d1,q→d) μ(e1,q→e) L11: *p1 = *q1; a2=χ(a1 , must, p⟹a) c2=χ(c1 , may, p→c) d2=χ(d1 , may, p→d) L12: *q1 = &obj; b2=χ(b1 , must, q⟹b) d3=χ(d2 , may, q→d) e2=χ(e1 , may, q→e) }

  19. Points-to Set • Local Points-to Set • Loc (p) = { <v, ℂ(c1,…,ck)> | ℂ(c1,…,ck) is a context condition}. • p can point to v if and only if ℂ(c1,…,ck) holds. • is computed explicitly during the bottom-up analysis. • Dependence Set • Dep(p) = { <q, ℂ(c1,…,ck)> | q is a formal-in parameter of level lev and ℂ(c1,…,ck) is a context condition • Ptr(p) includes Ptr(q) if and only if ℂ(c1,…,ck) holds.

  20. Transfer function • Trans(proc, v) • < Loc(v), Dep(v), ℂ(c1,…,ck), M > • v is a formal-out parameter • ℂ(c1,…,ck) is a context condition. • V can be modified at a callsite invoking proc only if ℂ(c1,…,ck) holds at the callsite • M ∈ {may, must}, • indicates may/must mod effect • Trans(proc) • a set of all individual transfer functions Trans(proc, v).

  21. Bottom-up analyze level 1 • Trans(foo, a) = < { }, { <b, q⟹b> , < d, q→d>, < e, q→e>} , p⟹a, must > void foo( int **p, int **q) { μ(b1, q⟹b) μ(d1, q→d) μ(e1, q→e) L11: *p1 = *q1; a2=χ(a1 , must, p⟹a) c2=χ(c1 , may, p→c) d2=χ(d1 , may, p→d) L12: *q1 = &obj; b2=χ(b1 , must, q⟹b) d3=χ(d2 , may, q→d) e2=χ(e1 , may, q→e) } • Trans(foo, c) = < { }, { <b, q⟹b> , < d, q→d>, < e, q→e>} , p→c, may > • Trans(foo, b) = < {< obj, q⟹b> }, { } , q⟹b, must > • Trans(foo, e) = < {< obj, q→e> }, { } , q→e, may > • Trans(foo, d) = < {< obj, q→d> }, { <b, p→d ∧ q⟹b> , < d, p→d>, < e, p→d ∧ q→e> } , p→d ∨ q→d, may >

  22. Bottom-up analyze level 1 • L5: *b1 = 5; • L6: if ( … ) { x2 = &c; y2 = &e; } • L7: else { x3 = &d; y3 = &d; } • x4=ϕ (x2, x3) y4=ϕ (y2, y3) • L8: c1 = &t; • μ(d1, true) • μ(e1, true) • L9: foo(x4, y4); • c2=χ(c1, may , true) • d2=χ(d1, may , true) • e2=χ(e1, may , true) • L10: *e1= 10; } • intobj, t; • main() { • L1: int **x, **y; • L2: int *a, *b, *c, *d, *e; • L3: x1 = &a; y1 = &b; • μ(b1, true) • L4: foo(x1 , y1 ); • a2=χ(a1 , must, true) • b2=χ(b1 , must, true) • at L4, • p ⟹ a holds, • q ⟹ b holds • at L9, • p → c, p → d holds, • q → e, q → d holds,

  23. x1 1 0 x2 0 x3 1 1 0 0 1 BDD and context condition • Context conditions are implemented using BDD • Compactly represented • Boolean operations efficiently variable x1 represents p→a variable x2 represents q→a variable x3 represents p→b BDD for ℂ = (p → a ∧ q → a) ∨ p → b if only p → b holds at a call site, we can write ℂ|x1=0;x2=0;x3=1to see whether C holds at the call site.

  24. Experiment • Analyzes million lines of code in minutes • Faster than the state-of-the art FSCS pointer analysis algorithms. Table 2.  Performance (secs).

  25. Conclusion • We present a scalable method for flow- and context-sensitive pointer analysis • Analyzes the pointers in a program level by level in terms of their points-to levels. • Fast flow-sensitive analysis on full sparse SSA form • Fast and accurate context-sensitive analysis using full transfer functions represented by BDD. • Can analyze million lines of C code in minutes, faster than the state-of-the-art methods.

  26. Thanks

More Related