Pointer Analysis for Multithreaded Programs

Pointer Analysis for Multithreaded Programs Radu Rugina and Martin Rinard M I T Laboratory for Computer Science

Outline • Example • Review of Pointer Analysis for Sequential Programs • Pointer Analysis for Multithreaded Programs • Experimental Results • Conclusions R. Rugina, M. Rinard PLDI 99

two concurrent threads two questions: Q1 : what location is written by *p=1 ? Q2: what location is written by *p=2 ? OR : Q1: p? in left thread Q2: p? after both threads completed Example R. Rugina, M. Rinard PLDI 99

Two Possible Executions R. Rugina, M. Rinard PLDI 99

Analysis Result Result = a points-to graph at each program point R. Rugina, M. Rinard PLDI 99

Analysis of Multithreaded Programs • Problem: • analyze interactions between concurrent threads • Straightforward solution: • analyze all possible interleavings and merge results • fails because of exponential complexity • for n threads with s1 , ... , sn statements : Number of interleavings = ( s1 + ... + sn ) (s1 + ... + sn) ! = s1 , ... , sn s1 ! ... sn ! R. Rugina, M. Rinard PLDI 99

Our Approach • We introduce interference information : • interference = points-to edges created by the other concurrent threads • models the effect of “all possible interleavings” • Efficiency: polynomial complexity in program size • Derive dataflow equations : • recursive equations • fixed-point algorithms to solve the equations • theoretically less precise than “all interleavings” • in practice : no loss of precision R. Rugina, M. Rinard PLDI 99

Algorithm Overview • intra-procedural: • flow-sensitive (dataflow analysis) • handles unstructured flow of control • defines dataflow equations for: • pointer assignments • parallel constructs • inter-procedural : • context-sensitive • handles recursive functions R. Rugina, M. Rinard PLDI 99

Review of Pointer Analysis for Sequential Programs

Points-to Graphs • Points-to graphs [EGH94] • nodes = program variables • edges = points-to relationships • Example : R. Rugina, M. Rinard PLDI 99

Basic Pointer Assignments • Four types of pointer assignments: • x = &y( address-of assign ) • x = y( copy assign ) • x = *y( load assign ) • *x = y( store assign ) • More complex assignments: • transformed into a sequence of basic statements tmp = &t; *z = tmp; *z = &t; R. Rugina, M. Rinard PLDI 99

Generated Edges x z x z y y t address-of:x = &y copy:x = y x z x z w y t u y t load:x = *y store:*x = y R. Rugina, M. Rinard PLDI 99

Strong vs. Weak Updates • strong updates : • kill existing points-to relationships • result in more precise analysis results • weak updates : • leave existing points-to edges in place • reasons for weak updates: • control flow uncertainty: • arrays of pointers : • heap-allocated pointers : y q x p z r if (cond) p = &q; else p = &r; *p = &x; v[i] = &x; p = malloc( sizeof(int*) ) *p = &x; R. Rugina, M. Rinard PLDI 99

Dataflow Information copy: x=y gen = { (x, t) | (y, t)  C } kill = { (x, z) | (x, z)  C } strong = not (array_elem(x) heap(x)) address-of: x=&y gen = { (x, y) } kill = { (x, z) | (x, z)  C } strong = not (array_elem(x) heap(x)) load: x=*y gen = { (x, u) | (y, t)  C  (t, u)  C } kill = { (x, z) | (x, z)  C } strong = not (array_elem(x) heap(x)) store: *x=y gen = { (z, t) | (x, z)  C  (y, t)  C } kill = { (z, w) | (x, z)  C  (z, w)  C } strong = { z | (x, z)  C } = {v}  not (array_elem(v) heap(v)) R. Rugina, M. Rinard PLDI 99

Dataflow Analysis • the dataflow information is : <C, I, E> P3 • C = the current points-to relationships • I = the interference information from other threads • E = edges created by the current thread • as a set of edges, P3 is a lattice: • partial order relation = set inclusion • merge operator = set union <C1,I1,E1>  <C2,I2,E2> = <C1UC2 , I1UI2, E1UE2> R. Rugina, M. Rinard PLDI 99

Abstract Interpretation • P = set of points-to graphs, • Stat= set of program statements • abstract semantics is defined by a functional : : Stat  (P3  P3) R. Rugina, M. Rinard PLDI 99

Parallel par Statements • syntax: par { {t1}, ..., {tn} } • concurrent execution • interleaving semantics • may be nested • interference: • is the union of points-to edges created by all other concurrent threads • may be different for different concurrent threads R. Rugina, M. Rinard PLDI 99

Analysis of Individual Threads • Interference information: • I = “global” interference - generated by enclosing par’s • Li=“local” interference - generated by current par • E = points-to edges created by the current thread • Analysis result for thread ti : < Ci’, Ii , Ei > = ti < Ci , Ii ,  > Ii =I Li Ci =C Li R. Rugina, M. Rinard PLDI 99

Parend Analysis Analysis result : < C’, I’, E’ > =par < C, I, E > < Ci’, Ii , Ei > = ti < Ci , Ii ,  > I’ = I E’ = E ( Ei) C’ =  Ci’ R. Rugina, M. Rinard PLDI 99

Analysis of Entire par Construct Recursive dataflow equations : Ci =C Li Ii =I Li < Ci’, Ii , Ei > = ti < Ci , Ii ,  > (thread rule) E’ = E ( Ei) C’ =  Ci’ < C’, I, E’ >= par < C, I, E > ( par rule ) information flowing INTO par construct information flowing OUT of par construct R. Rugina, M. Rinard PLDI 99

Example Analysis R. Rugina, M. Rinard PLDI 99

Inter-Procedural Analysis • Context-sensitive : • procedures re-analyzed at each call site • Ghost variables: • replace variables not in the scope of the procedure • distinguish locals in different activations of recursive functions • Sequential Partial Transfer Functions (Seq-PTFs) [WL95] • associate a points-to output graph to an input context • can be reused when there is a match for the input context R. Rugina, M. Rinard PLDI 99

Multithreaded Extensions • Multithreaded Input Context = input points-to information + interference information • Multithreaded PTF = = associates output points-to graph + created edges to an input context • Mapping and unmapping : • map the interference information I • unmap created points-to edges E R. Rugina, M. Rinard PLDI 99

Other Parallel Constructs • Parallel for loops • generate a symmetric dataflow equation: t1 < CU E1, I U E1 ,  > = < C1’, I U E1 , E1 > for(i=0; i<n; i++) spawn thread(i); sync; • Conditional Thread Creation • merge analysis result with initial points-to graph if (c1) spawn thread1(); if (c2) spawn thread2(); sync; C’ = (Ci’ U Ci ) R. Rugina, M. Rinard PLDI 99

Advanced Features • Recursive procedures: • result in recursive dataflow equations • fixed-point algorithm to solve recursion • Function pointers: • result in a dynamic call-graph • handled using the computed pointer information • methodology: analyze all possible callees and merge results • Thread-private global variables: • at parbegin nodes: save their values in the parent thread and make them point to unknown in the child threads • at parend nodes: restore saved values in the parent thread R. Rugina, M. Rinard PLDI 99

Algorithm Evaluation • Soundness : • the multithreaded algorithm conservatively approximates all possible interleavings of concurrent threads’ statements • Termination of fixed-point algorithms: • follows from the monotonicity of the abstract semantics functional • Complexity of fixed-point algorithms: • worst-case size of points-to graphs: O(n2), where n = | Stat | • n program points imply worst-case O(n3) iterations • worst-case polynomial complexity: O(n4) • Precision of analysis: • if the concurrent threads do not (pointer-)interfere then this algorithm gives the same result as the “ideal algorithm” R. Rugina, M. Rinard PLDI 99

Experimental Results • Implementation : • SUIF infrastructure; Cilk benchmarks • Benchmark characteristics : R. Rugina, M. Rinard PLDI 99

Precision Measurements • Pointer values at load/store: • usually unique target: 83 % of the loads 88 % of the stores • few potentially uninitialized pointers • very few pointers with more than two targets • Comparison : • Multithreaded, Interleaved, Sequential: MT Interleaved Seq • results: Multithreaded = Sequential • conclusion: Multithreaded = Interleaved   R. Rugina, M. Rinard PLDI 99

Applications • Current Uses: • MIT RAW project • memory disambiguation for static promotion (ISCA 99) • C-to-silicon compiler generating small memories (FCCM 99) • automatic parallelization of divide-and-conquer algorithms (PPoPP 99) • Future Uses: • data race detection in multithreaded programs • static elimination of array bounds checks R. Rugina, M. Rinard PLDI 99

Future • Multithreaded programs: • are becoming very common • are hard to debug • are hard to analyze • The current algorithm: • gives precise MT pointer information • may be used as a foundation for other MT analyses • gives a framework for other MT analyses R. Rugina, M. Rinard PLDI 99

Additional Slides

Challenging Benchmark Set • Applications Heavily Optimized By Hand • Pousse - timed competition, won ICFP ‘98 contest • Pointer Arithmetic • Casts • Divide and Conquer Algorithms • Recursion • Pointers Into Heap-Allocated Arrays • Pointer-Based Data Structures (octrees, hash tables, ...) • Recursive Linked Data Structures Allocated On Stack R. Rugina, M. Rinard PLDI 99

Related Work • Pointer analysis • existing pointer analyses are focused to sequential programs [LR92], [LRZ93], [CBC93], [EGH94], [Ruf95], [WL95], [And94], [Ste96], [SH97] • flow-sensitive vs. flow-insensitive analysis • context-sensitive vs. context-insensitive analysis • Multithreaded program analysis: • relatively unexplored field • flow-sensitive analysis : • dataflow framework for bitvector problems [KSV96] • does not apply to pointer analysis • flow-insensitive analysis: • trivially model the interleaving semantics of concurrent threads • locality analysis [ZH97] ( uses type-inference techniques) R. Rugina, M. Rinard PLDI 99

Pointer Analysis for Multithreaded Programs

Pointer Analysis for Multithreaded Programs

Presentation Transcript

Analysis of Multithreaded Programs

Pointer and Escape Analysis for Multithreaded Programs

Compositional Pointer and Escape Analysis for Java Programs

Analyses and Optimizations for Multithreaded Programs

Pointer Analysis

Pointer Analysis

Pointer Analysis

A Structure Layout Optimization for Multithreaded Programs

A Modular Checker for Multithreaded Programs

Pointer Analysis

Pointer and Escape Analysis for (Multithreaded) Programs

Pointer analysis

Hierarchical Pointer Analysis for Distributed Programs

Pointer Analysis.

Runtime Safety Analysis of Multithreaded Programs

Compositional Pointer and Escape Analysis for Java Programs

Pointer Analysis

Runtime Safety Analysis of Multithreaded Programs

Hierarchical Pointer Analysis for Distributed Programs

Pointer analysis