data flow analysis n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Data-Flow Analysis PowerPoint Presentation
Download Presentation
Data-Flow Analysis

Loading in 2 Seconds...

play fullscreen
1 / 38

Data-Flow Analysis - PowerPoint PPT Presentation


  • 243 Views
  • Uploaded on

Data-Flow Analysis. Static Analysis Inspections Dependence analysis Symbolic execution Software Verification Data flow analysis Concurrency analysis Interprocedural analysis. Approaches. Dynamic Analysis Assertions Error seeding, mutation testing Coverage criteria

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data-Flow Analysis' - josephine-strong


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
approaches
Static Analysis

Inspections

Dependence analysis

Symbolic execution

Software Verification

Data flow analysis

Concurrency analysis

Interprocedural analysis

Approaches
  • Dynamic Analysis
    • Assertions
    • Error seeding, mutation testing
    • Coverage criteria
    • Fault-based testing
    • Object oriented testing
    • Regression testing
data flow analysis dfa
Data Flow Analysis (DFA)
  • Efficient technique for proving properties about programs
  • Not as powerful as automated theorem provers, but requires less human expertise
  • Uses an annotated control flow graph model of the program
    • Compute facts for each node
  • Use the flow in the graph to compute facts about the whole program
    • We’ll focus on single units
some examples of dfa techniques
Some examples of DFA techniques
  • DFA used extensively in program optimization
    • e.g., determine if a definition is dead (and can be removed) determine if a variable always has a constant value determine if an assertion is always true and can be removed
  • DFA can also be used to find anomalies in the code
    • Find def/ref anomalies [Osterweil and Fosdick]
    • Cecil/Cesar system demonstrated the ability to prove general user-specified properties [Olender and Osterweil]
    • FLAVERS demonstrated applicability to concurrent system[Dwyer and Clarke]
  • Why “anomalies” and not faults?
    • May not correspond to an actual executable failure
data flow analysis1
Data flow analysis
  • First,determine local information that is true at each node in the CFG
    • e.g., What variables are defined What variables are referenced
    • Usually stored in sets
      • e.g.,ref(n) is the set of variables referenced at node n
  • Second, use this local information and control flow information to compute global information about the whole program
    • Done incrementally by looking at each node’s successors or predecessors
    • Use a fixed point algorithm-continue to update global information until a fixed point is reached
reaching definitions
Reaching Definitions
  • Definition reaches a nodeif there is a def clear path from the definition to that node
  • Definition of x at node 1 reaches nodes 2,3,4,5 but not 6

x:=

1

2

Could be used to determine data dependencies, useful for debugging, data flow testing, etc.

5

x:=

3

6

4

:=x

Start from a definition, and move forward on the graph to see how far it reaches

computing global information reaching definitions

Definitions that might reach a node

Definitions that must reach a node

reaching_def={xi}

reaching_def={xi,yj}

1

1

2

2

reaching_def={yj}

reaching_def={yj}

3

3

Computing global information- reaching definitions

Start from a definition, and move forward on the graph to see how far it reaches

reaching_def={xi,yj}

reaching_def={yj}

reaching definitions1
Reaching Definitions
  • Xi means that the definition of variable x at node i {might|must} reach the current node
    • Also stated as {possible|definite}

{some|all}

{any|all}

{may|must}

computing values for a node
Computing values for a node

Forward flow

  • Keep track of the definitions into a node that have not been redefined
  • Flow out of a node depends on flow in and on what happens in the node

y:= …x

In(n)

def(n)={y}

ref(n)={x…}

Out(n)

example possible reaching definitions

def={x}

ref={ }

def={y}

ref={x}

X1

Y2

X4

def={}

ref={x}

def={x}

ref={x,y}

def={}

ref={ }

Example: Possible Reaching Definitions

{ }

x = foo()

{X1}

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + y;

end if;

...

{X1}

y = x + 2

{X1,Y2}

{X1,Y2}

if(x > 0)

{X1,Y2}

{X1,Y2}

x = x + y

{X4,Y2}

Forward flow,any path problem,

def/ref sets are the initial facts associated with each node

{X1,Y2,X4}

{X1,Y2,X4}

example definite reaching definitions

X1

Y2

X4

Example: Definite Reaching Definitions

{ }

x = foo()

{X1}

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + y;

end if;

...

{X1}

y = x + 2

{X1,Y2}

{X1,Y2}

if(x > 0)

{X1,Y2}

{X1,Y2}

x = x + y

{X4,Y2}

Forward flow,all path problem,

def/ref sets are the initial facts associated with each node

{Y2}

{Y2}

example definite reaching definitions1

X1

Y2

X4

Example: Definite Reaching Definitions

What happens when we add a loop?

{ }

x = foo()

{X1}

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + y;

end if;

...

{X1}

y = x + 2

{X1,Y2}

{X1,Y2}

if(x > 0)

{X1,Y2}

{X1,Y2}

x = x + y

{X4,Y2}

Forward flow,all path problem,

def/ref sets are the initial facts associated with each node

{Y2}

{Y2}

example definite reaching definitions2

X1

Y2

X4

Example: Definite Reaching Definitions

What happens when we add a loop?

{ }

x = foo()

{X1}

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + y;

end if;

...

{X1}

y = x + 2

{X1,Y2}

X

{X1,Y2}

if(x > 0)

X

{X1,Y2}

X

{X1,Y2}

x = x + y

X

{X4,Y2}

Forward flow,all path problem,

def/ref sets are the initial facts associated with each node

{Y2}

{Y2}

computing values for a node1
Computing values for a node

Forward flow

  • Keep track of the definitions into a node that have not been redefined
  • Flow out of a node depends on flow in and on what happens in the node

y:= …x

In(n)

Out(n)

For definite reaching defs:

In(n) = kєpredOut(k)Out(n) = In(n)-def(n) U def(nn)

live variables
Live Variables
  • a variable, x, is live at node p if there exists a def-clear path wrt x from node p to a use of x
  • x is live at 2, 3, 4, but not

at node 5

x:=

1

2

Used to determine what variables need to be kept in a register after executing a node

5

3

6

x:=

Start from a use, and move backward on the graph until a def is encountered

4

:=x

computing global information live variables

Possible live variables

Definite live variables

1

1

2

2

3

3

4

4

live={x}

live={x}

live={y}

live={x, y}

Computing global information- live variables

live={x,y}

live={x}

live variables computing values for a node
Live Variables: Computing values for a node

Backward flow

  • Keep track of the (forward) references that have not found their (backward) definition site
  • Flow out of a node depends on flow in and on what happens in the node

Out(n)

y:= …x

In(n)

example possible live variables
Example: Possible Live Variables

{ }

x = foo()

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + y;

end if;

...

{x}

{x}

y = x + 2

{x,y}

{x,y}

if(x > 0)

{x,y}

{x,y}

Backward flow,any path problem,

def/ref sets are the initial facts associated with each node

x = x + y

{}

{}

{}

example definite live variables
Example: Definite Live Variables

{ }

x = foo()

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + y;

end if;

...

{x}

{x}

y = x + 2

{x}

{x}

if(x > 0)

{}

{x,y}

Backward flow,all path problem,

def/ref sets are the initial facts associated with each node

x = x + y

{}

{}

{}

example definite live variables1
Example: Definite Live Variables

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + y;

end if;

...

{ }

x = foo()

{x}

{x}

y = x + 2

  • Backward flow,all path problem
  • def/ref sets are the initial facts associated with each node
  • In(n) = kєsucc(n)Out(k)Out(n) = In(n)-def(n) U ref(n)

{x}

{x}

if(x > 0)

{}

{x,y}

x = x + y

{}

{}

{}

data flow analysis decisions
Data Flow Analysis decisions
  • Backward/Forward
  • Any/All
  • Facts
  • Equations
  • Initial values

1

2

3

4

5

Union or Intersection

constant propagation
Constant Propagation
  • Some variables at a point in a program may only take on one value
  • If we know this, can optimize the code when it is compiled
constant propagation1
Constant Propagation

x = 3

int x,y;

...

x := 3;

y := x + 2;

if x > z then

x := x + y;

end if;

...

y = x + 2

if(x > z)

x = x + y

constant propagation2
Constant Propagation

U=unknown

N= not a constant

(x,y)

(U,U)

x = 3

(3,U)

int x,y;

...

x := 3;

y := x + 2;

if x > z then

x := x + y;

end if;

...

(3,U)

y = x + 2

(3,5)

(3,5)

if(x >z)

(3,5)

Forward flow,all paths problem

Facts are the computations and assignments made at each node

(3,5)

x = x + y

(8,5)

(N,5)(N,5)

constant propagation with a loop
Constant Propagation with a loop

U=unknown

N= not a constant

(U,U)

x = 3

(3,U)

(3,U)

y = x + 2

(3,5)

(3,5)

(N,5)

if(x > z)

(3,5)

(N,5)

Forward flow,all paths problem

Facts are the computations and assignments made at each node

(3,5)

(N,5)

x = x + y

(8,5)

(N,5)

(N,5)

fixed point
Fixed point
  • The data flow analysis algorithm will eventually terminate
    • If there are only a finite number of possible sets that can be associated with a node
    • If the function that determines the sets that can be associated with a node is monotonic
constant propagation with a loop1
Constant Propagation with a loop

U=unknown

N= not a constant

(U,U)

x = 3

(3,U)

(3,U)

y = x + 2

(3,5)

if(x > 0)

(3,5)

(N,5)

Forward flow,all paths problem

Facts are the computations and assignments made at each node

(3,5)

(N,5)

x = x + y

(8,5)

(N,5)

(N,5)

dave detects anomalous def ref behavior
DAVE: detects anomalous def/ref behavior
  • L. D. Fosdick and Leon J. Osterweil, " Data Flow Analysis in Software Reliability", ACM Computing Surveys, September 1976, 8 (3), pp. 306-330.
  • Application independent specification of erroneous behavior
      • use of an uninitialized variable
      • redefinition of a variable that is not referenced
anomalous pairs of ref def information

Unreferencedefinition

Anomalous pairsof ref/def information

d - defined, r - referenced, u - undefined

d..r: defined variable reaches a reference

u..d: undefined variable reaches a definition

d..d: definition is redefined before being used

d..u: definition is undefined before being used

u..r: undefined variable reaches a reference

undefined

reference

consider an unreferenced definition
Consider an unreferenced definition
  • For a definition of a, want to know if that definition is not going to be referenced
    • Is there some path where a is redefined or undefined before being used?
    • May be indicative of a problem
    • Usually just a programming convenience and not a problem
  • For a definition of a, want to know if on all paths is a redefined or undefined before being used
    • May be indicative of a problem
    • Or could just be wasteful
some versus all

def a

def a

ref a

def a

def a

Some versus All

For some path, a is redefined without a subsequent reference

For all paths, a is redefined without a subsequent reference

unreferenced definition computing values for a node
Unreferenced definition: Computing values for a node

Forward flow

  • Keep track of the definitions into a node that have not been referenced
  • Flow out of a node depends on flow in and on what happens in the node

y:= …x

In(n)

Out(n)

unreferenced definition computing values for a node1
Unreferenced definition: Computing values for a node

Forward flow

  • Keep track of the definitions into a node that have not been referenced
  • Flow out of a node depends on flow in and on what happens in the node

y:= …x

In(n)

Out(n)

Forward flow,all paths problem

In(n) = kєpred(n)Out(k)Out(n) = In(n)-ref(n) U def(n)

unreferenced definitions

Need to look at each node where there is a def

Unreferenced definitions

(unreferenced defs)

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + 1;

end if;

y:= …

{ }

x = foo()

{x}

{x}

y = x + 2

{y}

{y}

if(x > 0)

{y}

{y}

x = x + 1

{x,y}

{y}

Forward flow,all paths problem

y:= z

{y}

continuing with the unreferenced def example

def={x}

ref={ }

def={y}

ref={x}

def={}

ref={x}

def={x}

ref={x}

def={y}

ref={ }

Continuing with the unreferenced def example
  • A definiton is redefined without being used if def(n)  In(n)-ref(n)
  • Must compute this for each node
  • For this example, the last node would report a redefinition of y on all paths
  • Finds the location where the redef occurs

{ }

x = foo()

{x}

{x}

y = x + 2

{y}

{y}

if(x > 0)

{y}

{y}

x = x +1

{x,y}

y:= …

{y}

{y}

unreferenced definitions finds the location of the first def of the pair
Unreferenced definitions, (finds the location of the first def of the pair)

(unreferenced defs)

int x,y;

...

x := foo();

y := x + 2;

if x > 0 then

x := x + 1;

end if;

y := ...

{x,y}

x = foo()

{y}

{y}

y = x + 2

{y} double def

{y}

if(x > 0)

{y}

{y}

x = x +1

{y}

{y}

y:= …

Backward flow,all paths problem

{y}

in values depend on direction
In values depend on direction

Forward flow

Backward flow

y:= …

In(n)

y:= …

Out(n)

Out(n)

In(n)

data flow analysis problem
Data Flow Analysis Problem
  • Need to determine the information that should be computed at a node
  • Need to determine how that information should flow from node to node
    • Backward or Forward
    • Union or Intersection
  • Often there is more than one way to solve a problem
    • Can often be solved forward or backward, but usually one way is easier than the other