Loading in 5 sec....

Program Slicing by Mark Weiser 5 th International Conference on Software Engineering, San Diego, 1982PowerPoint Presentation

Program Slicing by Mark Weiser 5 th International Conference on Software Engineering, San Diego, 1982

Download Presentation

Program Slicing by Mark Weiser 5 th International Conference on Software Engineering, San Diego, 1982

Loading in 2 Seconds...

- 380 Views
- Uploaded on
- Presentation posted in: Sports / GamesEducation / CareerFashion / BeautyGraphics / DesignNews / Politics

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- Program Slicing is a method that reduces a program to a portion that is of “interest.” The reasons behind the “interest” may be:
- the desire to fix a bug
- The desire to make a modification

- So - - - - there is a desire to understand the behavior ofa portion of the program.

- Program slicing may be viewed as another method to decompose a program, by some criteria.
- The criteria is defined through specifying a set of program variable (s) at some set of statement (s)
- Usually it is one statement and one variable of interest.
- Criterion = < i, v> , where i = statement number, v = variable

- The set of all the statements that would “influence” the statement and the variable specified in the criterion <i, v> is considered the program slice.

If Criterion = <12, {z} >

1. Begin

2. Read (x, y)

5. if (x <=1)

6. then Sum = y

7. else Begin

8. Read (z)

12. End

If Criterion = <12, {z} >

1. Begin

8. Read (z)

12. End

or

- Original Program
- Begin
- Read (x,y)
- Total = 0.0
- Sum = 0.0
- If (x <= 1)
- then Sum = y
- else Begin
- Read (z)
- Total = x * y
- End
- Write (Total, Sum)
- End

If Criterion = <9, {x} >

1. Begin

2. Read (x,y)

12. End

Why not?

If Criterion = <12, {Total} >

1. Begin

2. Read (x,y)

3. Total = 0.0

5. If (x<= 1)

6. then Sum =y

7. else Begin

9. Total = x * y

12 End

- The slice must be obtained from the original program via deletion of statements

To make sure that this works, we need to delete in such a manner that the

result after the deletion is still “meaningful’.

That is, no statement increases the number of its immediate successors

as a result of a statement deletion.

(So be careful of the “branch” statement which has multiple successor

paths.)

2. The behavior of the slice must be the same as the original program as observed through the slicing criterion

This is a reasonable requirement except, we need to relax it by saying

for all “terminating program.”

The example of non-terminating program example says that we can’t

ensure the 2nd property for the program below if x = 0.

1. Begin

2. Read (x)

3. If x =0

4. then Begin

(some infinite loop

without changing x)

5. x = 1

6. end

7. else x =2

8. end

The problem is statement 5, which may

never be executed if x =0.

So if criterion = < 8, {x} >

The slice must include statement 5, which

may never execute and we can’t compare

the slice and the original program, if they do

not terminate.

- Because we have the special case of not being able to compare two programs to be “equal,” we can not find the “Smallest” slice and compare to see if it behaves the same as another slice.
- So, finding a “minimal” slice is a problem .

- A Clarification first: The dataflow algorithm alluded to here is not the DFD that some of you may be thinking. Dataflow algorithm here refers to the path from a variable usage to that variable’s source of definition.
- Let C = criterion = <i,v>
- Let R[0,C] (n) = the variables in statement n that can directly affect what is expressed as of interest through C = < i, v >.

We will be using these variables thatdirectly affect C as the guide to

coming up with the statements that we want to include in the slice.

- This is a recursive definition. Given C = < i, V >
- R[0,C] (n) = all variables v such that either:
- n = i and v is in V OR
- n is an immediate predecessor of a statement m such that either:
a. V is in REF (n) and there is a variable w in both DEF (n) and R[0, C] (m)

OR

b. v is not in DEF (n) and v is in R[0,C] (m)

where DEF (n) = set of variables whose values are altered in statement n.

REF (n) = set of variables who are referenced in statement n.

Trace through the R[0,C]’s

1. R[0,C] (6) = {z} because of definition part 1)

2. R[0,C] (5) = {y, x} because of definition 2 a)

3. R[0,C] (4) = {y, x} because of definition 2 b)

4. R[0,C] (3) = (y, x} because of definition 2 b)

5. R[0,C] (2) = { } (this is like a procedural call to Read subroutine

6. R[0,C] (1) = { }

So for C = <6, {z} > , R[0,C] = { z, y, x }

Sample program:

1. Begin

2. Read (x,y)

3. If ( x > y )

4. then z = x – y

5. else z = y – x

6. Write (z)

And let the criterion C

be C = < 6, {z} >

- Given the Criterion, <i, V> and the resulting R[0,C], the statements that should be included in the Slice are those that change any of the variables in R[0,C].
- More formally, let S[0,C] be the set of statements that should be included in the Slice based on R[0,C]:
S[0,C] = every statement, x, such that

{DEF (x)} ∩ {R[0,C] } ≠ Ø

Sample program:

1. Begin

2. Read (x,y)

3. If ( x > y )

4. then z = x – y

5. else z = y – x

6. Write (z)

And let the criterion C

be C = < 6, {z} >

Given: for C = <6, {z} > , R[0,C] = { z, y, x }

- S[0,C] = every statement, x, such that
- {DEF (x)} ∩ {R[0,C] } ≠ Ø
- Statement 2 : {x, y} ∩ {z, y, x } ≠ Ø
- Statement 4 : {z} ∩ {z, y, x } ≠ Ø
- Statement 5 ; {z} ∩ {z, y, x } ≠ Ø

So, S[0,C] = {statements 2, 4, 5}, and

they should be included in the Slice

- Intuitively, the branch statement in our example, statement 3, which decided which path to take seem to “influence” the C= < 6, {z} >. Should we not include the branching statement in the Slice?

B

- First a little more definition:
- A) In a program graph, a statement x
- is said to dominate statement y
- if x is in every path from the Begin
- statement to y.
- B) An inverse dominator, D(n), of a
- statement n is a statement x
- that is on every path from n to the
- end of the program.

R

P

S

X

Y

A) X is a dominator of Y

B) X and Y are inverse dominators of B

- It would seem that if the branch statement influences which path to take and in one of the paths there is a statement in S[0,C], then that branch statement should be included in the Slice as an “indirect” influencer.
- More formally:
Let ND(Branch) be the set of statements which are on a path from the Branch statement to its nearest inverse dominator, x, excluding Branch and x themselves. Then the Branch statement has indirect influence if

S[0,C] ∩ ND(Branch) ≠ Ø

Sample program:

1. Begin

2. Read (x,y)

3. If ( x > y )

4. then z = x – y

5. else z = y – x

6. Write (z)

And let the criterion C

be C = < 6, {z} >

x > y ?

z = y - x

z = x - y

Write (z)

- Statement 3 is the Branch statement.
- Statement 6 is the inverse dominator of statement 3.
- Statements 4 and 5 are in ND(Branch)
- Also recall that S[0,C] = { statements 2, 4, 5 }
- (statements 4 and 5} ∩ {statements 2,4,5} = {statements 4,5}
- Thus ND (Branch) ∩ S[0.C] ≠ Ø

Thus we should include statement 3 in the Slice.

- Let CS(S[0,C]) = set of statements that include thebranches that will influence the statements in S[0,C].
- Let B[0,C] = those branch statements that are in CS(S[0,C]).

- Now, what about those statements that might
- influence the branch statements, B[0,C] ?

- Consider a slicing criteria at a branch statement to be <b, REF(b) >, where b is the branch statement and REF(b) is the set of variables that influence the choice of path from b.
- Let BC(b) denote this branch criterion, <b, REF(b)>
- Then the set of variables that directly influence the branch statement, b, can be denoted as:
R [ 0, BC(b) ] (n)

- The next level of variables of influence (which will include the direct and the indirect), R[1,C] (n) would be defined as:
R[1,C] (n) = {R[0,C] (n)} U { R[0,BC(b)] (n), for all b in B[0,C] }

- The next level of statements of influence would include all those statements that directly and indirectly influence the criterion:
S[1,C] = {statements x : where DEF(x) ∩ R[1,C](n) ≠ Ø or

x is in B[0,C] }

- Because the program can have multiple levels of branches and paths, to come up with a Program Slice based on a criterion C, one may have to recursively process the direct and the indirect influencing variables and the associated statements as follows:
- R[i+1, C] (n) = {R[i,C] (n)} U { R[0,BC(b)](n) such that b is in B[i,C] }
- B[i+1, C] = CS( [ S [i +1, C] )
- S[i+1, C] = { statement x: {DEF(x)} ∩ {R[i+1,C](n) ≠ Ø} } or
x is in B[i, C] }

- People and Experiment:
- 21 experienced programmers
- 3 programs, each with a bug, were given to be debugged
- Then program slices were shown
- 2 Local adjacent code slice
- 3 Non-adjacent code slice (including slices relevant / not relevant to debugging)

- Result:
- “Remembered” the 2 adjacent code slices more than the non-adjacent code slices, except for the non-adjacent code slice that contained the bug.
- For the non-adjacent code slice (but relevant to the bug), the number of “remembered” situations were as much as the adjacent code slices.

- Also some automatic slicers were built - - - with some success

- Slice Coverage = mean slice lengths / program length. (understand the mean length of slices and how they compare with the whole program, with the suspicion that the smaller this metric is the more “independent” parts this program contains - - - thus it is not very cohesive?)
- Overlap = number of statements found uniquely in that slice versus those that are not unique. (The smaller the number is, the more interdependencies among the code.)
- Clustering = number of statements that are adjacent in the slice to the total slice statements. ( a low number may mean the code is spread out and intertwined like a spaghetti code.)

- Parallelism = number of slices which have a small number of statements in common. (if the common statements among the slices is zero and there are many of these, then the slices may be executed in parallel to gain computation time?)
- Tightness = number of those statements that are in every slice. (If this number is high among the subroutines, then perhaps the subroutines should be combines into one?)

In the spirit of REFACTORING, can these metrics be used

as guidelines to improve design and code?