cs5103 software engineering n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
CS5103 Software Engineering PowerPoint Presentation
Download Presentation
CS5103 Software Engineering

Loading in 2 Seconds...

play fullscreen
1 / 45

CS5103 Software Engineering - PowerPoint PPT Presentation


  • 140 Views
  • Uploaded on

CS5103 Software Engineering. Lecture 17 Debugging. Today’s class. Delta Debugging Motivation Algorithm In practice Statistical Debugging Tarantula Dynamic Slicing. 2. Debugging. Something we do when testing find a bug Basic Process Reproduce the bug Locate the fault Fix

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'CS5103 Software Engineering' - whitney-wolf


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cs5103 software engineering

CS5103 Software Engineering

Lecture 17

Debugging

today s class
Today’s class

Delta Debugging

Motivation

Algorithm

In practice

Statistical Debugging

Tarantula

Dynamic Slicing

2

debugging
Debugging

Something we do when testing find a bug

Basic Process

Reproduce the bug

Locate the fault

Fix

Bug localization: Basic idea

Suspicious Score (s) = failing tests cover (s) / all tests cover (s)

3

debugging1
Debugging

Sometimes the inputs is too complex…

Quite common in real world (compiler, office, browser, database, OS, …)

Locate the relevant inputs

4

consider mozilla firefox
Consider Mozilla Firefox

Taking html pages as inputs

A large number of bugs are related to loading certain html pages

Corner cases in html syntax

Incompatibility between browsers

Corner cases in Javascripts, css, …

Error handling for incorrect html, Javascript, css, …

5

how do we go from this
How do we go from this

<SELECT NAME="op sys" MULTIPLE SIZE=7>

<OPTION VALUE="All">All<OPTION VALUE="Windows 3.1">Windows 3.1<OPTION VALUE="Windows 95">Windows 95<OPTION

VALUE="Windows 98">Windows 98<OPTION VALUE="Windows ME">Windows ME<OPTION VALUE="Windows 2000">Windows

2000<OPTION VALUE="Windows NT">Windows NT<OPTION VALUE="Mac System 7">Mac System 7<OPTION VALUE="Mac System

7.5">Mac System 7.5<OPTION VALUE="Mac System 7.6.1">Mac System 7.6.1<OPTION VALUE="Mac System 8.0">Mac System

8.0<OPTION VALUE="Mac System 8.5">Mac System 8.5<OPTION VALUE="Mac System 8.6">Mac System 8.6<OPTION VALUE="Mac

System 9.x">Mac System 9.x<OPTION VALUE="MacOS X">MacOS X<OPTION VALUE="Linux">Linux<OPTION

VALUE="BSDI">BSDI<OPTION VALUE="FreeBSD">FreeBSD<OPTION VALUE="NetBSD">NetBSD<OPTION

VALUE="OpenBSD">OpenBSD<OPTION VALUE="AIX">AIX<OPTION VALUE="BeOS">BeOS<OPTION VALUE="HP-UX">HPUX<

OPTION VALUE="IRIX">IRIX<OPTION VALUE="Neutrino">Neutrino<OPTION VALUE="OpenVMS">OpenVMS<OPTION

VALUE="OS/2">OS/2<OPTION VALUE="OSF/1">OSF/1<OPTION VALUE="Solaris">Solaris<OPTION

VALUE="SunOS">SunOS<OPTION VALUE="other">other</SELECT>

</td>

<td align=left valign=top>

<SELECT NAME="priority" MULTIPLE SIZE=7>

<OPTION VALUE="--">--<OPTION VALUE="P1">P1<OPTION VALUE="P2">P2<OPTION VALUE="P3">P3<OPTION

VALUE="P4">P4<OPTION VALUE="P5">P5</SELECT>

</td>

<td align=left valign=top>

<SELECT NAME="bug severity" MULTIPLE SIZE=7>

<OPTION VALUE="blocker">blocker<OPTION VALUE="critical">critical<OPTION VALUE="major">major<OPTION

VALUE="normal">normal<OPTION VALUE="minor">minor<OPTION VALUE="trivial">trivial<OPTION

VALUE="enhancement">enhancement<

6

to this
To this…

<SELECT NAME="priority" MULTIPLE SIZE=7>

7

motivation
Motivation

Turning bug reports with real web pages to minimized test cases

The minimized test case should still be able to reveal the bug

Benefit of simplification

Easy to communicate

Remove duplicates

Easy debugging

Involve less potentially buggy code

Shorter execution time

8

delta debugging
Delta Debugging

The problem definition

A program exhibit an error for an input

The input is a set of elements

E.g., a sequence of API calls, a text file, a serialized object, …

Problem:

Find a smaller subset of the input that still cause the failure

9

a generic algorithm
A generic algorithm

How do people handle this problem?

Binary search

Cut the input to halves

Try to reproduce the bug

Iterate

10

delta debugging version 1
Delta Debugging Version 1

The set of elements in the bug-revealing input is I

Assumptions

Each subset of I is a valid input:

Each Subset of I -> success / fail

A single input element E causes the failure

E will cause the failure in any cases (combined with any other elements) (Monotonic)

11

solution is simple
Solution is simple

Go with the binary search process

Throw away half of the input elements, if the rest input elements still cause the failure

12

solution is simple1
Solution is simple

Go with the binary search process

Throw away half of the input elements, if the rest input elements still cause the failure

A single element: we are done!

13

delta debugging version 11
Delta Debugging Version 1

This is just binary search: easy to automate

The assumptions do not always hold

Let’s look at the assumptions:

(I1 U I2) =

-> I1 = and I2 =

or I1 = and I2 =

It is interesting to see if this is not the case

15

case i multiple failing branches
Case I: multiple failing branches

What happened if I1 = and I2 = ?

A subset of I1 fails and also a subset of I2 fails

We can simply continue to search I1 and I2

And we find two fail-causing elements

They may be due to the same bug or not

16

case ii interference
Case II: Interference

What happened if I1 = and I2 = ?

This means that a subset of I1 and a subset of I2 cause the failure when they combined

This is called interference

17

handling interference
Handling Interference

The cute trick

Consider I1 = and I2 =

But I1 UI2 =

An element D1 in I1 and an element D2 in I2 cause the failure

We do binary search in I2 with I1

Split I2 to P1 and P2, try I1 U P1 and I1 UP2

Continue until you find D2, so that I1 UD2 cause the failure

Then we do binary search in I1 with D2 until find D1

Return D1 U D2

18

example i handle interference
Example I: Handle interference

Consider 8 input elements, of which 3 and 7 cause the

failure when they applied together

Configuration Result

1 2 3 4

Interference!

5 6 7 8

1 2 3 4 5 6

1 2 3 4 7 8

1 2 3 4 7

1 2 7

3 4 7

37

19

example ii handle multiple interference
Example II: Handle multiple interference

Consider 8 input elements, of which 3, 5 and 7 cause the

failure when they applied together

Configuration Result

1 2 3 4

Interference!

5 6 7 8

1 2 3 4 5 6

1 2 3 4 7 8

Second Interference! What to do?

Go on with I1 U P1!

1 2 3 4 5 6 7

1 2 3 4 57

1 2 57

3 4 57

3 57

20

delta debugging version 2
Delta Debugging Version 2

The set of elements in the bug-revealing input is I

New Assumptions

Each subset of I is a valid input

A subset of input elements E causes the failure

E will cause the failure in any cases (combined with any other elements)

21

delta debugging version 21
Delta Debugging Version 2

Algorithm

Split I to I1 and I2

Case I: I1 = and I2 =

Try I1

Case I: I1 = and I2 =

Try I2

Case I: I1 = and I2 =

try both I1 and I2

Case II: I1 = and I2 =

Handle interference for I1 and I2

22

real example gnu compiler
Real example: GNU Compiler

This input program (bug.c)

causes Gcc 2.59.2 to crash

when all optimitization are

enabled

Minimize it to debug gcc

Consider each character

as an element

23

real example gnu compiler1
Real example: GNU Compiler

Our delta debugging process

Create the appropriate subset of bug.c

Feed it to gcc

Continue according to whether Gcc crashes

77

24

gcc compiler example
GCC compiler example

The minimized code:

The test case is 1-minimal

No single character can be removed

Even every space is removed

The function name has been changed from mult to a signle t

Gcc is executed for 700+ times

Input reduce to 10% of the initial input

t(double z[],int n){int i,j;for(;;){i=i+j+1;z[i]=z[i]*(z[0]+0);}return z[n];}

25

another example gdb
Another example: GDB

GDB is the debugger from GNU

It updates from 4.16 to 4.17

The version 4.17 no longer compatible with DDD (a GUI for GNU software development tools)

178, 000 lines of code change from 4.16

How to know which code change(s) cause the failure

26

results
Results

After a lot of work (by machine)

178KLOC change grouped to 8700 groups (commits)

Use delta debugging

Work it out in 470 tests

It took 48 hours

Doing this by hand would be a nightmare!

27

importance of input elements
Importance of input elements

It is important to have good input element definition

So that subset of input elements are valid for input

The size of input is small

Consider the examples

GCC example: we use characters as elements, which is simple but not so good, if the bug happens after parser, the bug is not monotonic due to syntax errors

GDB example: we group LOC to groups to reduce input size to 5% of the original size. 2 days are acceptable, what about 40 days?

28

limitations of delta debugging
Limitations of Delta debugging

Rely on the assumptions

Monotonicity does not always hold

Rely on good input elements, always providing valid inputs will enhance efficiency

Require automatic test oracles

Good for regression testing

No good for development-time testing

29

statistical debugging
Statistical Debugging

Delta Debugging

Narrow down the input to be considered

Statistical Debugging

Narrow down the code to be considered

30

statistical debugging1
Statistical Debugging

Basic Idea

Consider a number of test cases, some of which pass and some of which fail

If a statement is covered mostly by failed test cases, it is highly likely to be the buggy part of the code

31

tarantula
Tarantula

A classical tool for statistical debugging

Use the following formulas

Color = red + pass/(fail + pass) * (green )

Brightness = max (pass, fail)

32

context based statistical debugging
Context based statistical debugging

Not just consider a statement

Runtime Control Flow Graph

Also consider connections

Outcomes of branches

Connections on a runtime-CFG

34

runtime control flow graph
Runtime Control Flow Graph

1: void replaceFirst (sx, sy) {

2: for (int i=0;i<len;i++) {

3: if (arr[i]==sx){

4: arr[i] = sz;

5: //should break;

6: }

7: if (arr[i]==sy)){

8: arr[i] = sz;

9: //should break;

10: }

11: }

12:}

pass

pass

Fail

35

limitations
Limitations

Questions:

If a statement is covered only by passed test cases, can it be the root cause of the bug found?

If a statement is covered only by failed test cases, it must be the root cause of the bug found?

36

example1
Example

void f(int a, int b){

if (a > 0){ //error: should be >=

do something;

}

if (b < 0){

do something

}

}

Test Cases:

3, 2

2, 1,

0, -1

2, 0

37

dynamic slicing
Dynamic Slicing

Another way to narrow down code to be considered in debugging

38

data dependencies
Data Dependencies

Data dependencies are the dependency from the usage of a variable to the definition of the variable

Example:

s1: x = 3;

s2: if(y > 5){

s3: y = y + x; //data depend on x in s1

s4: }

39

control dependencies
Control Dependencies

Control dependencies are the dependency from the branch basic blocks to the predicate

Example:

s1: x = 3;

s2: if(y > 5){

s3: y = y + x; //control depend on y in s2

s4: }

40

dynamic slicing1
Dynamic Slicing

Describe dependencies among code elements

If a variable has incorrect value, the bug should be in its backward dynamic slice

Like runtime control flow graph

A map from static slicing to the executed code

41

algorithm
Algorithm
  • A dependence edge is introduced from a load to a store if during execution, at least once, the value stored by the store is indeed read by the load (mark dependence edge)
  • No static analysis is needed.
algorithm ii example

11

21

31

41

51

71

81

Algorithm II Example

1: b=0

For input N=1, the trace is:

2: a=2

3: 1 <=i <=N

T

4: if ((i++)%2= =1)

F

T

F

5: a=a+1

6: b=a*2

7: z=a+b

8: print(z)

efficiency summary
Efficiency: Summary
  • For an execution of 130M instructions:
    • space requirement: reduced from 1.5GB to 94MB (I further reduced the size by a factor of 5 by designing a generic compression technique [MICRO’05]).
    • time requirement: reduced from >10 Mins to <30 seconds.
    • http://jslice.sourceforge.net/
summary of debugging
Summary of debugging

Debugging is a follow-up step of testing

Bug localization, and bug fixing are tasks highly depend on human intelligence

Tools can help us to narrow the scope to consider

Bug localization

Reduce the code to be considered

Delta debugging

Reduce the inputs to be considered

45