1 / 36

MSc Software Maintenance MS Viðhald hugbúnaðar

MSc Software Maintenance MS Viðhald hugbúnaðar. Fyrirlestrar 15 & 16 Programmers Use Slices When Debugging. Case Study Dæmisaga. Reference Programmers Use Slices When Debugging , Mark Weiser, Communications of the ACM, Volume 25, Number 7, pp 446-452, 1982. The basic debugging method.

melvyn
Download Presentation

MSc Software Maintenance MS Viðhald hugbúnaðar

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MSc Software MaintenanceMS Viðhald hugbúnaðar Fyrirlestrar 15 & 16Programmers Use Slices When Debugging Dr Andy Brooks

  2. Case StudyDæmisaga • Reference • Programmers Use Slices When Debugging, Mark Weiser, • Communications of the ACM, Volume 25, Number 7, • pp 446-452, 1982. Dr Andy Brooks

  3. The basic debugging method • Reading 1 million lines of code, from beginning to end, to locate and remove a bug is not efficient. • 100 LOC/day equates to 10000 days... • 1000 LOC/day equates to 1000 days... • The basic debugging method is to begin at the statement where the error appears and then reason backwards about the previous sequence of statements. Dr Andy Brooks

  4. Reasoning backwards • Reasoning backwards to determine all the influences on a variable usually reveals that many statements in the program have no influence. Sometimes you reason backward to the hardware or translation software... Dr Andy Brooks

  5. Að sneiða Program Slicing “The process of stripping a program of statements without influence on a given variable at a given statement is called program slicing.” “An elementary slicing criterion of a program P is a tuple <i,V> where i denotes a specific statement in P and V is a subset of variables in P.” Dr Andy Brooks

  6. A program and a program slice • BEGIN • READ(X,Y) • TOTAL:=0.0 • SUM:=0.0 • IF X<=1 • THEN SUM:=Y • ESLE BEGIN • READ(Z) • TOTAL:=X*Y • END • WRITE(TOTAL,SUM) • END. Slice on Z at statement 12 BEGIN READ(X,Y) IF X<=1 THEN ELSE READ(Z) END. TOTAL, SUM and Y have no influence on Z. Dr Andy Brooks

  7. A program and a program slice • BEGIN • READ(X,Y) • TOTAL:=0.0 • SUM:=0.0 • IF X<=1 • THEN SUM:=Y • ESLE BEGIN • READ(Z) • TOTAL:=X*Y • END • WRITE(TOTAL,SUM) • END. Slice on X at statement 9 BEGIN READ(X,Y) END. Dr Andy Brooks

  8. A program and a program slice • BEGIN • READ(X,Y) • TOTAL:=0.0 • SUM:=0.0 • IF X<=1 • THEN SUM:=Y • ESLE BEGIN • READ(Z) • TOTAL:=X*Y • END • WRITE(TOTAL,SUM) • END. Slice on TOTAL at statement 12 BEGIN READ(X,Y) TOTAL:=0.0 IF X<=1 THEN ELSE TOTAL:=X*Y END. Dr Andy Brooks

  9. tilgáta Experimental Hypothesis H1 “... debugging programmers, working backwards from the variable and statement of a bug´s appearance, use that variable and statement as a slicing criterion to construct mentally the corresponding program slice.” Experimental Hypothesis H2 “... programmers look at code only in contiguous pieces.” Dr Andy Brooks

  10. “Slices are generally not contiguous pieces, but contain statements scattered throughout the code.” ---------- ---------- ---------- ---------- ---------- ---------- xxxxxx xxxxxx xxxxxx xxxxxx xxxxxx ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- ---------- xxxxxx ---------- ---------- ---------- ---------- xxxxxx ---------- xxxxxx ---------- ---------- ---------- ---------- ---------- ---------- xxxxxx ---------- ---------- ---------- xxxxxx contiguous aðlægur slice Dr Andy Brooks

  11. Method • Programmers debug three programs. • Test programmers´ memory of various code fragments • particularly the program slice relevant to the bug. “If the programmers did slice, then their memories for the relevant slices should be at least as good as their memories of contiguous code, and somewhat better than their memories of other non-contiguous code.” Andy says, this is more like a properly stated hypothesis. Dr Andy Brooks

  12. Andy says: no protocol analysis • It is important to recognise that programmers were not observed working with the programs. • Their actions and the program statements they considered were not recorded. • Testing programmers´ memory is an indirect measurement. • And you may not be measuring what you think you are measuring... Dr Andy Brooks

  13. Materials • Three programs written in Algol-W • Program sizes from 75 to 150 lines of code • Program TALLY • An IBM scientific subroutine • poorly structured and non-mnemonic variable names • Program PAYROLL • written for the experiment • computes salaries and deductions • well structured and mnemonic variable names • Program EVADE • written for the experiment • simulation of random aircaft turns • well structured and mnemonic variable names Dr Andy Brooks

  14. Program bugs The bugs were chosen so that the entire experiment could be completed in less than an hour. Dr Andy Brooks

  15. 5 types of program fragments shown to programmers: • Relevant slice • Relevant contiguous • overlapped the relevant slice • Irrelevant contiguous • did not overlap relevant contiguous • did not overlap relevant slice • program TALLY had no irrelevant contiguous • Irrelevant slice • Jumble • every 3rd or 4th statement Dr Andy Brooks

  16. Fragment overlaprelevant slice & relevant contiguous Andy asks: What were the number of statements in the relevant slices? Overlap is the fraction of statements shared by two fragments. Dr Andy Brooks

  17. Syntactic changes • Syntactic changes were made to the code fragments to prevent recognition by a particular detail: • Variables and constants in the fragments were renamed as single letters followed by a unique number. • Indenting was adjusted from the original program to a form internally consistent with each fragment. Dr Andy Brooks

  18. þátttakendur Participants • Experienced Algol-W programmers • Graduate student teaching assistants • all from the University of Michigan in Ann Arbor • 26 volunteers • 4 participated in pilot studies • 1 did not follow instructions in the experiment • 21 final participants Dr Andy Brooks

  19. Andy´s view • Pilot studies are conducted to: • To check experimental materials are in order. • Instructions are clear. • To check experimental processes are sound. • There is sufficient time to complete tasks. • Participants behave in the way expected. • Weiser reports that pilot studies were conducted but fails to report on actions taken as a result of the pilot studies. Any actions taken should be briefly reported. Dr Andy Brooks

  20. Procedure • Participants were given all three programs to debug in random order. • Participants were then asked to rate 14 program fragments for how sure they were the fragment had been used in one of the three programs. • remember, program TALLY had no irrelevant contiguous fragment (3*5-1 = 14) • Code fragments were given in random order each on a separate page with its rating scale. • Participants were told not to look back either at the programs or at previously rated code fragments. Dr Andy Brooks

  21. Part of the relevant slice for PAYROLL Dr Andy Brooks

  22. Fragment shown to participants Rating scale recognition Dr Andy Brooks

  23. Results • All 21 participants found the bugs in TALLY and EVADE but only 17 found the bug in PAYROLL. Table IV Debugging times (minutes) Andy asks: what were the minimum and maximum times? Dr Andy Brooks

  24. Results • A two-way analysis of variance using Friedman´s test indicated an overall difference in the ratings of the different fragments. • fragment type, program type Andy says: it is important for an overall test to be significant before looking at individual differences. The test is named, but the alpha level is not reported here (0.05?, 0.01?). Dr Andy Brooks

  25. Results Figure 3 by fragment type 54% 28% 24% Why is recognition so high? Dr Andy Brooks

  26. Significant differencesWilcoxon matched-pairs signed-ranks test • The difference between relevant slices and irrelevant slices is significant at the 0.03 level. • The difference between relevant slices and jumbles is very significant at the 0.005 level. Dr Andy Brooks

  27. Results • Irrelevant contiguous was recognised because the programs were small and the irrelevant contiguous fragments were close to the output statements which wrote the incorrect variable values. • Participants would likely have examined code around these output statements. Dr Andy Brooks

  28. Results Figure 4by fragment type and program type Dr Andy Brooks

  29. Results Figure 4 • TALLY shows the greatest recognition of the relevant slice fragment. • Because TALLY was poorly structured (many GOTOS), perhaps more programmers adopted a slicing strategy to debug it. Dr Andy Brooks

  30. Results Table V • To conclude the experiment, participants were asked about the typicalness of the programs and the bugs. • Table V shows that the mean ratings were at least 2.4 on a 1 to 4 scale. • 4 meant “very typical” • 1 meant “not at all typical”. • Weiser reasonably concluded that no program was especially atypical. Dr Andy Brooks

  31. Examples of slices Figure 6 Slices that are large in relation to the program (e.g. 563/662 statements) are less useful to the program maintainer. Dr Andy Brooks

  32. Implications • Tools that automatically generate program slices can help maintainers debug faulty code. • Novice programmers should be taught the concept of slicing. Today, researchers study many different kinds of slicing techniques. Dynamic slicing makes use of knowledge about the input, and this can greatly reduce the size of slices. Dr Andy Brooks

  33. Slicing or not ? “Because the relevant slice fragment overlapped the relevant sequential fragment in each program, this experiment gives no absolute assurance that relevant slices were not recognised only because of that overlap.” Table II indicates that recognition ratings between relevant slice fragments and relevant sequential fragments are poorly correlated. This suggests that participants could have been recognising relevant slice fragments because they had indeed been slicing, but... Dr Andy Brooks

  34. Andy´s view • In experimental work it is better to directly measure than indirectly measure. • Nowadays, it is possible to build and use tools to record all user actions and so help establish if program slicing occurred or not. • Even in Weiser´s day, he could have recorded participants speaking their thoughts and actions aloud and then analysed the recordings to help establish if program slicing had occurred or not. Dr Andy Brooks

  35. Andy´s view • At the very least, Weiser should have asked his participants at the end of the experiment what actions they performed to debug the programs. • Because the programs were so small, it is quite possible that relevant slice recognition occurred because (some or all) participants had simply read all the code involved. • It would be interesting to know what the recognition rates would have been if fragments shown to participants had not been syntactically altered. Dr Andy Brooks

  36. You never really know what is going on inside someone´s head. Dr Andy Brooks

More Related