Improving the invariant model of diduce
1 / 11

Improving the Invariant Model of DIDUCE - PowerPoint PPT Presentation

  • Uploaded on

Improving the Invariant Model of DIDUCE. CS 343 -- Research Proposal 12 June 2002 Katy Innes and Andy Westbrook. Overview. Review of DIDUCE What’s wrong with DIDUCE’s current model? How do we propose to fix it? Related work Other presentations Summer break!. Review of DIDUCE.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Improving the Invariant Model of DIDUCE' - nimrod

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Improving the invariant model of diduce

Improving the Invariant Model of DIDUCE

CS 343 -- Research Proposal

12 June 2002

Katy Innes and Andy Westbrook


  • Review of DIDUCE

  • What’s wrong with DIDUCE’s current model?

  • How do we propose to fix it?

  • Related work

  • Other presentations

  • Summer break!

Review of diduce
Review of DIDUCE

  • A dynamic invariant checker

  • Instruments user-specified portions of the Java Bytecode for a particular program

    • Maintains a hypothetical invariant for the value of many variables at selected program points

    • Does so using a a bitmask which is the “meet” of all values seen so far

      • The meet operator used in this case is the bitwise-or operator

What s wrong with diduce
What’s Wrong with DIDUCE?

  • The invariant model

    • It is heavily associated with the binary representation of integers

      • If a variable is allowed to take on values 1 and 4, it must also be allowed to take on value 5

    • This model is of little use for floating point numbers

    • Empirically, this model has been shown to be meaningful with reference types only for distinguishing between null and non-null

For example
For Example

  • The paper mentions a bug found in MAJC where a state variable takes on a new state

    • This variable is 0 for empty, 1 for occupied, or 2 for pending

      • The error occurs when it takes on 2 for the first time

    • But, if the variable took on 1 for empty, 2 for occupied, and 3 for pending DIDUCE would not find this bug

  • Would DIDUCE be better if it could handle either case?

Our improvement perhaps
Our Improvement (Perhaps)

  • Rather than use a bit vector for each invariant, we will use a set of ranges

    • For example, we might associate the range 1-2 with the previous example

    • We might have multiple ranges, or ranges of width one

  • To handle reference types, we would assign each class type a number and treat reference types as integers taking on the number corresponding to the type to which they point


  • We developed a measurement of confidence for each range in an invariant

    • It is

    • This rewards small ranges that contain a large number of observed values

Reporting violations
Reporting Violations

  • When we observe a value that does not fall into a range, we report a violation

    • These violations are sorted by the confidence of the invariant model violated.

      • This confidence is the mean of the confidences of the ranges defining the invariant

  • We also create a new range for that invariant, containing just the observed value

Efficiency improvements
Efficiency Improvements

  • To improve efficiency ranges are merged

    • For two ranges to be merged, the difference in the confidence between the initial range with higher confidence and the newer range must be less than some empirically determined constant

    • This will result in merging ranges that are close together and have similar confidence

  • We will also limit the number of ranges per program point and will drop ranges with low confidence

More efficiency improvements
More Efficiency Improvements

  • Deinstrumentation

    • When the program has been running for suitably long period of time and has no high confidence ranges for a particular invariant, we stop checking that invariant

    • We hypothesize that this will eliminate checking of variables that hold random or arbitrary values or can take on most of the values allowed by their type

Related work
Related Work

  • Daikon- tracks all observed values and then, after completion, determines invariants

    • Requires extensive training data

    • This provides better invariants than our proposal but at a much, much higher cost

  • A number of languages (e,g. Ada) support range-based subtyping

    • This supports our hypothesis that ranges are meaningful invariants