1 / 22

Automated Whitebox Fuzz Testing (NDSS 2008)

Automated Whitebox Fuzz Testing (NDSS 2008). Patrice Godefroid Microsoft (Research) pg@microsoft.com. Michael Y. Levin Microsoft (CSE) mlevin@microsoft.com. David Molnar UC Berkeley dmolnar@eecs.berkeley.edu. Presented by: Edmund Warner University of Central Florida April 7, 2011.

hofer
Download Presentation

Automated Whitebox Fuzz Testing (NDSS 2008)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automated Whitebox Fuzz Testing(NDSS 2008) Patrice Godefroid Microsoft (Research) pg@microsoft.com Michael Y. Levin Microsoft (CSE) mlevin@microsoft.com David Molnar UC Berkeley dmolnar@eecs.berkeley.edu Presented by: Edmund Warner University of Central Florida April 7, 2011

  2. Acknowledgments Figures are taken directly from the paper or original presentation slides Some slides reused from the original presentation

  3. Overview Definition of Whitebox Fuzz Testing The Search Algorithm SAGE (Scalable, Automated, Guided Execution) Test Findings Conclusions

  4. What is Whitebox Fuzz Testing? Fuzz testing is a form of blackbox random testing Can be remarkably effective, but there are limitations Given the then branch statement: If (x == 10) then... has 1 in 2^32 chance of being executed if x is a random 32-bit input Can provide low code coverage

  5. Whitebox Fuzz Testing Combine fuzz testing with dynamic test generation Run the code with its input Collect constraints on inputs with symbolic execution Generate new constraints Solve constraints with constraint solver Synthesize new inputs

  6. Whitebox Fuzz Testing In theory, this approach can lead to full program path coverage Practically, it will fall short and the search will be incomplete: Number of execution paths in the program is huge Symbolic execution, constraint generation, and constraint solving are necessarily imprecise

  7. The Search Algorithm With blackbox fuzzing, it is unlikely to catch the error (5 values out of 2^(8*4) 4-byte cases) This is rather simple, however, for dynamic test generation

  8. Dynamic Test Generation For instance, we run the input “good” on the program. We develop a path constraint based off of the conditional statements crossed: <i0 != 'b', i1 != 'a', i2 != 'd', i3 != '!'> Create a new path constraint: <i0 = 'g', i1 != 'o', i2 != 'o', i3 = '!'>

  9. Limitations Path Explosion Does not scale to large, realistic programs Can be alleviated with different methods in the search algorithm Imperfect Symbolic Execution Complex program statements (pointer manipulation) OS and library functions (cost)

  10. The Search Algorithm Solution: Generational Search Places the initial input in a workList Runs program for bugs in the first execution WorkList is processed by selecting an element and expanding it Run with child inputs Assigned a score Added to workList

  11. The Search Algorithm More on ExpandExecution Tests program with input Generates path constraints (PC) Attempt to expand path constraints If so, save for later execution

  12. The Search Algorithm What does this mean? Given input with PC Attempts to expand all constraints in PC Instead of just the last with a depth-first search Or the first with a breadth-first search A parameter bound is used to limit backtracking through parent nodes End Result: achieve the largest search space in the shortest amount of time

  13. SAGE Scalable, Automated, Guided Execution Can test any file-reading program running on Windows by treating bytes read from files as symbolic input.

  14. SAGE Architecture Instead of being source-based, SAGE is a machine-code-based instrumentation Multitude of languages and build processes No need for specific source, compiler and build operations Slower to start, but encompasses much more Compiler and post-build transformations By performing symbolic execution on binary code that actually ships, SAGE can detects bugs also in the compiling and post-processign tools Unavailability of source Source-based may be difficult for self-modifying or JITed code SAGE doesn't need the data types or structures not visible at machine code level

  15. Constraint Generation SAGE is trace-based Uses replay of trace to update the concrete and symbolic stores This allows constraints to be built on input values *Given conditional jumps, it uses bitvectors to tag the EFLAGS used for the jumps

  16. Constraint Optimization SAGE employs a number of optimization techniques to improve speed and decrease memory consumption: Tag catching Unrelated constraint elimination Local constraint catching Flip count limit Concretization Constraint subsumption** Constraint subsumption checks to see if newly created contstraints imply or are being implied

  17. Findings Generational Search vs. Depth-First Search On Media1,2,3 applications they tested, DFS terminated in ~11 hours with nothing. GS ran for slightly longer and found 15 crashes in 4 buckets in Media3. Bogus files find few bugs Divergences are common: ~60% Most bugs are shallow** Impact of the block-coverage heuristic Adding 10407 blocks instead of 10633; not very effective in most cases

  18. Conclusions Most unique bugs found are on well formatted input, and in few generations There may be a limited sample size, but the success of finding bugs previously missed suggests a new search strategy SAGE still needs enhancement: precision, power

  19. Contributions A critical vulnerability was found in the MS07-017 ANI, which has been missed by extensive blackbox testing and static analysis A new search algorithm was introduced for systematic test generation, which has been optimized for large applications Introduction and implementation of SAGE, which can scale to programs with hundreds of millions of instructions

  20. Weaknesses The paper itself is hard to understand in certain areas Sometimes there is nondeterminism shown in the coverage of the program Same input, same program, same machine, different coverage

  21. Improvements Paper – more figures explaining the heuristics and rules Nondeterminism – export input coverage results to a database to be checked so that nothing is repeated

More Related