1 / 16

Mining Windows Kernel API Rules

Mining Windows Kernel API Rules. Jinlin Yang jinlin@cs.virginia.edu 09/28/2005 CS696. My Background. Bounded exhaustive testing, 09/2001-01/2004

yukio
Download Presentation

Mining Windows Kernel API Rules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Mining Windows Kernel API Rules Jinlin Yang jinlin@cs.virginia.edu 09/28/2005 CS696

  2. My Background • Bounded exhaustive testing, 09/2001-01/2004 • D. Coppit, J. Yang, S. Khurshid, W. Le, and K. Sullivan. Software Assurance by Bounded Exhaustive Testing. IEEE Transactions on Software Engineering. April 2005 • K. Sullivan, J. Yang, D. Coppit, S. Khurshid, and D. Jackson. Software Assurance by Bounded Exhaustive Testing. ISSTA ‘04 • Temporal properties inference, 01/2004-present • J. Yang and D. Evans. Dynamically Inferring Temporal Properties. PASTE ’04 • J. Yang and D. Evans. Automatically Inferring Temporal Properties for Program Evolution. ISSRE ’04 • J. Yang and D. Evans. Automatically Discovering Temporal Properties for Program Verification. Submitted to FMSD • J. Yang, D. Evans, D. Bhardwah, T. Bhat, and M. Das. Terracotta: Mining Temporal API Rules from Imperfect Traces. Submitted to ICSE ‘06 Jinlin Yang, CS696

  3. Overview • Problem: unavailability of specification is a big issue in defect detection • Solution: automatically inferring specification from execution traces • Benefits: better understanding of legacy code and opportunity to find more defects • Experiments on finding kernel API rules • Found one previously unknown bug in Windows • Found interesting properties that should have been checked Jinlin Yang, CS696

  4. Problem • Defect detection technique • Generic properties • E.g. pointer and buffer usage • PREfix [Bush et al, SP&E00], PREfast • Very effective • Application specific properties • E.g. lock/unlock, resource creation/deletion • SLAM/SDV [Ball et al, SPIN01], ESP [Das et al, PLDI02] • Where do we get such properties? Jinlin Yang, CS696

  5. My Approach Instrumented Program Inferred Properties Execution Traces Program Report Running Inference Post-processing Instrumentation Property Templates Test Suite J. Yang and D. Evans. Dynamically inferring temporal properties. PASTE ‘04. Jinlin Yang, CS696

  6. An Example • Alternating template (PS)*, P≠S. P and S are placeholders Jinlin Yang, CS696

  7. Implementation • Terracotta • Inference engine • Context-aware trace analysis • Heuristics for prioritizing and presenting properties • Performance linear to length of trace and number of distinct events • More information http://www.cs.virginia.edu/terracotta Jinlin Yang, CS696

  8. Lessons • Missing interesting properties • Original algorithm requires 100% satisfaction • Real world is never perfect  • Trace collected by sampling • Object information unavailable • Imperfect programs • Can we develop better inference to handle this? • Too many noises in results • Interesting properties are buried in a group of uninteresting ones • Can we develop heuristics to select interesting ones? Jinlin Yang, CS696

  9. Refinement of Inference • How to detect interesting properties in face of imperfect traces? • Example • PS PS PS PS PS PS PS PS PS PPP • The dominant behavior is P and S alternate • 10 subtraces, 90% satisfy Alternating Jinlin Yang, CS696

  10. Refinement of Inference (2) • How to pick out interesting properties? • Which one is more likely to be interesting? • Heuristics: CD is often more interesting • Compute call graph for windows binaries • Keep AB if B is not reachable from A void A(){ ... B(); ... } Case 1 void KeSetTimer(){ KeSetTimerEx(); } void x(){ C(); ... D(); } Case 2 void x(){ ExAcquireFastMutexUnsafe(&m); ... ExReleaseFastMutexUnsafe(&m); } Jinlin Yang, CS696

  11. Refinement of Inference (3) • Heuristics: the more similar two events are, the more likely that the properties is interesting • Relative edit distance between A and B • Partition A and B into words • A has wA words, B has wB, w common words • For example: • Ke Acquire In Stack Queued Spin Lock  Ke Release In Stack Queued Spin Lock • Similarity = 85.7% Jinlin Yang, CS696

  12. Results: Kernel • Approximation • PAL threshold = 0.90 • 7611 properties • Call-graph and edit distance based reduction • Use the call-graph of ntoskrnl.exe, edit dist > 0.5 • 142 properties. 53 times reduction! • Small enough for manual inspection • 56 apparently interesting properties (40%) • Locking discipline • Resource allocation and deletion Jinlin Yang, CS696

  13. Result: Kernel (2) • Found interesting properties that should be checked • Several types of kernel SpinLock • The Static Device Verifier should have checked them • ESP found one previously unknown bug in ntfs.sys • Double-acquire of FastMutex • Confirmed and fixed by the responsible developers Static Driver Verifier: Finding Bugs in Device Drivers at Compile-Time. WinHEC, April 2004. M. Das, S. Lerner, and M. Seigle. ESP: Path-Sensitive Program Verification in Polynomial Time. PLDI ‘02 Jinlin Yang, CS696

  14. Summary of Experiments • We inferred interesting rules about kernel APIs! • SDV already encodes some properties http://download.microsoft.com/download/5/b/5/5b5bec17-ea71-4653-9539-204a672f11cf/SDV-intro.doc • We inferred undocumented ones too • Inference scales well to realistic traces • Approximation is effective in tolerating imperfect traces and detect dominant patterns • Call-graph and edit distance based reduction is very effective • Check with defect detection tool is promising • Other experiments: Vulcan APIs, Daisy file system Jinlin Yang, CS696

  15. Conclusion • Constructing interesting properties is important and difficult • Automatic inference from execution traces is light-weight and effective • Practical values • Helping developers understand legacy code • Giving us opportunity of leveraging sophisticated static analysis tools to find application specific defects Jinlin Yang, CS696

  16. Q & A • For more information jinlin@cs.virginia.edu http://www.cs.virginia.edu/terracotta • Great collaborators • UVa David Evans, Ed Mitchell • Microsoft Stephen Adams, Deepali Bhardwaj, Thirumalesh Bhat, Manuvir Das, Damian Hasse, Marne Staples, Rick Vicik, Jason Yang, Zhe Yang Jinlin Yang, CS696

More Related