1 / 26

Model Checking x86 Executables with CodeSurfer/x86 and WPDS++

Model Checking x86 Executables with CodeSurfer/x86 and WPDS++. G. Balakrishnan 1 , T. Reps 1,2 , N. Kidd 1 , A. Lal 1 , J. Lim 1 , D. Melski 2 , R. Gruian 2 , S. Yong 2 , C.-H. Chen 2 , and T. Teitelbaum 2,3 1 University of Wisconsin 2 GrammaTech, Inc. 3 Cornell University. Source code. IR

eros
Download Presentation

Model Checking x86 Executables with CodeSurfer/x86 and WPDS++

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Model Checking x86 Executableswith CodeSurfer/x86 and WPDS++ G. Balakrishnan1, T. Reps1,2, N. Kidd1, A. Lal1, J. Lim1,D. Melski2, R. Gruian2, S. Yong2, C.-H. Chen2, and T. Teitelbaum2,3 1University of Wisconsin 2GrammaTech, Inc. 3Cornell University

  2. Source code IR Construction Front end CFG + call graph + other info State machine IR Exploration Model checker Error report OK Static Bug-Detection Tools

  3. Executable Memory-access analyzer IR Recovery CFG + call graph + memory-access info State machine IR Exploration Model checker Error report OK Static Bug-Detection Tools

  4. Reveals platform-specific choices made by compiler What you see is what you get Some source-level issues go away Better platform for finding security vulnerabilities Source-code tools: Lack of fidelity can allow vulnerabilities to escape detection Why Executables?

  5. Minimizing Data Lifetime? • Windows • Login process keeps a user’s password in the heap after a successful login • Should minimize data lifetime by • clearing memory • calling free() • But . . . • the compiler might optimize away the memory-clearing code (“useless-code” elimination) memset(buffer, ‘\0’, len); free(buffer); free(buffer);

  6. Puzzle int callee(int a, int b) { int local; if (local == 5) return 1; else return 2; } int main() { int c = 5; int d = 7; int v = callee(c,d); // What is the value of v here? return 0; } Answer: 1 (for the Microsoft compiler)

  7. Tutorial on x86 (Intel Syntax) p = q; p = *q; *p = q; p = &a[2];

  8. Tutorial on x86 (Intel Syntax) mov ecx, edx mov ecx, [edx] mov [ecx], edx lea ecx, [esp+8] ecx = edx; ecx = *edx; *ecx = edx; ecx = &a[2];

  9. Puzzle Standard prolog Prolog for 1 local push ebp push ebp mov ebp, esp mov ebp, esp sub esp, 4 push ecx int callee(int a, int b) { int local; if (local == 5) return 1; else return 2; } int main() { int c = 5; int d = 7; int v = callee(c,d); // What is the value of v here? return 0; } Answer: 1 (for the Microsoft compiler) mov [ebp+var_8], 5 mov [ebp+var_C], 7 mov eax, [ebp+var_C] push eax mov ecx, [ebp+var_8] push ecx call _callee . . .

  10. The Vision • Code-inspection tools for security analysts • Analyses for identifying • security vulnerabilities and bugs • malicious behavior (code vs. memory snapshots) • commonalities and differences • Platform for • de-compilation • code obfuscation • installation of protection mechanisms • remediation of security vulnerabilities • de-obfuscation (w/ assistance from dyn. tools)

  11. IR recovery control-flow graph (w/ indirect jumps resolved) call graph (w/ indirect calls resolved) identification of variables values of pointers used, killed, and possibly-killed variables for CFG nodes data dependences identification of types: base types, pointer types, structs, and classes GUI for code browsing and navigation Scripting language API for accessing the IR API for modifying the IR IR exploration API for traversal/searching/pattern matching API for defining static-analyzers/model-checkers GUI to investigate warnings Cooperation with dynamic tools What Should a Tool Provide? No use of symbol-table or debugging information!!!

  12. IDA Pro ParseBinary CodeSurfer Build CFGs Build SDG Browse Initial estimate of • code vs. data • procedures • call sites • malloc sites • fleshed-out CFGs • fleshed-out call graph • used, killed, may-killed variables for CFG nodes • points-to sets • reports of violations CodeSurfer/x86 Architecture Binary Security Analyzers Connector Decompiler Value-setAnalysis Binary Rewriter UserScripts

  13. IDA Pro ParseBinary CodeSurfer Build CFGs Build SDG Browse Initial estimate of • code vs. data • procedures • call sites • malloc sites • fleshed-out CFGs • fleshed-out call graph • used, killed, may-killed variables for CFG nodes • points-to sets • reports of violations CodeSurfer/x86 Architecture Binary Security Analyzers Connector Decompiler Value-setAnalysis Binary Rewriter UserScripts

  14. IDA Pro ParseBinary CodeSurfer Build CFGs Build SDG Browse Initial estimate of • code vs. data • procedures • call sites • malloc sites • fleshed-out CFGs • fleshed-out call graph • used, killed, may-killed variables for CFG nodes • points-to sets • reports of violations CodeSurfer/x86 Architecture Binary Security Analyzers Connector Decompiler Value-setAnalysis Binary Rewriter UserScripts

  15. IDA Pro ParseBinary CodeSurfer Build CFGs Build SDG Browse Initial estimate of • code vs. data • procedures • call sites • malloc sites • fleshed-out CFGs • fleshed-out call graph • used, killed, may-killed variables for CFG nodes • points-to sets • reports of violations CodeSurfer/x86 Architecture Binary Security Analyzers Connector Decompiler Value-setAnalysis Binary Rewriter UserScripts

  16. IDA Pro ParseBinary CodeSurfer Build CFGs Build SDG Browse Initial estimate of • code vs. data • procedures • call sites • malloc sites • fleshed-out CFGs • fleshed-out call graph • used, killed, may-killed variables for CFG nodes • points-to sets • reports of violations CodeSurfer/x86 Architecture Binary Security Analyzers Connector Decompiler Value-setAnalysis Binary Rewriter UserScripts

  17. IDA Pro ParseBinary CodeSurfer Build CFGs Build SDG Browse Initial estimate of • code vs. data • procedures • call sites • malloc sites • fleshed-out CFGs • fleshed-out call graph • used, killed, may-killed variables for CFG nodes • points-to sets • reports of violations CodeSurfer/x86 Architecture Binary Security Analyzers Connector Decompiler Value-setAnalysis Binary Rewriter UserScripts

  18. IR Recovery: Scope of our Ambitions • Programs that conform to a “standard compilation model” • procedures • activation records • global data region • heap-allocated structs/objects (malloc/new) • virtual functions • dynamically linked libraries • Report violations • violations of stack protocol • return address modified within procedure Memory-safety violations!

  19. Static Analysis of Executables:State of the Art Prior to CS/x86 • Relies on symbol-table/debugging info • Atom, EEL, Vulcan, Rival • Able to track only data movements via registers • EEL, Cifuentes, Debbabi, Debray • Poor treatment of memory operations • Overly conservative treatment  many false positives • Non-conservative treatment  many false negatives • Limited usefulness for security analysis

  20. An Application of CodeSurfer/x86 • Project at MIT Lincoln Labs (originally classified) • Adopted CodeSurfer/x86 (replacing IDA Pro) • DARPA funding under “Dynamic quarantine of worms” • PI: Rob Cunningham; PM: Anup Ghosh • Given a worm . . . • What are its target-discovery, propagation, and activation mechanisms? • What is its payload? • Use of CodeSurfer/x86’s analysis mechanisms • Find system calls • Find their arguments • Follow dependences backwards to find where their values come from • . . .

  21. Demo CodeSurfer/C CodeSurfer/x86

  22. Reveals platform-specific choices made by compiler memory layout padding between fields of a struct which variables are adjacent? register usage execution order optimizations performed compiler bugs Some source-level issues go away analyze the actual library code, not hand-written stubs in-line assembly code use of multiple source languages Better platform for finding security vulnerabilities A source-code tool would have to duplicate all choices made by the compiler & optimizer Why Executables?

  23. IR Exploration • API for traversal/searching/pattern matching • API for defining static-analyzers/model-checkers • Use a script to traverse IR • Create a model of the program as Weighted PDS • Invoke analyzer (WPDS++) • Path Explorer tool • Software-assurance plug-in to CodeSurfer/x86 • Performs security-related analyses on the IR • Uses the GUI to investigate warnings

  24. Related Work Balakrishnan and Reps, “Analyzing memory accesses in x86 executables” [CC04] http://www.cs.wisc.edu/~reps/#cc04 Debray et al., “Alias analysis of executable code” [POPL 98] Cifuentes et al., “Assembly to high-level language translation” [ICSM 98] A. Mycroft, “Type-based decompilation” [ESOP 99] Linn et al., “Stack analysis of x86 executables” [Unpublished] Guo et al., Practical and accurate low-level pointer analysis” Amme et al., “Data dependence analysis of assembly code” [PACT 98]

  25. Questions & Discussion

More Related