1 / 55

The Deconstruction of Dyninst

The Deconstruction of Dyninst. Andrew R. Bernat University of Wisconsin bernat@cs.wisc.edu. Barton P. Miller University of Wisconsin bart@cs.wisc.edu. Challenges. Systems are becoming increasingly complex Scale: thousands or millions of cores Software: new languages, operating systems, …

will
Download Presentation

The Deconstruction of Dyninst

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Deconstruction of Dyninst Andrew R. Bernat University of Wisconsin bernat@cs.wisc.edu Barton P. Miller University of Wisconsin bart@cs.wisc.edu

  2. Challenges • Systems are becoming increasingly complex • Scale: thousands or millions of cores • Software: new languages, operating systems, … • Hardware: new or expanded architectures • End-to-end tool suites are not tenable • Tool designers find it increasingly hard to “keep up” with new systems • Componentization of tools is the answer! • Allows code sharing between research groups • Makes porting between systems easier We can stop reinventing the wheel! The Deconstruction of Dyninst

  3. Benefits of Components • Code/technique sharing between projects • How many CFG implementations exist? • We all have areas of expertise • Provides a more equal opportunity for smaller groups • Tools have an enormous “start-up” cost • Current fundamental software simply isn’t there • Increases collaboration The Deconstruction of Dyninst

  4. Sharing Isn’t Always Easy • Local requirements are easier to understand • External users have wildly different requirements • How do we generalize our own code? The Deconstruction of Dyninst

  5. Easier to Give than Receive • We have to be willing to use other people’s work • Components must be open • No licensing issues • Available source code • Abstractions must be extensible • Must be easy to add new capability to external components • Annotations may be the solution The Deconstruction of Dyninst

  6. Why Binary Code? • Access to the source code often is not possible: • Proprietary software packages. • Stripped executables. • Proprietary libraries: communication (MPI, PVM), linear algebra (NAG), database query (SQL libraries). • Binary code is the only authoritative version of the program. • Changes occurring in the compile, optimize and link steps can create non-trivial semantic differences from the source and binary. • Worms and viruses are rarely provided with source code The Deconstruction of Dyninst

  7. So You Want to Build a Binary Tool? • Binaries are weird • Functions overlap and share code • Code in data/data in code is a classic problem • Obfuscated and self-checksumming programs • Globally optimized programs (especially Windows) • Missing (or incorrect) symbol/debug information • We have what we think are best-of-breed solutions here… • Binaries are huge – 250MB of executable code! • Running programs are highly dynamic • Constantly load/unload libraries • Fork/exec auxiliary programs (e.g., ssh) • Create/destroy threads The Deconstruction of Dyninst

  8. So You Want to Build a Binary Tool (con’t) • Instruction sets are difficult • Each has its idiosyncrasies – register windows, variable-length instructions, predicates… • Operating systems keep changing • Particularly Linux and its interfaces • Languages become more abstract • How do we present a binary OpenMP construct to the user? The Deconstruction of Dyninst

  9. Our Starting Point: Dyninst • A machine-independent library for machine level code analysis and modification. • Designed (and evolved) interface • We’ve learned a lot from user feedback • Designed: point and snippet • Evolved: queries, callbacks, replacements, ... • Originally developed as part of Paradyn – the deconstruction started a long time ago. The Deconstruction of Dyninst

  10. Code: Query-Based • Find control-flow and data elements • Functions, loops, basic blocks • Variables and parameters • Query the call graph (caller/callee) • Iterate over an intra-procedural CFG • Key point: “global knowledge” kills scalability The Deconstruction of Dyninst

  11. Modification: Replace and Wrap • Replace/Remove Function Calls • Redirect a function call (or remove it) • Replace Functions • Redirect all calls (current and future) to a function to a new function • Replace Instructions • Code snippet executes instead • Wrap Functions • Allow the new function to call the replaced one (potentially with all its original parameters). The Deconstruction of Dyninst

  12. Control: Callbacks and “Do It Now” • Callbacks as a notification interface • System calls (fork/exec/exit), • Thread create/destroy • Loaded libraries, ... • “Do It Now” operations • Create/attach, start, stop • Malloc/free • OneTimeCode • Load module The Deconstruction of Dyninst

  13. Generation: Abstract Syntax Trees • Basic coding (lessons from compilers) • Arithmetic, if statements, loops • Function calls • Perfect for dynamic process data • Effective addresses (or values) of memory operations • PID/TID executing instrumentation • Read/write local or global variables • Read/write parameters and return values The Deconstruction of Dyninst

  14. Machine Independent Code Abstract Syntax Trees: SPARC Code Power Code sethi %hi(ctr) ld [. . .],%o1 add %o1,%o1,1 st %o1,[. . .] incl ctr cau r3,r0,hi%ctrl r4,lo%ctr(r3)addi r4,1(r4)st r4,lo%ctr(r3) IA32 Code The Deconstruction of Dyninst

  15. The Future: BinInst • Tool-kit component architecture for binary analysis and editing • Open source • Open data structure definitions • Interactive/extendable abstractions • Machine-independent abstract interfaces • Batch-enabled analyses • Static and dynamic code patching • All major analysis products are exportable • Enhanced testability and accompanying test suites The Deconstruction of Dyninst

  16. Dynamic Editing Scenario (Dynamic Instrumentation) Code Queries and Instrumentation Requests AST Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Code Gen Instr Control Intra-Proc CFG Process Control Stack Walker Idiom Signatures Disassembly User Process The Deconstruction of Dyninst

  17. Static Editing Scenario (Binary Rewriting) Code Queries and Instrumentation Requests AST Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Code Gen Instr Control Intra-Proc CFG Binary Generator Idiom Signatures Disassembly The Deconstruction of Dyninst

  18. Interactive Editing Scenario (Static or Dynamic) Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Code Gen Instr Control Intra-Proc CFG Binary Generator Idiom Signatures Disassembly The Deconstruction of Dyninst

  19. Analysis Scenario VSA Buffer Overrun Symbol Table Dump Call Graph Binary Decode and Parsing Binary Code Connector 2 Intra-Proc CFG Idiom Signatures Other Tool Code Surfer Disassembly The Deconstruction of Dyninst

  20. AST * * * * * * Code Queries and Instrumentation Requests Symbol Table Parser PE Symbol Table Dump ELF Call Graph Code Gen Code Parser COFF Binary Code Intra Proc CFG Instr Control Instruction Decoder Idiom Detector IA32 Process Control Idiom Signatures AMD64 Stack Walker Power Disassembly The Deconstruction of Dyninst

  21. SymtabAPI • Version 1.0 available as of June 5, 2007. • Supports ELF, XCoff, PE (Linux, Solaris, AIX, Windows). • Debug information available in next release • Line numbers • Local variables • Variable/function types • Users can create and update new symbols and debug information http://www.paradyn.org/html/symtab1.0-features.html The Deconstruction of Dyninst

  22. Stackwalker • Available soon on all Dyninst platforms. • Cross-platform API for collecting first and third party stackwalks. • Callback interface allows users to plug in their own stack walking mechanisms, e.g: • Walking through non-standard stack frames created by optimized functions. • Use stackwalking debug information provided by another library The Deconstruction of Dyninst

  23. InstructionAPI • Decodes machine code into abstract instruction representation • Interface allows straightforward data flow and control flow analysis • Query interface is designed for analysis, e.g.: • Control flow targets • Registers read/written • Memory addresses accessed • Instructions can be annotated with analysis results • Provides disassembly interface • Pluggable formatters The Deconstruction of Dyninst

  24. Ongoing Work • How do we make arbitrary binary code modification easy? • Binary Editing: Drew Bernat • How do we find functions in optimized code? • Machine Learning-based Parsing: Nate Rosenblum • How can we ensure this works on all platforms? • Automatic Test Generation: Greg Cooksey • How do we scale to millions of nodes? • Scalable Reliability: Dorian Arnold • Filesystem Interface to Group Operations: Mike Brim The Deconstruction of Dyninst

  25. Demo: Unstrip • Demonstration tool that regenerates a stripped binary’s symbol table • Uses Dyninst stripped code parser to find function entry points • Derives correct function sizes from the binary • Uses SymtabAPI to write the new symbol table into the binary. The Deconstruction of Dyninst

  26. Demo: Memory Checker • Simple memory checker library that locates the following: • Double free of buffers (fast!) • Free of invalid memory (fast!) • Trace malloc/free to determine largest memory users (slow…) • Uses Stackwalker to identify the call chain at allocation/deallocation sites • Future: use SymtabAPI to identify function names and call site line numbers The Deconstruction of Dyninst

  27. Back-up Slides The Deconstruction of Dyninst

  28. Instrumentation Relocated Instruction(s) The Basic Mechanism ApplicationProgram Trampoline Function foo inserted jump What about self-checksumming code? read &foo The Deconstruction of Dyninst

  29. Invisible Relocation • Programs can be modified without changing their behavior • Now: emulate PC-relative code • Future: emulate self-checksumming operations • We can model and prove relocation correct • What does “changing behavior” mean? • How can we predict this? • Side benefit: less expensive relocation • Only emulate what is necessary to maintain original behavior The Deconstruction of Dyninst

  30. Binary Code is Getting Messier • Parsed x86 Linux binaries on department server • Many binaries have non-contiguous functions • 190 out of 820 executables • 67 out of 235 libraries • Many binaries contain some code that appears to be shared between functions • Steps were taken to reduce instances of false sharing by recognizing some non-returning functions (exit, abort) • Libraries and program binaries exhibit differing characteristics. The Deconstruction of Dyninst

  31. Typical Binaries Exhibit Gaps • 41% of all functions in surveyed binaries were unreachable through static parsing • Up to 90% of functions are unreachable in some binaries Functions could exist at any point in the gap Func A Func B The Deconstruction of Dyninst

  32. Statistical Binary Parsing • Idea: Use the properties (features) of the code found through static parsing to improve detection of functions in the gaps • E.g., prefix matching, n-gram models of basic blocks, CFG-based features • Approaches parsing as a machine learning problem • Avoids pitfalls of narrow heuristics like stack frame preamble matching • Different compilers emit different idioms • Lack of a single idiom should not immediately disqualify a function candidate The Deconstruction of Dyninst

  33. Scalability Reliable and Scalable Tools (Dorian Arnold) • How do we build tools that efficiently scale to 1000’s or 10000’s of nodes? • Start with our MRNet infrastructure: a TBŌN: tree-based overlay Network • Efficiently and flexibly handle Multi-casts and Reductions. • How do we handle failures? • How do we recover from failures? • We use a relaxed consistency model and the natural redundancy in a tree-based computation to get fault tolerance for free. The Deconstruction of Dyninst

  34. Testing • We test Paradyn and Dyninst on multiple: • processors, operating systems, operating system versions; • compilers, languages, optimization levels; • stripped vs. non-stripped, threaded vs. non-threaded. • We need an automated testing mechanism (and so do you) • Input: declarative specification of the tests desired • Output: automatically built and run tests for all platforms/optimization levels/compilers/… • Challenges: • Not all combinations should be run on all platforms. • Combinations of specifications are inter-related. • Similar features appear differently on different platforms: e.g., specifying optimization level. The Deconstruction of Dyninst

  35. Binary Analysis and Editing • Analysis: processing of the binary code to extract syntactic and symbolic information. • Symbol tables (if present) • Decode (disassemble) instructions • Control-flow information: basic blocks, loops, functions • Data-flow information: from basic register information to highly sophisticated (and expensive) analyses. • Binary rewriting: static (before execution) modification of a binary program: • Analyze the program and then insert, remove, or change the binary code, producing a new binary. • Dynamic instrumentation: dynamic (during execution) modification of a binary program: • Analyze the code of the running program and then insert, remove, or change the binary code, changing the execution of the program. • Can operate on running programs and servers. The Deconstruction of Dyninst

  36. Uses of Binary Analysis and Editing • Cyber-forensics • Analysis: understand the nature of malicious code • Binary-rewriting: produce a new version of the code that might be instrumented, sandboxed, or modified for study. • Dynamic instrumentation: same features, but can do it interactively on an executing program. • Hybrid static/dynamic: control execution and produce intermediate versions of the binary that can be re-executed (and further instrumented). • Program tracing: instructions, memory accesses, function calls, system calls, . . . • Debugging • Testing • Performance profiling • Performance modeling • Reverse engineering The Deconstruction of Dyninst

  37. Base Trampoline Save Regs Instrumentation Code Instrument Restore Regs Relocated Instruction(s) Instrumentation Code Patching in Instrumentation ApplicationProgram Mini Trampolines Function foo The Deconstruction of Dyninst

  38. Instrumentation Overview • Typical instrumentation • Non-recursive • Single-threaded • Simple • Minimal register usage • Lots of extra work! • Can we do better? Save GPRs Save FPRs Save Flags Thread Index Tramp Guard load $foo -> r1 add r1, 1 -> r1 store r1 -> $foo Restore GPRs Restore FPRs Restore Flags The Deconstruction of Dyninst

  39. User Code Optimization • Produce efficient code for specific platforms • IA-32: “increment memory location” • May have minimal direct effect • Processors are very good at optimization • However, these optimizations drive other profitable optimizations – an optimization cascade • Remove register uses  avoid saving the registers • Remove inter-instrumentation dependencies The Deconstruction of Dyninst

  40. Register Save/Restore Costs • Save/restore only registers that are both live and used during instrumentation • Perform liveness analysis and track register usage during code generation • Skip floating-point if unused • Unexpected overhead: special-purpose registers • IA-32 condition flags are expensive to access The Deconstruction of Dyninst

  41. Unnecessary Dyninst Services • Instrumentation includes auxiliary code • Recursive trampoline guards • Multithreaded index calculation • This code may not be necessary • User code makes no function calls -> no recursion • Single-threaded instrumentation -> no MT needed • Solution: automatically prune “dead” code • Another cascade of optimizations The Deconstruction of Dyninst

  42. Bringing It All Together • Automatic optimizations • Disable tramp guard • Skip multithread code • IA-32 specific generation • Avoid register saves Save GPRs Save FPRs Save Flags Thread Index Tramp Guard load $foo -> r1 add r1, 1 -> r1 store r1 -> $foo inc $foo Restore GPRs Restore FPRs Restore Flags The Deconstruction of Dyninst

  43. Optimization Results inc $foo The Deconstruction of Dyninst

  44. Dyninst Automated Testing • A test suite of almost 100 operation-specific tests. • Runs each night on each platform on the nightly build. • Variations for different compilers, languages (C, C++, Fortran), stripped vs. non-stripped code, etc. • Results reported on the web (reachable from paradyn.org or dyninst.org home pages): http://www.paradyn.org/testresults/dyntable.html The Deconstruction of Dyninst

  45. Instrumentation Overhead • Relocation allows us to instrument anywhere • But what about efficiency? • Causes of overhead: • Relocated instructions • Instrumentation code • Register save/restore code • Dyninst-provided features • Recursive tramp guard • Multithread index calculation • What can we simplify or remove? The Deconstruction of Dyninst

  46. Parsing and Function Discovery • Parsing sites are determined in three ways: • Function locations as reported by symbol table (when available) • Targets of inter-procedural control transfer operations • Speculative parsing of gaps between procedures • The goal is to find function prologues The Deconstruction of Dyninst

  47. Experimental Framework • Data set consists of 583 Linux binaries with full symbol information on department server • Each binary provides reference, training, and test data • Reference: the location of all code in the binaries, obtained by parsing with full symbol information • Training: the functions in the binary obtained through static parsing only • Test: the candidate functions obtained by parsing from every byte within the remaining gaps • The classifier is trained and tested on individual binaries • Use Condor to accelerate evaluation. The Deconstruction of Dyninst

  48. The Learning System • Language models • Unigram and bigram models, and edit distance of known basic blocks • Maximum common prefix length • Compares a candidate block’s initial instructions to every known function entry block • Reachability measure (log normalized) • The feature weights are trained using logistic regression Weighted Features f1 f2 f3 f4 ∑ Decision Function A binary classifier for candidate entry blocks The Deconstruction of Dyninst

  49. Preliminary Results Some brand new results just improved these numbers! The Deconstruction of Dyninst

  50. The Deconstruction of Dyninst

More Related