1 / 11

Binary Translation and Applications

Binary Translation and Applications. R. Sekar Stony Brook University. Binary Translation for Protecting Applications. Basic approach: Instrument OS+application to enforce policies that protect the application from a hostile OS Why binary translation?

nash
Download Presentation

Binary Translation and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Binary Translation and Applications R. Sekar Stony Brook University

  2. Binary Translation for Protecting Applications • Basic approach: Instrument OS+application to enforce policies that protect the application from a hostile OS • Why binary translation? • Versatile: enforce a wide range of properties • Low-level: memory pages, instructions/operands,… • Higher-level: fine-grained (data-structure level) memory isolation, policies on callable functions and parameters,… • Global: information flow, control-flow integrity,… • Wide applicability: • COTS and legacy applications available only in binary form • Components in hand-written assembly code • Parts of the OS, performance-critical components of some applications (e.g., Firefox, GIMP, some media codecs, …) • Flexibility: • apply different policies on different OS components (allows some components to be more trusted than others)

  3. Binary Translation Today … • State-of-art uses dynamic binary translation • Instrument each instruction just before first execution • Side-steps one of the key challenges with COTS binary: accurate disassembly (and dynamically generated code) • Is dynamic translation really practical? • Yes! It is already in wide use in a number of tools • Valgrind, VMWare, QEMU, … • One-time translation overhead can be • Relatively low for coarse-grained instrumentation • i.e., where only a small fraction of instructions are instrumented • Easily amortized for long-running applications

  4. So what is the problem? • Very high overheads for fine-grained instrumentation • Many security applications instrument most instructions, e.g., taint-tracking and fine-grained memory isolation • Dynamic approaches save/restore registers and flags so that they can be used by the added instrumentation • Need many memory reads and writes for each instruction • Purely dynamic approach precludes most optimizations (which rely on static analysis for soundness) • 400% to 4000% (e.g., Valgrind memcheck) overhead! • Large applications can take several minutes to start up! • Difficulty in reasoning about higher level properties • Another limitation of pure dynamic translation

  5. Our Approach • Use (mostly) static binary rewriting to reduce overheads • Eliminate most runtime overhead for disassembly or translation • Can use static analysis to reduce instrumentation overheads • But COTS binaries pose many challenges • Binaries lack information available to compilers • Variable kind (local/global), size or type • Function boundaries or number of parameters • Position-independent code (PIC), non-standard use of stack, functions with side-effects, and aliasing are common • More complications: Hand-written assembly, exceptions, multi-threading, unrestricted pointer arithmetic and pointer forging … • Solution • Develop a static-analysis based approach that systematically overcomes these challenges • Can also form the basis for reasoning about higher level properties

  6. Previous Results: Binary Taint-Tracking • Metadata for each word of data • Metadata for M : TAGMAP[M/4] 1 UNTRUSTED TAGMAP 0 TRUSTED ADDRESS SPACE

  7. Taint-Tracking: Problem with Performance • R = R + M • Save R1,R2,R3 in memory • R1 = &M • R2 = TAGMAP [R1 >> 2] • R3 = RegTaint [R] • Taint = R2 || R3 • RegTaint [R] = Taint • Restore R1, R2, R3

  8. Binary Taint-Tracking: Key Results • A newmodular, scalable static analysis for binaries that recovers information about • local variables and function parameters • PC-relative addressing (PIC code) • limited information about aliasing • Effective new optimizations supported by static analysis • Register-caching of metadata (taint) • Metadata-sharing among locations with equal taint labels • “Fast path” code specialization • Good performance • Our 30% to 160% overhead is much faster than previous works (4x to 40x slowdown) • Fast-enough for online operation with no perceptible slowdown for many CPU-intensive applications (e.g., media players) • No perceptible difference to application startup times

  9. Key Research Problems • Static analysis of binary code • Cope with challenges of posed by low-level code (hand-written assembly, pointer fabrication, violation of stack/calling conventions …) • Reconstruct higher level views needed to support required security properties • Robust disassembly • Avoid optimistic assumptions or assumptions regarding compilers used for code generation • Rely on static analysis instead • If assumptions are unavoidable, then verify them at runtime • Fall back to dynamic translation when all else fails • Indirect calls that cannot be analyzed, dynamically generated code, …

  10. Key Research Problems (Continued.) • Threat analysis and defenses • Analyze the full range of threats (especially low-level threats) posed by hostile OSes and develop defense mechanisms • Leveraging compiler support (when available) • Utilizing type or other information provided by compiler to enhance property enforcement • Tie-in to SVM and certifying translation • Exploit hardware features • Features for enhanced isolation, multi-core, …

  11. Questions?

More Related