1 / 16

Decompilation of Binary Programs

Decompilation of Binary Programs. Christina Cifuentes & K. John Gough School of Computing Science Queensland University of Technology Presented by Conny Chan. Overview. Usage of decompiler Phases of the decompiler Front-end, UDM, Back-end The decompiling system Signature generator

diella
Download Presentation

Decompilation of Binary Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Decompilation of Binary Programs Christina Cifuentes & K. John Gough School of Computing Science Queensland University of Technology Presented by Conny Chan

  2. Overview • Usage of decompiler • Phases of the decompiler • Front-end, UDM, Back-end • The decompiling system • Signature generator • Conclusion • Discussion

  3. Reverse Compiler • Perform inverse process of a compiler • Usage: • Maintenance of Code • Lost code recovery • Migration of application to new HW platform. • Translation of the obsolete code into new code. • Software Security • Malicious code detection Binary code HLL program

  4. Note in advance: • Decompiler in this article: • Experimental decompiler for the DOS OS • Intel i80286 architecture • Read .com and .exe files • Produce C programs as output.

  5. Decompiler Structure Binary Program Front-end (machine dependent) UDM (analysis) Back-end (language dependent) HLL program

  6. Loader Parser The Front-end Virtual Memory Binary Program 1. Low-level Intermediate code 2. Control flow graph Semantic Analysis

  7. The Semantic Analysis • Performs idiom analysis e.g. • type propagation. • E.g. long variable found at –1 to –4. neg dx neg ax sbb dx, 0 neg dx:ax [bp-2] … [bp-4] merged [bp-2]:[bp-4]

  8. The Universal Decompiling Machine (UDM) UDM Data Flow Analysis 1. Low-level Intermediate code 2. Control flow graph 1. High-level Intermediate code 2. Structured control flow graph Control Flow Analysis

  9. The Back-end Restructuring 1. High-level Intermediate code 2. Structured control flow graph HLL Program HLL Code Generation

  10. The HLL code generation • Defines: • Global variables • Emits code for each function • In each function: • Comments of such procedures • Variables and procedures named in loc1, proc2, etc.

  11. The Decompiling System Signature Generator (dccSign) + Decompiler (dcc)

  12. Extract Signatures e.g. printf(), scanf() Signature Generator (dccSign) Decompiler (dcc) Signature Database Compiler Signatures Library Signatures Signature Database • If a library function matched, replaced by the library name instead of analyzed by dcc.

  13. Signatures • Library Signature • A series of instructions that identifies library function for a compiler. • Compiler Signature • A series of instructions that identifies a particular version of a compiler.

  14. Decompiler (dcc) Signature Checker Signature Checker • Determined if known compiler is used. • Check first n bytes of instructions with pattern-matching

  15. Conclusion • Present one way for decompiling binary program. • Prove the feasibility of writing a decompiler for a contemporary machine architecture.

  16. Discussion • Is decompilation legal?

More Related