1 / 28

BinHunt: Automatically Finding Semantic Differences in Binary Programs

BinHunt: Automatically Finding Semantic Differences in Binary Programs. Debian Gao Michael K. Reiter Dawn Song. ICICS 2008: 10th International Conference on Information and Comunications Security. Conference. ICICS :

binta
Download Presentation

BinHunt: Automatically Finding Semantic Differences in Binary Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BinHunt: Automatically Finding Semantic Differences in Binary Programs Debian Gao Michael K. Reiter Dawn Song ICICS 2008: 10th International Conference on Information and Comunications Security

  2. Conference • ICICS: • A bi-annual International Conference on Information, Communications and Signal Processing. The conference covers areas in Information Engineering, Communication Systems, Signal Processing, Multimedia Processing and Applications.

  3. Papers • Session V: Software security • BinHunt: Automatically Finding Semantic Differences in Binary ProgramsDebin Gao (a), Mike Reiter (b) and Dawn Song (c) • Enhancing Java ME Security Support with Resource Usage MonitoringPaolo Mori, Fabio Martinelli, Alessandro Castrucci and Francesco RopertiIIT-CNR, Italy • Pseudo-randomness Inside Web BrowsersGuan Zhi, Zhang Long, Zhong Chen and Nan XianghaoPeking University, China

  4. Author • Debin Gao • Michael K. Reiter • Dawn Song

  5. Debin Gao • Automatically Adapting a Trained Anomaly Detector to Software PatchesPeng Li, Debin Gao and Michael K. ReiterIn Proceedings of the 12th International Symposium on Recent Advances in Intrusion Detection (RAID 2009) • Bridging the Gap between Data-flow and Control-flow Analysis for Anomaly DetectionPeng Li, Hyundo Park, Debin Gao and Jianming FuIn Proceedings of the 24th Annual Computer Security Applications Conference (ACSAC 2008) • Gray-Box Extraction of Execution Graphs for Anomaly DetectionDebin Gao, Michael K. Reiter and Dawn SongIn Proceedings of the 11th ACM Conference on Computer and Communications Security (CCS 2004) • On Gray-Box Program Tracking for Anomaly DetectionDebin Gao, Michael K. Reiter and Dawn SongIn Proceedings of the 13th USENIX Security Symposium (USENIX Security 2004) Assistant Professor School of Information Systems Singapore Management University

  6. Michael K. Reiter • Automatically adapting a trained anomaly detector to software patches P. Li, D. Gao and M. K. Reiter In Recent Advances in Intrusion Detection, 12th International Symposium, RAID 2009 • Fast and black-box exploit detection and signature generation for commodity software X. Wang, Z. Li, J. Y. Choi, J. Xu, M. K. Reiter and C. Kil ACM Transactions on Information and System Security 12(2) • On gray-box program tracking for anomaly detection D. Gao, M. K. Reiter and D. Song In Proceedings of the 13th USENIX Security Symposium Lawrence M. Slifkin Distinguished Professor Department of Computer Science University of North Carolina at Chapel HIll

  7. Dawn Song • Research Projects • BitBlaze: Binary analysis for COTS protection and malicious code defense • Binary Code Extraction and Interface Identification for Security Applications. Juan Caballero, Noah M. Johnson, Stephen McCamant, and Dawn Song. In Proceedings of the 17th Annual Network and Distributed System Security Symposium, February 2010. • Loop-Extended Symbolic Execution on Binary Programs. Prateek Saxena, Pongsin Poosankam, Stephen McCamant, and Dawn Song. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA), July 2009. • BitBlaze: A New Approach to Computer Security via Binary Analysis. Dawn Song, David Brumley, Heng Yin, Juan Caballero, Ivan Jager, Min Gyung Kang, Zhenkai Liang, James Newsome, Pongsin Poosankam, and Prateek Saxena. In Proceedings of the 4th International Conference on Information Systems Security Associate Professor Computer Science Division University of California, Berkeley

  8. Introduction BinHunt: It bases its analysis on the control flow of the programs using a new graph isomorphism technique, symbolic execution, and theorem proving for finding semantic differences in binary programs. Semantic differences: changes in the program functionality Syntactic differences: e.g. Different register allocation and basic block re-ordering

  9. Challenge • A small change in the source code may cause the compiler to use a different register allocation in other parts of the program in which the corresponding source code remains the same • A small change in the source code may change the size of a small number of basic blocks, which further triggers the compiler to re-order many other basic blocks in the binary file

  10. Idea • The control flow of a program is much more resistant to “superficial” changes like different register allocations and basic block re-ordering, and therefore is a more attractive feature for finding semantic differences

  11. Assumption • source code of binary files is not available • function name extracted from these binary files are unreliable for the purpose of binary difference analysis, since they can be changed easily

  12. System Overview(1) Input: two binary files Output: a matching between functions in the two binary files a matching between basic blocks in two matched functions a matching strength for each match of functions or basic block

  13. System Overview(2) Decision: The matchings together with the matching strengths tell us where the semantic differences are. Unmatched functions and unmatched basic blocks, as well as matched functions and matched basic blocks with low matching strengths, constitute the semantic differences found between the two binary file.

  14. Disassembler • parse each binary file • locate the code segment • Realization: • Implement a plug-in to IDA Pro

  15. IR Converter • IR: a dozen different statements, which are type-checked and free of side effects • Easy: our symbolic execution and theorem proving are applied on a much simpler set of instructions • Reliable: reduce the language variation in performing the same functionality

  16. CFG Constuctor • CFG: a set of nodes each representing a basic block and a set of directed edges representing the control flow among the basic blocks • CG: the set of nodes corresponding to the functions in the file and the set of directed edges representing calls among the functions

  17. Graph Isomorphism Engine • Basic Block Comparison • Symbolic Execution and Theorem Proving • Maximum common subgraph isomorphism problem • Backtracking Algorithm

  18. Symbolic Execution • Definition • represent values of program variables with symbolic values instead of concrete(initialized) data and to manipulate expressions involving symbolic values • Procedure • Step1: • find all the input and output registers and variables • Step2: • use symbolic execution to represent the final values of the output registers and variables

  19. Theorem Proving • Realization • STP: a decision procedure for the satisfiability of quantifier-free formulas in the theory of bit-vectors and arrays • Procedure • pick the symbolic representation of one register/variable from each basic block and use STP to test if they are equivalent, assuming that the inputs to the basic blocks share the same values • Assurance • if two basic blocks are found to be different by our technique of symbolic execution and theorem proving, then they must not be functionally equivalent • This property holds even if the two binary files are compiled using different compilers or compiler options.

  20. Matching Strength • Basic Block • 1.0: functionally equivalent and registers used are the same • 0.9: functionally equivalent while registers used are different • lower: scored on how functionally equivalent they are • Function • 1.0: instructions(x86 or IR) of the two functions are the same • others: subgraph measurement divided by the number of nodes in the CFG that has fewer nodes, where subgraph measurement is defined as the summation of matching strengths of matched nodes(basic blocks)

  21. Backtracking Algorithm • D: • contains all possible pairs of nodes that might still be matched(initially V X M) • M: • contains matched node pairs(initially empty)

  22. Case Study——gzip

  23. Case Study——tar(1)

  24. Case Study——tar(2)

  25. Case Study——tar(3)

  26. Related Work& Conclusion • BinDiff/BindView • contruct a maximal subgraph isomorphism between the sets of functions in two versions of the same executable file • BinHunt: • contribute a more thorough technique(backtracking technique) for identifying the maximum common subgraph isomorphism • use a novel technique for basic block comparison using symbolic execution and theorem proving

  27. Reference

  28. Thank you!

More Related