1 / 18

WANG Zhi

Eureka: A Framework for Enabling Static Malware Analysis the 13 th European Symposium on Research in Computer Security ( ESORICS ) conference 2008. WANG Zhi. Outline. Overview of Generic Unpacker. 1. System Call Level Heuristic. 2. Statistics-Based Unpacking. 3. Evaluation Metrics. 4.

jorden-kent
Download Presentation

WANG Zhi

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Eureka: A Framework for Enabling Static Malware Analysisthe 13thEuropean Symposium on Research in Computer Security (ESORICS) conference2008 WANG Zhi

  2. Outline Overview of Generic Unpacker 1 System Call Level Heuristic 2 Statistics-Based Unpacking 3 Evaluation Metrics 4

  3. Overview of Unpacker • Static analyses: decompile and analyze the logical structure, flow, and data stored within the binary itself. • Dynamic analyses: monitor the behavior of the malware binary at runtime. • Fine-grained monitor (Instruction-level) • Coarse-grained monitor (page-level)

  4. Generic Automatic Unpackers PolyUnpack Renovo OmniUnpack Eureka Instruction-level Instruction-level Page-level System call level Model-base trigger Heuristic trigger Heuristic trigger Heuristic and Statistical trigger slow slow fast fast • The variability in unpacking strategies come from the granularity of tracking unpacking behavior.

  5. Eureka Eureka Coarse-grained execution tracing NtTerminateProcess NtCreateProcess Statistical bigram analysis bigram.

  6. Coarse-grained Execution Tracing • Eureka uses the event of program exit as a trigger. • NtTerminateProcess implies that the unpacked malicious payload has been successfully decrypted. • A large fraction of current malware use a new process (NtCreateProcess) to execute the unpacked malicious payload.

  7. Problems • Not all malware exit and keep an executing version resident in memory • Packers can make spurious event of creating new process. • Malware authors can simply avoid exiting the malware process. • The above two simple heuristics may work for a large fraction of malware today( as much as 80%), it may not be the same for future malware.

  8. Evaluation

  9. Statistical bigram analysis • Mining statistical patterns in x86 code • Use simple n-gram analysis • Use the IDA Pro to extract regions from executable that were marked as functions. • Looking for the most common bigrams ( opcode pairs or 2-byte opcodes) and space bigrams( byte pairs separated by 1 or more bytes) • Found FF 15(call) , FF 75(push), E8---00 and E8---FF are prevalent in x86 code.

  10. Occurrence summary of bigrams

  11. Bigram Counts • Bigram counts during execution of goat file packed with Aspack

  12. Bigram Counts • Bigram counts during execution of goat file packed with Molbox

  13. Bigram Counts • Bigram counts during execution of goat file packed with Armadillo

  14. Bigram Counts • There are consistent and significant shifts in the bigram counts. • The simple bigram counting approach had over a 95% success rate in distinguishing between packed and unpacked malware instance.

  15. Evaluation Metrics • Code-to-data ratio • An observable difference between packed code and unpacked code is the amount of identifiable code and data found in the binary • Use IDA Pro to identify valid code sequences. • In IDA Pro, data are represented by db, dw or dd. • In packed executables, the ratio is below 3%. • In unpacked executables, the ratio is above 50%.

  16. Code-to-data ratio Packed Unpacked

  17. Code-to-data ratio Grey area stand for data Blue area stand for code Original notepad.exe memory space Packed notepad.exe memory space

  18. Thank You !

More Related