Zero-day Attack 방어를 위한 네트워크 기반 탐지 방법

Zero-day Attack 방어를 위한 네트워크 기반 탐지 방법 2007년 4월 12일 김익균 (ikkim21@etri.re.kr) 한국전자통신연구원 정보보호연구단

Contents Vulnerability & Zero-day Attack Intrusion Detection Detection Model Research Trends : Zero-day Attack Detection

Vulnerabilities Laws of Vulnerabilities • Half-life of critical vulnerabilities is 21 days • Half of the most prevalent are replaced by new vulnerabilities every year • Lifespan of some vulnerabilities and worms is unlimited • 80% of worms and automated exploits occur in the first two half-lives *Source : Gerhard Eschelbeck of Qualys at Blackhat 2004

Zero-day Attacks Time-gap between vulnerability disclosure and release of a worm that exploits it is decreasing LSASS buffer-overflow vulnerability SQL server buffer-overflow vulnerability Vulnerability disclosure 0-day attack Apr 30 2004 Jan 25 2003 ….. July 24 2002 2006 ?? Apr 11 2004 185-day attack 19-day attack Slammer Worm released Sasser Worm released Future worm released • A zero-dayattack is a computer threat that exposes undisclosed or unpatched computer application vulnerabilities. (defined by wikipedia) • May 2005 : Zero-day exploits for unknown vulnerabilities in Mozilla Firefox

Expert Manager Response Manager Rule Manage Response Alert Signature Data Base Detection Engine Pattern Matching Response Comparison Objects Packet Parsing Audit Data Packet Sensor Network Intrusion Detection - I Misused Analysis : Signature-based

Signature-based ID DPI : High Performance Pattern Matching • By 2006, 75 percent of Global 2000 enterprises will replace or augment their firewall approach with deep packet inspection capabilities • By 2005, enterprises will no longer use software-based application proxy firewalls • Source - Deep Packet Inspection: Next Phase of Firewall Evolution • (21 November 2002, Gartner) • Today, Deep Packet Inspection capability, ASIC-based Appliance • TippingPoint: IPS 5000E • TopLayer: IPS 5500 • Cisco: IPS4255 • By 2009 the UTM space will be the largest single market.

Report Alert Learning Profile Anomaly Analysis Engine Statistical Data Mining Neural Net. Learning & Comparison Objects Packet Parsing Audit Data Packet Sensor Network Intrusion Detection - II Anomaly Detection

Zero-day Attack Protection Anomaly Detection + Signature Generation + High Performance FW (IPS) • Current Signature Generation Process • New worm outbreak • Report of anomalies from people via phone/email/newsgroup • Worm trace is captured • Manual analysis by security experts • Signature generation • Labor-intensive, Human-mediated

UPR. LYR. PAYLOAD TCP/UDP HDR IP NOP NOP NOP NOP Decryption Code Attack Code Exploit (ReturnAddr) Control Flow Hijacking Worm Model • * Source [Crandall05] Epsilon-Gamma-Pi Model • Epsilon (ε) = Exploit Vector • Gamma (γ) = Bogus Control Data • Pi (π) = Payload

ε- γ- π Model CodeRed II Case • Epsilon (ε) = HTTP Header • Gamma (γ) = Return Address • Pi (π) = Codered Shellcode • * Source [Crandall05]

Control hijacking - example 11 Buffer Overflow Smashed Stack Smashed Stack Normal Stack

Recent Worm Exploits Worm Exploits

Randomly generates a new key and corresponding decryptor code Decrypt and execute Mutation A Worm body Mutation B Mutation C Worm Polymorphic Polymorphic Worm To detect an unknown mutation of a known virus , emulate CPU execution of until the current sequence of instruction opcodes matches the known sequence for virus body

Polymorphic Engine Mutation Engine • ADMutate alters each of these elements • NOP substitution with operationally inert commands • Shell code encoded by XORing with a randomly generated key • Return address modulated – least significant byte altered to jump into different parts of NOPs NOP substitute Another NOP Yet another NOP A different NOP Here’s a NOP NOP substitute Another NOP Yet another NOP A different NOP Here’s a NOP Polymorphic XOR Decoder XOR’ed Machine Code: execve (/bin/sh) XOR’ed Machine Code: execve (/bin/sh) Modulated Pointer to NOP Substitutes Modulated Pointer to NOP Substitutes

Metamorphic Code - Examples • Code reordering • Instructions that are independent are re-ordered MOV EAX, [X] MOV EBX, [Y] ADD EAX, EBX MOV [X], EAX MOV EBX, [Y] MOV EAX, [X] ADD EAX, EBX MOV [X], EAX • Garbage Code insertion • Instructions are inserted that are semantic no-ops (do not effect the code and registers, and therefore execution) MOV EAX, [X] MOV EBX, [Y] ADD EAX, EBX PUSH ESI MOV [X], EAX POP ESI MOV EAX, [X] MOV EBX, [Y] ADD EAX, EBX MOV [X], EAX • Equivalent Code Replacement • Register renaming, or semantically equivalent code MOV EAX, [X] MOV EBX, [Y] ADD EAX, EBX MOV [X], EAX XOR EAX, EAX ADD EAX, [X] ADD EAX, [Y] MOV [X], EAX • Register-reassignment • Swaps the usage of the registers Causes extensive “minor” changes in the code sequence

Zero-Day Attack Detection Research Trends • Network-based • Prevalence Model • Autograph/Polygraph • Earlybird • Other Type • PayL • PacketVaccine • Malicious Code detection • SigFree • Polymorphic Detection - Network execution • Control Flow Graph • Host-based • MINOS • DACODA

Prevalence Model – (1) • * Source [Singh04] [1] EarlyBird • Key observation : Define worm behavior • Content invariance • Portions of a worm are invariant (e.g. the decryption routine) • Content prevalence • Appears frequently on the network • Address dispersion • Distribution of destination addresses more uniform to spread fast • Two consequences • Content Prevalence: 1/60 sampled Rabin Fingerpirntng, 40bytes substring, Prevalence threshold is 3 • Address Dispersion: Threshold 30 source, 30 destination • Packet content examination can be evaded with simple polymorphism (Stefan Savage, UCSD *)

Prevalence Model – (2) • * Source [Kim 04] [2] Autograph • Key Observations • TCP worms that propagate via scanning • Worm’s payloads share a common substring • Vulnerability exploit part is not easily mutable, Not polymorphic • Step 1: Select suspicious flows using heuristics • Flows from scanners are suspicious • Step 2: Generate signature using content-prevalence analysis • All instances of a worm have a common byte pattern specific to the worm • Content-based Payload Partitioning (COPP) • Partition if Rabin fingerprint of a sliding window matches Breakmark • Configurable parameters: content block size (minimum, average, maximum), breakmark, sliding window

19 Prevalence Model – (2) • * Source [Kim 04] [2] Autograph A protocol through which multiple distributed Autograph monitors may share information step2: generate signature using content prevalence analysis step1: select suspicious flows using heuristics

Prevalence Model – (3) • * Source [Newsome05] [3] Polygraph • No one substring is specific enough • BUT, there are multiple substrings • Protocol framing • Value used to overwrite return address • (Parts of poorly obfuscated code) • Approach : combine the substrings (3 bytes-size)

Summary of Prevalence Model Detection Prevalence Model

Payload Anomaly [4] PAYL • * Source [Wang05] PAYL • Compute a “normal profile” of a site’s unique content flow, and use this information to detect anomalous data • n-gram • is the sequence of n adjacent byte values in a packet payload • A sliding window with width n is passed over the whole payload one byte at a time and the frequency of each n-gram is computed • The frequency count distribution represents a model of the content flow (“statistical centroid”) • Compare the similarity between test data and the trained models • Mahalanobis distance • If the distance of a test datum is greater than the threshold, the system issues an alert Character Distribution CodeRed II Normal HTTP Request

Jump Address Detection –(1) [5] Packet Vaccine • * Source [X. Wang, CCS2006] • Vaccine Generation • detection of anomalous packet payloads • a byte sequence resembling a jump address, and randomization of selected contents • Exploit Detection • detect an exploit attempt • it should now trigger an exception in a vulnerable program • Vulnerability Diagnosis • correlates the exception with the vaccine to acquire information regarding the exploit • the corrupted pointer content and its location in the exploit packet • Signature Generation • creates variations of the original exploit to probe the vulnerable program • in an effort to identify necessary exploit conditions for generation of a signature

24 Jump Address Detection –(2) [5] Packet Vaccine • * Source [X. Wang, CCS2006] • Vaccine Generation • A key step in most exploits is to inject a jump address to redirect the control flow of a vulnerable program • Such an address points • stack or heap in a code-injection attack • global library entry in an existing-code attack • Approach • Check every 4-byte sequence(32-bit system) or 8-byte sequence(64-bit system) • Randomize those which fall in the address range of the potential jump targets in a protected program • Should cause an exception, segmentation fault (SEGV) or illegal instruction fault(ILL)

Attack Scenario BLASTER Worm • DCOM object : Insufficient bounds checking • RPC Endpoint Mapper listen : 139, 135, 445, 593 Attacker Victim 1. Probe-Connection Scan Attempt ( TCP 135 ) Buffer Overflow 2. RPC DCOM Request ( TCP 135 ) 3. Shell Code : Binding Port 4444 4. Start TFTP Server Listening TCP 4444 5. Remote Command Attempt 4 5.1 tftp <host> GET msblast.exe 6. TFTP Download Request ( UDP 69 ) Worm File Code 7. Delivery Main Worm Body “Msblaster.exe” ( TFTP : UDP 69 ) 7.1 start msblast.exe msblaster.exe 8. Syn Flooding WindowsUpdate.com

Windows platforms Linux platforms Apache Webserver Port 80 Web Service Port 80 BIND Port 53 Remoteaccess services Port 111, 137, 138, 139 SNMP Port 161 MS-SQL Servers Port 1434 Workstationservices Port 139, 445 Mail Port 25 Databaseservers Port 1521, 3306, 5432 Accept Data Only Malicious Code Detection Approach –(1) Malicious Code • Assumptions • Buffer overflow attacks typically contain executables whereas legitimate client requests never contain executables in most Internet services • if a packet contains executables it would be an attack

Malicious Code Detection • * Source [Wang06] [6] SigFree • SigFree blocks attacks by detecting the presence of code • Signature free • Immunized from most attack-side obfuscation methods • Generic code-data separation criteria • Transparency • Negligible throughput degradation • Economical deployment with very low maintenance cost • Scope • Web service (port 80) Buffer overflow attacks • Actually it’s not a BOF detection algorithm, it’s a executable code detection algorithm • Application level attacks such as data manipulation and SQL injection are out of the scope • IA-32(Intel) • Packet based (No reassemble) • Assumption : Normal requests do not contain executable codes

SigFreeOverview • * Source [Wang06] SigFree [6] SigFree • SigFree architecture • Scheme 1: exploits the OS characteristics of a program (faster) • Scheme 2: exploits the data flow characteristics of a program (more robust) Extended instruction flow graph All Possible instruction

29 SigFree - Limitation • * Source [Wang06] [6] SigFree SigFree • Limitations • SigFree can’t fully handle the branch-function based obfuscation • SigFree can’t detect the shellcode that is written in a alphanumeric form • SigFree can’t detect malicious code which consists of fewer useful instructions than current threshold 15 • SigFree can’t the encrypted executable codes

Network-Level Execution – (1) • * Source [Michalls06] [7] Polymorphic Shellcode Detection • executes every potential instruction sequence, aiming to identify the execution behavior of polymorphic shellcodes • compares their execution profile against the behavior observed to be inherent to polymorphic shellcodes. [ Input Stream] Decryptor start end Byte shifting If over threshold, attack decision. Disassembly Execution Mem Read Count Disassembly Execution Mem Read Count Invalid memory accesses & Invalid Instructions Mem read loop

Network-Level Execution – (2) • * Source [Michalls06] [7] Polymorphic Shellcode Detection Patten 1 During decrytion, the decryptor must read the encrypted payload in order to decrypt it. Hence, the decryption process must read the encrypted payload. Criterion 1 : If a number of payload reads in a execution chain > Payload Read Threshold (PRT) Patten 2 A mandatory operation of every polymorphic shellcode is to find its location in memory using some form of “Get PC(%eip)”. Criterion 2 : If the chain executes some form of “Get PC(%eip)” “Get PC” code An execution chain Execution chain for payload reads

32 PW Detection : CFG – (1) • * Source [Kruegel 05] [8] Control Flow Graph Extraction Control Flow Graph Extraction • Perform a linear disassembly from the first byte of a stream to extract the machine instructions • Remove invalid basic blocks (resulted from the disassembly of non-code byte streams) • Invalid block : • if it contains one or more invalid instructions, • if it is on a path to an invalid block or • if it ends in a control transfer instruction that jumps into the middle of another instruction

PW Detection : CFG – (2) [8] CFG Construction • * Source [Kruegel 05] CFG Construction • linear disassembly of the byte stream • Nodes  Describes the sequence of instruction without any jumps. • Edges jump instruction making transition from one node to another. • Robustness to modification • Junk insertion, register renaming, code transposition, instruction substitution • Uniqueness • Different executable regions should map to different fingerprints • Move A 10 • Move B 10 • ADD B • JMP BLOCK2 • MOV A 15 • MOV B 20 • MUL B CFG of a binary code  cluster of closely connected nodes CFG of random sequence  isolated nodes

PW Detection : CFG – (3) Classify Instructions  14 sets • * Source [Kruegel 05] [8] Graph Coloring • A 14 bit colour value  associated with each node (1 bit corresponding to 1 class) • When one or more instructions of certain class appears in the basic block , the corresponding bit of the basic block colour value is set to 1. • E.g. MOV A, B 00000000000010 • MUL A,10 00000000000001 • PUSH A 00000000010000 • Node Colour : 00000000010011 • Append 14 bit colour value to each node in the adjacency matrix of the sub graph • Concatenate the rows as before and get the new fingerprint

Classification of Malicious Code Detection Static v.s. Dynamic Analysis • Static Analysis : Structure of executables • Analysis without “execution” • Disassembling -> String Analysis • Frequency Analysis • Structure of a program is described by its control flow graph (CFG) • Cannot be used to detect novel malware instances • Used to recognize obfuscated invariants of the same code instance • Dynamic Analysis : Behavior of executables • File is “executed “ in saved envirionment • VMWare, SandBox • Behavior of a program is the observable effect that it has on its environment • RegMon, FileMon, syscall monitoring • Consider the behavior for a whole class of malware

Payload Analysis Payload Analysis • Executables • Exploits • Known Files • Data with unknown file-type Unknown Filetype Known Filetype Static Analysis Crypto Analysis Dynamic Analysis Find out “responsible” Program (Loader)

Host-based Detection – (1) [9] MINOS • * Source [Crandall04] • Tagged architecture that tracks the integrity of every memory word • Network data is tainted • Control data (return pointers, function pointers, jump targets, etc.) should not be • Taint tracking with every instruction • Great for catching worms • Uses the γ mapping • Implemented a full-system tagging scheme in a virtual machine • Linux (modified kernel) • Tracks integrity in the file system • Virtual memory swapping • Windows (unmodified) • Works great as a honeypot for cacthing worms

Host-based Detection – (2) [10] DACODA • * Source [Crandall05] • DAvis malCODe Analyzer • Discover invariants in the exploit vector (ε) • Symbolic execution on the system trace during attacks that Minos catches • Used for an empirical analysis of polymorphism and metamorphism • Quantify and understand the limits

bibliography • [Singh04] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm fingerprinting. In OSDI, 2004. • [Kim 04]H.-A. Kim and B. Karp. Autograph: Toward automated, distributed worm signature detection. In USENIX Security Symposium, pages 271-286, 2004. • [Newsome05] J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. In Proceedings of the IEEE Symposium on Security and Privacy, May, 2005. • [Wang05] K. Wang, G. Cretu, and S. J. Stolfo. Anomalous payload-based worm detection and signature generation. In Proceedings of the 14th Usenix Security Symposium, Baltimore, MD, USA, July 31 – August 5 2005. • [X. Wang 06] XiaoFeng Wang, Zhuowei Li, Jun Xu, Michael K. Reiter, Chongkyung Kil, Jong Youl Choi1, Packet Vaccine: Black-box Exploit Detection and Signature Generation, CCS 2006 • [Wang06] SigFree: A Signature-free Buffer Overflow Attack Blocker, Usenix security 2006 • [Michalls06] Michalis Polychronakis, Kostas G. Anagnostakis, and Evangelos P. Markatos Network-Level Polymorphic Shellcode Detection Using Emulation DIMVA2006 • [Kruegel05] C. Kruegel, E. Kirda, D. Mutz,W. Robertson, and G. Vigna. Polymorphic worm detection using structural information of executables. In Proceedings of the 8th International Symposium on Recent Advances in Intrusion Detection (RAID), September 2005. • [Crandall 04] Jedidiah R. Crandall and Frederic T. Chong, Minos: Control Data Attack Prevention Orthogonal to Memory Model, IEEE/ACM international symposium on micro-architecture, 221-232, IEEE Computer Society. 2004 • [Crandall 05] J. R. Crandall, Z. Su, S. F. Wu, and F. T. Chong. On Deriving Unknown Vulnerabilities from Zero-Day Polymorphic and Metamorphic Worm Exploits. ACM CCS, pages 235–248, November 2005

Zero-day Attack 방어를 위한 네트워크 기반 탐지 방법

Zero-day Attack 방어를 위한 네트워크 기반 탐지 방법

Presentation Transcript

Hamsa: Fast Signature Generation for Zero-day Polymorphic Worms with Provable Attack Resilience

Attack

Zero

Attack

Predicting zero-day software vulnerabilities through data mining

Attack

Attack Attack !

ATTACK

Acquisitions: Your Latest Zero Day

zero

Panic attack

Welcome to ZERO Day!

STRUCTURING YOUR COMPANY DAY ZERO

Zero

Threat Modeling and the Zero Day Problem

Side-Channel Attack: timing attack

Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms

MUMBAI ATTACK

Nature Attack!

HEART ATTACK

Heart Attack

HEART ATTACK