Bouncer: Securing Software by Blocking Bad Input

Bouncersecuring software by blocking bad input Miguel Castro Manuel Costa, Lidong Zhou, Lintao Zhang, and Marcus Peinado Microsoft Research

Software is vulnerable • bugs are vulnerabilities • attackers can exploit vulnerabilities • to crash programs • to gain control over the execution • vulnerabilities are routinely exploited • we keep finding new vulnerabilities

How we secure software today • static analysis to remove vulnerabilities • checks to prevent exploits • type-safe languages • unsafe languages with instrumentation • but what do you do when a check fails? • (usually) no code to recover from failure • restarting may be the only option • bad because programs are left vulnerable to • loss of data and low-cost denial of service attacks

Blocking bad input • Bouncer filters check input when it is received • filters block bad input before it is processed • example: drop TCP connection with bad message • input is bad if it can exploit a vulnerability • most programs can deal with input errors • programs keep working under attack • correctly because filters have no false positives • efficiently because filters have low overhead

Outline • architecture • symbolic execution • symbolic summaries • precondition slicing • evaluation

program instrumented to detect attacks & generate trace generate filter conditions for sample combine sample conditions trace conditions generation of alternative exploits filter new exploit program instrumented to detect attacks & log inputs sample exploit attacks Bouncer architecture

Example • vulnerable code: char buffer[1024]; char p0 = 'A'; char p1 = 0; if (msg[0] > 0) p0 = msg[0]; if (msg[1] > 0) p1 = msg[1]; if (msg[2] == 0x1) { sprintf(buffer, "\\servers\\%s\\%c", msg+3, p0); StartServer(buffer, p1); } • sample exploit: 97 97 97 1 1 1 97 0

Symbolic execution • analyze trace to compute path conditions • execution with any input that satisfies path conditions follows the path in the trace • inputs that satisfy path conditions are exploits • execution follows same path as with sample exploit • use path conditions as initial filter • no false positives: only block potential exploits

Computing the path conditions • start with symbolic values for input bytes: b0,… • perform symbolic execution along the trace • keep symbolic state for memory and registers • add conditions on symbolic state for: • branches: ensure same outcome • indirect control transfers: ensure same target • load/store to symbolic address: ensure same target

Example symbolic execution *msg p0 b0,b1,… b0 symbolic state mov eax, msg movsx eax, [eax] cmp eax, 0 jle L mov p0, al L: mov eax, msg movsx eax, [eax+1] cmp eax, 0 jle M mov p1, al M: eax (movsx b0) eflags (cmp (movsx b0) 0) path conditions (jg (cmp (movsx b0) 0)) [b0 > 0]

Example symbolic execution *msg p0 b0,b1,… b0 symbolic state mov eax, msg movsx eax, [eax] cmp eax, 0 jle L mov p0, al L: mov eax, msg movsx eax, [eax+1] cmp eax, 0 jle M mov p1, al M: eax (movsx b1) eflags (cmp (movsx b0) 0) path conditions (jg (cmp (movsx b0) 0)) [b0 > 0]

Example symbolic execution *msg p0 b0,b1,… b0 symbolic state mov eax, msg movsx eax, [eax] cmp eax, 0 jle L mov p0, al L: mov eax, msg movsx eax, [eax+1] cmp eax, 0 jle M mov p1, al M: eax (movsx b1) eflags (cmp (movsx b1) 0) path conditions (jg (cmp (movsx b0) 0)) [b0 > 0] and (jg (cmp (movsx b1) 0)) [b1 > 0]

Properties of path conditions • path conditions can filter with no false positives b0 > 0 Λ b1 > 0 Λ b2 = 1 Λ b1503 = 0 Λ bi ≠ 0 for all 2 < i < 1503 • they catch many exploit variants [Vigilante] • but they usually have false negatives: • fail to block exploits that follow different path • example: they will not block exploits with b0 ≤ 0 • attacker can craft exploits that are not blocked • we generalize filters to block more attacks

Symbolic summaries • symbolic execution in library functions • adds many conditions • little information to guide analysis to remove them • symbolic summaries • use knowledge of library function semantics • replace conditions added during library calls • are generated automatically from a template • template is written once for each function • summary is computed by analyzing the trace

Example symbolic summary • the vulnerability is in the call • sprintf(buffer, "\\servers\\%s\\%c", msg+3, p0) • the symbolic summary is • bi ≠ 0 for all 2 < i < 1016 • it constrains the formatted string to fit in buffer • it is expressed as a condition on the input • summary is computed by • using concrete and symbolic argument values • traversing trace backwards to find size of buffer

Pre-condition slicing • analyze code and trace to compute a path slice: • slice is a subsequence of trace instructions whose execution is sufficient to exploit the vulnerability • generalize filter using the slice • keep conditions added by instructions in slice • discard the other conditions • reduces false negative rate • does not introduce false positives

Computing the slice • add instruction with the vulnerability to slice • traverse trace backwards • track dependencies for instructions in slice • add instructions to slice when • branches: • path from branch may not visit last slice instruction • path to last slice instruction can change dependencies • other instructions: can change dependencies • combination of static and dynamic analysis

Example char buffer[1024]; char p0 = 'A'; char p1 = 0; if (msg[0] > 0) p0 = msg[0]; if (msg[1] > 0) p1 = msg[1]; if (msg[2] == 0x1) { sprintf(buffer, "\\servers\\%s\\%c", msg+3, p0); StartServer(buffer, p1); }

Slicing example 1 moveax, msg 2 movsxeax, [eax+1] 3 cmpeax, 0 4 jle 6 5 mov p1, al 6 moveax, msg 7 movsxeax, [eax+2] 8 cmpeax, 1 9 jne N a movsxeax, p0 b movecx, msg c add ecx, 3 dpush eax # call sprintf e push ecx dependencies msg[3], msg[4], msg[5],…, ecx msg ,eflags ,eax ,msg[2] ,msg[2] slice …,e,d ,b ,c ,9 ,8 ,7 ,6

Example filter after each phase • after symbolic execution b0 > 0 Λ b1 > 0 Λ b2 = 1 Λ b1503 = 0 Λ bi ≠ 0 for all 2 < i < 1503 • after symbolic summary b0 > 0 Λ b1 > 0 Λ b2 = 1 Λ bi ≠ 0 for all 2 < i < 1016 • after slicing b2 = 1 Λ bi ≠ 0 for all 2 < i < 1016 • the last filter is optimal

Deployment scenarios • distributed scenario • instrument production code to detect exploits • run Bouncer locally on each exploit we detect • deploy improved filter after processing an exploit • centralized scenario • software vendor runs cluster to compute filters • vendor receives sample exploits from customers • run Bouncer iterations in parallel in the cluster

Evaluation • implemented Bouncer prototype • detected memory corruption attacks with DFI • generated traces with Nirvana • used Phoenix to implement slicing • evaluated Bouncer with four real vulnerabilities • SQL server, ghttpd, nullhttpd, and stunnel • started from sample exploit described in literature • ran iterations with search for alternative exploits • single machine; max experiment duration: 24h

Filter accuracy • bouncer filters have no false positives • perfect filters for two vulnerabilities

Conditions after each phase

Filter generation time

Throughput with filters

Throughput with filters 50 Mbits/sec

Conclusion • filters block bad input before it is processed • Bouncer filters have • low overhead • no false positives • no false negatives for some vulnerabilities • programs keep running under attack • a lot left to do

Bouncer: Securing Software by Blocking Bad Input

Bouncer: Securing Software by Blocking Bad Input

Presentation Transcript

BLOCKING BY OFFENSE AND DEFENSE

Blocking

Bad Software

Securing Loans For People with Bad Credit

Bad Software Design

BLOCKING

Blocking

Blocking

securing binary software through retrofitting

Bouncer

Securing software by enforcing data-flow integrity

Bouncer securing software by blocking bad input

BLOCKING

Ex. Software crisis - bad software

Blocking

Securing Embedded Software using Software Dynamic Translation

Securing the most Important Input $

BLOCKING

Securing Embedded Software using Software Dynamic Translation

US, Ad blocking software use by state 2015

BLOCKING BY OFFENSE AND DEFENSE

Newborn Bouncer