automatic for the people reducing inadvertent leaks by personal machines n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Automatic for the people: Reducing inadvertent leaks by personal machines PowerPoint Presentation
Download Presentation
Automatic for the people: Reducing inadvertent leaks by personal machines

Loading in 2 Seconds...

play fullscreen
1 / 28

Automatic for the people: Reducing inadvertent leaks by personal machines - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

Automatic for the people: Reducing inadvertent leaks by personal machines. Landon Cox Duke University. Inadvertent leaks. Usability and privacy: A Study of Kazaa ... Good and Krekelberg, CHI, 2003 In 12 hours, found 150 inboxes on Kazaa Observed people downloading dummy inbox

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Automatic for the people: Reducing inadvertent leaks by personal machines


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
automatic for the people reducing inadvertent leaks by personal machines
Automatic for the people: Reducing inadvertent leaks by personal machines

Landon Cox

Duke University

inadvertent leaks
Inadvertent leaks
  • Usability and privacy: A Study of Kazaa ...
    • Good and Krekelberg, CHI, 2003
    • In 12 hours, found 150 inboxes on Kazaa
    • Observed people downloading dummy inbox
  • Problem hasn’t gone away
technical solution
Technical solution?

Servers: Asbestos, HiStar, Flume

Languages: Jif, Laminar, Resin

Desktop: PrivacyScope, TightLip

Process

Files

Reference monitor

Process

Network

Process

IPC

Policy

User

Admin

Dev

Automation

automatic policy specific
Automatic policy specific.
  • State of the art: pattern matching
    • Look for strings that look like SSNs, CCs, etc.
    • find_SSNs, Firefly, SENF, Spider, etc.
    • A bit brittle and error-prone
    • High false positive/negative rates
  • Let’s take a different approach
key observations
Key observations

1) Personal machines often cache sensitive data

2) Servers force clients to access files using crypto

3)Crypto is general technique, used across admin. domains and applications

redflag overview
RedFlag overview
  • Identifies processes that store decrypted data
    • Unobtrusive (requires no user input)
    • Compatible with legacy applications
    • Compatible with existing Internet protocols
  • High-level insights
    • Stop trying to figure out what sensitive data looks like
    • Use heuristics of how sensitive data is handled
caveats
Caveats
  • We cannot stop all inadvertent leaks
    • Stop large, important class of leaks
  • Trust and threat model
    • Uncompromised host
    • No IP spoofing or DNS hijacking
    • Correct, trusted reference monitor (take your pick)
    • Buggy/absent access-control policies
redflag system overview
RedFlag system overview

Monitor

sockets

Compose

rules

Inspect

process

monitoring sockets
Monitoring sockets
  • Goal
    • Try to identify incoming encrypted data
    • Only at application level (e.g., SSL)
  • Easy for most widely used apps
    • Look at remote port (e.g., 443 or 993)
  • Not always sufficient
    • Non-standard ports: Skype, Groove, Groupwise
    • XMPP sends SSL, non-SSL data to same port (5222/TCP)
information entropy
Information entropy
  • Compute entropy score for ambiguous ports
    • Negligible performance overhead
    • If score above threshold (~7.9 bits/byte), invoke inspection process
  • Can induce false positives
    • Compressed data sent in the clear (e.g., mp3s)
    • On-the-fly compression schemes (e.g., http content-coding=gzip)
  • Luckily, doesn’t need to be 100% accurate
    • Really just a performance optimization to save work
    • Only used as a first-pass filter
    • Correct any mistakes in inspection phase
redflag system overview1
RedFlag system overview

Monitor

sockets

Compose

rules

Inspect

process

inspect process
Inspect process
  • Goals of inspection
    • Infer when file write depends on network read
    • Determine whether file write is decrypted data
  • Use taint-tracking
    • Too slow to perform in critical path of desktop apps
    • Perform asynchronously via deterministic replay
    • Fork if network monitor flags process (port or entropy)
    • Log libc calls in original, use log in replay process
    • Attach taint-tracker to replayed process (e.g., PIN)
    • Perform analysis on a free core in the background
taint tracking
Taint tracking
  • Implement with PIN
    • Rewrite instructions to propagate taint
    • Record taint in shadow memory
  • Key questions
    • What are the taint sources?
    • What info to send to the policy composer?
slide15

Address space

“/tmp/attach.pdf, 74.125.45.83:443”

}

<!DOCTYPE html PUBLIC ...

Taint label (byte)

0

0

0

0

0

1

}

}

Shadow memory

Fine when there is no ambiguity about the source

But what about ambiguous ports?

ambiguous ports
Ambiguous ports
  • Search process memory for AES s-boxes
    • S-boxes are set by algorithm designer
    • S-boxes are unlikely to appear randomly
    • (also look for well-known transformations)
ambiguous ports1

0

0

0

0

0

1

Ambiguous ports
  • If we find s-boxes in a library data section
    • Assume image is a crypto library
    • Vast majority of crypto libraries include AES implementation
  • Instrument lib to set “crypto bit” of inbound taint labels
    • If crypto bit == 1, network data was “routed” through crypto lib
    • If crypto bit == 0, assume network data was not decrypted
  • Also use s-boxes as taint source
    • Data derived from s-boxes have “AES bit” set
    • Can use to gauge strength of crypto algorithm

Taint label (byte)

1

1

}

ID index

AES bit

Crypto bit

redflag system overview2
RedFlag system overview

Monitor

sockets

Compose

rules

Inspect

process

compose rules
Compose rules
  • Taint-tracking gives three pieces of info
    • Description of network source
    • If data was routed through crypto library
    • If data was derived from AES s-box
  • Can use this to compose policies
compose rules1
Compose rules
  • Same source
    • Allow sensitive files to be copied back to their source
    • Raise alert otherwise
    • Generalize hostnames (e.g., *.google.com)
  • Obfuscation vs. confidentiality
    • Many P2P clients use crypto to obfuscate
    • Aren’t trying to protect data so use weak algorithms
    • (e.g., BitTorrent and LimeWire explicitly do not support AES)
    • If ambiguous port + no AES, then ignore file
redflag implementation
RedFlag implementation
  • Runs on Ubuntu 8.10
  • Modified Jockey for logging/replay
    • Supports multi-threaded programs
    • User-level thread library
  • PIN tool for tainting
    • Based on sequential taint tracker from Speck
    • Modified to allow tainting during replay
    • Implemented s-box search, crypto and AES bits in taint label
evaluation
Evaluation
  • Accuracy
    • How well can RedFlag identify crypto libraries using s-boxes?
    • How well does RedFalg categorize sensitive files?
  • Performance
    • Will asynchronous taint-tracking fall behind?
identifying crypto libraries
Identifying crypto libraries
  • Looked at 10 Ubuntu programs
    • Email: checkgmail, thunderbird
    • IM: pidgin
    • P2P: Azureus, Limewire, Skype, Transmission
    • Web: Firefox, Opera, wget
  • Successfully identified crypto libs in all
    • Including custom implementations, plugins (flash player)
    • Interesting case: Opera folds crypto into exectable
categorizing sensitive files
Categorizing sensitive files
  • Non-sensitive files
    • Used Firefox
    • Loaded 30 most popular webistes (alexa)
    • RedFlag produced no false positives/negatives
  • Sensitive files
    • Downloaded 17 representative sensitive docs
    • Firefox, thunderbird, pidgin
conclusions
Conclusions
  • RedFlag automates policy specification
    • Heuristic-based approach
    • Monitor process behavior, not file content
    • Sensitive files usually downloaded using crypto
    • Deal with ambiguous ports using entropy scores, AES s-boxes
  • Evaluation highlights
    • Automatically identified crypto libraries
    • Correctly categorized files in 45/47 scenarios
    • No false positives, three false negatives
    • Sufficient idle time in long-running process