Ucsd potemkin honeyfarm
This presentation is the property of its rightful owner.
Sponsored Links
1 / 20

UCSD Potemkin Honeyfarm PowerPoint PPT Presentation


  • 110 Views
  • Uploaded on
  • Presentation posted in: General

UCSD Potemkin Honeyfarm. Jay Chen, Ranjit Jhala, Chris Kanich, Erin Kenneally, Justin Ma, David Moore, Stefan Savage, Colleen Shannon, Alex Snoeren, Amin Vahdat, Erik Vandekeift, George Varghese, Geoff Voelker , Michael Vrable. Network Telescopes.

Download Presentation

UCSD Potemkin Honeyfarm

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Ucsd potemkin honeyfarm

UCSD Potemkin Honeyfarm

Jay Chen, Ranjit Jhala, Chris Kanich,

Erin Kenneally, Justin Ma, David Moore, Stefan Savage,

Colleen Shannon, Alex Snoeren, Amin Vahdat, Erik Vandekeift,

George Varghese, Geoff Voelker, Michael Vrable


Network telescopes

Network Telescopes

  • Infected host scans for other vulnerable hosts by randomly generating IP addresses

  • Network Telescope: monitor large range of unused IP addresses – will receive scans from infected host

  • Very scalable. UCSD monitors 17M+ addresses (/8 + /16s)


Telescopes active responders

Telescopes + Active Responders

  • Problem: Telescopes are passive, can’t respond to TCP handshake

    • Is a SYN from a host infected by CodeRed or Welchia? Dunno.

    • What does the worm payload look like? Dunno.

  • Solution: proxy responder

    • Stateless: TCP SYN/ACK (Internet Motion Sensor), per-protocol responders (iSink)

    • Stateful: Honeyd

    • Can differentiate and fingerprint payload


Honeynets

HoneyNets

  • Problem: don’t know what worm/virus would do?

    • No code ever executes after all.

  • Solution: redirect scans to real “infectible” hosts (honeypots)

    • Individual hosts or VM-based: Collapsar, HoneyStat, Symantec

    • Can reduce false positives/negatives with host-analysis (e.g., TaintCheck, Vigilante, Minos) and behavioral/procedural signatures

  • Challenges

    • Scalability

    • Liability (honeywall)

    • Isolation (2000 IP addrs -> 40 physical machines)

    • Detection (VMWare detection code in the wild)


The scalability fidelity tradeoff

The Scalability/Fidelity tradeoff

Telescopes + Responders

(iSink, Internet Motion Sensor)

VM-based Honeynet

Network

Telescopes

(passive)

Live Honeypot

Nada

Highest

Fidelity

Most

Scalable


Potemkin a large scale high fidelity honeyfarm

Goal: emulate significant fraction of Internet hosts (10M+)

Multiplex large address space on smaller # of servers

Temporal & spatial multiplexing

Physical Honeyfarm Servers

Global

Internet

GRE

Tunnels

64x /16

advertised

MGMT

Gateway

VM

VM

VM

VM

VM

VM

VM

VM

VM

Potemkin: A large scale high-fidelity honeyfarm

  • Scalability, Fidelity, and Containment in the Potemkin Virtual Honeyfarm, Vrable, Ma, Chen, Moore, VandeKieft, Snoeren, Voelker, and Savage, SOSP 2005


Ucsd honeyfarm approach

UCSD Honeyfarm Approach

  • Make VMs very, very cheap

    • Create one (or more) VM per packet on demand

  • Deploy many types of VM systems

    • Plethora of OSes, versions, configurations

  • Monitor VM behavior

    • Decide benign or malicious

    • Benign: Quickly terminate, recycle resources

    • Malicious: Track propagation, save for offline analysis, etc.

    • Assumes common case that most traffic is benign

  • Key issues for remainder of talk

    1) Scaling

    2) Containment


Scaling

Scaling

  • Naïve approach: one machine per IP address

    • 1M addresses = 1M hosts = $2B+ investment

  • However most of these resources would be wasted

  • Claim: should be possible to make do with 5-6 orders of magnitude less


Resulting philosophy

Resulting philosophy

  • Only commit the minimal resources needed and only when you need them

  • Address space multiplexing

    • Late-bind the assignment of IP addresses to physical machines (on demand assumption of identity)

  • Physical resource multiplexing

    • Multiple VMs per physical machine

    • Exploit memory coherence

      • Delta virtualization (allows ~1000 VMs per physical machine)

      • Flash cloning (low latency creation of on demand VM)


Address space multiplexing

Address space multiplexing

  • For a given unused address range and service time distribution, most addresses are idle

/16 network

500ms service time

But most of these arehorizontal port scans!


The value of scan filtering

The value of scan filtering

  • Heuristic: no more than one (srcip, dstport, protocol) tuple per 60 seconds

Max

Mean


Implementation

Implementation

  • Gateway (Click-based) terminates inbound GRE tunnels

  • Maintains external IP address->type mapping

    • i.e. 132.239.4.8 should be a Windows XP box w/IIS version 5, etc

  • Mapping made concrete when packet arrives

    • Flow entry created and pkt dispatched to type-compatible physical host

    • VMM on host creates new VM with target IP address

    • VM and flow mapping GC’d after system determines that no state change

  • Bottom line: 3 orders of magnitude savings


Physical resource multiplexing

Physical resource multiplexing

  • Can create multiple VMs per host, but expensive

    • Memory: address spaces for each VM (100s of MB)

      • In principal limit for VMWare = 64 VMs, practical limit less

    • Overhead: initializing new VM wasteful

  • Claim: can support 100’s-1000 VMs per host by specializing hosts and VMM

    • Specialize each host to software type

    • Maintain reference image of active system of that type

    • Flash cloning: instantiate new VMs via copying reference image

    • Delta virtualization: share state COW for new VMs (state proportional to difference from reference image)


How much unique memory does a vm need

How much unique memory does a VM need?


Potemkin vmm implementation

Potemkin VMM implementation

  • Xen-based using new shadow translate mode

    • New COW architecture being incorporated back into Xen (VT compatible)

  • Clone manager instantiates frozen VM image and keeps it resident in physical memory

    • Flash clone memory instantiated via eager copy of PTE pages and lazy faulting of data pages(moving to lazy + profile driven eager pre-copy)

    • Ram disk or Parallax FS for COW disks

  • Overhead: currently takes ~300ms to create new VM

    • Highly unoptimized (e.g. includes python invocation)

    • Goal: Pre-allocated VM’s can be invoked in ~5ms


Containment

Containment

  • Key issue: 3rd party liability and contributory damages

    • Honeyfarm = worm accelerator

    • Worse, I knowingly allowed my hosts to be infected (premeditated negligence)

  • Export policy tradeoffs between risk and fidelity

    • Block all outbound packets: no TCP connections

    • Only allow outbound packets to host that previously send packet: no outbound DNS, no botnet updates

    • Allow outbound, but “scrub”: is this a best practice?

    • In the end, need fairly flexible policy capabilities

      • Could do whole talk on interaction between technical & legal drivers

  • But it gets more complex…


Internal reflection

Internal reflection

  • If outbound packet not permitted to real internet, it can be sent back through gateway

    • New VM generated to assume target address (honeyfarm emulates external Internet)

    • Allows causal detection (A->B->C->D) and can dramatically reduces false positives

  • However, creates new problem:

    • Is there only one version of IP address A?

    • Yes, single “universe” inside honeyfarm

      • No isolation between infections

      • Also allows cross contamination (liability rears its head again)

    • No, how are packets routed internally?


Causal address space aliasing

Causal address space aliasing

  • A new packet i destined for address t, creates a new universe Uit

  • Each VM created by actions rooted at t is said to exist in the same universe and a single export policy is shared

    • In essence, the 32-bit IP address space is augmented with a universe-id that provides aliasing

    • Universes are closed; no leaking

  • What about symbiotic infections? (e.g., Nimda)

    • When a universe is created it can be made open it to multiple outside influences

    • Common use: a fraction of all traffic is directed to a shared universe with draconian export rules


Overall challenges for honeyfarms

Overall challenges for honeyfarms

  • Depends on worms scanning it

    • What if they don’t scan that range (smart bias)

    • What if they propagate via e-mail, IM? (doable, but privacy issues)

  • Camouflage

    • Honeypot detection software exists… perfect virtualization tough

  • It doesn’t necessary reflect what’s happening on your network (can’t count on it for local protection)

  • Hence, there is a need for both honeyfarm and in-situ approaches


Summary

Summary

  • Potemkin: High-fidelity, scalable honeyfarm

    • Fidelity: New virtual host per packet

    • Scalability: 10M IP addresses  100 physical machines

  • Approach

    • Address multiplexing: late-bind IPs to VMs (103:1)

    • Physical multiplexing: VM coherence, state sharing

      • Flash cloning: Clone from reference image (milliseconds)

      • Delta virtualization: Copy-on-write memory, disk (100+ VMs per host)

  • Containment

    • Risk vs. fidelity: Rich space of export policies in gateway

  • Challenges

    • Attracting attacks, camouflage, denial-of-service


  • Login