Glavlit: Preventing Exfiltration at Wire Speed

ГУОГТП Glavlit: Preventing Exfiltration at Wire Speed Nabil Schear†*, Carmelo Kintana†, Qing Zhang†, Amin Vahdat† †Department of Computer Science and Engineering, University of California at San Diego *Los Alamos National Laboratory Hotnets V - Irvine, CA - November 30, 2006

Information Leaks and Exfiltration • Exfiltration – type of information leak; malicious theft of valuable information • Leaks affect customer confidence, regulatory compliance, profits, etc… • Leaks are inevitable • Targeted attacks, insiders, accidents, etc… Goal: Minimize leaks NO MATTER how or why they happen.

Email File How Does Data Get Out? Accidentally External Network Protected Network Boundary Web servers File User Workstations Didn’t Know file was sensitive ______or An honest mistake Email server

What about Malicious Exfiltration? External Network Protected Network Attacker, malware, or insider uses existing Web server File Boundary Web servers File User Workstations Email server

HTTP File2 More Malicious Leaks External Network Attacker uses hidden channel in protocol to encode sensitive data Protected Network HTTP File2 HTTP Boundary File Web servers User Workstations Email server

Policies Procedures Previous Solutions • Policy • Private Stand-alone LAN External Network Protected Network -Expensive -Granularity _too coarse -Hard to use -Difficult to _enforce Boundary Web servers User Workstations Email server

Previous Solutions • Packet Filter (Firewall) • Passive Monitoring External Network Protected Network -High speed _limits analysis _complexity -Works on _packets not _files -Can’t actively _stop leaks in _progress Firewall Boundary Web servers User Workstations Email server Analysis / Audit

Web servers Proxy Proxy Email server Previous Solutions • Proxies External Network Protected Network -High overhead -Difficult and _complicated _to configure Boundary User Workstations

Guard Warden Our Solution: Glavlit Decouple vetting from verification External Network Protected Network File Boundary -Transparent -High speed -Actively stop _leaks Web servers User Workstations -Arbitrary and powerful analysis -off critical network path

HTTP File2 Guard Warden Our Solution: Glavlit Mitigate covert channels in the application layer protocol External Network Protected Network HTTP Boundary Web servers User Workstations -Prevents a subset _of covert channels -Limits bandwidth _of others

What is Glavlit? • Prevent unauthorized release from HTTP servers while allowing authorized data to pass unhindered • Enforces complex exit policy • Operates at granularity of whole files • Covers wide range of threats • Does not depend on host security • Only trust the Warden and Guard Key Contributions: Ensure that only authorized objects cross the network boundary in payload Mitigate a class of covert channels in application layer protocols

Glavlit is NOT… • Just a firewall • For outgoing HTTP browser requests • Designed to prevent leaks from covert channels below layer 7 • Capable of stopping ALL potential covert channels • In general this is intractable

Two Complementary Techniques for Mitigating Leaks • Content Control • Hash network content against known list of good releasable data • HTTP Protocol Channel Mitigation • Restrict HTTP RFC and parse protocol for syntactic correctness • Check field values for semantic validity • Enforce ordering and normalize timing

Guard Warden Vetting at the Warden • Vetting – authoritative review to decide if an object (a file) is ok to release • Arbitrarily complex and time-consuming • Warden performs arbitrary vetting process Vetting Complete File Approved File Content Provider File

Guard Warden Vetting at the Warden • Generates signatures • Split the file into 1KB chunks • Calculate secure hash of each chunk • Collect file metadata • Share table of signatures for vetted objects with Guard Content Provider Signatures File

Verification at the Guard • Verification - Ensure object crossing network boundary is pre-vetted • Locate object within network stream • Lookup object in signature table based upon hash of first 256 bytes of the file • Verify file content • Hash and check each chunk • Packets can egress as soon as all their chunks are verified • Can actively stop invalid data by dropping packets and injecting TCP RESET packets

Need an In-order TCP Stream • How to verify data in lost, retransmitted, or out of order packets? • Keep a sliding window of packet content and cache for old packets Packet Cache Pending Data Unused Buffer Space Packet Header Queue Send TCP/IP Header TCP/IP Header TCP/IP Header TCP/IP Header TCP/IP Header

Protocol Channels • Protocol Channel • Unauthorized communication channel • Present in L7 protocol or its operation • Channel Carrier • Cover data holding the channel • Types of carriers in protocol channels • Structured • Unstructured

Structured Protocol Channels • Attackers can encode data in structured protocol fields in an HTTP response HTTP/1.1 200 OK Date: Thu, 23 Nov 2006 03:45:23 GMT Server: Apache Last-Modified: Fri, 10 Mar 2006 05:56:06 GMT Accept-Ranges: bytes Content-Length: 255 Connection: close Content-Type: text/html; charset=UTF-8 Content-Length: 255 254 Credit-Card-Num: 1234-5678-9012-3456 Key Insight: most fields are verifiable

Verifying Structured Data • Does it look right? (Syntactic) • Check syntax against restricted RFC specification • Pre-specified headers and order • Does it make sense? (Semantic) • Check against corresponding request • Restrict server responses to aid verification • Check metadata against Warden Info • Content-Length, Last-Modified, etc…

Unstructured Carriers • Attackers can also encode information in network order or timing • Correlate request/response pairs to enforce ordering • Actively alter timing behavior by delaying server responses • Model server response behavior and block deviations

Evaluation Setup • How fast is Glavlit verification relative to • Direct connection • Linux software bridge • Glavlit Guard with verification off • No hashing or protocol parsing • TCP reassembly and packet forwarding only Apache 2.2.2 Web Server Linux Host Running Guard Network Boundary Custom HTTP Client Gigabit Ethernet Gigabit Ethernet

System Throughput

Evaluation Discussion • Guard and Web server both pay the price for more connections on small files • Per-connection overhead reduces performance for small files (~50%) • Parsing • TCP Connection/Stream/State Allocation • pcap and libnet kernel switching overhead • For common Web files (~10KB+) performance is comparable to direct connect and Linux kernel bridge • Total request latency NOT affected

Conclusions • Content control prevents information that is not explicitly allowed from exiting • Prevents inadvertent disclosure • Protocol Channel Mitigation prevents many channels and limits others • Raises the Bar for attackers wanting to steal valuable data • Performance overhead acceptable in un-tunedprototype • FIRST system to actively limit application layer covert channels

Author Contact Info {nschear, ckintana, qzhang, vahdat} @cs.ucsd.edu Thank you QUESTIONS?

Guard CPU Usage

Guard No-Verify CPU Usage

Verifying Dynamically Generated Content • Goal: Leverage static content verification as much as possible • Rolling Checksum (ala rsync) • Rabin Fingerprints for variable sized chunks • High speed analysis engine for mismatch regions • Self describing templates

Related Work • Content Control • Commercial Solutions (Entrust, Fidelis, Vontu, PortAuthority) • Covert Channels • Web Tap, Eraser, Infranet • Detection of Layer 3 and 4 Channels (NUSHU, Loki, etc…) • Murdoch et al., Fisk et al., Tumoian et al. • Vetting Review Tools • Wetstone StegoSuite • Los Alamos National Lab - File Scrub

Future Work • Dynamic Content • Fuzzy Fingerprinting matching • Self Describing Web Language (JWig) • Support More Protocols • SMTP, IM, etc… • SSL Traffic Support • More tuning for better performance • Possible hardware acceleration?

Glavlit: Preventing Exfiltration at Wire Speed