1 / 44

ForNet: A Distributed Forensic Network

ForNet: A Distributed Forensic Network. Kulesh Shanmugasundaram http://isis.poly.edu/projects/fornet/. Security Fails…. What Mechanisms Do We Have?. Logging mechanisms and audit trails Alerts/Logs from security components: Logs only perceived security threats Host Logs:

crystal
Download Presentation

ForNet: A Distributed Forensic Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ForNet: A Distributed Forensic Network Kulesh Shanmugasundaram http://isis.poly.edu/projects/fornet/

  2. Security Fails….

  3. What Mechanisms Do We Have? • Logging mechanisms and audit trails • Alerts/Logs from security components: • Logs only perceived security threats • Host Logs: • First thing to get disabled upon intrusion • An insider would rather use a host without any logs • Mobility, wireless networks create new problems • Packet Logs: • Usually at the edges hence blinded easily • Can’t keep data for long… • E.g: Infinistream, NFR, NetWitness, SilentRunner

  4. What do we need? • A reliable mechanism for analysis and attribution • An effective response model • Current response model is mostly manual • Response time is in days or weeks • Digital evidence disappears quickly • Tools that complement security components • Need some safety net to fall back • Forensics is that net

  5. Our Solution… • Securely collect, store, disseminate, and process synopsis of network traffic. • In other words a device analogous to a surveillance camera for the network. Even better – a co-operating network of such surveillance cameras. • Goal of Project ForNet: development of tools, techniques, and infrastructure to aid rapid investigation and identification of cyber crimes

  6. ForNet: A Unified, End-to-End Approach to Network Forensics ForNet Domain: A domain covered by single monitoring and privacy policies. Forensic Server: Responsible for archiving synopses, query processing & routing, enforcing monitoring, security policies, for the domain. SynApp:equipped routers or hosts. Primary function is to create synopses of network traffic. May have limited query processing and storage component as well.

  7. Design Goals and Trade-Offs • “Complete” & “Correct” Evidence • Longevity • Succinctness • Accessibility of Evidence • Security & Privacy of Evidence • Ubiquity • Incremental Deployment • Modular • Scaleable Design

  8. Legal Issues Systems Security Networking , Distributed Systems Theory Research Challenges • Legally admissible evidence • Chain of custody • Privacy • High-speed packet capture • Storage systems • Databases • GUI • Secure architecture • Attacks on ForNet • Use of ForNet • Identification of events • Architecture of ForNet • Query Routing • Fault Tolerance • Design of Synopses • Quantitative Analysis

  9. SynApp Monitoring Policy Query Language Panorama Forensic Server Privacy Policy Components of ForNet • Architectural components can be grouped under following functions: • Data Collection • Data Retention • Data Dissemination

  10. Network Stream Synopses Sent ToForensic Server Network Filter HBF Neoflow MonitoringPolicy Synopses Library Synopsis Engine Configuration Manager Histograms Buffer Manager Synopsis Controller Architecture of a SynApp

  11. Synopses in ForNet * Currently being implemented

  12. Query From An Analyst Response To The Query Security Manager Archive Manager Configuration Manager QueryParser PrivacyManager QueryPlanner XMLGenerator Scan Detection Find Zombies Steppingstones HBF Neoflow DNS Records Histogram BGP Tables Architecture of Forensic Server

  13. Forensic Server • Forensic Server has the following functions: • Data archiving • Query processing • Policy management and enforcement • Data Archiving • Periodically receives data from SynApps & archives them on disk.

  14. Forensic Server (cont…) • Query Processor: • XML-based query language • No relational databases (uses Berkeley DB) • Does coalescence of synopses • Given a query can locate appropriate synopses • Generate and execute a query plan to answer the query • Distributed query processing • Currently being implemented to talk to nearby forensic servers to answer/corroborate queries/evidence • Policy Enforcer: • Two types of policies in ForNet: • Monitoring Policy– informs user of what is being monitored • Privacy Policy – informs user of how/with whom data is shared etc.

  15. A user interface for: Building queries Data mining Visualization Chart Generator Network Graphs 3D Link Maps Summary Views Queries, Various Commands to Forensic Server Data Cleansing Miners Communication Manager User Interface Manager XML Parser Plug-in Manager Architecture of Panorama

  16. Capturing Evidence on The Internet • Internet is a gigantic state machine: • To support forensics we need to know its precise state at any given time! • Therefore, we need to keep track of state transitions • State transitions can be characterized by: • Links/connections between nodes • Link content • Protocol mappings and aliases • Aggregates generated by state transitions Archiving raw traffic for this information is an overkill!

  17. What is a Synopsis? Data structures and algorithms for representing a set of elements succinctly with predefined loss in information and has the ability to recall the original set of elements with a preset accuracy.

  18. Properties of a Good Synopsis • Contains enough data to answer certain classes of queries • Who sent payload “xyz”? • What did host bug.poly.edu send? • Contains enough data to quantify confidence of its answers • I’m 99.37% sure bug.poly.edu sent “XYZ” • Have small memory footprint and easy to update • Need 20GB/day to keep 1TB/day of raw network data • Need to compute two hashes per packet • Resource requirements are tunable • Can only afford 3GB/day, adjust the accuracy to accommodate this.

  19. Advantages of Using Synopses Can retain potential evidence for months! • Succinct representation of raw data makes it possible to transfer network data to disks • Sharing/transferring raw data over network is impossible but synopsis can be moved to remote sites • Query processing would be expensive with raw data • What’s the frequency of traffic to port 80 in the past week? (raw data vs. a histogram) • Easily adaptable to various resource requirements • Can adopt the size, processing requirements of a Bloom Filter based on various hardware resources and network load Allows for cascading different techniques!

  20. Core (40Gbps) Enterprise(100Mbps) Edge(1Gbps) Subnet(50Mbps) SCBF Simple CR Topology maps Payload Sampling MAC-layer maps Neoflow HBF DNS mappings CR-BF Topology maps Cascading of Synopses

  21. Packet Digests & Bloom Filters • Snoeren et. al. used it successfully in SPIE for single packet traceback (“Hash-Based IP Traceback”) • Space Efficient: • 16-bits per packet (m/n=16) and 8 hashes (k=8) false positive (FP) = 5.74 x 10-4 • No false negatives! • However, suppose we don’t have packets. • We only have some excerpts of payload • Don’t know where the excerpt was aligned in the packet Extend Bloom Filters to support excerpt/substring matching

  22. P = s create s-byte blocks of payload P = p0 p1 p2 … … p(n/s) append blocks’ Offset (in payload) P = (p0|0) (p1|s) (p2|2s) … … (p(n/s)|n/s) Block-based Bloom Filter Insert each block into a Bloom Filter

  23. Q = s Try all possible offsets create s-byte blocks of query string (q0|0) Q = q0 q1 q2 … … q(l/s) (q0|s) (q1|2s) (q2|3s) q2=p3 q0=p1 q1=p2 (p0|0) (p1|s) (p2|2s) (p2|3s) … (p(n/s)|n/s) BBF = H1(q0|0)=1 H2(q0|0)=1 H3(q0|0)=0 X “q0q1q2” was seen in a payload at offset ‘s’

  24. P1 = A B R A C A P2 = C D A B R A (A|0) (B|s) (R|2s) (A|3s) (C|4s) (A|5s) BBF = (C|0) (D|s) (A|2s) (B|3s) (R|4s) (A|5s) (A|0) (B|s) (R|2s) (A|3s) (C|4s) (A|5s) (C|0) (D|s) (A|2s) (B|3s) (R|4s) (A|5s) “Offset Collisions” For query strings: “AD”, “CB”, “DR”, “AA” etc. BBF falsely identifies them as seen in the payload! Because BBF cannot distinguish between P1 and P2

  25. Hierarchical Bloom Filter (p0|0) (p1|s) (p2|2s) … … (p(n/s)|n/s) P = level-2 (p0p1p2p3|0) level-0 level-1 (p0p1|0) (p2p3|2s) (p(n/s-1)p(n/s)|(n/s-1)) Hierarchical Bloom Filter • An HBF is basically a set of BBF for geometrically increasing sizes of blocks.

  26. (q0|0) (q1|s) (q2|2s) (q3|3s) … (q(n/s)|l/s) Q = level-2 (q0q1q2q3|0) level-0 level-1 (q0q1|0) (q2q3|2s) (q(l/s-1)q(l/s)|(l/s-1)) Hierarchical Bloom Filter • Querying is similar to BBF. • Matches at each level can be confirmed a level above.

  27. Adapting an HBF for ForNet • So far an HBF can attest for the presence of a bit-string in payloads • We need to tie this bit-string to a source and/or destination hosts • Our Approach: • Similar to tying an offset to a block/bit-string • In addition to inserting (block||offset) also insert (block||offset||hostid) • Hostid could be (srcIP||dstIP)

  28. Connectionrecord/Neoflow Give all connections between time T1 and T2? Who sent “DEADBEEF” between time T1 and T2? (source, destination) IPs of connectionsbetween time T1 and T2? QueryProcessor Did (source, destination) IP send“DEADBEEF” between time T1 and T2? HBF YES/NO Coalescence of Synopses • HBF requires: • Source IP, destination IP, excerpt • But where do we get Source IP, Destination IP • Connection Record/Neoflow • Given two time intervals can give us list of source, destination IPs • Coalesce these synopses

  29. ForNet in Intranet Usage • Investigation based on payload characteristics • Determine victims of worm, trojans and other malware. • Detection of potential victims of phising and spyware • Source of IP theft • Investigation based on connection characteristics • Detection of zombies • Detection of malware based on connection pattern • Investigation based on aggregate characteristics • Insider abuse • Network resource abuse • Etc Etc …

  30. ForNet Deployed on Internet • Investigation based on payload characteristics • Traceback based on partial content of single packets • Source of malware, worms, etc. • Investigation based on connection characteristics • Stepping stone detection • Investigation based on aggregate characteristics • Attack attribution

  31. Current Status • Implemented a PC based SynApp device for placement within an intranet. • Connection records, HBF, DNS, NewFlow, Mappings • Implemented Forensics Server with simple querying capabilities. • Current Forensic Server has 1.3TB of storage with over 3 months worth of data from the edge-router and two subnets • Normal bandwidth consumption of network is about a 1 – 2 TB/day • Synopses reduces this traffic to about 20GB/day • A 4TB Forensic Server will take over operations in July • Implemented Panorama (GUI Client)

  32. Tracking MyDoom • Recorded all email traffic for a week • Using HBF and raw traffic • Was not aware of MyDoom during this collection • When signatures became available we used them to query the system • To find hosts that are infected in our network • How the hosts were infected • Some statistics: • 679 hosts originated at least one copy of the virus • 52 of which were in our network • These hosts sent out copies of the virus to 2011 hosts outside our network boundary

  33. Analyzing MyDoom Infections…

  34. MyDoom’s Weekly Progress…

  35. Neoflow-NG • Keeps track of everything Neoflow has plus: • Individual packet sizes • Be able to recall the packet sizes in the same sequence • Inter-arrival times of packets • Be able to recall the arrival times of individual packets • Content type of flow • Be able to recall types of content carried by flows • Audio, Video, Plain-Text, Encrypted, Compressed etc.

  36. Investigating Network Resource Abuse

  37. Total Network Bandwidth Composition

  38. Detection of Network Proxies

  39. Zombie Detection • Are they any zombies in Poly network? And if so, how do we identify them? • Criterion: Hosts that portscan and use IRC at likely zombies. • Tabulate the amount of IRC activity for each host. • Tabulate the amount of portscan activity for each host. • For each host: ZOMBIE_LIKELIHOOD = PORTSCANS * IRC_FLOWS • Sort all hosts by their ZOMBIE_LIKELIHOOD. • List the top 200 hosts.

  40. Work currently in progress… • Identification of useful network events • Network is the virtual crime scene that holds evidence in the form of network events • Developing efficient synopses • Handling connection oriented & connectionless traffic • Techniques for payload attribution (esp. IRC/IM) • Automatic cascading of various synopsis techniques • Analysis and mining techniques • Zombie detection • Stepping stone detection

  41. Work currently in progress… • Integration of information from synopses across networks • Development of a protocol for secure communication of various ForNet components • Query processing and storage of synopses • A query language transparent of various underlying synopsis techniques • A query manage system to interpret the language for the underlying database • Various storage and garbage collection strategies for collected-SynApps • Storage and query processing infrastructure for Forensics Servers

  42. Research Challenges • Integration of information from synopses across networks • Real power of ForNet is realized when information from SynApps is fused to answer queries • Development of a protocol for secure communication of various ForNet components • Query processing and storage of synopses • A query language transparent of various underlying synopsis techniques • A query management system to interpret the language for the underlying database • Analysis • A frame-work to compare effectiveness/trade-offs of synopses

  43. Attacks on ForNet • Resource Exhaustion: • Flood the network with random bits of data • Malicious Transformation: • Create packets of length (blocksize – 1) • Stuffing: • Stuff every other block with application dependent escape characters • For smaller blocks we can try to guess for larger blocks it is not possible! • Exploiting Collisions: • Hash collisions • Very unlikely for strong hash functions • We use a random seed for every HBF so it makes it more difficult • Packet collisions • A possibility in Block-based Bloom Filters but not in HBFs • Streaming transformations: • Encryption, compression

  44. For more information visit: http://isis.poly.edu/projects/fornet/

More Related