1 / 33

Traceback

Traceback. Pat Burke Yanos Saravanos. Agenda . Introduction Problem Definition Benchmarks and Metrics Traceback Methods Packet Marking Hash-based Conclusion References. Why Use Traceback?. General Network Monitoring Check users on FTP server Network Threats SPAM DoS

Download Presentation

Traceback

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Traceback Pat Burke Yanos Saravanos

  2. Agenda • Introduction • Problem Definition • Benchmarks and Metrics • Traceback Methods • Packet Marking • Hash-based • Conclusion • References

  3. Why Use Traceback? • General Network Monitoring • Check users on FTP server • Network Threats • SPAM • DoS • Insider attacks

  4. Why Use Traceback? • Network Threats • Worms / Viruses • Code Red (2001) spreading at 8 hosts/sec • Slammer Worm (2003) spreading at 125 hosts/sec • Illegal file sharing

  5. Why Use Traceback? • Currently very difficult to find spammers, virus authors • Easy to spoof IPs • No inherent tracing mechanism in IP • Blaster virus author left clues in code, was eventually caught • What if we could trace packets back to point of origin?

  6. Packet Tracing

  7. Packet Tracing • Monitoring applications currently exist • Ethereal, tcpdump, ngrep, etc • Only work with untampered packets • Worms, viruses, spam are sent with spoofed IPs from compromised computers • Need solutions to trace all packets

  8. Preliminary Solutions • Routers add identifiers to the packet as it moves along the Internet • Packet size increases with every hop • Effective throughput decreases very quickly • Routers keep a log of all the packets that have been routed • Large overhead required of all routers • Huge database containing packet information • When should you clear packet information?

  9. Benchmarks • Effect on throughput • Amount of overhead added to the packets • False positive rate • Percentage of paths traced back to the incorrect source • Computational intensity • Time required to trace an attack • Amount of data required to trace an attack • CPU/memory usage on router

  10. Benchmarks • Traceback’s effect on network • Does it flood? • Susceptibility to spoofing • Collisions • For hash-based traceback methods

  11. Some Assumptions • Attackers can create/spoof any packet • Packets from an attack may take different routes to victim • Attacker-victim routes are stable • Routers are not compromised

  12. Packet Marking

  13. Packet Marking • Add information to the packets so that paths can be retraced to original source • Methods for marking packets • Probabilistic • Node Marking • Edge Marking • Deterministic

  14. Probabilistic Packet Marking (PPM) • Using probability, router marks a packet • With router IP address (node marking) • With edge of paths (edge marking) • Node marking • 95% accuracy, requires ~300,000 packets • Edge marking • More state information required, converges much faster

  15. PPM Nodes • Each router writes its address in a 32-bit field only with probability p • Address field can be overwritten by routers closer to the victim • Probability of seeing the mark of a router d hops away is p(1-p)d-1 • Need many packets before we see a mark from a distant router

  16. PPM Nodes – Pros • Not every packet is marked • Lower overhead on routers • Higher throughput (packet size remains small) • Fixed space is required for the packets • Packet size + 32 bits

  17. PPM Nodes - Cons • Large number of false positives • DDoS with 25 hosts requires several days and has thousands of false positives • Slow convergence rate • For 95% success, we need 300,000 packets • Attacker can still inject modified packets into PPM network (mark spoofing) • This is only for a single attacker

  18. PPM Edge Sampling • Reserve distance field and two 32-bit address fields (“start” and “end”) • If router decides to mark a packet, writes its address in “start” field and zeroes the distance field • When a router sees a zero in the distance field, it writes its address in the “end” field • If a router decides not to mark a packet, increments distance field • Must use saturating addition (distance field has limit)

  19. PPM Edge Sampling • Max packets to reconstruct an attack is ln(d)/p(1-p)d-1 • Requires fewer packets than when marking nodes • Edge sampling allows reconstruction of the whole attack tree • Packets have additional overhead • Encoding start, end, and distance eliminates compatibility with networks not using PPM

  20. Deterministic Packet Marking (DPM) • Every packet is marked • Spoofed marks are overwritten with correct marks

  21. DPM • Incoming packets are marked • Outgoing packets are unaltered • Requires more overhead than PPM • Less computation required • Probability of generating ingress IP address (1-p)d-1

  22. DPM • 32-bit address is split into two fields (0-15 and 16-31) and a flag • IP populates one of the two fields with probability of 0.5 • Set flag to 1 if using the higher end bits • Only part of the address is available to the attacker • Can be made more secure by using non-uniform probability distributions

  23. DPM • Claimed to have 0 false positives • Claimed to converge very quickly • 99% probability of success with 7 packets • 99.9% probability of success with only 10 packets • Has not been tested on large networks • Cannot deal with NAT

  24. HASH-BASED TRACEBACK Source Path Isolation Engine (SPIE)

  25. SPIE - Overview • Each router along a packet’s transmission path computes a set of Hash-codes (digests) associated with each packet • The time-taggeddigests are stored in router-memory for some time period • Limited by available router resources • Traceback is initiated only by “authenticated agent requests” to the SPIE Traceback Manager (STM) • Executed by means of a broadcast message • Results in the construction of a complete attack graph within the STM

  26. SPIE - Assumptions • Packets may be addressed to multiple destinations • Attackers are aware they are being traced • Routers may be subverted, but not often • Routing within the network may be unstable • Traceback must deal with divergent paths • Packet size should not grow as a result of traceback • 1 byte increase in size = 1% increase in resource use • Very controversial … self-enabling assumption • End hosts may be resource constrained • Traceback is an infrequent operation • Broadcast messages can have a significant impact on internet performance • Traceback should return entire path, not just source

  27. SPIE - Architecture DGA (Data Generation Agent) Resident in SPIE-enhanced routers to produce digests and store them in time-stamped digest tables. Implemented as software agents, interface cards, or dedicated aux boxes STM (SPIE Traceback Manager) Controls the SPIE system. Verifies authenticity of a traceback request, dispatches the request to the appropriate SCAR’s, gathers regional attack graphs, and assembles the complete attack graph. SCAR (SPIE Collection and Reduction Agents) Data concentration point for some regional area. When traceback is requested, SCAR’s initiate a broadcast request for traceback and produce regional attack graphs based upon data from constituent DGA’s

  28. SPIE - Hashing LAN .139% WAN .00092% Masked (gray) areas are NOT used in hash-code calculation • Multiple hash-codes (hash-codes, different groupings of fields) are calculated for each package based on 24 relatively invariant fields of the first 32 bytes of each packet. • Packet was received if all hashes are positive • Hash functions can be simple (no cryptographic hardness required) and relatively fast

  29. SPIE – Implementation Issues • PRO • Single packet tracing is feasible • Automated processing by SPIE-enhanced routers make spoofing difficult, at best • Relatively low storage required • Only digests and time are stored • Does not aid in eavesdropping of payload data • Payload is not stored • CON • Requires specially configured (SPIE-enhanced) routers. • Probability of detection is directly related to the number of available SPIE-enhanced routers in the network in question • Storage in routers is a limiting factor in the window of time in which a packet may be successfully traced • May consider some sort of filtering of packets to be digested • May have the appearance of a loss of anonymity across the Internet

  30. Conclusions • DoS, worms, viruses continuously becoming more dangerous • Attacks must be shut down quickly and be traceable • Integrating traceback into next generation Internet is critical

  31. Conclusions • Probabilistic Packet Marking • Keeps low packet overhead • Not 100% accurate, traceback is slow • Deterministic Packet Marking • No false positives • Much higher packet overhead, needs more testing • Hash-based Traceback • No packet overhead • New, more capable routers

  32. Conclusions • Cooperation is required • Routers must be built to handle new tracing protocols • ISPs must provide compliance with protocols • Internet is no longer anonymous • Some issues must still be solved • NATs • Collisions

  33. References • Belenky, A., Ansari, N. “IP Traceback with Deterministic Packet Marking”. IEEE Communications Letter, April 2003. • Savage, S., et al. “Practical Network Support for IP Traceback”. Department of Computer Science, University of Washington. • Snoeren, A., Partridge, Craig, et al. “Single-Packet IP Traceback”. IEEE/ACM Transactions on Networking, December 2002.

More Related