1 / 37

An Effective Defense Against Spam Laundering

An Effective Defense Against Spam Laundering. Mengjun Xie, Heng Yin, Haining Wang Presented by Dustin Christmann March 4, 2009. Outline. Introduction Spam Laundering Anti-Spam Techniques Proxy-Based Spam Behavior DBSpam DBSpam Evaluation Potential Evasions. Introduction. What is spam?

kiora
Download Presentation

An Effective Defense Against Spam Laundering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Effective Defense Against Spam Laundering Mengjun Xie, Heng Yin, Haining Wang Presented by Dustin Christmann March 4, 2009

  2. Outline • Introduction • Spam Laundering • Anti-Spam Techniques • Proxy-Based Spam Behavior • DBSpam • DBSpam Evaluation • Potential Evasions

  3. Introduction What is spam? Classic definition: a canned precooked meat product made by the Hormel Foods Corporation, introduced in 1937. “SPAM” stands for “SPiced hAM” Modern definition: the abuse of electronic messaging systems to send unsolicited bulk messages indiscriminately.

  4. Introduction So how did we get from one definition to the other? A 1970 Monty Python sketch, entitled “Spam.”

  5. Spam Laundering Email relay MTA Proxy MTA

  6. Anti-Spam Techniques Three main categories: • Recipient-oriented techniques • Sender-oriented techniques • HoneySpam

  7. Recipient-oriented Techniques Two main categories: • Content-based techniques • Non-content-based techniques

  8. Content-Based Techniques • Email address filters • Heuristic filters • Machine-learning based filters

  9. Non-content-based Techniques • DNSBLs • MARID • Challenge-Response • Tempfailing • Delaying • Sender Behavior Analysis

  10. Sender-oriented Techniques • Usage regulation • Cost-based approaches

  11. HoneySpam • Based on honeyd • Set up • Fake web servers • Fake open proxies • Fake relays • Log the users of these fake servers as spam sources

  12. Proxy-based Spam Behavior Normal email transmission MTA Corporate / campus / home network Router

  13. Proxy-based Spam Behavior Proxy-based Spam MTA Router Corporate / campus / home network

  14. Connection Correlation • One-to-one mapping between upstream and downstream connections • In normal email transmission, there’s only one. • Problems • Upstream encryption • Overhead • Timing

  15. Packet Symmetry • Message symmetry • SMTP message from downstream connection results in TCP message to upstream connection • Packet symmetry • One packet from downstream connection results in one packet to upstream connection • Exceptions

  16. TCP Correlation Example

  17. DBSpam Goals: • Fast detection of spam laundering with high accuracy • Breaking spam laundering via throttling or blocking after detection • Support for spammer tracking and law enforcement • Support for spam message fingerprinting • Support for global forensic analysis

  18. Deployment of DBSpam • At a network vantage point where it can monitor the bi-directional traffic Single-homed network:

  19. Deployment of DBSpam Multi-homed network

  20. Design of Spam Laundering Detection • With proxy-based spam transmission, number of incoming SMTP reply packets = number of outgoing TCP packets • Possible for this to occur with normal traffic, but very seldom • Sequential Probability Ratio Test (SPRT) is used

  21. SPRT • Can be viewed as a one-dimensional “random walk” starting between two boundaries • One boundary defines “spam connection” • Other boundary defines “not a spam connection”

  22. SPRT • Each observation pushes the walk in one direction or the other • Observation of correlated SMTP-TCP packets pushes walk toward “spam connection” • Observation of no correlation pushes walk toward “no spam connection” • When the walk hits either boundary, test ends

  23. SPRT • Average number of required observations to reach a determination depends on four variables: • α* (the desired probability of false positives) • β* (the desired probability of false negatives) • θ1(the distribution of positive correlation) • θ0 (the distribution of negative correlation)

  24. SPRT E[N|H1] vs. θ0 and α* (θ1 = 0.99, β* = 0.01)

  25. SPRT Detection Algorithm

  26. Noise Reduction • Maintain a set of external IP addresses that appear for each time • In the consecutive M time windows, single out the external IP addresses that appear at least K times • Can further reduce the incidence of false positives dramatically, depending on the selection of M and K

  27. Noise reduction

  28. DBSpam Evaluation • Evaluation at College of William & Mary • Two off-campus PCs as spam sources • Two PCs in different campus subnets running SOCKS and HTTP proxies • Spam “sink” in dark net • Traces run in two different months • N-* includes no spam traffic • S-*-C encrypted spam, S-*-A and S-*-B unencrypted spam

  29. DBSpam Evaluation SPRT Detection Time

  30. DBSpam Evaluation Distribution of N|H0

  31. DBSpam Evaluation CDF of Detection Time for SPRT

  32. DBSpam Evaluation Accuracy of SPRT

  33. DBSpam Evaluation Accuracy of SPRT after noise reduction

  34. DBSpam Evaluation Resource Consumption

  35. Potential Evasions • Fragmenting SMTP replies at the proxy • Change the 1:1 packet symmetry into 1:2 or 1:3 • Inserting random delays at the proxy • Randomly change the 1:1 packet symmetry into 1:0 or 1:2

  36. Strengths • Simple to implement • Moves spam detection closer to source, reducing network traffic • Thwarts encryption • Detects proxy-based spam quickly • Few false positives

  37. Weaknesses • Easy to evade by breaking packet symmetry • Can be thwarted by short SMTP dialogs • Must be installed at ISP edge • Too resource intensive for imbedded systems

More Related