1 / 20

Detection and Identification of Network Anomalies Using Sketch Subspaces

Detection and Identification of Network Anomalies Using Sketch Subspaces. Xin Li, Fang Bian, Mark Crovella, Christophe Diot, Ramesh Govindan, Gianluca Iannaccone, and Anukool Lakhina . ACM Internet Measurement Conference 2006 . Speaker: Chang Huan Wu 2009/5/1. Outline. Introduction

libba
Download Presentation

Detection and Identification of Network Anomalies Using Sketch Subspaces

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Detection and Identification of Network Anomalies Using Sketch Subspaces Xin Li, Fang Bian, Mark Crovella, Christophe Diot, Ramesh Govindan, Gianluca Iannaccone, and Anukool Lakhina ACM Internet Measurement Conference 2006 Speaker: Chang Huan Wu 2009/5/1

  2. Outline • Introduction • Previous Approach • Defeat • Evaluation • Conclusions

  3. Introduction (1/3) • Unusual traffic patterns arise from network abuse as well as from legitimate activity • These traffic anomalies are often difficult to detect at a single link and require scrutiny of the entire network

  4. Introduction (2/3) • Characterizing “normal” traffic using IP flows representation is intractable • High dimension • Reduce dimension and identify anomalies

  5. Introduction (2/3) • Previous work aggregate netflow into origin-destination (OD) flows • Modify this approach and increases the detection rate while reducing false alarms and identify the IP-flows responsible for the anomaly Points of Presence, PoP Link

  6. Previous Approach • Reference: Anukool Lakhina, Mark Crovella, Christophe Diot, "Mining Anomalies Using Traffic Feature Distributions," In ACM SIGCOMM 2005

  7. Volume vs. Traffic Feature Distribution • Volume based detection schemes have been successful in isolating large traffic changes • But a large of anomalies do NOT cause detectable disruptions in traffic volume • Using traffic feature distribution • Augments volume-based anomaly detection • Traffic distributions can reveal valuable information about the structure of anomalies

  8. Port scan anomalies viewed in terms of traffic volume and in terms of entropy Port scan dwarfed in volume metrics… But stands out in feature entropy

  9. Traffic Feature Distributions • Anomalies can be detected and distinguished by inspecting traffic features: • 4-tuple: SrcIP, SrcPort, DstIP, DstPort

  10. Entropy based scheme • In volume based scheme, # of packets or bytes per time slot was the variable. • In entropy based scheme, in every time slot, the entropy of every traffic feature is the variable. • This gives us a three way data matrix H. • H(t, p, k) denotes at time t, the entropy of OD flow p, of the traffic feature k. • To apply subspace method, we need to unfold it into a single-way representation.

  11. Subspace Decomposition Residual trafficvector Traffic vector at a particular point in time Normal trafficvector Normal subspace, : first k principal components Anomalous subspace, : remaining principal components Then, decompose traffic on all links by projecting onto and to obtain: 11

  12. Geometric illustration In general, anomalous traffic results in a large value of Use to identify if it is anomalous Traffic on link 2 y Traffic on link 1 12

  13. Multiway Subspace Method:(Multi-way to single-way) • Decompose into a single-way matrix • Now apply the usual subspace decomposition (PCA) • Every row of the matrix will be decomposed into

  14. Defeat (1/2) R1, SrcIP R2, SrcIP • Use random aggregations of IP flows (sketches) • Put an IP flow into different hash functions (h1, h2…) t1 Entropy of h1 h1 s buckets h1 s buckets R1 h1 s buckets t2 Entropy of h1 h2 h2 h1 s buckets R2 … h3 h3 … … h4 h4 tn Entropy of h1 h5 h5 Entropy of h1

  15. Defeat (2/2) SrcIP SrcPort DstIP DstPort • Apply multiway subspace method to each hash function • In all m hash functions, see how many ones are identified as anomalous • Voting approach t1 Entropy of h1 Entropy of h1 Entropy of h1 Entropy of h1 t2 Entropy of h1 Entropy of h1 Entropy of h1 Entropy of h1 … … … … tn Entropy of h1 Entropy of h1 Entropy of h1 Entropy of h1

  16. Identify Anomalies • Find the element in hash functions that is identified as anomalous • The intersection of the key sets over all hash functions which has raised the alarms, identifies the keys of the IP flows that caused the anomaly (with high likelihood) t1 Entropy of h1 Entropy of h2 Entropy of h3 Entropy of h4 s buckets s buckets s buckets s buckets

  17. Evaluation (1/2)

  18. Evaluation (2/2) • 5 or 6 hash functions is enough • If m is the number of hash functions, m−2 or more votes may be enough

  19. Conclusion • Uses multiple random traffic projections to robustly detect anomalies • Higher detection rate and fewer false alarms • Able to automatically infer the IP flows responsible for an anomaly

  20. Comments • Only can handle offline data • Can other fields in packet header be used for anomaly detection?

More Related