1 / 14

Lightweight Application Classification for Network Management

Lightweight Application Classification for Network Management. Hongbo Jiang Case Western Reserve University. Andrew W. Moore University of Cambridge. Zihui Ge Adverplex Inc. Shudng Jin Case Western Reserve University. Jia Wang AT&T Labs - Research.

howie
Download Presentation

Lightweight Application Classification for Network Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lightweight Application Classification for Network Management Hongbo Jiang Case Western Reserve University Andrew W. Moore University of Cambridge Zihui Ge Adverplex Inc. Shudng Jin Case Western Reserve University Jia Wang AT&T Labs - Research ACM SIGCOMM Workshop on Internet Network Management (INM) Kyoto, Japan, August 31, 2007

  2. Why do Network Traffic Classification? • Network planning • Traffic engineering • Accounting and billing • Security profiling • …

  3. Our Contribution • A lightweight application classification scheme based on NetFlow data • Evaluation & Sensitivity Analysis • Trivial features • Derivative features • Training-set size • Packet sampling

  4. Flow-level Traffic Classification • Previous traffic classification use features derived from streams of packets • Can achieve good accuracy (e.g., 95%) • Have high complexity and cost • Commonly available flow-level statistics (Cisco NetFlow, Juniper cflowd, Huawei NetStream,…) • Sampling further reduces the cost

  5. Class of membership Prior Probability box Object Characteristics In Training In Training Pr = .15 In Use Object Characteristics ? Probability box Prior In Use Pr = .33 Probability of membership (estimate of membership) Probabilistic Method Example Training Set Pr = .97

  6. Our Approach (cont.) • Features ranked by importance • Use Symmetric Uncertainty (based on entropy) (See paper and references therein for details.) Ranked features allows for a • sensitivity analysis, and the • removal of irrelevant and redundant features.

  7. Evaluation • Dataset (not from AT&T!) • Full-duplex 1Gbps access-link; 1000 researchers • Data was hand-classified into a number of application classes: e.g. web-browsing, email, FTP, attack, P2P, … • Focused on TCP/IP flows only • 800,000 simplex TCP/IP application-level flows (97% of traffic by byte-volume) • Netflow Generation • Software simulation of Cisco NetFlow v5 engine • Independent training and test sets • Flows randomly assigned to each

  8. Baseline and Derivative Features Comparison: Port based: 50-70%, Packet based: 95%

  9. Highly Relevant Features Refers to specific privileged services and protocols Differentiate Email and FTP from Web-browsing Compact features

  10. Reducing Feature Complexity Runtime: 600x (s) Runtime: 1x (s) Accuracy remains high even after removing irrelevant and redundant features.

  11. Reducing Training SetSize More features may lead-to noise (insufficiently representative)

  12. Impact of Packet Sampling • NetFlow characteristic: Observed flow-count will decrease as sampling rate decreases Packet sampling has little impact on accuracy

  13. Conclusion & Future Works • Conclusion • Application Classification can be done with Flow-level (NetFlow) information • Trivially-derived features improve accuracy • Packet sampling have minimal impact • Future works • NetFlow v9?? • Other M-L methods?

  14. Thanks

More Related