1 / 40

Monitoring Abnormal Traffic Based on Dynamic Flow Model and Systematic Method

Xiaohong Guan, Tao Qin and Wei Li Xiaohong Jiaotong / Tsinghua University. APNG’ 2007. Monitoring Abnormal Traffic Based on Dynamic Flow Model and Systematic Method. Introduction: motivations and network traffic characteristics. Outlines. Mathematical model for network traffic

onawa
Download Presentation

Monitoring Abnormal Traffic Based on Dynamic Flow Model and Systematic Method

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Xiaohong Guan, Tao Qin and Wei Li Xiaohong Jiaotong / Tsinghua University APNG’ 2007 Monitoring Abnormal Traffic Based on Dynamic Flow Model and Systematic Method

  2. Introduction: motivations and network traffic characteristics Outlines • Mathematical model for network traffic • ICA based solution method • Experiments and testing results • Conclusions and future work

  3. Outbreaks of large-scale network security incidents, such as worms, DDOS, etc., pose seriously threats to normal Internet operation Traffic flow is one of the most important information sources for monitoring network anomalies It is possible to discover anomalies in network traffic due to behavior changes Motivations

  4. Methods based on pattern change of traffic volume mostly with time series model Worm detection ICMP packets (G. Bakos 2002) Destination and source correlation (Xing Qin 2004) Kalman Filter (Cliff. Zou 2005) DOS detection AR (autoregressive) model (B. Ravichandran, 2000) EWMA (exponential weight moving average) model (Nong Ye, 2003) Wavelet analysis (P. Barford 2002) LRD (long range dependence or self similarity) model (Xin Wang,2005) Other signal processing and statistical methods(Yin Zhang 2005) Related Work (1)

  5. Methods based on NetFlow(K.C.Claffy 1995) NetFlow Traffic from one host (port) to another host (port) NetFlow parameters Dst_IP, Src_IP, Ports, packets and bytes Detection Establish statistical model and compare the “normal” flow forecasted by the model with the actual Related Work (2)

  6. Methods based on PCA (principle component analysis)(A. Lakhina 2005) Decompose traffic flows to eigenflows and classify the eigenflows corresponding to three traffic patterns d-eigenflows  Period or cycle pattern s-eigenflows  Abrupt changes n-eigenflows Noise Establish threshold for normal (abnormal) traffic based on norms of the eigenflows vectors Related Work (3)

  7. Methods based on user’s behavior (Kuai Xu 2005 and G. Tan 2003) Connection patterns associated with user’s behaviors Dst_IP, Dst_Port Determine the abnormal behavior based on the “abrupt change” of the connection pattern such as the range of destination Related Work (4)

  8. The methods based on monitoring “abrupt changes” of traffic flows or behaviors are difficult to apply since the abruptness is hard to measure The methods based on time series and statistic models need training process to determine the model parameters. However obtaining labeled training samples in network security problems is extremely difficult The too many eigenflows based on the PCA method may be difficult to separate, causing high false alarm rate Challenges

  9. Define observable traffic variables including flows, packets, bytes , connection degree, Destination port distribution degree and flow interval Decompose traffic patterns into two behavior features and convert the traffic monitoring problem into a blind signal source separation (BSS) problem Fast ICA (Fast Independent Component Analysis) based method developed to solve the BSS problem Framework of a new ICA based method

  10. Outlines • Introduction: motivations and network traffic characteristics • Mathematical model for network traffic • ICA based solution method • Experiment and testing results • Conclusions and Future work

  11. S-flow (SF) A group of packets with the identical triples: source IP, destination IP, destination port Packets of SF Number of total packets in one S-flow Bytes of SF Number of bytes in one S-flow Connection degree (CD) Number of destination hosts that one host connects Four variables describing statistical characteristics of network traffic

  12. Destination port distribution degree (DPDD) Number of different destination ports that one host connects S-flow interval (SFI) Time interval between different S-flows that a host generates Connection degree (CD) Number of destination hosts that one host connects Three variables describing host connection characteristics

  13. Cycling repetition  Habitual or routine behaviors Abrupt changes  abnormal incidents Noise  stochastic nature Behaviors embedded in network traffic Need two features to represent cycling routing behaviors and abnormal incidents

  14. Four observable output variables of network traffic Packets X1(t) Bytes X2(t) Connection degree X3(t) S-Flow X4(t) Two feature variables of behaviors Routine behavior S1(t) Abnormal behavior S2(t) Notations of mathematical traffic model

  15. Traffic output variables are linearly represented by the behavior feature variables, where aij are weighting factors Mathematical traffic model X1(t)=a 11S1(t) +a12S2(t) X2(t)=a 21S1(t) +a22S2(t) X3(t)=a 31S1(t) +a32S2(t) X4(t)=a 41S1(t) +a42S2(t) X(t)=AS(t)

  16. Blind signal separation (BSS) problem Given observed traffic variables, Xi(t), estimate the behavior feature variables Si(t) and determine if there is significant abnormal traffic

  17. Outlines • Introduction: motivations and network traffic characteristics • Mathematical model for network traffic • ICA based solution method • Experiment and testing results • Conclusions and Future work

  18. Two microphone outputs in the room as observable variables X1(t) and X2(t) Two lecture speakers in the room as input feature variables S1(t) and S2(t) With X1(t) = a11S1(t) +a12S2(t) X2(t)=a21S1(t) +a22S2(t) Obtain aij and Si(t) with only the knowledge of Xi(t) Background of BSS Problem: “Chicken Party”

  19. Basic assumption s1(t) and s2(t) are Independent and s2(t) is non-Gaussian since the abnormal behaviors are not “regularly” disrtibuted Measurement of independence p(x,s)=p(x)p(s) Measurement of non-Gaussianty Negentropy used to measurement non-Gaussianty Basics for solving BSS problem

  20. Entropy Basic concept in information theory. H(x) = -∑p(x=ai)log(p(x=ai)) reaches its maximum if X is a Gaussian variable Negentropy J(X)=H(Xgauss)-H(X) reaches its minimum (zero) if X is a Gaussian variable where h() is a monotonic non-quadratic function Max (J) is performed to estimate the non-Guanssianty of X Calculation of negentropy

  21. Fixed-point iteration algorithm to find an optimal ST=WTX that maximizes the non-Gaussianity FastICA method for solving BSS problem

  22. Stochastic noises are mixed in both routine and abnormal behavior feature variables Noise frequency is usually higher than the feature variables Scale-space filter (SSF) with adaptive scale factor is applied to maximize the mitigation of noises Noise mitigation on the feature variables

  23. Determining scale factor First find the maximal noise pulse width in the extracted feature variables If the noise pulse width is large than 0.8* , the pulse will remain after filtering If the noise pulse width is less than 0.2* , the pulse will be removed after filtering SSF Implementation

  24. Based on the abnormal behavior variable denote the amplitude of pulses i the width the area The abnormal degree We select the threshold based on the maximums of these three parameters filtered as noises If all elements of the abnormal degree vector in the abnormal behavior features exceed the threshold, it is considered abnormal. Abnormal Detection

  25. Three observable variables describing the connection characteristics Connection degree Dst-port distribution degree Flow interval The abnormal behaviors detected are classified based on these variables Abnormal behavior classification

  26. Outlines • Introduction: motivations and network traffic characteristics • Mathematical model for network traffic • ICA based solution method • Experiment and testing results • Conclusions and Future work

  27. Experiment system for flow data acquisition

  28. Interface of the monitoring system

  29. Observed traffic variables

  30. Extracted behavior features

  31. Noise embedded in the extracted behavior features

  32. Find the noise pulse width for selecting filtering factor s

  33. Behavior features before filtering noises

  34. Behavior features after filtering noises

  35. 40 anomalous behaviors embedded in the data With the ICA based method, 41 anomalous pulses are detected 17 network scanning behaviors 20 connection distribution behaviors. The false negative rate is 7.5% The false positive rate 9.7%. Results on anomaly detection

  36. Abnormal behavior classification

  37. Abnormal behavior classification

  38. The abnormal behavior detection in network traffic monitoring can be converted into a BSS problem and effectively solved by the ICA based method The abnormal behaviors are independent with the user’s routine behavior The SSF with adaptive selected scale factor is effective for noise filtering The experiments and testing results show that the ICA based method with simple abnormal classification are effectiveand discovered Conclusions

  39. Improvement the ICA based method and comparison with other methods Determine nature of the behavior (routine or abnormal) that a particular component represents with other information Compare with others methods with extensive testing Select different traffic features and compare Worm modeling and detection for IPv6 network Topic discovery and group dynamics via Internet with certain social network structure Backbone network flow processing, storing and replaying Ongoing and Future work

  40. Thanks you

More Related