310 likes | 449 Views
This paper presents a wavelet-based methodology for detecting performance problems in network traffic. Traditional active probing methods can disrupt network operation, whereas our non-intrusive approach leverages passive measurements for real-time problem detection. We introduce WIND, a detection tool capable of identifying common symptoms such as delays and drops through energy plots derived from passive data. The paper details the underlying mechanisms, theoretical frameworks, and validation of the methodology, offering an efficient solution for network performance monitoring.
E N D
A Non-intrusive, Wavelet-basedApproach To Detecting Network Performance Problems Polly Huang ETH Zurich Anja Feldmann U. Saarbruecken Walter Willinger AT&T Labs-Research
Road Map • Motivation and rationale • Mechanism details • Conclusion and outlook
Web TCP Network Link/Physical Performance Problem Web Web TCP TCP Google.com Internet Network Network Link/Physical Link/Physical server proxy congestion congestion routing routing else else
Current State • Active probing • Ex: traceroute, ping • Disturbing - injecting unnecessary traffic • Biasing - distort metrics of interest • ‘Heisenberg’ effects • Passive measurements • Ex: Cisco NetFlow, IP Accounting, other packet-level measurment • give much information • Do not infer problems inside the network
What Would Be Cool • Passive • Trigger alerts in real time • For problems due to • Server load • Congestion • Routing error • Common Symptoms • Delay and drop
TCP’s Closed-loop Control • Delays/drops reflected in RTT/RTOestimations • RTT: round trip time • RTO: retransmission timeout • Quality of Network Path • Values of RTT/RTO estimations • Amounts of RTT/RTO samples • Can be measured passively
Detailed Estimation • Methodology • A hash table of all data packets observed • One RTT sample per data-ack pair • One RTO sample per data-data pair • Slow • ~ #packets/observation period • especially with high date rate connections (the likely trouble makers)
Objectives • Passive measurement • Non-intrusive • Infer quality of network paths • Detecting network performance problem • Efficiently (so can be done in real time) • Wavelet-based technique
Road Map • Motivation and rationale • Mechanism details • Conclusion and outlook
Wavelet-based Technique • Theoretical ground • Wavelet transform • Energy plots (or scaling plots) • Interpreting energy plots • WIND, the problem detection tool • Features & examples • Detection methodology • Validation effort
Theoretical Ground • FFT • Frequency decomposition • fj, Fourier coefficient • Amount of the signal in frequency j • WT: wavelet transform • Frequency (scale) and time decomposition • dj,k, wavelet coefficient • Amount of the signal in frequency j, time k
0 0 0 0 2 2 2 2 0 0 0 0 0 0 0 0 0 0 4 4 0 0 0 0 0 8 0 0 8 8 Wavelet Example 1 0 -1 00 00 00 00 11 11 11 11 s1 s2 s3 s4 d1 d2 d3 d4
Self-similarity • Energy function • Ej = Σ(dj,k)2/Nj • Self-similar process • Ej = 2j(2H-1) C <- the magic!! • log2 Ej = (2H-1)j + log2C • linear relationship between log2 Ej andj
Effect of Periodicity self-similar Internet Traffic
10 00 00 00 10 00 00 00 s1 s2 s3 s4 d1 d2 d3 d4 1 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 1 0 1 1 1 1 2 0 Adding Periodicity • packets arrive periodically, 1 pkt/23 msec • coefficients cancel out at scale 4
Interpreting Energy Functions • Abrupt knees at • RTT time scale • RTO time scale • Knee shifts • RTT/RTO time changes • Low energy level (after normalization) • congestion • low traffic volume
WIND - The Detection Tool Wavelet-based Inference for Network Detection • Based on libpcap and tcpdump • On-line mode (efficient) • Per packet: compute dj,k • Per observation period: output Ej • On a subnet basis • Off-line mode • Detailed RTT/RTO estimation
Detecting Methodology • Reference function • Smoothed average • Difference • Area below the reference function • Weighted sum by scale • Flagged interesting • Top 10% deviations
Validation By • WIND off-line mode • Detailed RTT/RTO estimations • Volume • Similar heuristics (area difference) • CCDF of RTT/RTO • Ratio of RTO/RTT • Volume
Validate period 26, 30, 31 CCDF of RTT: pick out period 29, 30, 31 CCDF of RTO: pick out period 23, 26, 31 80-90% are validated interesting
Road Map • Motivation and rationale • Mechanism details • Conclusion and outlook
Summary • Detect problems using energy plots • If self-similar, clean linear relationship • If periodic, getting knees • If problems, knee shifts or low energy level • WIND: the online/offline analysis tool • Passive • Efficient
Outlook • Full-fledged diagnosing tool • More sophisticated heuristics • Use of traceroute data • Illustrative examples • Using the tool (beta release) • Using the methodology
Questions? • http://www.tik.ee.ethz.ch/~huang