130 likes | 297 Views
On the efficient detection of elephant flows in aggregated network traffic. Javier Rivillo Lopez Jose Alberto Hernandez Iain W. Phillips Networks and Control Group Research School of Informatics Loughborough University J.Rivillo-Lopez@lboro.ac.uk J.A.Hernandez@lboro.ac.uk
E N D
On the efficient detection of elephant flows in aggregated network traffic Javier Rivillo Lopez Jose Alberto Hernandez Iain W. Phillips Networks and Control Group Research School of Informatics Loughborough University J.Rivillo-Lopez@lboro.ac.uk J.A.Hernandez@lboro.ac.uk I.W.Phillips@lboro.ac.uk LCS, 2005
Outline • Motivation • Flow analysis • Detection method • Experiments and results • Conclusions
Motivation (I):Definitions • Flow: unidirectional set of packets of the same transport protocol sharing the same source and destination IP addresses and ports. • Elephant flow: stream of packets which contribute to network load substantially more than the rest of the flows. • A threshold must be defined by the network managers/administrators. • The threshold value depends on the network size.
Usually a few flows carry most of the data Trace example from NLANR router. In future, MASTS. 0.1% of the flows carry nearly 83% of the total traffic Motivation (II):Elephant and mice phenomenon Figure 1: A flow aggregation view of network traffic
Motivation (III):Elephant and mice phenomenon • Elephant flows • Low-priority applications • Large data transfer transactions and peer-to-peer file sharing • Mice flows • Sensitive to delay, jitter and high loss rates. • Voice over IP, online gaming, small http requests • Under this phenomenon, Internet's best effort delivery is not suitable. • The performance of the network can be improved by detecting the elephant flows and applying traffic engineering solutions
Flow analysis (II):Flow duration • Most of the mice flows (92%) have very short duration (< 2 sec) • Most elephants are long duration flows: heavy tail behaviour. Figure 2: Flow duration histogram
Flow analysis (III):Mean interarrival time • Elephants have very low average packet interarrival time. • So, a flow with high average packet rate and long duration is very likely to be an elephant. Figure 3: Flow mean interarrival histogram
Detection method (I):Sampling • Requirement: Low computational cost • Continuous monitoring not suitable: • Requires huge amount of resources. • Not scalable. • Sampling is required. • Random sampling not suitable because we lose information about the packet interarrival and timing. • Solution: Windowing. • Example: monitoring the network 20ms every 2 seconds. • Monitoring 1% of the time, Sampling factor= 20ms/2sec = 100
Detection method (II):Elephant detection algorithm • Objective: identify flows with low packet interarrival time (high packet rate) and long duration: Elephants. • Step 1. A flow has high packet rate when it has at least Nppackets in a sampling window. • Step 2. A flow is considered elephant when it has been identified as high packet rated flow in Nwdifferent sampling windows. • Parameters: • Np, Nw, w andT • In future, the algorithm will be adaptative: The parameters will be calculated automatically by the system.
Experiments and Results • Results with w=20ms and T=2sec: • Np=2, Nw=2: 80% of the elephant flows are correctly identified, they carry 89% of the total traffic and 0.12% of the mice flows are misidentified as elephants • Increasing Np and Nw, we get more Precision but less Recall. Figure 4: Flows identified as elephant traffic
Conclusions • Identifying elephant flows for traffic engineer solutions can improve the network performance. • The properties of elephant and mice flows have been obtained studying real traffic data. • The long tail behaviour and high packet transmission rate shown by the elephants have been used in the elephant detection method explained. • This scalable and low computational cost method uses high sampling rate for early detection of elephant flows. • We have shown in the results that it is a valid method and its parameters may be adjusted for a tradeoff between Precision and Recall in identifying the elephant flows.