A General Introduction to Tomography & Link Delay Inference with EM Algorithm

A General Introduction to Tomography & Link Delay Inference with EM Algorithm Presented by Joe, Wenjie Jiang 21/02/2004

Outline of Talk • Why tomography? • Introduction to tomography • Internal Link Delay Inference • Basic EM • A simple example to infer internal link delay using EM algorithm • Conclusion

Terminology “Tomography” Brain Tomography Access is difficult! Network Tomography Access is difficult! Vardi 1996

Why tomography? What is the: • Bandwidth? • Loss rate? • Link Delay? • Traffic demands? • Connectivity of links in the network? (Topology Inference) Path: a connection between two end nodes, each consisting of several links. Link: a direct connection with no intermediate routes/hosts.

Motivation • Identify congestion points and performance bottlenecks • Dynamic routing • Optimized service providing • Security: detection of anomalous/malicious behavior • Capacity planning

Why tomography - Difficulty • Decentralized, heterogeneous and unregulated nature of the internal network. • No incentive for individuals to collect and distribute these info freely. • Collecting all statistics impose an impracticable overhead expense • ISP regards the statistics highly confidential • Relaying measurements to decision-making point consumes bandwidth.

Why tomography - Solution • Widespread internal network monitoring is expensive and infeasible • Edge-based measurement and statistical analysis is practical and scalable

Brain Tomography

Network Tomography

Where are you? • Why tomography? • Introduction to tomography • Internal Link Delay Inference • Basic EM • A simple example to infer internal link delay using EM algorithm • Conclusion

Introduction to tomography • Use a limited number of measurements to infer network (link) performance parameters, using: -- Maximum Likelihood Estimator -- Estimation Maximization -- Bayesian Inference and assuming a prior model. • Categories of problems: -- Link level parameter estimation -- Sender-Receiver traffic intensity. -- Topology Inference

Introduction to tomography (2) • Two forms of network tomography: -- link-level metric estimation based on end-to-end, traffic measurements (counts of sent/received packets, time delays between sent/received packets) -- path level (sender-receiver path) traffic intensity estimation based on link-level measurements (counts of packets through nodes) • Passive or Active measurements? • Multicast or Unicast?

Problem Description • To solve the linear system: • A, ө and εhave special structures. • Goal: to maximize the likelihood function

Problem Description (2) • A = routing matrix (graph) • ө = packet queuing delays for each link • y = packet delays measured at the edge • ε= noise, inherent randomness in traffic measurements Statistical likelihood function

Problem Description (3) l1 l2 l3 l4 l5 l6 l7 l1 l2 l3 l4 l5 l6 l7 Y1 Y2 Y3 Y4 An virtual multicast tree with four receivers Y1=X1+X2+X4

Physical Topology Measure end-to-end (from sender to receiver) delays

Logical Topology Logical topology is formed by considering only the branching points in the physical topology Infer the logical link-level queuing delay distributions!

The basic idea of internal link delay tomography Send a back-to-back packet pair from a sender, each packet heading to a different receiver Use the fact that delays are highly correlated on shared links Queuing delay difference between these two end can be attributed to the unshared links

Delay Estimation • Measure end-to-end delay of packet pairs Packets experience the same delay on link1 d2=dmin=0 d3>0 Extra delay on link 3!

Packet-pair measurements • Key Assumptions • Fixed known routes • Temporal independence • Spatial independence • Packet-pair delays are identical on share links. N delay measurements in all

Parameters αi = parameter of delay pmf on link i α1 α3 α2 α6 α4 α5 α7 α9 α8

Link delay model • αi = delay pmf on link i • Link delay model could be multinomial • quantized delay model: delay= {0, 1, 2, 3,…,L,∞} • αi= {αi0,αi1,αi2,...,αiL,αi ∞} • αij=P{ delay(link i) = j } • αi0+αi1+αi2,...,αiL+αi ∞=1

Goal is the probability of the event of n-th measurement is the probability of the event of all measurements Our goal: find

Review of MLE (Maximum Likelihood Estimation)

Review of MLE (Maximum Likelihood Estimation) • The basic idea of MLE: God always let the event with the biggest probability happen the most likely -- The MLE of ө is to make the sample occur the most likely • Note we assume X={x1,…xN} to be i.i.d • The solution could be easy or hard depending on the form of p(ө|X) • e.g. p(ө|X) is a single Gaussian ө=(μ, σ2), we can set the derivative of logL(ө|X) to zero and solve it directly.

Complete Data • The sample X={x1,…xN} together with the missing (or latent) data Y is called complete data. • The complete likelihood is where p(x, y|ө) is the joint density of X and Y given the parameter ө. • The complete log-likelihood is

Complete MLE • By the definition of conditional density, where p(y|x,ө) is the conditional density of Y given X=x and ө • The complete MLE

Basic idea of EM • Given X=x and ө= өt-1, where өt-1 is the current estimates the unknown parameters • log p(x,Y| ө) is a function of Y whose unique best Mean Squared Error (MSE) predicator is

EM steps

The magic of EM • the direct MLE of is relatively hard to solve • But the MLE of complete log-likelihood is relatively easier to obtain • since is a function of x and y, (y is hidden), we use the expectation of y under x and • So E-step M-step

EM in link delay inference Note that here notation x and y have opposite meaning of x, y stated in previous EM algorithm α1 x1 x2 x3 α3 α2 x6 α6 x4 x5 x7 x9 α4 α5 α7 x8 α9 α8

EM in link delay inference (2) • Complete data Z=(X,Y) • the complete data log-likelihood: • Pα[Y|X] has nothing to do with α • mi,j is the total number of packets experience a delay j on link i over N measurements.

EM in link delay inference (3) The MLE of αwould be

EM in link delay inference (4) MLE which is the frequency of event mi A simple example is that we toss a die, P( the result i)=αi (i=1,2…6) mi= how many times we see result i

EM in link delay inference (5) • We notice that is similar to only different that should be replaced by • So the MLE

EM in link delay inference (6) Probability Propagation

A simple example 0 delay on each link fall into {0,1,2,3} x1 1 x2 x3 2 3 αij=P{ delay (link i) = j } y2 y1

A simple example (2) Suppose there are 5 measurements: { (3,2), (4,2), (6,5), (0,0), (4,1)} 0 x1 1 x2 x3 2 3 y2 y1

A simple example (3) 0 x1 1 Bayes Formula x2 x3 2 3 y2 y1

A simple example (4) 0 x1 1 x2 x3 2 3 y2 y1

A simple example (5) 0 x1 similarly: 1 x2 x3 2 3 y2 y1

A simple example (6) mi,j computed in the first iteration.

A simple example (7) the physical meaning of α1,0is that: the number of packets that experience delay 0 on link i divided by the total number of packets that travel through link i

A simple example (8) αi,j computed in the first iteration

A simple example (9) Iteration: iterate E-step and M-step, until some termination criteria is satisfied! After 6 iterations, αi,j converges to a fixed value.

A simple example (9) { (3,2), (4,2), (6,5), (0,0), (4,1)} 0 x1 1 x2 x3 2 3 y2 y1

Complexity

A General Introduction to Tomography & Link Delay Inference with EM Algorithm