1 / 39

# Towards Unbiased End-to-End Diagnosis - PowerPoint PPT Presentation

Towards Unbiased End-to-End Diagnosis. Yao Zhao 1 , Yan Chen 1 , David Bindel 2. Lab for Internet & Security Tech, Northwestern Univ EECS department, UC Berkeley. Outline. Background and Motivation MILS in Undirected Graph MILS in Directed Graph Evaluation Conclusions.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Towards Unbiased End-to-End Diagnosis' - ahmed-freeman

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Towards Unbiased End-to-End Diagnosis

Yao Zhao1, Yan Chen1, David Bindel2

• Lab for Internet & Security Tech, Northwestern Univ

• EECS department, UC Berkeley

• Background and Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

93 hours?

Path loss rate pi, link loss ratelj:

A

1

3

D

p1

p2

C

2

B

Linear Algebraic Model

Usually an underconstrained system

• Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable

• The property of a link (or link sequence) can be computed from the linear system if and only if the corresponding vector is identifiable

• Otherwise, Unidentifiable

A

1

3

D

p1

p2

[ 0 0 1 ]

C

2

B

0

0.1

Motivation

• Biased statistic assumptions are introduced to infer unidentifiable Links

Loss rate?

Loss = 0 if unicast tomography & RED

Loss rate = 0.1 if linear optimization

• Basic Assumptions

• End-to-end measurement can infer the end-to-end properties accurately

• Link level properties are independent

• Problem Formulation

• Given end-to-end measurements, what is the finest granularity of link properties can we achieve under basic assumptions?

Better accuracy

Basic assumptions

More and stronger statistic assumptions

Diagnosis granularity?

• Contributions

• Define the minimal identifiable unit under basic assumptions (MILS)

• Prove that only E2E paths are MILS with a directed graph topology (e.g., the Internet)

• Propose good path algorithm (incorporating measurement path properties) for finer MILS

Better accuracy

Basic assumptions

More and stronger statistic assumptions

Diagnosis granularity?

• Background and Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

• Definition of MILS

• The smallest path segments with loss rates that can be uniquely identified through end-to-end path measurements

• Related to the sparse basis problem

• NP-hard Problem

• Properties of MILS

• The MILS is a consecutive sequence of links

• A MILS cannot be split into MILSes (minimal)

• MILSes may be linearly dependent, or some MILSes may contain other MILSes

Real links (solid) and all of the overlay

paths (dotted) traversing them

MILSes

b

3’

1

4

e

a

4’

d

1’

3

2

5

2’

c

• Background and Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

• Preparation

• Active or passive end-to-end path measurement

• Optimization

• Measure O(nlogn) paths and infer the n(n-1) end-to-end paths [SIGCOMM04]

• Preparation

• Identify MILSes

• Enumerate each link sequence to see if it is identifiable

• Computational complexity: O(r×k×l2)

• r: the number of paths (O(n2))

• k: the rank of G (O(nlogn))

• l: the length of the paths

• Only takes 4.2 seconds for the network with 135 Planetlab hosts and 18,090 Internet paths

Sum=1

Sum=1

Sum=1

Sum=1

Sum=0

• Directed Graph Are Essentially Different to Undirected Graph

[1 0 0 0 0 0] ?

Theorem:In a directed graph, no end-to-end path contains an identifiable subpath if only considering topology information

• Consider Only Topology

• Works for undirected graph

• Incorporate Measurement Path Property

• Most paths have no loss

• PlanetLab experiments show 50% of paths in the Internet have no loss

• All the links in a path of no loss are good links (Good Path Algorithm)

• Symmetric Property is broken when using good path algorithm

• Dynamic Update for Topology and Link Property Changes

• End hosts join or leave, routing changes or path property changes

• Incremental update algorithms very efficient

• Combine with Statistical Diagnosis

• Inference with MILSes is equivalent to inference with the whole end-to-end paths

• Reduce computational complexity because MILSes are shorter than paths

• Example: applying statistical tomography methods in [Infocom03] on MILSes is 5x faster than on paths

• Motivation

• MILS in Undirected Graph

• MILS in Directed Graph

• Evaluation

• Conclusions

• Diagnosis Granularity

• Average length of all the lossy MILSes in lossy path

• Accuracy

• Simulations

• Absolute error and relative error

• Internet experiments

• Cross validation

• IP spoof based consistency check

• Speed

• Running time for finding all MILSes and loss rate inference

• Planetlab Testbed

• 135 end hosts, each from different institute

• 18,090 end-to-end paths

• Topology Measured by Traceroute

• Avg path length is 15.2

• Path Loss Rate by Active UDP Probing with Small Overhead

• Most MILSes are pretty short

• Some MILSes are longer than 10 hops

• Some paths do not overlap with any other paths

Most MILSes are short

A few MILSes are very long

• MILS to AS Mapping

• 33.6% lossy MILSes comprise only one physical link

• 81.8% of them connect two ASes

• Accuracy

• Cross validation (99.0%)

• IP spoof based consistency check (93.5%)

• Speed

• 4.2 seconds for MILS computations

• 109.3 seconds for setup of scalable active monitoring [SIGCOMM04]

• Link-level property inference in directed graphs is completely different from that in undirected graphs

• With the least biased assumptions, LEND uses good path algorithm to infer link level loss rates, achieving

• Good inference accuracy

• Acceptable diagnosis granularity in practice

• Online monitoring and diagnosis

• Continuous monitoring and diagnosis services on PlanetLab under construction

http://list.cs.northwestern.edu/lend/

Questions?

B

A

Motivation

• End-to-End Network Diagnosis

• Under-constrained Linear System

To simplify presentation, assume undirected graph model

• Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable

• Otherwise, Unidentifiable

x3

Row(path) space

(identifiable)

A

(0,0,1)

1

3

(1,1,1)

D

p1

x1

p2

C

2

B

(1,1,0)

x2

a

1

2

Rank(G)=1

1

a

1’

2’

2

3

c

b

3’

Rank(G)=3

b

3’

1

4

e

a

4’

d

1’

3

2

5

2’

c

Rank(G)=4

Real links (solid) and all of the overlay

paths (dotted) traversing them

MILSes

Examples of MILSes in Undirected Graph

x3

x1

x2

Identify MILSes in Undirected Graphs

• Preparation

• Identify MILSes

• Compute Q as the orthonormal basis of R(GT) (saved by preparation step)

• For a vector v in R(GT) , ||v|| = ||QTv||

v2

v1

• Step 1

• Monitors O(n·logn) paths that can fully describe all the O(n2) paths(SIGCOMM04)

• Or passive monitoring

• Step 2

• Apply good path algorithm before identifying MILSes as in undirected graph

Iteratively check all possible MILSes

Measure

topology

to get G

Good path

algorithm on G

Active or passive

monitoring

Compute loss rates of MILSes

Stage 2: online update the measurements and diagnosis

Stage 1: set up scalablemonitoring system for diagnosis

• Metrics

• Diagnosis granularity

• Average length of all the lossy MILSes in lossy path (in the unit of link or virtual link)

• Accuracy

• Absolute error |p – p’ |:

• Relative error

• Topology type

• Three types of BRITE router-level topologies

• Mecator topology

• Topology size

• 1000 ~ 20000 or 284k nodes

• Number of end hosts on the overlay network

• 50 ~ 300

• LLRD1 and LLRD2 models

• Loss model

• Bernoulli and Gilbert

• Mercator (284k nodes) with Gilbert loss model and LLRD1 loss distribution

• Pure End-to-End Approaches

• Internet Tomography

• Multicast or unicast with loss correlation

• Uncorrelated end-to-end schemes

• Router Response Based Approach

• Tulip and Cing

• IP-to-AS mapping constructed from BGP routing tables

• Consider the short MILSes with length 1 or 2

• Consist of about 44% of all lossy MILSes.

• Most lossy links are connecting two dierent ASes

• Cross Validation (99.0% consistent)

• IP Spoof based Consistency Checking.

• UDP: Src: A, Dst: B, TTL=255

• UDP: Src: C, Dst: B, TTL=2

• ICMP: Src: R3, Dst: C, TTL=255

• UDP: Src: A, Dst: C, TTL=255

C

R1

A

R3

R2

B

IP Spoof based Consistency: 93.5%