Towards unbiased end to end diagnosis
Download
1 / 39

Towards Unbiased End-to-End Diagnosis - PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on

Towards Unbiased End-to-End Diagnosis. Yao Zhao 1 , Yan Chen 1 , David Bindel 2. Lab for Internet & Security Tech, Northwestern Univ EECS department, UC Berkeley. Outline. Background and Motivation MILS in Undirected Graph MILS in Directed Graph Evaluation Conclusions.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Towards Unbiased End-to-End Diagnosis' - ahmed-freeman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Towards unbiased end to end diagnosis

Towards Unbiased End-to-End Diagnosis

Yao Zhao1, Yan Chen1, David Bindel2

  • Lab for Internet & Security Tech, Northwestern Univ

  • EECS department, UC Berkeley


Outline
Outline

  • Background and Motivation

  • MILS in Undirected Graph

  • MILS in Directed Graph

  • Evaluation

  • Conclusions



Linear algebraic model

Path loss rate pi, link loss ratelj:

A

1

3

D

p1

p2

C

2

B

Linear Algebraic Model

Usually an underconstrained system


Unidentifiable links

[ 1 0 0 ] ?

Unidentifiable Links

  • Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable

    • The property of a link (or link sequence) can be computed from the linear system if and only if the corresponding vector is identifiable

  • Otherwise, Unidentifiable

A

1

3

D

p1

p2

[ 0 0 1 ]

C

2

B


Motivation

0.1

0

0.1

Motivation

  • Biased statistic assumptions are introduced to infer unidentifiable Links

Loss rate?

Virtual Link

Loss = 0 if unicast tomography & RED

Loss rate = 0.1 if linear optimization


Least biased end to end network diagnosis lend
Least-biased End-to-end Network Diagnosis (LEND)

  • Basic Assumptions

    • End-to-end measurement can infer the end-to-end properties accurately

    • Link level properties are independent

  • Problem Formulation

    • Given end-to-end measurements, what is the finest granularity of link properties can we achieve under basic assumptions?

Better accuracy

Basic assumptions

More and stronger statistic assumptions

Diagnosis granularity?

Virtual link


Least biased end to end network diagnosis lend1
Least-biased End-to-end Network Diagnosis (LEND)

  • Contributions

    • Define the minimal identifiable unit under basic assumptions (MILS)

    • Prove that only E2E paths are MILS with a directed graph topology (e.g., the Internet)

    • Propose good path algorithm (incorporating measurement path properties) for finer MILS

Better accuracy

Basic assumptions

More and stronger statistic assumptions

Diagnosis granularity?

Virtual link


Outline1
Outline

  • Background and Motivation

  • MILS in Undirected Graph

  • MILS in Directed Graph

  • Evaluation

  • Conclusions


Minimal identifiable link sequence
Minimal Identifiable Link Sequence

  • Definition of MILS

    • The smallest path segments with loss rates that can be uniquely identified through end-to-end path measurements

    • Related to the sparse basis problem

      • NP-hard Problem

  • Properties of MILS

    • The MILS is a consecutive sequence of links

    • A MILS cannot be split into MILSes (minimal)

    • MILSes may be linearly dependent, or some MILSes may contain other MILSes


Examples of milses in undirected graph
Examples of MILSes in Undirected Graph

Real links (solid) and all of the overlay

paths (dotted) traversing them

MILSes

b

3’

1

4

e

a

4’

d

1’

3

3’+2’-1’-4’ → link 3

2

5

2’

c


Outline2
Outline

  • Background and Motivation

  • MILS in Undirected Graph

  • MILS in Directed Graph

  • Evaluation

  • Conclusions


Identify milses in undirected graphs
Identify MILSes in Undirected Graphs

  • Preparation

    • Active or passive end-to-end path measurement

    • Optimization

      • Measure O(nlogn) paths and infer the n(n-1) end-to-end paths [SIGCOMM04]


Identify milses in undirected graphs1
Identify MILSes in Undirected Graphs

  • Preparation

  • Identify MILSes

    • Enumerate each link sequence to see if it is identifiable

    • Computational complexity: O(r×k×l2)

      • r: the number of paths (O(n2))

      • k: the rank of G (O(nlogn))

      • l: the length of the paths

    • Only takes 4.2 seconds for the network with 135 Planetlab hosts and 18,090 Internet paths


What about directed graphs

Sum=1

Sum=1

Sum=1

Sum=1

Sum=1

Sum=0

What about Directed Graphs?

  • Directed Graph Are Essentially Different to Undirected Graph

[1 0 0 0 0 0] ?

Theorem:In a directed graph, no end-to-end path contains an identifiable subpath if only considering topology information


Good path algorithm
Good Path Algorithm

  • Consider Only Topology

    • Works for undirected graph

  • Incorporate Measurement Path Property

    • Most paths have no loss

      • PlanetLab experiments show 50% of paths in the Internet have no loss

    • All the links in a path of no loss are good links (Good Path Algorithm)


Good path algorithm1
Good Path Algorithm

  • Symmetric Property is broken when using good path algorithm


Other features of lend
Other Features of LEND

  • Dynamic Update for Topology and Link Property Changes

    • End hosts join or leave, routing changes or path property changes

    • Incremental update algorithms very efficient

  • Combine with Statistical Diagnosis

    • Inference with MILSes is equivalent to inference with the whole end-to-end paths

    • Reduce computational complexity because MILSes are shorter than paths

      • Example: applying statistical tomography methods in [Infocom03] on MILSes is 5x faster than on paths


Outline3
Outline

  • Motivation

  • MILS in Undirected Graph

  • MILS in Directed Graph

  • Evaluation

  • Conclusions


Evaluation metrics
Evaluation Metrics

  • Diagnosis Granularity

    • Average length of all the lossy MILSes in lossy path

  • Accuracy

    • Simulations

      • Absolute error and relative error

    • Internet experiments

      • Cross validation

      • IP spoof based consistency check

  • Speed

    • Running time for finding all MILSes and loss rate inference


Methodology
Methodology

  • Planetlab Testbed

    • 135 end hosts, each from different institute

    • 18,090 end-to-end paths

  • Topology Measured by Traceroute

    • Avg path length is 15.2

  • Path Loss Rate by Active UDP Probing with Small Overhead



Distribution of length of milses
Distribution of Length of MILSes

  • Most MILSes are pretty short

  • Some MILSes are longer than 10 hops

    • Some paths do not overlap with any other paths

Most MILSes are short

A few MILSes are very long


Other results
Other Results

  • MILS to AS Mapping

    • 33.6% lossy MILSes comprise only one physical link

      • 81.8% of them connect two ASes

  • Accuracy

    • Cross validation (99.0%)

    • IP spoof based consistency check (93.5%)

  • Speed

    • 4.2 seconds for MILS computations

    • 109.3 seconds for setup of scalable active monitoring [SIGCOMM04]


Conclusion
Conclusion

  • Link-level property inference in directed graphs is completely different from that in undirected graphs

  • With the least biased assumptions, LEND uses good path algorithm to infer link level loss rates, achieving

    • Good inference accuracy

    • Acceptable diagnosis granularity in practice

    • Online monitoring and diagnosis

  • Continuous monitoring and diagnosis services on PlanetLab under construction


Thank you
Thank You!

For more info:

http://list.cs.northwestern.edu/lend/

Questions?


Motivation1

R

B

A

Motivation

  • End-to-End Network Diagnosis

  • Under-constrained Linear System

    • Unidentifiable Links exist

To simplify presentation, assume undirected graph model



Identifiable and unidentifiable
Identifiable and Unidentifiable

  • Vectors That Are Linear Combinations of Row Vectors of G Are Identifiable

  • Otherwise, Unidentifiable

x3

Row(path) space

(identifiable)

A

(0,0,1)

1

3

(1,1,1)

D

p1

x1

p2

C

2

B

(1,1,0)

x2


Examples of milses in undirected graph1

1’

a

1

2

Rank(G)=1

1

a

1’

2’

2

3

c

b

3’

Rank(G)=3

b

3’

1

4

e

a

4’

d

1’

3

2

5

2’

c

Rank(G)=4

Real links (solid) and all of the overlay

paths (dotted) traversing them

MILSes

Examples of MILSes in Undirected Graph

3’+2’-1’-4’ → link 3


Identify milses in undirected graphs2

x3

x1

x2

Identify MILSes in Undirected Graphs

  • Preparation

  • Identify MILSes

    • Compute Q as the orthonormal basis of R(GT) (saved by preparation step)

    • For a vector v in R(GT) , ||v|| = ||QTv||

v2

v1


Flowchart of lend system
Flowchart of LEND System

  • Step 1

    • Monitors O(n·logn) paths that can fully describe all the O(n2) paths(SIGCOMM04)

    • Or passive monitoring

  • Step 2

    • Apply good path algorithm before identifying MILSes as in undirected graph

Iteratively check all possible MILSes

Measure

topology

to get G

Good path

algorithm on G

Active or passive

monitoring

Compute loss rates of MILSes

Stage 2: online update the measurements and diagnosis

Stage 1: set up scalablemonitoring system for diagnosis


Evaluation with simulation
Evaluation with Simulation

  • Metrics

    • Diagnosis granularity

      • Average length of all the lossy MILSes in lossy path (in the unit of link or virtual link)

    • Accuracy

      • Absolute error |p – p’ |:

      • Relative error


Simulation methodology
Simulation Methodology

  • Topology type

    • Three types of BRITE router-level topologies

    • Mecator topology

  • Topology size

    • 1000 ~ 20000 or 284k nodes

  • Number of end hosts on the overlay network

    • 50 ~ 300

  • Link loss rate distribution

    • LLRD1 and LLRD2 models

  • Loss model

    • Bernoulli and Gilbert


Sample of simulation results
Sample of Simulation Results

  • Mercator (284k nodes) with Gilbert loss model and LLRD1 loss distribution


Related works
Related Works

  • Pure End-to-End Approaches

    • Internet Tomography

      • Multicast or unicast with loss correlation

    • Uncorrelated end-to-end schemes

  • Router Response Based Approach

    • Tulip and Cing


Mils to as mapping
MILS to AS Mapping

  • IP-to-AS mapping constructed from BGP routing tables

  • Consider the short MILSes with length 1 or 2

    • Consist of about 44% of all lossy MILSes.

    • Most lossy links are connecting two dierent ASes


Accuracy validation
Accuracy Validation

  • Cross Validation (99.0% consistent)

  • IP Spoof based Consistency Checking.

  • UDP: Src: A, Dst: B, TTL=255

  • UDP: Src: C, Dst: B, TTL=2

  • ICMP: Src: R3, Dst: C, TTL=255

  • UDP: Src: A, Dst: C, TTL=255

C

R1

A

R3

R2

B

IP Spoof based Consistency: 93.5%