1 / 34

AS Relationship Inference

AS Relationship Inference. AS Graph and AS Relationship Inference Gao’s Degree-based Heuristics SARK’s “Multiple Vantage Point” Approach Computation/Optimization based Approach Summary and Discussion. AS Relationship Inference. AS graph as a (simple) model for Internet structure

ryann
Download Presentation

AS Relationship Inference

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. AS Relationship Inference • AS Graph and AS Relationship Inference • Gao’s Degree-based Heuristics • SARK’s “Multiple Vantage Point” Approach • Computation/Optimization based Approach • Summary and Discussion CSci8211: AS Relation Inference

  2. AS Relationship Inference • AS graph as a (simple) model for Internet structure • nodes: ASes; edges: BGP connections between ASes • not the same as “physical” topology • “connectivity” in AS graph does not mean “reachability” • BGP policy based • Need to augment “edges” in AS graph with types of relationships • AS relationship inference problem Relationship? AS 1 AS 2 CSci8211: AS Relationship Inference

  3. Applications of AS Relationships Some examples • Construct Internet distance map • Place proxy or mirror site servers • Potentially avoid route divergence • Help ISPs or domain administrators to achieve load balancing and congestion avoidance • Help ISPs or companies to plan for future contractual agreements • Help ISPs to reduce effect of misconfiguration and to debug router configuration files

  4. Internet AS Graph (from www.caida.org)

  5. Data Sources • How to obtain an Internet graph at AS level? • BGP routing tables (AS PATH) • Active probes (traceroute) • BGP routing tables (BGP views) • Route Views Project (www.routeviews.org/) • RIPE (www.ripe.net/) • UMN (www.cs.umn.edu/research/networking/BGP/traces/) • …. • Traceroute data (need to map IP’s to ASN’s) • CAIDA • Router servers via telnet, (www.traceroute.org/#Route%20Servers) • iPlane • ….

  6. Caveats Challenge: Can we get a complete Internet AS graph? If not, why? • Impact of partial BGP views • Where the vantage points? • What are likely missed? • Impact of (partial) traceroute data • … • Beware of sampling bias

  7. AS Relationship Inference Problem Basic Assumptions • Most common AS relationships • provider-customer • peer-to-peer • (some may be sibling-sibling) • Common BGP routing practices • Prefer customer route over peer/provider routes • Prefer peer router over provider routes • Filter peer routes to providers • Filter provider routers to peers • (most) AS paths of an BGP routing table entry are valley-free

  8. 7 4 3 2 6 5 1 1 3 4 5 6 7 1 2 Valley-Free Property An AS path (u1, u2, …, un) is valley-free if and only if p-c or p-p edges can be followed by only p-c or s-s edges. possibly missing possibly missing possibly missing possibly missing P-C C-P P-P

  9. Abstract Model: ToR Graph • Given a graph G=(V,E) • Edges are either “directed” or “undirected” (but their orientations unknow) • directed edge <u,v>: u, customer; v, provider • undirected edge (u,v): u and v are peers • We are given a set of paths P (in G) • AS relationship inference Problem: • ToR Edge Orientation Problem: orient (some) edges in G so as to minimize # of paths in P that are “invalid”, i.e., non-valley-free

  10. AS1 AS7 AS6 AS2 AS3 customer-to-provider edge AS4 AS5 peer-peer edge sibling-sibling edge AS ToR Graph

  11. A “Simplified” Problem • For any “valley-free” path p=(u1, …,um) with a “flat top” (u1,ui+1), i.e., undirected edge, we can orient this edge any way, it is still a “valley-free” path! • Given any solution to the ToR edge orientation problem with some undirected edges, we can orient these edges without increasing # of invalid paths in P! • A simpler version of the Problem: • ToR-Simpe:orient all edges in G so as to minimize # of paths in P that are “invalid”, i.e., non-valley-free • A general two-step process: • solve ToR-simple first (i.e., customer-provider edges); • then figure out “peer-to-peer” edges • point to the difficulty of inferring “peer-to-peer” edges • some “arbitrariness” involved

  12. Gao’s Algorithm • A degree-based approach • Intuitively, ASes with high degrees (# of AS neighbors) are likely providers • Two-step process • First, try to orient all edges to minimize invalid paths • Intuition for step 1: • given a path path p=(u1, …,um), pick uk, where degree(uk) is largest among all ui, make uk as top provider on the path • there may be many paths involving u_k • need to figure which paths and which nodes to start with • Then pick some “custumer-provider” edges to “flatten” them out as “peer-to-peer” edges • Heuristics: nodes with similar degrees likely peers • Full-mesh at the top of the hierachy • Also address “sibling” edges

  13. Basic Algorithm • Heuristics: • Top provider has largest degree • Based on patterns on BGP routing table entries • Consecutive AS pairs on the left of top provider are customer-to-provider or sibling-sibling edges • Consecutive AS pairs on the right of top provider are provider-to-customer or sibling-sibling edges

  14. Basic Algorithm ... • Computation complexity • O(N): Total number of consecutive AS pairs in the routing table • Problem • BGP Mis-configuration: some BGP speaking routers do not conform to the selective export rule • Example: u, v are providers of w. w announce w-v to u, and we get (u, w, v). Suppose d(v) is the max, we will infer Edge[u,w] = p2c • Consequence: incorrect inference of AS relationships • Solution: Refined algorithm

  15. Refined Algorithm • 1. Compute the degree for each AS • Degree[u] = |neighbor[u]| • 2. Count # of paths that imply an AS pair having customer-provider (transit) relationships • e.g. 217 57 11537 10466 55; 217 57 3908 19092 209) • For each AS path, (u1, u2, ..un), find j such that degree[uj] is the maximum • For i = 1 to j – 1 transit[ui, ui+1] = transit[ui, ui+1] + 1 • For i = j to n – 1 transit[ui+1, ui] = transit[ui+1, ui] + 1

  16. Refined Algorithm (cont.) • 3. Assign relationships to AS pairs • For each AS path (u1, u2, ..un) • For I = 1, …, n-1 • If (transit[ui, ui+1] > L and transit[ui+1, ui] > L) or ( (transit [ui, ui+1] <= L and transit[ui+1, ui] > 0) and (transit[ui+1, ui] <= L and transit [ui, ui+1] >0) ) • Edge[ui, ui+1] = sibling-to-sibling • Else if transit[ui+1, ui] > L or transit [ui, ui+1] = 0 • Edge[ui, ui+1] = provider-to-customer • Else if transit[ui, ui+1] > L or transit [ui+1, ui] = 0 • Edge[ui, ui+1] = customer-to-provider L: a small constant

  17. Final Algorithm • Phase 1: use either basic or refined algorithms to coarsely classify AS pairs into provider-customer or sibling relationships • Phase 2: Identify AS pairs that can not have a peering relationship • For each AS path (u1, u2, ..un) • find the AS uj such that degree[uj] is max1<=i<=ndegree[ui] • for i = 1, …, j-2 • notpeering[ui, ui+1] = 1 • for i = j+1, …, n-1 • notpeering[ui, ui+1] = 1 • if edge[ui-1, ui] <> sibling-to-sibling and edge[ui, ui+1] <> sibling-to-sibling • If degree[ui-1] > degree[ui+1] • notpeering[ui, ui+1] = 1 • else • notpeering[ui-1, ui] = 1

  18. Final Algorithm (cont.) • Phase 3: Assign peering relationship to AS pairs • For each AS path (u1, u2, ..un) • For j = 1, … n-1 • If notpeering[ui, ui+1] <> 1 and notpeering[ui+1, ui] <> 1 and degree[ui] / degree[ui+1] < R and degree[ui] / degree [ui+1] > 1 /R edge[ui, ui+1] = peer-to-peer R: a sensitive constant, very difficult to properly set

  19. Relationships # AS pairs Percentage P-C / C-P 12930 93.7% P-P 713 5.7% S-S 157 1.6% Inference Results 13, 800 AS pairs (2000/3/9) [R = 60, L = 1]

  20. Verification of Inferred Relationships by AT&T 8 Comparing inference results from Basic and Final(R= ) with AT&T internal information

  21. Issues with AS Degree based Approach • Challenges • Can’t obtain accurate degree for all ASes from a single BGP view • Assumptions may not always hold • (1) top provider has the highest degree, and • (2) highest degree ASes’ peers have higher degree than their customers • Impact of BGP configuration errors • Router configuration typo. (e.g., 7018 3561 7057 7075 7057) • Mis-configuration of small ISPs, e.g., (1239 11 116 701 7018) • Unusual AS relationships • (1239 3561 2856 701 702 1849 9090)

  22. Multiple Vantage Points Approach Main Idea of SARK approach: • exploit the structure of partial views of AS graph as seen from multiple vantage points • assign a rank to each AS for each of the partial views • infer the relationships between neighboring ASes by comparing their vectors of ranks

  23. Computing AS Ranks • Each BGP vantage point has a partial view (sub-graph) of global AS graph • A tree (or DAG) rooted at vantage • In general, “leaves” of trees are likely customer ASes • Combine all views together to rank each AS • map each AS into an N-dimensional vector (ri1, ri2, … riN) • rij is the rank of AS i from vantage point j, • AS far away (i.e., w/ smaller ranks) from most vantage points are likely customers of other Ases • A reverse pruning algorithm to compute rank from each vantage point

  24. Reverse Pruning Algorithm • Let X denote the source AS of a particular view of the AS graph • Let P(X) denote the set of AS paths seen from X • Let v(Gx) denote the set of all vertices in Gx from P(X) G = Gx r = 1 While (leaves(G) != NULL) { For all u  leaves(G) Rank(u) = r; v’ = v(G) – leaves(G); r = r + 1; G = Gv’ } For all u  v(G) set rank(u) = r;

  25. Inferring AS Relationships • N vantage points • l(i, j): number of coordinates k where rik > rjk • e(i,j): number of coordinates k where rik = rjk • For an adjacent AS pairs: (i,j) • Provider-customer: AS i is a provider of AS j if l(i, j) >= N/2 and l(j, i) = 0 (i “dominates” j) • Peer-to-peer: AS i and AS j are peers if e(i, j) > N/2 (i is “equivalent” to j)

  26. “Probabilistic” Version • Probabilistic Dominance: • if l(i, j)/l(j,i) > 0for a high value of 0 then i probably dominates j, and thus i is a provider of j • AS i is (more likely) provider of AS j • Probabilistic Equivalence: t • two ASes are probably equivalent if 1/1 <= 1(i,j)/l(j,i) > 1 for a 1 close to 1 • This rule is used to infer peering relationships between ASes when visibility is poor across the partial views • N = 10,  = 2 in the experiments • No sibling edges inferred

  27. “Hardness” of ToR Inference Problem • ToR edge orientation problem is NP-complete! • Original formulation is a minimization problem • ToR-D problem – a decision problem: • Given a graph G, a set of paths P, and an integer k, test if it is possible to give an orientation to some of the edges of G so that the number of invalid paths is at most k • ToR-D-Simple problem: • Given a graph G, a set of paths P, and an integer k, test if it is possible to give an orientation to all edges of G so that the number of invalid paths is at most k • ToR-D-problem admits a solution iff ToR-D-simple admits one • .

  28. How Hard is ToR-D Simple? • We can map ToR-D simple to MAX2SAT • MAX2SAT is a NP-complet problem • However, when k=0, ToR-D simple is equivalent to 2SAT, which is solvable in linear time • finding strongly connected components in a directed graph G2SAT , and verify they contain no cycles • Heuristics for solving ToR-simple problem • find the maximum subset of paths that can all be made valid by removing paths involved in “cycles” in the directed graph G2SAT

  29. Mapping to MAX2SAT

  30. Solving 2SAT Problem • All clauses can be satisfied (all paths can be made valid) if there is no variable xi belonging with its negation to the same SCC in G2SAT (conflict variable/edge) • SCC (strongly connected component) is a set of mutually reachable nodes in a directed graph • Proper direction of non-conflict edges can be done via topological sorting in G2SAT (if the variable negation is before the variable itself, then the variable is true, and vice versa) • Topological sorting is a natural ordering of nodes in directed acyclic graphs

  31. Summary • All three approaches have their own advantages and disadvantages • Applying all these approaches will get very different AS relationships • Example: running Gao and SARK’s algorithms on the same dataset (2003/01/09). (www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy) • Results Gao SARK Common • P-C 29446 29320 27852 • P-P 1015 1495 262 • S-S 339 Not considered N/A • Inferring peering relationships is a hard problem! • There are some further improvements upon these algorithms

  32. Case 1: some edges can be directed any way without causing invalid paths Fix: introduce additional incentive to direct edge along the node degree gradient Case 2: trying to infer sibling links leads to proliferation of error Fix: try to discover sibling links using the WHOIS database 701 617 618 sibling Causes of some problemsand possible resolutions 701 ? cust-prov 616 1 2 8043

  33. Discussion • No perfect solution yet (or is it ever one?) • It is unlikely to obtain a complete global Internet topology. (BGP routing tables) • How to get more satisfying results with such partial dataset? • BGP mis-configuration makes inferring complicated. • Inferring p-p is much more difficult than inferring p-c. • Given two interconnected ASes, can we develop some “AS attributes” to distinguish them? • How to validate AS relationships without internal information.

  34. Main Reference • L. Gao, “On Inferring Autonomous Systems Relationships in the Internet,” IEEE/ACM Tran. Networking, Dec. 2001. • L. Subramanian et al., “Characterizing the Internet Hierarchy from Multiple Vantage Points,” INFOCOM Jun. 2002. • G. Di Battista et al., “Computing the Types of the Relationships between Autonomous Systems,” INFOCOM Apr. 2003.

More Related