180 likes | 279 Views
Explore how traceroutes reveal inter-ISP router policies, early/late exits, and more. Can machine learning enhance policy understanding?
E N D
Learning ISP Policies from Traceroute Data Michael Cafarella Daniel Lowd December 8, 2004
Rocketfuel (SMWA, ’02) • Maps networks with traceroutes
Path Inflation (SMA ’03) • After learning topology, use traces to determine inter-ISP router policies: • Early Exit • Late Exit • Idiosyncratic engineered routes • Others? • PI compares true distance against measured time in ISPs • Can we use ML to describe policies in greater detail?
Data • Each trace creates a record of machines encountered, and RTTs to reach them • Data is filthy Probe: 1:ner-routes.bbnplanet.net:ha Network: 216.138.90.0/24 Target: 216.138.90.94 Reasons: updown Time: Thu Dec 27 20:17:23 2001 TargetAS: 3356 [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net (4.2.34.162) 0 msec 4 msec 0 msec 2 s7-0-4.bstnma1-cr7.bbnplanet.net (4.0.3.181) 0 msec 0 msec 4 msec 3 so-3-1-0.bstnma1-nbr1.bbnplanet.net (4.24.4.225) 4 msec 4 msec 0 msec …
Data Cleaning [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net (4.2.34.162) 0 ms 4 ms 2 s7-0-4.bstnma1-cr7.bbnplanet.net (4.0.3.181) 4 ms 4 ms 3 so-3-1-0.bstnma1-nbr1.bbnplanet.net (4.24.4.225) 4 ms 4 *** 5 *** 6 p1-0.nycmny1-cr10.bbnplanet.net (4.24.8.170) 8 ms 7 pos4-10.core1.NewYork1.Level3.net (209.244.160.141) [AS 3356] 12 ms 12 ms 8 ms 8 64.159.17.65 12 ms ... 11 gige6-0.ipcolo2.SanFrancisco1.Level3.net (209.244.14.46) [AS 3356] 88 ms 8 4 ms 84 ms 12 unknown.Level3.net (209.246.20.6) [AS 3356] 88 ms 88 ms 13 64.63.16.3 [AS 3356] 88 ms 84 ms * [END] [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net (4.2.34.162) 4 msec 0 msec 4 msec 2 s7-0-4.bstnma1-cr7.bbnplanet.net (4.0.3.181) 4 msec 4 msec 4 msec 3 so-3-1-0.bstnma1-nbr1.bbnplanet.net (4.24.4.225) 4 msec 0 msec 4 msec 4 so-7-0-0.bstnma1-nbr2.bbnplanet.net (4.24.10.218) 4 msec 4 msec 4 msec 5 p9-0.nycmny1-nbr2.bbnplanet.net (4.24.6.50) 8 msec 12 msec 12 msec 6 p1-0.nycmny1-cr10.bbnplanet.net (4.24.8.170) 8 msec 8 msec 8 msec 7 pos4-10.core1.NewYork1.Level3.net (209.244.160.141) [AS 3356] 12 msec 12 mse c 8 msec 8 ae0-53.mp1.NewYork1.Level3.net (64.159.17.65) [AS 3356] 12 msec 8 msec 12 ms Ec ... 11 gige6-0.ipcolo2.SanFrancisco1.Level3.net (209.244.14.46) [AS 3356] 88 msec 8 4 msec 84 msec 12 unknown.Level3.net (209.246.20.6) [AS 3356] 88 msec 88 msec 88 msec 13 64.63.16.3 [AS 3356] 88 msec 84 msec * [END] [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net (4.2.34.162) 4 msec 0 msec 4 msec 2 s7-0-4.bstnma1-cr7.bbnplanet.net (4.0.3.181) 4 msec 4 msec 4 msec 3 so-3-1-0.bstnma1-nbr1.bbnplanet.net (4.24.4.225) 4 msec 0 msec 4 msec 4 so-7-0-0.bstnma1-nbr2.bbnplanet.net (4.24.10.218) 4 msec 4 msec 4 msec 5 p9-0.nycmny1-nbr2.bbnplanet.net (4.24.6.50) 8 msec 12 msec 12 msec 6 p1-0.nycmny1-cr10.bbnplanet.net (4.24.8.170) 8 msec 8 msec 8 msec 7 pos4-10.core1.NewYork1.Level3.net (209.244.160.141) [AS 3356] 12 msec 12 mse c 8 msec 8 64.159.17.65 12 msec 8 msec 12 ms Ec ... 11 gige6-0.ipcolo2.SanFrancisco1.Level3.net (209.244.14.46) [AS 3356] 88 msec 8 4 msec 84 msec 12 unknown.Level3.net (209.246.20.6) [AS 3356] 88 msec 88 msec 88 msec 13 64.63.16.3 [AS 3356] 88 msec 84 msec * [END] [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net (4.2.34.162) 4 msec 0 msec 4 msec 2 s7-0-4.bstnma1-cr7.bbnplanet.net (4.0.3.181) 4 msec 4 msec 4 msec 3 so-3-1-0.bstnma1-nbr1.bbnplanet.net (4.24.4.225) 4 msec 0 msec 4 msec 4 *** 5 *** 6 p1-0.nycmny1-cr10.bbnplanet.net (4.24.8.170) 8 msec 8 msec 8 msec 7 pos4-10.core1.NewYork1.Level3.net (209.244.160.141) [AS 3356] 12 msec 12 mse c 8 msec 8 64.159.17.65 12 msec 8 msec 12 ms Ec ... 11 gige6-0.ipcolo2.SanFrancisco1.Level3.net (209.244.14.46) [AS 3356] 88 msec 8 4 msec 84 msec 12 unknown.Level3.net (209.246.20.6) [AS 3356] 88 msec 88 msec 88 msec 13 64.63.16.3 [AS 3356] 88 msec 84 msec * [END] [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net (4.2.34.162) 4 ms 0 ms 4 ms 2 s7-0-4.bstnma1-cr7.bbnplanet.net (4.0.3.181) 4 ms 4 ms 4 ms 3 so-3-1-0.bstnma1-nbr1.bbnplanet.net (4.24.4.225) 4 ms 0 ms 4 ms 4 *** 5 *** 6 p1-0.nycmny1-cr10.bbnplanet.net (4.24.8.170) 8 ms 8 ms 8 ms 7 pos4-10.core1.NewYork1.Level3.net (209.244.160.141) [AS 3356] 12 ms 12 ms 8 ms 8 64.159.17.65 12 ms 8 ms 12 ms ... 11 gige6-0.ipcolo2.SanFrancisco1.Level3.net (209.244.14.46) [AS 3356] 88 ms 8 4 ms 84 ms 12 unknown.Level3.net (209.246.20.6) [AS 3356] 88 ms 88 ms 88 ms 13 64.63.16.3 [AS 3356] 88 ms 84 ms * [END] [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net 4.2.34.162 0 ms 4 ms 2 s7-0-4.bstnma1-cr7.bbnplanet.net 4.0.3.181 4 ms 4 ms 3 (4.24.4.225)so-3-1-0.bstnma1-nbr1.bbnplanet.net 4 ms 4 *** 5 *** 6 p1-0.nycmny1-cr10.bbnplanet.net [4.24.8.170] 8 ms 7 pos4-10.core1.NewYork1.Level3.net (209.244.160.141) [AS 3356] 12 ms 12 ms 8 ms 8 64.159.17.65 12 ms ... 11 gige6-0.ipcolo2.SanFrancisco1.Level3.net (209.244.14.46) [AS 3356] 88 ms 8 4 ms 84 ms 12 unknown.Level3.net (209.246.20.6) [AS 3356] 88 ms 88 ms 13 64.63.16.3 [AS 3356] 88 ms 84 ms * [END] [AS1:ner-routes.bbnplanet.net:ha] 1 e0-1-5.burlma1-ops1.bbnplanet.net (4.2.34.162) 4 ms 0 ms 4 ms 2 s7-0-4.bstnma1-cr7.bbnplanet.net (4.0.3.181) 4 ms 4 ms 4 ms 3 so-3-1-0.bstnma1-nbr1.bbnplanet.net (4.24.4.225) 4 ms 0 ms 4 ms 4 *** 5 *** 6 p1-0.nycmny1-cr10.bbnplanet.net (4.24.8.170) 8 ms 8 ms 8 ms 7 pos4-10.core1.NewYork1.Level3.net (209.244.160.141) [AS 3356] 12 ms 12 ms 8 ms 8 64.159.17.65 12 ms 8 ms 12 ms ... 11 gige6-0.ipcolo2.SanFrancisco1.Level3.net (209.244.14.46) [AS 3356] 88 ms 8 4 ms 84 ms 12 unknown.Level3.net (209.246.20.6) [AS 3356] 88 ms 88 ms 88 ms 13 64.63.16.3 [AS 3356] 88 ms 84 ms * [END] • Unknown DNS • Missing hops • ms/msec • Inconsistent spacing • Missing RTTs • Inconsistent formatting • Etc, etc, etc
Methodology • Split data (at random) into sets A and B • Find 10 most-popular ISP pairs. For each pair: • Use set A to build topology model, with RTTs on links • Use set B to generate info on each border-router decision • Analyze decisions to deduce ISP policy <numBorderRouters> <routerChosen> [rtr-1] <shortestPath(src, rtr-1)> <shortestPath(rtr-1, dst)> [rtr-2] <shortestPath(src, rtr-2)> <shortestPath(rtr-2, dst)> ... [rtr-numBorderRouters:] ...
Modeling policies • Want to learn functions from data that approximate each peering policy. • Good: most likely router. • Examples: perceptron, neural net, SVM, decision trees, rule sets, nearest neighbor • Better: probability distribution over all routers. • Examples: naïve Bayes, Bayesian network, maxent/logistic regression, MRF, CRF, kernel methods
Logistic regression • Probabilistic • Continuous response (no discretization) • Discriminatively trained • But: only 2 classes (in this form)
Generalizing logit • Each border router has a probability: • Normalize over all border routers:
Deciphering lambdas • Weights determine relative cost of sourcerouter latency vs. routerdestination latency. • Router with the largest weighted latency is always most likely. • Scale of weights determines how skewed this distribution is. • Adding fixed value to all RTTs has no effect.
Special cases • Random: • Early-exit: • Late-exit: • “Optimal”-exit: These will act as baselines in our experiments.
Case study: AT&T Above.net • Our accuracy: 98.55% • Next best: 75.57% (opt-exit) • Learned weights: -6,000 and -4,000
Case study: Ebone Level3 • Our accuracy: 44.37% • Next best: 32.07% (late-exit) • Learned weights: +750 and –1,400
Conclusion • Learning peering policies is hard • Each ISP pair can have a different policy • Policies may be complex, or arbitrary • Simple weight-based models solve some of this problem • More flexible than early/late-exit • Offer insight into routing tradeoffs • Don’t always work
Future work • Additional features: • Geographical distance • Bandwidth • MLNs (Richardson & Domingos 2004): • Represent every router in the graph • Learns local and global policies at once • Can learn engineered routes as well