1 / 22

Internet Tomography and Geography

Explore the field of Internet Tomography and Geography, including related work, main research papers, and how WEBMAPPER can help analyze and visualize the geographic location of clients and servers.

bbarker
Download Presentation

Internet Tomography and Geography

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet Tomography and Geography • What is this area all about? • Related work in the area • Main Paper • WEBMAPPER’s features • How WEBMAPPER works • WEBMAPPER results summarized

  2. Internet Tomography / Geography • Where, geographically, are my clients located (or servers)? • What’s the best server for me to get content from? • Given two IP’s, what’s the latency between them? • What IP’s all come from the same geographic location? • Generally, show me a complete and accurate map of the Internet

  3. Related Work • B. Krishnamurthy and J. Wang, “On network-aware clustering of webclients,” in Proceedings of SIGCOMM ’00, August 2000 • Problem: Which IP’s are the same • Does BGP-table based clustering • Groups IP’s by common administrative control • Passive • S. Jamin, C. Jin, Y. Jin, D. Raz, Y. Shavitt, and L. Zhang, “On the placement of intenet instrumentation,” in Proceedings of INFOCOM ’00, March 2000, pp. 295–304. • Problem: What’s latency between 2 IP’s • Clusters the internet based on BGP address prefixes • Places “Tracers” everywhere in the Internet that actively ping each other

  4. Related work • An Investigation of Geographic Mapping Techniques for Internet Hosts • Padmanabhan, Subramanian, SIGCOMM 2001 • Problem: geo-locate from IP • 3 solutions • GeoTrack: infer location by DNS / traceroute • GeoPing: probe target from known locations, triangulate [active] • GeoCluster: do passive BGP, IP-prefix based clustering

  5. Related work • Predicting Internet Network Distance with Coordinate-Based Approaches • Problem: Given 2 IP’s, what’s latency • Ng, Zhang, CMU, INFOCOM 2002 • Active network of landmarks, pinging each other • Better math model (GNP) for interpolating distance than IDMaps • Proprietary Solutions (Akamai, etc) • Problem: What’s the closest server to an IP?

  6. Clustering and Server Selection Using Passive Monitoring M. Andrews, B. Sheperd, A. Srinivasan, P. Winkler, F. Zane (Bell Labs), INFOCOM 2002

  7. Problem: Server Selection • Given a client, and a set of servers, all with identical content, tell me the “best” server • For the client, “best” means lowest latency or highest throughput • For the whole system, it may mean something else

  8. What is Passive Monitoring? • Content Servers don’t ping each other • Content Servers don’t ping clients • No pinging is done: no additional network traffic is introduced • Instead, servers record the Round-Trip Time of TCP handshake from a client

  9. WEBMAPPER

  10. WEBMAPPER: Clusterer • Clusterer uses the client-server latency pairs reported by the servers • Determines which address prefixes correspond to the same network location, and which don’t • Does more than just find the closest server to each cluster; assigns probabilities to each to balance the network flow

  11. WEBMAPPER Output

  12. WEBMAPPER: Big Tree • Giant Binary Tree of all IP addresses • Well, not all… assume last 8 bits always clustered • Root of tree: 0.0.0.0/0 • Children 0.0.0.0/1, 128.0.0.0/1 • Leaves 123.123.123.123/24

  13. WEBMAPPER: Big Tree • Leaves also store • Sum of recorded distances (per server) • Squared sum of recorded distances • Number of recorded distances • Leaf data is periodically aged exponentially (I = I * 0.9)

  14. WEBMAPPER: Small Tree • Clusters are formed from big tree by folding children into their parents when they’re “similar” to their siblings • Uses statistical test to determine “similar” (two-sample t-test) • Threshold based

  15. Assigning Clusters to Servers Assigning servers to clusters is complicated • 1. Testing Index • Futzes with the probability of a cluster being assigned to a server • Based on multi-armed bandit solution

  16. Assigning Clusters to Servers

  17. Assigning Clusters to Servers • Calculate server capacity • Each cluster assignment has a latency cost • Try to minimize the total cost without breaking any server’s capacity • Graph theory to the rescue: min-cost flow

  18. Results • Experiment 1: • 28 day log of busy web traffic • Recorded client IP, time, RTT • Clustering produced 17,270 clusters • Here are 12 example clusters for high traffic days

  19. Results • Experiment 2: • Set up a west-coast and east-coast server • Force clients to download something from each (to get actual measurments) • Have clients download something from the “either-or” server (WEBMAPPER powered)

  20. Opinions • Statistics and weighing formulae are cool • What’s a good way to tell if the clustering is any good aside from eyeing samples? • Two server test is all we can get out of Bell Labs?

More Related