1 / 22

Internet Iso-bar: A Scalable Overlay Distance Monitoring System

Internet Iso-bar: A Scalable Overlay Distance Monitoring System. Yan Chen, Lili Qiu, Chris Overton and Randy H. Katz. Motivations. Applications of end-to-end distance monitoring/estimation Overlay Routing/Location Peer-to-peer Systems VPN Management/Provisioning

ianna
Download Presentation

Internet Iso-bar: A Scalable Overlay Distance Monitoring System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Internet Iso-bar: A Scalable Overlay Distance Monitoring System Yan Chen, Lili Qiu, Chris Overton and Randy H. Katz

  2. Motivations Applications of end-to-end distance monitoring/estimation • Overlay Routing/Location • Peer-to-peer Systems • VPN Management/Provisioning • Service Redirection/Placement • Cache-infrastructure Configuration Requirements for E2E distance monitoring system • Scalable: a small amount of probing traffic and system load • Accurate: capture congestion/failures + latency estimation • Fast: small computation for real-time estimation • Incrementally deployable • Easy to use Benefit applications • Application-driven measurement • Inference techniques for trouble shooting, root cause analysis • Improve application performance and reliability

  3. E2E Estimation/Monitoring Systems Comparison

  4. E2E Estimation/Monitoring Systems Comparison

  5. E2E Estimation/Monitoring Systems Comparison

  6. E2E Estimation/Monitoring Systems Comparison

  7. Problem Formulation • Given N end hosts, how to select a subset of them as monitors and build a scalable overlay distance monitoring service without knowing the underlying topology? • Distance info desired: report congestion/failure if occurs, otherwise latency

  8. E2E Congestion/Failures Analysis • Based on National Lab of Applied Network Research (NLANR) AMP data set • 104 sites in US (including Alaska, Hawaii) & Australia, every host ping all other hosts every minute • Sliding window of 10 samples, use minimum RTT as latency sample • 105M measurements, 6/25/01 – 7/1/01 • Congestion/failures (uniformly denoted as congestion) defined as measurement “loss” or (latency > geo mean × geo stdev) • Congestions not common, only 0.96% samples • A few congestion links dominate the E2E congestion • Besides those happened at the last mile, E2E congestion exhibit strong spatial correlation

  9. NLANR AMP Sites

  10. Internet Iso-bar • Procedures • Cluster hosts that perceive similar performance to a small set of sites (landmarks) • For each cluster, select a monitor for active and continuous probing • Estimate distance between any pair of hosts using inter- and intra-cluster distance

  11. Internet Iso-bar (I): Host Clustering • Define correlationdistance between each pair of hosts • Existing work use network proximity:cor_dist(i,j) = net_dist(i,j) (denoted pij) • Iso-bar uses network distance vector(k landmarks for clustering only): netVi = [pi1, pi2, …, pik]T • Euclidean distance based: • Cosine vector similarity based: • Apply generic clustering methods • Optimize the worst case: minimize the maximum radius of all clusters (limit_num_minRmax) • Optimize the average case: minimize the sum of total host-monitor distance (limit_num_minDistSum)

  12. Diagram of Internet Iso-bar Cluster C Cluster B Cluster A Landmark End Host

  13. Diagram of Internet Iso-bar Distance probes from monitor to its hosts Distance probes among monitors Cluster C Cluster B Cluster A Landmark Monitor End Host

  14. j i m j mj i mi Internet Iso-bar (II): Distance Estimation • Intra-cluster estimation • If path(m, i) or path(m, j) is congested, report path(i, j) as congestion • O/w pDist(i,j) = (mDist(m, i) + mDist(m, j))/ 2 • Inter-cluster estimation • If path(mi, i), path(mi, mj) or path(mj, j) is congested, report path(i, j) as congestion • O/w pDist(i,j) = mDist(mi, mj)

  15. Evaluation Methodology • Internet measurement data • NLANR AMP data set • Clustering with geometric mean of training date • Estimation dates: 6/25/01 – 7/24/01, 12/06/01 • Keynote CDN measurement data • 63 agents covering all major ISPs in US, Europe, Asia & Australia • 2 targets (CDN re-directors) in Boston and Texas • Measure TCP connection time (2/3 of handshake) from each agent to target every minute • Training date: 10/21/2002 • Estimation dates: 10/21/2002 – 11/25/2002 • Similar latency estimation results for both datasets, present NLANR

  16. Evaluation Methodology (II) • Estimation metric • Relative accuracy error for un-congested latency • Stability • For dynamic monitoring systems, amount of congestion captured and false positive ratio • Internet distance estimation techniques evaluated • Omniscent: use g-mean data of (source, dest) on training date • Global Network Positioning (GNP) • Clustering with network distance vector (Iso-bar) • Clustering with network proximity • 15 clusters vs. 15 landmarks of GNP

  17. Latency Prediction Accuracy & Stability • Training date: 06/25/01 • Estimation dates: 06/25/01 - 12/06/01 • Summary of the 90th percentile relative error for various distance estimation methods

  18. Distance Estimation Results • Latency estimation when un-congested • Omniscient is the most accurate, but unscalable • GNP and Iso-bar are the second • Both have good accuracy and stability for distance estimation • GNP unscalable for online monitoring, static approach • Iso-bar outperforms proximity-based clustering by 50% • 90th percentile < 0.5, if 60ms latency, 45ms < prediction < 90ms • Congestion/failures estimation • 6/25/01 – 7/01/01, averagely 148K congested measurements per day • Iso-bar captures 78% of them, 32% false positive ratio • Only 3% of monitoring overhead compared with RON

  19. Conclusions • Propose Internet Iso-bar • Cluster hosts based on the network similarity • Inter- and Intra-cluster latency estimation w/ first-step heuristic for congestion/failure detection • Preliminary results promising • High accuracy & stability for normal latency estimation • Simple heuristics of congestion estimation captures 78% of congestions, with 32% false positive, and only 3% of monitoring overhead of RON

  20. Ongoing Work • Current focus switch from latency estimation to congestion/failures estimation • Apply topology information, e.g. lossy link detection with network tomography • Cluster and choose monitors based on the lossy links • Benefit applications • Dynamic node join/leave for P2P systems • Joining client pings landmark sites to get distance vector, compare with those of monitors, and choose closest one to join • Split/merge clusters • Multi-path selection • More comprehensive evaluation • Simulate with large network • Deploy on PlanetLab, and operate at finer level

  21. Internet Iso-bar Problem formulation: Given N end hosts, how to select a subset of them as monitors and build a scalable overlay distance monitoring service without knowing the underlying topology? Distance info desired: report congestion/failure if occurs, o/w latency Our approach: • Cluster hosts that perceive similar performance to a small set of sites (landmarks) • For each cluster, select a monitor for active and continuous probing • Estimate distance between any pair of hosts using inter- and intra-cluster distance Performance evaluation • Using real Internet measurement data • Compared with other distance estimation services: GNP, RON • Performance metrics: accuracy and stability

  22. Internet Iso-bar (II): Distance Estimation • Congestion/failures analysis • Congestion/failures (uniformly denoted as congestion) not common • Defined as measurement “loss” or (latency > geo mean × geo stdev) • Only 0.96% out of 105M NLANR ping measurements over a week • Suggest a few congestion links dominate the E2E congestion • Besides those happened at the last mile, E2E congestion exhibit strong spatial correlation • Estimation algorithms • Intra-cluster estimation (i and j use the same monitor m) • If path(m, i) or path(m, j) is congested, report path(i, j) as congestion • O/w predictedDist(i,j) = (measuredDist(m, i) + measuredDist(m, j))/ 2 • Inter-cluster distance estimation • If path(monitori, i), path(monitori, monitorj) or path(monitorj, j) is congested, report path(i, j) as congestion • Otherwise predictedDist(i,j) = measuredDist(monitori, monitorj) • Self-diagnostics of monitors, check for last-mile congestion

More Related