the internet s physical topology or will the internet ever measure itself
Download
Skip this Video
Download Presentation
The Internet’s Physical Topology (or, Will the Internet ever measure itself?) ‏

Loading in 2 Seconds...

play fullscreen
1 / 33

The Internet’s Physical Topology (or, Will the Internet ever measure itself?) ‏ - PowerPoint PPT Presentation


  • 281 Views
  • Uploaded on

Scott Kirkpatrick, School of Engineering, Hebrew University of Jerusalem EVERGROW and OneLab2 Collaborators (thanks, not blame…) ‏ Yuval Shavitt, Eran Shir, Udi Weinsberg, Shai Carmi, Shlomo Havlin, Avishalom Shalit, Daqing Li.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'The Internet’s Physical Topology (or, Will the Internet ever measure itself?) ‏' - adamdaniel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the internet s physical topology or will the internet ever measure itself
Scott Kirkpatrick,

School of Engineering, Hebrew University of Jerusalem

EVERGROW and OneLab2

Collaborators (thanks, not blame…)‏

Yuval Shavitt, Eran Shir, Udi Weinsberg,

Shai Carmi, Shlomo Havlin, Avishalom Shalit, Daqing Li

The Internet’s Physical Topology(or, Will the Internet ever measure itself?)‏
the internet is our most distributed work of engineering
The Internet is our most distributed work of Engineering

Federated initially from military and commercial networks, some of which involved highly proprietary and gratuitously different platforms.

Arpanet, DECnet, PC-based systems, IBM’s SNA, BitNet, Euronet…

As a result, there are two distinct layers: BGP and above (inter-AS), and intra-AS (OSPF shortest path, MPLS, ATM, …)‏

BGP information is exchanged by sharing recommended routes, exposing only those for which an AS will be properly compensated.

Engineering the Internet has always been distributed using a formal model, IETF RFP’s etc. similar to international standards formation, yet with less commercial involvement than typical ISO practice, and a US center-of-gravity for the deliberations.

Layering of communications protocols has permitted high degree of refinement, but now seems in stasis.

There are no global databases, many local databases, poor data quality.

measuring and monitoring the internet
Measuring and monitoring the Internet

Has undergone a revolution

Traceroute – an old hack  basic tool in wide use

Active monitors – hardware intensive  distributed software

DIMES (“[email protected]”) an example, not the only one now

Many enhancements under consideration, as the problems in traceroute become very evident

Ultimately, we expect every router (or what they become in the future internet) will participate in distributed active monitoring.

The payoff comes with interactive and distributed services that can achieve greater performance at greatly decreased overhead

history of traceroute active measurement
History of TraceRoute active measurement

Jacobson, “traceroute” from LBL, February 1989

And this is something that can be rewritten for special situations, such as cellphones

Single machine traces to many destinations – Lucent, 1990s (Burch and Cheswick)‏

Great pictures, but interpretation not clear, demonstrate need for more analytic visualization techniques

But excellent for magazine covers, t-shirts…

First attempt to determine the time evolution of the Internet

First experience in operating under the “network radar”

history of internet measurement ctd
History of Internet Measurement, ctd.

Skitter and subsequent projects at CAIDA (SDSC)‏

15-50 machines (typically <25), at academic sites around world

RIPE and NLANR, 1-200 machines, commercial networks and telco backbones, information is proprietary

DIMES (>10,000 software agents) represents the next step

Current statistics:

8298 users

19,597 agents registered (in 115 countries)‏

Have seen

29,404 Ases and 204,204 AS-AS links

6.6 B measurements saved since 9/2004

dimes data available for general use
DIMES data available for general use
  • Monthly data files currently available from 1/2007
  • Weekly data files available by web request from 9/2004
  • Data sets
    • AS nodes
    • AS edges
    • Routers
    • City Edges
    • POPs (tested, but not released yet)
dimes documentation
DIMES documentation
  • File entries are explained, otherwise, caveat emptor:
traceroute is more than a piece of string
Traceroute is more than a piece of string

A flood of feigned suicide packets (with TTL values t=1 to about 30 hops), each sent more than one time.

Ideal situation, each packet dies at step t, router returns echo message, “so sorry, your packet died at ip address I, time T”

Non ideal situations must be filtered to avoid data corruption:

Errors – router inserts destination address for I

Non-response is common

Multiple interfaces for a single (complex) router

Route flaps, load balancing create false links

Route instabilities can be reduced with careful header management (requires guessing router tricks)‏

Resulting links must be resolved – to Ases, to routers, to POPs

models of the internet are highly contentious
Models of the Internet are highly contentious
  • Practitioner preferences – start with points in 2D
    • Impose physical constraints of actual routers
      • Finite number of connections
      • Low connectivity/high bandwidth (core)
      • High connectivity/low bandwidth (edge)
    • Introduce randomness through distance-dependent probability of interconnection
    • Reorganize net locally in ways thought to reflect engineering practice
    • Ignore existence of extended entities (large Ases)
  • Results in strong resistance to scale-free models
use a new analytical tool k pruning
Use a new analytical tool – k-pruning

Prune by grouping sites in “shells” with a common connectivity further into the Internet: All sites with connectivity 1 are removed (recursively) and placed in the “1-shell,” leaving a “2-core” then removing 2-shell leaves 3-core, and so forth.

The union of shells 1- k is called the “k-crust”

At some point, kmax, pruning runs to completion.

Identify nucleus as kmax-core

This is a natural, robust definition, and should apply to other large networks of interest in economics and biology.

Cluster analysis finds interesting structure in the k-crusts

k crusts show percolation threshold
K-crusts show percolation threshold

 These are the hanging

tentacles of our (Red Sea)‏

Jellyfish

For subsequent analysis, we distinguish three components:

Core, Connected, Isolated

Largest cluster in each shell

Data from 01.04.2005

meduza model
Meduza (מדוזה) model

This picture has been stable from January 2005 (kmax = 30) to present day, with little change in the nucleus composition. The precise definition of the tendrils: those sites and clusters isolated from the largest cluster in all the crusts – they connect only through the core.

what about the error bars the bias etc
What about the error bars, the bias, etc.?

Need to address the specifics of the “network discoveries”

How frequently observed?

How sensitive are the observations to the number of observers?

How do the measurements depend on the time of observation?

The extensive literature on the subject is mostly straw-man counterexamples, that show bias from this class of observation can be serious, in graphs of known structure, but do not address how to estimate structure from actual measurements.

filtering the masses of data
Filtering the masses of data

Current efforts (me, Weinsberg, Carmi) are studying how the Meduza model and other observations are affected by removal of the less-reliable data:

Infrequently seen links

Less than three days presence in a week

Some things seen only once

Stuff seen by rogue agents

Is it intentional? Probably not.

So far all the basic observations are proving robust.

how does the city data differ from the as graph information
How does the city data differ from the AS-graph information?
  • Cities are local, ASes may be highly extended (ATT, Level 3, Global Xing, Google)
  • About 4000 cities identified, cf. 25,000 ASes
  • But similar features are seen
    • Wide spread of small-k shells
    • Distinct nucleus with high path redundancy
    • Many central sites participate with nucleus
    • A less strong Medusa structure
is bgp routing wasting capacity
Is BGP routing wasting capacity?

Peer-connected component (PCC) capable of long ranged communications as well as local

We've used “betweenness” to test alternate routings which ignore “Tier One” links.

Betweenness is essentially a traffic model.

Each node in a set sends one packet to each other node in the set. (Example, all 1 and 2 shell nodes)‏

Compare maximum betweenness with and without the nucleus ASes.

conclusions will the internet use this information
Conclusions – will the Internet use this information?
  • Undisclosed transverse capacity in peering links can provide a global backup or reserve
  • Unless business relationships evolve, it is not adequate to also carry long-distance traffic
  • One hop (in AS-graph) of extra disclosure will probably suffice to make this viable for regional traffic.
ad