Optimization based approaches to understanding and modeling internet topology
This presentation is the property of its rightful owner.
Sponsored Links
1 / 80

Optimization-Based Approaches to Understanding and Modeling Internet Topology PowerPoint PPT Presentation


  • 70 Views
  • Uploaded on
  • Presentation posted in: General

Optimization-Based Approaches to Understanding and Modeling Internet Topology. David Alderson California Institute of Technology INFORMS Telecom Conference 2004 March 8, 2004. Acknowledgments.

Download Presentation

Optimization-Based Approaches to Understanding and Modeling Internet Topology

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Optimization based approaches to understanding and modeling internet topology

Optimization-Based Approaches to Understanding and Modeling Internet Topology

David Alderson

California Institute of Technology

INFORMS Telecom Conference 2004

March 8, 2004


Acknowledgments

Acknowledgments

  • This talk represents joint work with John Doyle (Caltech), Lun Li (Caltech), and Walter Willinger (AT&T—Research)

  • Many others have contributed to this story

    • Reiko Tanaka (Caltech)

    • Steven Low (Caltech)

    • Ramesh Govindan (USC)

    • Neil Spring (U.Washington)

    • Stanislav Shalunov (Abilene)

    • Heather Sherman (CENIC)

    • John Dundas (Caltech)


Today s agenda

Today’s Agenda

  • Review recent work of empiricists and theoreticians to understand the router-level topology of the Internet

  • Understand the causes and implications of “heavy tails” in the complex structure of the Internet

  • Illustrate how recently popular “scale-free” models of Internet topology are not just wrong, but wildly so

  • Describe the importance of optimization in the development of explanatory models of Internet topology

  • Present the HOT framework as an alternative means to understanding the “robust, yet fragile” structure of the Internet and other complex engineering systems

  • Highlight some open research problems and areas where contributions can be made by the OR/MS community


First question

First Question:

Why should we care about modeling

the topology of the Internet?


Optimization based approaches to understanding and modeling internet topology

Understanding the topology of the current (and future) Internet is important for many reasons

  • Design and evaluation of networking protocols

    • Topology affects performance, not correctness

  • Understanding large-scale network behavior

    • Closed-loop feedback: topology design vs. protocols

      • Is the current design a result of the dominant routing protocols?

      • Or are the presently used routing protocols the result of some prevailing network design principles?

    • Ability to study what-if scenarios

      • Operating policy shifts

      • Economic changes in the ISP market

  • Implications for tomorrow’s networks

    • Provisioning requirements, Traffic engineering, Operating and management policies


Performance evaluation via simulation

(traditional) topology generators provide only connectivity information!

protocol

info

performance

measures

network

topology

info

connectivity

bandwidths

delays

traffic demand

info

application-specific

Performance Evaluation via Simulation

network

simulator

(e.g. ns2)


Performance evaluation via simulation1

Performance Evaluation via Simulation

protocol

info

annotated

network

graph!

performance

measures

network

simulator

(e.g. ns2)

connectivity

bandwidths

delays

traffic demand

info


The internet as an infrastructure

The Internet as an Infrastructure

As the Internet has grown in capability and importance, we have become increasingly dependent on it.

  • Directly: communication (email, instant messenger, VOIP), information and entertainment, e-commerce

  • Indirectly: business, education, government have (permanently) replaced physical/manual methods with electronic processes, many of which rely on the Internet.

    The central importance, open architecture, and evolving technology landscape make the Internet an attractive target for asymmetric attack.


The internet as a case study

The Internet as a Case Study

The Internet is a great starting point for the study of other highly engineered complex network systems:

  • To the user, it creates the illusion of a simple, robust, homogeneous resource enabling endless varieties and types of technologies, physical infrastructures, virtual networks, and applications (heterogeneous).

  • Its complexity is starting to approach that of simple biological systems

  • Our understanding of the underlying technology together with the ability to perform detailed measurements means that most conjectures about its large-scale properties can be unambiguously resolved, though often not without substantial effort.


Question two

Question Two:

Why is research on Internet topology interesting/difficult?


A challenging problem

A Challenging Problem

Since the decommissioning of the NSFNet in 1995, it has been difficult to obtain comprehensive knowledge about the topology of the Internet

  • The network has grown dramatically (number of hosts, amount of traffic, number of ISPs, etc.)

  • There have been economic incentives for ISPs to maintain secrecy about their topologies

  • Direct inspection usually not allowed


Qwest us fiber map june 2001

Qwest US Fiber Map – June 2001

(to Europe)

(to Japan)

Source: www.qwest.com

(to Hawaii)


Optimization based approaches to understanding and modeling internet topology

Sprint backbone


Measuring topology

Measuring Topology

  • Marketing documents not very helpful

  • The task of “discovering” the network has been left to experimentalists

    • Must develop sophisticated methods to infer this topology from appropriate network measurements.

    • Many possible measurements that can be made.


Router level topology

Router-Level Topology

  • Nodes are machines (routers or hosts) running IP protocol

  • Measurements taken from traceroute experiments that infer topology from traffic sent over network

  • Subject to sampling errors and bias

  • Requires careful interpretation

Routers

Hosts


As topology

AS1

AS3

AS2

AS4

AS Topology

  • Nodes are entire networks (ASes)

  • Links = peering relationships between ASes

  • Relationships inferred from Border Gateway Protocol (BGP) information

  • Really a measure of business relationships, not network structure


Measuring topology1

Measuring Topology

  • Marketing documents not very helpful

  • The task of “discovering” the network has been left to experimentalists

    • Must develop sophisticated methods to infer this topology from appropriate network measurements.

    • Many possible measurements that can be made.

    • Each type of measurement has its own strengths, weaknesses, and idiosyncrasies, and results in a distinct view of the network topology.

  • Hard to know what “matters”…


Standard approach

“Standard” Approach

  • Choose a sequence of well-understood metrics or observed featuresof interest, such as

    • hierarchy

    • node-degree distributions

    • clustering coefficients

  • Develop a method that matches these metrics

Pros:

  • Always possible to obtain a good “fit” on a chosen metric.

    Cons:

  • Hard to choose the “right” metric. What is “right” is apt to vary, depending on the intended use of the topology.

  • A method that does a good job of matching the chosen metric often does not fit other metrics well.

  • No predictive power.

    We call this approach descriptive (evocative) modeling.


An alternative approach

An Alternative Approach

  • Identify the causal forces at work in the design and evolution of real topologies.

  • Develop methods that generate and evolve topologies in a manner consistent with these forces.

Pros:

  • Ability to generate (and design!) topologies at different levels of hierarchy

  • More realistic topology

  • Greater predictive power

  • Possibly reveal some relationship with routing protocols

    Cons:

  • Difficult to identify the causal forces

  • Requires careful development and diligent validation

    We call this approach explanatory modeling.


Our approach focus on the isp

Our Approach: Focus on the ISP

  • Capture and represent realistic drivers of Internet deployment and operation at the level of the single ISP

  • Many important networking issues are relevant at the level of the ISP (e.g. configuration, management, pricing, provisioning)

  • Common topologies represented in terms of ISPs

    • Router-level graphs  connectivity within the ISP

    • AS graphs  connectivity between ISPs

  • First-Order Objective: the ability to generate a “realistic, but fictitious” ISP topology at different levels of hierarchy


Isp driving forces

ISP Driving Forces

  • Economic Factors:

    • Cost of procuring, installing, and maintaining the necessary facilities and equipment

    • Limited budget for capital expenditures

    • Need to balance expenditures with revenue streams

    • Need to leverage investment in existing infrastructure

    • Location of customers

  • Technological Factors:

    • Hardware constraints (e.g. router speeds, limited # interfaces or line cards per router)

    • Level 2 Technologies (Sonet, ATM, WDM)

    • Existing legacy infrastructure

    • Location and availability of dark fiber


Mathematical framework

Mathematical Framework

  • Use combinatorial optimization to represent the problem and its constraints

    • Objectives (min cost, max profitability, satisfy demand)

    • Constraints (equipment costs/capacities, legacy infrastructure)

    • Parameters (pricing, provisioning, facility location)

  • Study and explore how this framework allows for a range of ISP behavior

    • Effect of objectives and constraints are the most important

  • Part of a more general framework

    • Highly Optimized Tolerance (HOT), Carlson and Doyle, 1999


Question three

Question Three:

How is research on Internet topology different from what OR/MS researchers are used to doing on other complex engineering networks?


A different type of problem

A Different Type of Problem

  • Researchers in network optimization are used to problems in which the objectives and constraints are well-defined

  • Here, we are trying to uncover the “most significant” drivers of topology evolution so that we can create “fictitious, yet realistic” network counterparts, and also so that we can design and build improved networks

  • We will need to iterate between modeling, measurement, and analysis to get it right

  • In this sense, this is a bit more like biology

  • Recent progress has given hope that a comprehensive theory for the Internet may be possible


Heterogeneity of the internet

Heterogeneity of the Internet

  • Full of “high variability”

    • Link bandwidth: Kbps – Gbps

    • File sizes: a few bytes – Mega/Gigabytes

    • Flows: a few packets – 100,000+ packets

    • In/out-degree (Web graph): 1 – 100,000+

    • Delay: Milliseconds – seconds and beyond

  • Diversity in the technologies that comprise the physical and link layers

  • Diversity in the applications and services that are supported

  • This heterogeneity has evolved organically from an architecture that was designed to be robust to changes (failures or innovation) and is permanent


The internet hourglass

The Internet hourglass

Applications

Web

FTP

Mail

News

Video

Audio

ping

Kazaa

Transport protocols

TCP

SCTP

UDP

ICMP

IP

Ethernet

802.11

Power lines

ATM

Optical

Satellite

Bluetooth

Link technologies


The internet hourglass1

The Internet hourglass

Applications

Web

FTP

Mail

News

Video

Audio

ping

Kazaa

TCP

IP

Ethernet

802.11

Power lines

ATM

Optical

Satellite

Bluetooth

Linktechnologies


The internet hourglass2

IP on

everything

The Internet hourglass

Applications

Everything

on IP

Web

FTP

Mail

News

Video

Audio

ping

Kazaa

TCP

IP

Ethernet

802.11

Power lines

ATM

Optical

Satellite

Bluetooth

Linktechnologies


A theory for the internet

Link

A Theory for the Internet?

  • Vertical decomposition of the protocol stack allows for the treatment of layers in isolation (a separation theorem)

  • Assume that layers not considered perform in a near-optimal manner

  • Use an engineering design-based perspective in two ways:

    • Analysis: explain the complex structure that is observed

    • Synthesis: suggest changes or improvements to the current design

Applications

TCP/

AQM

IP


A theory for the internet1

Link

A Theory for the Internet?

How to design an application that “performs well” in meeting user demands subject to the resources/constraints made available by TCP/IP?

Applications

TCP/

AQM

IP


A theory for the internet2

Link

A Theory for the Internet?

Applications

TCP/

AQM

How to design a network that “performs well” and satisfies traffic demands subject to the physical resources/constraints?

IP


A theory for the internet3

Link

A Theory for the Internet?

If TCP/AQM is the answer, what is the question?

Applications

gives

?

TCP/

AQM

gives

IP

Primal/dual model of TCP/AQM congestion control…


A theory for the internet4

Link

A Theory for the Internet?

Applications

TCP/

AQM

IP

?

If the current topology of the Internet is the answer, what is the question?


Next question

Next Question:

What’s been done to try and understand the large-scale structure of the Internet?

How should we think about the Internet’s router-level topology?


Trends in topology modeling

Observation

Modeling Approach

  • Random graph models (Waxman, 1988) generate connectivity-only topologies

  • Long-range links are expensive (router-level).

  • Real networks are not random, but have obvious hierarchy (router-level).

  • Structural models (GT-ITM Calvert/Zegura, 1996) generate connectivity-only topologies with inherent hierarchy

  • Router-level and AS graphs exhibit heavy-tailed distributions (power laws) in characteristics such as node-degree.

  • Degree-based models (including popular “scale-free” models) generate connectivity-only topologies with inherent power laws in node degree distribution

Trends in Topology Modeling


Power laws and internet topology

Power Laws and Internet Topology

A few nodes have lots of connections

Observed scaling in node degree and other statistics:

  • Autonomous System (AS) graph

  • Router-level graph

    How to account for high variability in node degree?

Source: Faloutsos et al (1999)

number of connections

rank

rank

Most nodes have few connections


Power laws in topology modeling

Power Laws in Topology Modeling

  • Recent emphasis has been on whether or not a given topology model/generator can reproduce the same types of macroscopic statistics, especially power law-type degree distributions

  • Lots of degree-based models have been proposed

    • All of them are based on random graphs, usually with some form of preferential attachment

    • All of them are connectivity-only models and tend to ignore engineering-specific system details

  • Examples: BRITE, INET, Barabasi-Albert, GLP, PLRG, CMU-generator


Models of internet topology

Models of Internet Topology

  • These topology models are merely descriptive

    • Measure some feature of interest (connectivity)

    • Develop a model that replicates that feature

    • Make claims about the similarity between the real system and the model

    • A type of “curve fitting”?

  • Unfortunately, by focusing exclusively on node degree distribution, these models that get the story wrong

  • We seek something that is explanatory

    • Consistent with the drivers of topology design and deployment

    • Consistent with the engineering-related details

    • Can be verified through the measurement of appropriate system-specific details


Our perspective

Our Perspective

  • Must consider the explicit design of the Internet

    • Protocol layers on top of a physical infrastructure

    • Physical infrastructure constrained by technological and economic limitations

    • Emphasis on network performance

    • Critical role of feedback at all levels

  • Consider the ability to match large scale statistics (e.g. power laws) as secondary evidence of having accounted for key factors affecting design


Trends in topology modeling1

Observation

Modeling Approach

  • Random graph models (Waxman, 1988) generate connectivity-only topologies

  • Long-range links are expensive (router-level).

  • Real networks are not random, but have obvious hierarchy (router-level).

  • Structural models (GT-ITM Calvert/Zegura, 1996) generate connectivity-only topologies with inherent hierarchy

  • Router-level and AS graphs exhibit heavy-tailed distributions (power laws) in characteristics such as node-degree.

  • Degree-based models (including popular “scale-free” models) generate connectivity-only topologies with inherent power laws in node degree

  • Physical networks have hard technological (and economic) constraints.

  • Optimization-driven models generate annotated topologies consistent with design tradeoffs of network engineers

Trends in Topology Modeling


Optimization based approaches to understanding and modeling internet topology

HOT

Highly Heavily

Heuristically

  • Based on ideas of Carlson and Doyle (1999)

  • Complex structure (including power laws) of highly engineered technology (and biological) systems is viewed as the natural by-product of tradeoffs between system-specific objectives and constraints

  • Non-generic, highly engineered configurations are extremely unlikely to occur by chance

  • Result in “robust, yet fragile” system behavior

Optimized Organized

Tolerance Tradeoffs


Heuristic network design

Heuristic Network Design

What factors dominate network design?

  • Economic constraints

    • User demands

    • Link costs

    • Equipment costs

  • Technology constraints

    • Router capacity

    • Link capacity


Optimization based approaches to understanding and modeling internet topology

Internet End-User Bandwidths

high

performance

computing

1e4

POS/Ethernet

1-10Gbps

1e3

academic

and corporate

1e2

Ethernet

10-100Mbps

Connection Speed (Mbps)

residential and

small business

1e1

a few users have very high speed connections

Broadband

Cable/DSL

~500Kbps

1

most users

have low speed connections

1e-1

Dial-up

~56Kbps

How to build a

network that

satisfies these end

user demands?

1e-2

1e6

1

1e2

1e4

1e8

Rank (number of users)


Economic constraints

Economic Constraints

  • Network operators have a limited budget to construct and maintain their networks

  • Links are tremendously expensive

  • Tremendous drive to operate network so that traffic shares the same links

    • Enabling technology: multiplexing

    • Resulting feature: traffic aggregation at edges

    • Diversity of technologies at network edge (Ethernet, DSL, broadband cable, wireless) is evidence of the drive to provide connectivity and aggregation using many media types


Heuristically optimal network

Hosts

Heuristically Optimal Network

Mesh-like core of fast, low degree routers

Cores

High degree nodes are at the edges.

Edges


Heuristically optimal network1

Heuristically Optimal Network

Claim: economic considerations alone suggest a structure having

  • Mesh-like core of high-speed, low degree routers

  • High degree, low-speed nodes at the edge

  • Is this consistent with technology capability?

  • Is this consistent with real network design?


  • Cisco 12000 series routers

    Cisco 12000 Series Routers

    • Modular in design, creating flexibility in configuration.

    • Router capacity is constrained by the number and speed of line cards inserted in each slot.

    Source: www.cisco.com


    Cisco 12000 series routers1

    Cisco 12000 Series Routers

    Technology constrains the number and capacity of line cards that can be installed, creating a feasible region.


    Cisco 12000 series routers2

    Cisco 12000 Series Routers

    Pricing info: State of Washington Master Contract, June 2002

    (http://techmall.dis.wa.gov/master_contracts/intranet/routers_switches.asp)

    $2,762,500

    $1,667,500

    $932,400

    $560,500

    $602,500

    $381,500

    $212,400

    $128,500


    Optimization based approaches to understanding and modeling internet topology

    Technological advance

    160Gb

    bandwidth

    10Gb

    Technically feasible

    2.5Gb

    625Mb

    155Mb

    16

    256

    1

    log/log

    degree


    Optimization based approaches to understanding and modeling internet topology

    Core

    backbone

    High-end

    gateways

    Edge

    Shared media

    (LAN, DSL,

    Cable, Wireless,

    Dial-up)

    Older/cheaper

    technology

    Technologically Feasible Region

    1000000

    100000

    cisco 12416

    cisco 12410

    10000

    cisco 12406

    1000

    Bandwidth (Mbps)

    cisco 12404

    100

    cisco 7500

    cisco 7200

    10

    cisco 3600/3700

    1

    cisco 2600

    1

    10

    100

    1000

    10000

    0.1

    degree

    linksys 4-port router

    uBR7246 cmts

    0.01

    (cable)

    cisco 6260 dslam

    (DSL)

    cisco AS5850

    (dialup)


    Optimization based approaches to understanding and modeling internet topology

    Intermountain

    GigaPoP

    U. Memphis

    Indiana GigaPoP

    WiscREN

    OARNET

    Great Plains

    Front Range

    GigaPoP

    U. Louisville

    NYSERNet

    StarLight

    Arizona St.

    Iowa St.

    Qwest Labs

    NCSA

    U. Arizona

    UNM

    Oregon

    GigaPoP

    WPI

    Pacific

    Wave

    Pacific

    Northwest

    GigaPoP

    SINet

    SURFNet

    ESnet

    MANLAN

    U. Hawaii

    Rutgers

    WIDE

    MREN

    UniNet

    MAGPI

    CENIC

    Northern

    Crossroads

    GEANT

    TransPAC/APAN

    AMES NGIX

    Tulane U.

    LaNet

    SOX

    North Texas

    GigaPoP

    U. Delaware

    Drexel

    DARPA

    BossNet

    Texas

    GigaPoP

    Mid-Atlantic

    Crossroads

    Texas Tech

    SFGP/

    AMPATH

    Miss State

    GigaPoP

    UT Austin

    NCNI/MCNC

    U. Florida

    UMD NGIX

    UT-SW

    Med Ctr.

    U. So. Florida

    Florida A&M

    Northern Lights

    Merit

    OneNet

    Kansas

    City

    Indian-

    apolis

    Denver

    Chicago

    Seattle

    New York

    Wash

    D.C.

    Sunnyvale

    Los Angeles

    Atlanta

    Houston

    PSC

    Abilene Backbone

    Physical Connectivity

    (as of December 16, 2003)

    OC-3 (155 Mb/s)

    OC-12 (622 Mb/s)

    GE (1 Gb/s)

    OC-48 (2.5 Gb/s)

    OC-192/10GE (10 Gb/s)


    Optimization based approaches to understanding and modeling internet topology

    OC-3 (155 Mb/s)

    OC-12 (622 Mb/s)

    GE (1 Gb/s)

    OC-48 (2.5 Gb/s)

    10GE (10 Gb/s)

    Cisco 750X

    Cisco 12008

    Cisco 12410

    Abilene

    Los Angeles

    Abilene

    Sunnyvale

    CENIC Backbone (as of January 2004)

    Corporation for Education Network Initiatives in California (CENIC) runs the educational backbone for the State of California

    COR

    dc1

    Backbone topologies for both Abilene and CENIC are built as a mesh of high speed, low degree routers.

    As one moves from the core out toward the edge, connectivity gets higher, and speeds get lower.

    dc1

    dc2

    OAK

    hpr

    SAC

    hpr

    dc1

    dc2

    FRG

    dc2

    dc1

    hpr

    dc1

    SVL

    dc3

    FRE

    dc1

    SOL

    dc1

    BAK

    dc1

    SLO

    hpr

    dc1

    hpr

    LAX

    dc2

    dc1

    dc3

    TUS

    dc1

    SDG

    hpr

    dc3

    dc1


    Cenic backbone for southern california

    LA CCD, LA City, LA Harbor, LA Mission, LA Pierce, LA Southwest, LA Trade Tech, LA Valley, Moorpark, Mt. San Antonio, Oxnard

    Antelope Valley CC, Cerritos, Citrus, College of the Canyons, Compton, East LA, El Camino CC, Glendale, Long Beach City College, Pasadena CC,

    Santa Monica, Ventura College

    Caltech

    UC Irvine

    UC Santa

    Barbara

    Abilene

    CUDI Peer,

    ESNet Peer

    Los

    Nettos

    LAAP

    Monrovia

    USD Gigaman

    UCSSN

    (Las Vegas)

    LA USD

    UCLA

    Los Angeles COE

    Chaffey Joint USD

    San Bernardino CSS

    Johnson & Johnson

    UC Riverside

    Riverside COE

    SDSC

    San Diego CC, Soutwestern CC,

    Grossmont, Cuyamaca, Imperial Valley, Mira Costa CC, Palomar College

    Orange COE

    UC San Diego

    San Diego COE

    CENIC Backbone for Southern California

    to Sunnyvale

    to Fremont

    to Soledad

    to Sacramento

    BAK

    SLO

    hpr

    dc1

    dc1

    hpr

    LAX

    dc2

    dc1

    dc3

    TUS

    dc1

    SDG

    hpr

    dc3

    dc1


    Heuristically optimal network2

    Heuristically Optimal Network

    • Mesh-like core of high-speed, low degree routers

    • High degree, low-speed nodes at the edge

    • Claim: consistent with drivers of topology design

      • Economic considerations (traffic aggregation)

      • End user demands

    • Claim: consistent with technology constraints

    • Claim: consistent with real observed networks

      Question: How could anyone imagine anything else?


    Two opposite views of complexity

    Physics:

    Pattern formation by reaction/diffusion

    Edge-of-chaos

    Order for free

    Self-organized criticality

    Phase transitions

    Scale-free networks

    Equilibrium, linear

    Nonlinear, heavy tails as exotica

    Engineering and math:

    Constraints

    Tradeoffs

    Structure

    Organization

    Optimality

    Robustness/fragility

    Verification

    Far from equilibrium

    Nonlinear, heavy tails as tool

    Two opposite views of complexity

    Principle Difference:

    Randomvs.Designed


    Models of internet topology1

    Models of Internet Topology

    • To physicists, scaling relationships are suggestive of critical phenomenon and phase transitions

    • The starting assumption is that of randomness, and one looks for emergent behaviors


    Random networks

    Random Networks

    Two methods for generating random networks having power law distributions in node degree

    • Preferential attachment (“scale-free” networks)

      • Inspired by statistical physics

      • Barabasi et al.; 1999

    • Power Law Random Graph (PLRG)

      • Inspired by graph theory

      • Aiello, Chung, and Lu; 2000


    Summary of scale free story

    Summary of “Scale-Free” Story

    • Fact: “Scale-free” networks have roughly power law degree distributions

    • Claim:

      • If the Internet has power law degree distribution

      • Then it must be “scale-free” (oops)

      • Therefore, it has the properties of a “scale-free” network

    • Characteristic features of “scale free” networks

      • High degree central “hubs”

      • Network connectivity is robust to loss of random nodes, but fragile to attack on central hubs

      • Highly likely to result from various random constructions


    Optimization based approaches to understanding and modeling internet topology

    One of

    the most-read

    papers ever on

    the Internet!


    Scientists spot achilles heel of the internet

    Scientists spot Achilles heel of the Internet

    • "The reason this is so is because there area couple of very big nodes and all messages are going through them. But if someone maliciously takes down the biggest nodes you can harm the system in incredible ways. You can very easily destroy the function of the Internet."

    • These scientists compared the structure of the Internet to the airline network of the United States.


    Key points

    Key Points

    • The scale-free story is based critically on the implied relationship between power laws and a network structure that has highly connected “central hubs”

    • Not all networks with power law degree distributions have properties of scale free networks. (The Internet is just one example!)

    • Building a model to replicate power law data is no more than curve fitting (descriptive, not explanatory)

    • The ubiquity of heavy-tailed (power-law) relationships in highly variable phenomena is to be expected for statistical reasons alone and requires no “special” or “exotic” explanation


    End result

    End Result

    The “scale-free” claims of the Internet are not merely wrong, they suggest properties that are the opposite of the real thing.

    Fundamental difference:

    random vs. designed


    Optimization based approaches to understanding and modeling internet topology

    Internet topologies

    nodes=routers

    edges=links

    25 interior routers

    818 end systems

    “scale-rich” vs. “scale-free”

    rank

    1

    10

    High degree hub-like core

    identical power-law degrees

    Low degree mesh-like core

    0

    10

    How to characterize / compare these two networks?

    1

    2

    degree

    10

    10


    Network performance

    Step 1: Constrain to be feasible

    Step 2: Compute traffic demand

    1000000

    100000

    10000

    Bj

    Abstracted Technologically Feasible Region

    1000

    Bandwidth (Mbps)

    100

    Step 3: Compute max flow 

    xij

    10

    degree

    1

    10

    100

    1000

    Bi

    Network Performance

    Given realistic technology constraints on routers, how well is the network able to carry traffic?


    Network likelihood

    Network Likelihood

    How likely is a particular graph (having given node degree distribution) to be constructed?

    • Notion of likelihood depends on defining an appropriate probability space for random graphs.

    • Many methods (all based on probabilistic preferential attachment) for randomly generating graphs having power law degree distributions:

      • Power Law Random Graph (PLRG) [Aiello et al.]

      • Random rewiring (Markov chains)

        In both cases, LogLikelihood (LLH) 


    Optimization based approaches to understanding and modeling internet topology

    Why such striking

    differences with same

    node degree distribution?

    Fast

    Performance

    Slow

    Low

    High

    Likelihood


    Optimization based approaches to understanding and modeling internet topology

    Fast

    Slow

    Low

    High

    Likelihood

    Performance

    Likelihood

    Fast core

    100000

    100000

    Slow

    core

    10000

    10000

    1000

    1000

    Bandwidth (Mbps)

    100

    100

    High-degree edge

    Slower edge

    10

    10

    1

    1

    1

    1

    10

    100

    10

    100

    degree

    degree


    Optimization based approaches to understanding and modeling internet topology

    “Scale-free”

    Core: Hub-like, high degree

    Edge: Low degree

    Robust to random 

    Fragile to “attack”

    HOT “scale-rich”

    Core: Mesh-like, low degree

    Edge: High degree

    Robust to random 

    Robust to “attack”

    + objectives and constraints

    • High performance

    • Low link costs

    • Unlikely, rare, designed

    • Destroyed by rewiring

    • Similar to real Internet

    • Low performance

    • High link costs

    • Highly likely, generic

    • Preserved by rewiring

    • Opposite of real Internet


    Optimization based approaches to understanding and modeling internet topology

    HOT

    Random

    Low Likelihood

    Low Performance

    Hierarchical

    Scale-Free (HSF)

    Most Likely


    Optimization based approaches to understanding and modeling internet topology

    Key Points

    • High performance networks are extremely unlikely to be found by a random process.

    • Models that focus on “highly likely” constructions will result in graphs that are poorly performing and are not representative of highly engineered networks.


    Recap

    Recap

    • We do not claim that our “heuristically optimal topology” is an accurate representation of the real Internet, simply that it captures some basic features observed in real networks.

    • But it goes a long way to dispelling much of the confusion about heavy-tailed node degree distributions (i.e. “scale-free” models are fundamentally inconsistent with engineering design despite their ability to match macro-statistics).

    • “Scale-free” models may be good representations of other systems, simply not the router-level of the Internet.

    • This highlights the importance of “random vs. designed”.

    • It is remarkable how even simple models based on fundamental technological and economic tradeoffs can go a long way to explaining large-scale network features.

    • These models are a only starting point for a more detailed investigation of Internet topology.


    Final question

    Final Question:

    What still needs to be done to understand the large-scale structure of the Internet?

    How can researchers in OR/MS help to solve this problem?


    To be done

    (to be done)

    • Understand the relationship between optimization drivers and topology

      • Example: Papadimitriou’s HOT

    • Description of more detailed optimization models that account for real economic and technological considerations (?)

    • Tie-in to WDM network optimization currently in vogue

    • (Need help thinking this through)


    Today s agenda1

    Today’s Agenda

    • Review recent work of empiricists and theoreticians to understand the router-level topology of the Internet

    • Understand the causes and implications of “heavy tails” in the complex structure of the Internet

    • Illustrate how recently popular “scale-free” models of Internet topology are not just wrong, but wildly so

    • Describe the importance of optimization in the development of explanatory models of Internet topology

    • Present the HOT framework as an alternative means to understanding the “robust, yet fragile” structure of the Internet and other complex engineering systems

    • Highlight some open research problems and areas where contributions can be made by the OR/MS community


    Thank you

    Thank You

    [email protected]


    Appendix

    Appendix


    Recent measurement experiments

    Recent Measurement Experiments

    • R. Govindan and H. Tangmunarunkit. Heuristics for Internet Map Discovery, Proceeding of IEEE INFOCOM (2000) [often known as Mercator Project]

    • L. Gao. On inferring autonomous system relationships in the Internet, in Proc. IEEE Global Internet Symposium, November 2000.

    • Route Views, University of Oregon Route Views Project, Available at http://www.antc.uoregon.edu/route-views/.

    • A. Broido and k. Claffy. Internet Topology: Connectivity of IP Graphs, Proceeding of SPIE ITCom WWW Conf. (2001) [often known as Skitter Project]

    • N. Spring, R. Mahajan, and D.Wetherall. Measuring ISP Topologies with Rocketfuel, Proc. ACM SIGCOMM (2002)

    • L. Subramanian, S. Agarwal, J. Rexford, and R. Katz. Characterizing the Internet Hierarchy from Multiple Vantage Points, Proc. IEEE INFOCOM (2002)

    • H. Chang, R. Govindan, S. Jamin, S. Shenker, and W. Willinger. Towards Capturing Representative AS-Level Internet Topologies Proc. Of ACM SIGMETRICS (2002)


    Optimization based approaches to understanding and modeling internet topology

    Models based on preferential attachment:

    • INET: use both curve fitting and preferential attachment. It first calculates the frequency-degree and rank-degree distributions. It then assigns degrees to each node according to these distributions. Finally it matches these degrees according to the linear preferential model.

    • BRITE: BRITE incorporates recent preferential attachment and observations of skewed network placement and locality in network connections on the Internet.

    • BA: preferential attachment

    • AB: preferential attachment + adding a third rewiring operation consisting of choosing links randomly and re-wiring each end of them according to the same linear preference rule as used in BA generator.

    • GLP (Generalized Linear Preference): change preferential probability from d_i/sum(d_i) to (d_i-beta)/(sum(d_i-beta)), and beta is a tunable paparmenter between (-\infinity, 1). With some probabilty, add links preferentially and with some other probability, add nodes preferentially.

      Other models:

    • PLRG: given N nodes with expected degree distribution, assign links among links with probability proprotional to the product of the expected degree of the two end points.

    • CMU Power-law generator: two ways: 1. assign a power-law degree distribution to nodes and then place links in the adjacency matrix such that every node obtains the assigned degree. 2. Recusive way: define a probability distribution function that randomly selects a pair of nodes and use it to produce a network graph with real-valued edge weights.


  • Login