- 90 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Alex Kesselman , MPI' - moswen

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

Presentation Transcript

### Internet Routing Algorithms

### Peer-to-Peer Networks: Chord

### Switch Scheduling Algorithms

Scheduling algorithms to achieve 100% throughput### Competitive Analysis: Theory and Applications in Networking

Applications of Exponential CostsApplications of Exponential CostsCompetitive Analysis in Networking: OutlineCompetitive Analysis in Networking: Outline### Non-Preemptive Scheduling of Optical Switches

Algorithms for Networks

- Networking provides a rich new context for algorithm design
- algorithms are used everywhere in networks
- at the end-hosts for packet transmission
- in the network: switching, routing, caching, etc.
- many new scenarios
- and very stringent constraints
- high speed of operation
- large-sized systems
- cost of implementation
- require new approaches and techniques

Methods

In the networking context

- we also need to understand the “performance” of an algorithm: How well does a network or a component that uses a particular algorithm perform, as perceived by the user?
- performance analysis is concerned with metrics like delay, throughput, loss rates, etc
- metrics of the designer and of the theoretician not necessarily the same

Recent Algorithm Design Methods

- Motivated by the desire
- for simple implementations
- and for robust performance
- Several methods of algorithm design can be used in the networking context
- randomized algorithms
- approximation algorithms
- online algorithms
- distributed algorithms

In this Mini Course…

- We will consider a number of problems in networking
- Show various methods for algorithm design and for performance analysis

Network Layer Functions

transport packet from sending to receiving hosts

network layer protocols in every host, router

important functions:

path determination: route taken by packets from source to dest.

switching: move packets from router’s input to appropriate router output

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

application

transport

network

data link

physical

application

transport

network

data link

physical

Balaji Prabhakar

Routing

Graph abstraction for routing algorithms:

graph nodes are routers

graph edges are physical links

link cost: delay, $ cost, or congestion level

5

3

5

2

2

1

3

1

2

1

A

D

B

E

F

C

Routing protocol

Goal: determine “good” path

(sequence of routers) thru

network from source to dest.

- “good” path:
- typically means minimum cost path
- other def’s possible

Routing Algorithms Classification

Global or decentralized information?

Global:

all routers have complete topology, link cost info

“link state” algorithms

Decentralized:

router knows physically-connected neighbors, link costs to neighbors

iterative process of info exchange with neighbors

“distance vector” algorithms

Static or dynamic?

Static:

routes change slowly over time

Dynamic:

routes change more quickly

periodic update

in response to link cost changes

Link-State Routing Algorithms: OSPF

Compute least cost paths from a node to all other nodes using Dijkstra’s algorithm.

advertisement carries one entry per neighbor router

advertisements disseminated via flooding

Dijkstra’s algorithm: example

5

3

5

2

2

1

3

1

2

1

A

D

B

E

F

C

D(B),p(B)

2,A

2,A

2,A

D(D),p(D)

1,A

D(C),p(C)

5,A

4,D

3,E

3,E

D(E),p(E)

infinity

2,D

Step

0

1

2

3

4

5

start N

A

AD

ADE

ADEB

ADEBC

ADEBCF

D(F),p(F)

infinity

infinity

4,E

4,E

4,E

Route Optimization

Improve user performance and network efficiency by tuning OSPF weights to the prevailing traffic demands.

customers or

peers

AT&T

backbone

customers or

peers

Route Optimization

- Traffic engineering
- Predict influence of weight changes on traffic flow
- Minimize objective function (say, of link utilization)
- Inputs
- Networks topology: capacitated, directed graph
- Routing configuration: routing weight for each link
- Traffic matrix: offered load each pair of nodes
- Outputs
- Shortest path(s) for each node pair
- Volume of traffic on each link in the graph
- Value of the objective function

Example

Links AB and BD are overloaded

B

1

1

2

D

E

1

2

A

C

Change weight of CD to 1 to improve routing (load balancing) !

References

- Anja Feldmann, Albert Greenberg, Carsten Lund, Nick Reingold, Jennifer Rexford, and Fred True, "Deriving traffic demands for operational IP networks: Methodology and experience," IEEE/ACM Transactions on Networking, pp. 265-279, June 2001.
- Bernard Fortz and Mikkel Thorup, "Internet traffic engineering by optimizing OSPF weights," in Proc. IEEE INFOCOM, pp. 519-528, 2000.

Distance Vector Routing: RIP

Based on the Bellman-Ford algorithm

At node X, the distance to Y is updated by

where DX(Y) denote the distance at X currently from X to Y,N(X) is set of the neighbors of node X, and c(X, Z) is the distance of the direct link from X to Z

Distance Table: Example

1

7

2

8

1

2

A

D

E

B

C

Below is just one step! The algorithm repeats for ever!

distance tables from neighbors

computation

E’s

distance table

distance table E sends to its neighbors

E

D ()

A

B

C

D

A

0

7

1

c(E,A)

B

7

0

1

8

c(E,B)

A

1

8

D

2

0

2

c(E,D)

A: 1

B: 8

C: 4

D: 2

E: 0

B

15

8

9

D

4

2

1, A

8, B

4, D

2, D

destinations

Link Failure and Recovery

- Distance vectors: exchanged every 30 sec
- If no advertisement heard after 180 sec --> neighbor/link declared dead
- routes via neighbor invalidated
- new advertisements sent to neighbors
- neighbors in turn send out new advertisements (if tables changed)
- link failure info quickly propagates to entire net

How are these loops caused?

- Observation 1:
- B’s metric increases
- Observation 2:
- C picks B as next hop to A
- But, the implicit path from C to A includes itself!

Solutions

- Split horizon/Poisoned reverse
- B does not advertise route to C or advertises it with infinite distance (16)
- Works for two node loops
- does not work for loops with more nodes

Example where Split Horizon fails

A

B

1

1

1

C

1

D

- When link breaks, C marks D as unreachable and reports that to A and B
- Suppose A learns it first. A now thinks best path to D is through B. A reports a route of cost=3 to C.
- C thinks D is reachable through A at cost 4 and reports that to B.
- B reports a cost 5 to A who reports new cost to C.
- etc...

Comparison of LS and DV algorithms

Message complexity

LS: with n nodes, E links, O(nE) msgs sent

DV: exchange between neighbors only

larger msgs

Speed of Convergence

LS: requires O(nE) msgs

may have oscillations

DV: convergence time varies

routing loops

count-to-infinity problem

Robustness: what happens if router malfunctions?

LS:

node can advertise incorrect link cost

each node computes only its own table

DV:

DV node can advertise incorrect path cost

error propagates thru network

Hierarchical Routing

scale: with 50 million destinations:

can’t store all dest’s in routing tables!

routing table exchange would swamp links!

administrative autonomy

internet = network of networks

each network admin may want to control routing in its own network

Our routing study thus far - idealization

- all routers identical
- network “flat”

… not true in practice

Hierarchical Routing

aggregate routers into regions, “autonomous systems” (AS)

routers in same AS run same routing protocol

“intra-AS” routing protocol

special routers in AS

run intra-AS routing protocol with all other routers in AS

also responsible for routing to destinations outside AS

run inter-AS routing protocol with other gateway routers

gateway routers

Intra-AS and Inter-AS routing

Inter-AS

routing

between

A and B

b

c

a

a

C

b

B

b

c

a

d

Host

h1

A

A.c

A.a

C.b

B.a

Host

h2

Intra-AS routing

within AS B

Intra-AS routing

within AS A

Balaji Prabhakar

A peer-to-peer storage problem

- 1000 scattered music enthusiasts
- Willing to store and serve replicas
- How do you find the data?

Centralized lookup (Napster)

N2

N1

SetLoc(“title”, N4)

N3

Client

DB

N4

Publisher@

Lookup(“title”)

Key=“title”

Value=MP3 data…

N8

N9

N7

N6

Simple, but O(N) state and a single point of failure

Flooded queries (Gnutella)

N2

N1

Lookup(“title”)

N3

Client

N4

Publisher@

Key=“title”

Value=MP3 data…

N6

N8

N7

N9

Robust, but worst case O(N) messages per lookup

Routed queries (Freenet, Chord, etc.)

N2

N1

N3

Client

N4

Lookup(“title”)

Publisher

Key=“title”

Value=MP3 data…

N6

N8

N7

N9

Chord Distinguishing Features

- Simplicity
- Provable Correctness
- Provable Performance

Chord Simplicity

- Resolution entails participation by O(log(N)) nodes
- Resolution is efficient when each node enjoys accurate information about O(log(N)) other nodes

Chord Algorithms

- Basic Lookup
- Node Joins
- Stabilization
- Failures and Replication

Chord Properties

- Efficient: O(log(N)) messages per lookup
- N is the total number of servers
- Scalable: O(log(N)) state per node
- Robust: survives massive failures

Chord IDs

- Key identifier = SHA-1(key)
- Node identifier = SHA-1(IP address)
- Both are uniformly distributed
- Both exist in the same ID space
- How to map key IDs to node IDs?

Consistent Hashing[Karger 97]

- Target: web page caching
- Like normal hashing, assigns items to buckets so that each bucket receives roughly the same number of items
- Unlike normal hashing, a small change in the bucket set does not induce a total remapping of items to buckets

Consistent Hashing [Karger 97]

Key 5

K5

Node 105

N105

K20

Circular 7-bit

ID space

N32

N90

A key is stored at its successor:

node with next higher ID

K80

Simple lookup algorithm

Lookup(my-id, key-id)

n = my successor

if my-id < n < key-id

call Lookup(id) on node n // next hop

else

return my successor // done

- Correctness depends only on successors

Lookup with fingers

Lookup(my-id, key-id)

look in local finger table for

highest node n s.t. my-id < n < key-id

if n exists

call Lookup(id) on node n // next hop

else

return my successor // done

Node Join (4)

N25

4. Set N25’s successor

pointer

N36

K30

K30

K38

N40

Update finger pointers in the background

Correct successors produce correct lookups

Stabilization

- Case 1: finger tables are reasonably fresh
- Case 2: successor pointers are correct; fingers are inaccurate
- Case 3: successor pointers are inaccurate or key migration is incomplete
- Stabilization algorithm periodically verifies and refreshes node knowledge
- Successor pointers
- Predecessor pointers
- Finger tables

Failures and Replication

N120

N10

N113

N102

Lookup(90)

N85

N80

N80 doesn’t know correct successor, so incorrect lookup

Solution: successor lists

- Each node knows r immediate successors
- After failure, will know first live successor
- Correct successors guarantee correct lookups
- Guarantee is with some probability

Choosing the successor list length

- Assume 1/2 of nodes fail
- P(successor list all dead) = (1/2)r
- I.e. P(this node breaks the Chord ring)
- Depends on independent failure
- P(no broken nodes) = (1 – (1/2)r)N
- r = 2log(N) makes prob. = 1 – 1/N

References

Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M. Frans Kaashoek, Frank Dabek, Hari Balakrishnan, ``Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications,'‘IEEE/ACM Transactions on Networking, Vol. 11, No. 1, pp. 17-32, February 2003.

Balaji Prabhakar

Basic Architectural Components

3.

1.

Output

Scheduling

2.

Routing

Table

Interconnect

Forwarding

Decision

Routing

Table

Forwarding

Decision

Routing

Table

Forwarding

Decision

Switching Fabrics

Batcher Sorter

Self-Routing Network

3

7

7

7

7

7

7

000

7

2

5

0

4

6

6

001

5

3

2

5

5

4

5

010

2

5

3

1

6

5

4

011

6

6

1

3

0

3

3

100

0

1

0

4

3

2

2

101

1

0

6

2

1

0

1

4

4

4

6

2

2

0

110

Input

Queued

Combined

Input and

Output Queued

111

Multi

stage

Parallel

Packet

Switches

Output

Queued

Background

- [Karol et al. 1987] Throughput limited to by head-of-line blocking for Bernoulli IID uniform traffic.
- [Tamir 1989] Observed that with “Virtual Output Queues” (VOQs) Head-of-Line blocking is reduced and throughput goes up.

Background Scheduling viaMatching

- [Anderson et al. 1993] Observed analogy to maximum size matching in a bipartite graph.
- [McKeown et al. 1995] (a) Maximum size match can not guarantee 100% throughput.(b) But maximum weight match can – O(N3).

Matching

O(N2.5)

BackgroundSpeedup

5. [Chuang, Goel et al. 1997] Precise emulation of a central shared memory switch is possible with a speedup of two and a “stable marriage” scheduling algorithm.

- [Prabhakar and Dai 2000] 100% throughput possible for maximal matching with a speedup of two.

Scheduling algorithms to achieve 100% throughput

- Basic switch model.
- When traffic is uniform (Many algorithms…)
- When traffic is non-uniform.
- Technique: Birkhoff-von Neumann decomposition.
- Load balancing.
- Technique: 2-stage switch.
- Technique: Parallel Packet Switch.

Some possible performance goals

When traffic is admissible

Scheduling algorithms to achieve 100% throughput

- Basic switch model.
- When traffic is uniform (Many algorithms…)
- When traffic is non-uniform.
- Technique: Birkhoff-von Neumann decomposition.
- Load balancing.
- Technique: 2-stage switch.
- Technique: Parallel Packet Switch.

Algorithms that give 100% throughput for uniform traffic

- Quite a few algorithms give 100% throughput when traffic is uniform
- “Uniform”: the destination of each cell is picked independently and uniformly and at random (uar) from the set of all outputs.

Maximum size bipartite match

- Intuition: maximizes instantaneous throughput
- Gives 100% throughput for uniform traffic.

L11(n)>0

Maximum

Size Match

LN1(n)>0

Bipartite Match

“Request” Graph

Some Observations

- A maximum size match (MSM) maximizes instantaneous throughput.
- But a MSM is complex – O(N2.5).
- In general, maximal matching is much simpler to implement, and has a much faster running time.
- A maximal size matching is at least half the size of a maximum size matching.

TDM Scheduling Algorithm

If arriving traffic is i.i.d with destinations picked uar across outputs, then a “TDM” schedule gives 100% throughput.

A

1

A

1

A

1

2

2

B

B

2

B

3

3

C

C

3

C

4

4

D

D

4

D

Permutations are picked uar from the set of N! permutations.

Why doesn’t maximizing instantaneous throughput give 100% throughput for non-uniform traffic?

Three possible

matches, S(n):

Scheduling algorithms to achieve 100% throughput

- Basic switch model.
- When traffic is uniform (Many algorithms…)
- When traffic is non-uniform.
- Technique: Birkhoff-von Neumann decomposition.
- Load balancing.
- Technique: 2-stage switch.
- Technique: Parallel Packet Switch.

Example:With random arrivals, but known traffic matrix

- Assume we know the traffic matrix, and the arrival pattern is random:
- Then we can simply choose:

Birkhoff - von Neumann Decomposition

Turns out, any L can always be decomposed into a linear (convex)

combination of matrices, (M1, …, Mr) by Birkhoff-von Neumann.

In practice…

- Unfortunately, we usually don’t know traffic matrix La priori, so we can:
- measure or estimate L, or
- use the current queue occupancies.

- Basic switch model.
- When traffic is uniform (Many algorithms…)
- When traffic is non-uniform.
- Technique: Birkhoff-von Neumann decomposition.
- Load balancing.
- Technique: 2-stage switch.
- Technique: Parallel Packet Switch.

2-stage Switch

Motivation:

- If traffic is uniformly distributed, then even a simple TDM schedule gives 100% throughput.
- So why not force non-uniform traffic to be uniformly distributed?

2-stage Switch

S1(n)

S2(n)

L11(n)

1

1

1

1

A1(n)

A’1(n)

D1(n)

A’N(n)

DN(n)

AN(n)

N

N

N

N

LNN(n)

Buffered

Switching

Stage

Bufferless

Load-balancing

Stage

Parallel Packet Switches

Definition:

A PPS is comprised of multiple identical lower-speed packet-switches operating independently and in parallel. An incoming stream of packets is spread, packet-by-packet, by a demultiplexor across the slower packet-switches, then recombined by a multiplexor at the output.

We call this “parallel packet switching”

Architecture of a PPS

1

2

3

N=4

Demultiplexor

OQ Switch

Multiplexor

(sR/k)

(sR/k)

R

R

1

1

Multiplexor

Demultiplexor

R

R

OQ Switch

2

2

Demultiplexor

Multiplexor

R

R

3

OQ Switch

Demultiplexor

Multiplexor

R

R

k=3

N=4

(sR/k)

(sR/k)

Parallel Packet SwitchesResults

[Iyer et al.] If S >= 2 then a PPS can precisely emulate a FIFO output queued switch for all traffic patterns, and hence achieves 100% throughput.

References

- C.-S. Chang, W.-J. Chen, and H.-Y. Huang, "Birkhoff-von Neumann input buffered crossbar switches," in Proceedings of IEEE INFOCOM '00, Tel Aviv, Israel, 2000, pp. 1614 – 1623.
- N. McKeown, A. Mekkittikul, V. Anantharam, and J. Walrand. Achieving 100% Throughput in an Input-Queued Switch. IEEE Transactions on Communications, 47(8), Aug 1999.
- A. Mekkittikul and N. W. McKeown, "A practical algorithm to achieve 100% throughput in input-queued switches," in Proceedings of IEEE INFOCOM '98, March 1998.
- L. Tassiulas, “Linear complexity algorithms for maximum throughput in radio networks and input queued switchs,” in Proc. IEEE INFOCOM ‘98, San Francisco CA, April 1998.
- C.-S. Chang, D.-S. Lee, Y.-S. Jou, “Load balanced Birkhoff-von Neumann switches,” Proceedings of IEEE HPSR ‘01, May 2001, Dallas, Texas.
- S. Iyer, N. McKeown, "Making parallel packet switches practical," in Proc. IEEE INFOCOM `01, April 2001, Alaska.

Balaji Prabhakar

Decision Making Under Uncertainty:Online Algorithms and Competitive Analysis

- Online Algorithm:
- Inputs arrive online (one by one)
- Algorithm must process each input as it arrives
- Lack of knowledge of future arrivals results in inefficiency
- Malicious, All-powerful Adversary:
- Omniscient: monitors the algorithm
- Generates “worst-case” inputs
- Competitive Ratio:
- Worst ratio of the “cost” of online algorithm to the “cost” of optimum algorithm

Competitive Analysis: Discussion

- Very Harsh Model
- All powerful adversary
- But..
- Can often still prove good competitive ratios
- Really tough Testing-Ground for Algorithms
- Often leads to good rules of thumb which can be validated by other analyses
- Distribution independent: doesn’t matter whether traffic is heavy-tailed or Poisson or Bernoulli

Competitive Analysis in Networking: Outline

- Shared Memory Switches
- Multicast Trees
- The Greedy Strategy
- Routing and Admission Control
- The Exponential Metric
- More Restricted Adversaries
- Adversarial Queueing Theory
- Congestion Control

Buffer Model

- We consider NxN switch
- Shared memory able to hold M bytes
- Packets may be either:
- accepted/rejected
- preempted
- All packets have the same size

M

Competitive Analysis

Aim: maximize the total number of packets transmitted

For each packet sequence S denote,

- VOPT(S): value of best possible solution,
- VA(S): value obtained by algorithm A

Throughput-Competitive Ratio: MAXS {VOPT(S) / VA(S)}

Uniform performance guarantee

Longest Queue Drop Policy

When a packet arrives:

- Always accept if the buffer is not full
- Otherwise we accept the packet and drop a packet from the tail of the longest queue

LQD Policy Analysis

Theorem 1 (UB):The competitive ratio of the LQD Policy is at most 2.

Theorem 2 (LB):The competitive ratio of the LQD policy is at least 2.

Theorem 3 (LB):The competitive ratio of any online policy is at least 4/3.

Proof Outline (UB)

OPT

LQD

EXTRA

Definition:An OPT packet p sent at time t is an extra packet if the LQD port is idle.

Claim:There exists a matching between each packet from EXTRA to a packet in LQD.

Matching Construction

- For each unmatched OPT packet p in a higher position than the LQD queue length:
- When p arrives and it is accepted by both OPT and LQD then match p to itself
- Otherwise, match p to any unmatched packet in LQD
- If a matched LQD packet p is preempted, then the preempting packet replaces p.

Proof Outline (UB)

Lemma:The matching process never fails.

- Notice: V(OPT) V(LQD) + V(EXTRA)
- Existence of matching implies: V(EXTRA) V(LQD)
- We obtain that: V(OPT) 2 V(LQD)

Proof Outline (LB)

- Scenario (active ports 1 & 2):
- At t = 0 two bursts of M packets to 1 & 2 arrive.
- The online retains at most M/2, say 1’s packets.
- During the following M time slots one packet destined to 2 arrives.
- The scenario is repeated.

Proof Outline (LB-LQD)

- Scenario:
- the switch memory M = A2/2 + A
- the number of output ports N = 3A

Active

A

Ovld.

A

Idle

A

Proof Outline (LB-LQD)

- Active output ports:
- have an average load of 1 with period A
- the bursts to successive ports are evenly staggered in time
- Overloaded output ports:
- receive exactly 2 packets every time slot

Proof Outline (LB-LQD)

- OPT ensures that both the active and overloaded output ports are completely utilized.
- At the same time the throughput of the active output ports in LQD is (2 -1)A.

Other Policies

Complete Partition: N-competitive

- Allocate to each output port M/N buffer space

Complete Sharing: N-competitive

- Admit packets into the buffer if there is some free space

Other Policies Cont.

Static Threshold: N-competitive

- Set the threshold for a queue length to M/N
- A packet is admitted if the threshold is not violated and there is a free space

Dynamic Threshold: open problem

- Set the threshold for a queue length to the amount of the free buffer space
- All packets above the threshold are rejected

Competitive Analysis in Networking: Outline

- Shared Memory Switches
- Multicast Trees
- The Greedy Strategy
- Routing and Admission Control
- The Exponential Metric
- More Restricted Adversaries
- Adversarial Queueing Theory
- Congestion Control

Steiner Tree Problem

Objective: find a minimum cost tree connecting S.

KMB Algorithm (Offline)Due to [Kou, Markowsky and Berman 81’]

- Step 1: Construct a complete directed distance graph G1=(V1,E1,c1).
- Step 2: Find the min spanning tree T1 of G1.
- Step3: Construct a subgraph GS of G by replacing each edge in T1 by its corresponding shortest path in G.
- Step 4: Find the min spanning tree TS of GS.
- Step 5: Construct a Steiner tree TH from TS by deleting edges in TS if necessary, so that all the leaves in TH are Steiner points.

KMB Algorithm Cont.

Worst case time complexity O(|S||V|2).

Cost no more than 2(1 - 1/l) *optimal cost

where l = number of leaves in the steiner tree.

KMB Example

A

D

4

A

A

4

1

1

4

4

4

10

4

H

H

1/2

I

B

C

1/2

I

1/2

1/2

1

G

G

1

1

A

D

1

4

1

1

E

F

1

1

B

B

F

E

4

2

2

8

2

2

C

D

C

9

D

4

B

C

Destination Nodes

Intermediate Nodes

Incremental Construction of Multicast Trees

- Fixed Multicast Source s
- K receivers arrive one by one
- Must adapt multicast tree to each new arrival without rerouting existing receivers
- Malicious adversary generates bad requests
- Objective: Minimize total size of multicast tree

a

C=3/2

Can create worse sequences

s

r1

r1

r1

b

b

b

Dynamic Steiner Tree (DST)

- G=(V,E) weighted, undirected, connected graph.
- Si V is the set of terminal nodes to be connected at step i.

Two Classes of Online Algorithms

- Shortest Path Algorithm
- Each receiver connects using shortest path to source (or to a core)
- DVMRP [Waitzman, Partridge, Deering ’88]
- CBT [Ballardie, Francis, Crowcroft ‘93]
- PIM [Deering et al. ’96]
- Greedy Algorithm [Imase and Waxman ‘91]
- Each receiver connects to the closest point on the existing tree
- Independently known to the Systems community
- The “naive” algorithm [Doar and Leslie ‘92]
- End-system multicasting [Faloutsos, Banerjea, Pankaj ’98; Francis ‘99]

Shortest Path AlgorithmCompetitive Ratio

- Optimum Cost K + N
- If N is large, the competitive ratio is K

r1

r2

s

r3

rK

Greedy Algorithm

- Theorem 1: For the greedy algorithm, competitive ratio = O(log K)
- Theorem 2: No algorithm can achieve a competitive ratio better than log K

[Imase and Waxman ’91]

Greedy algorithm is the optimum strategy

Proof of Theorem 1

[Alon and Azar ’93]

- L = Size of the optimum multicast tree
- pi = amount paid by online algorithm for ri
- i.e. the increase in size of the greedy multicast tree as a result of adding receiver ri
- Lemma 1: The greedy algorithm pays 2L/j or more for at most j receivers
- Assume the lemma
- Total Cost 2L (1 + 1/2 + 1/3 + … 1/K) ¼ 2L log K

Proof of Lemma 3

Suppose towars a contradiction thatthere are more than j receivers for which the greedy algorithm paid more than 2L/j

- Let these be r1, r2, … , rm, for m larger than j
- Each of these receivers is at least 2L/j away from each other and from the source

Tours and Trees

r3

r1

r2

Each segment 2L/j,

Tour cost > 2L

s

r4

rm

s

One can construct tour from tree by repeating edges at most twice, Tour cost 2L

rm

r1

r2

r4

r3

Competitive Analysis in Networking: Outline

- Shared Memory Switches
- Multicast Trees
- The Greedy Strategy
- Routing and Admission Control
- The Exponential Metric
- More Restricted Adversaries
- Adversarial Queueing Theory
- Congestion Control

The Exponential Cost Metric

- Consider a resource with capacity C
- Assume that a fraction l of the resource has been consumed
- Exponential cost “rule of thumb”: The cost of the resource is given by ml for appropriately chosen m
- Intuition: Cost increases steeply with l
- Bottleneck resources become expensive

Cost

l

Applications of Exponential Costs

- Exponential cost “rule of thumb” applies to
- Online Routing
- Online Call Admission Control
- Stochastic arrivals
- Stale Information
- Power aware routing

The Online Routing Problem

- Connection establishment requests arrive online in a VPN (Virtual Private Network)
- Must assign a route to each connection and reserve bandwidth along that route
- PVCs in ATM networks
- MPLS + RSVP in IP networks
- Oversubscribing is allowed
- Congestion = the worst oversubscribing on a link
- Goal: Assign routes to minimize congestion
- Assume all connections have identical b/w requirement, all links have identical capacity

Online Algorithm for Routing

- lL = Fraction of bandwidth of link L that has been already reserved
- m = N, the size of the network
- The Exponential Cost Algorithm:
- Route each incoming connection on current cheapest path from src to dst
- Reserve bandwidth along this path

[Aspnes et al. ‘93]

Online Algorithm for Routing

- Theorem 1: The exponential cost algorithm achieves a competitive ratio of O(log N) for congestion
- Theorem 2: No algorithm can achieve competitive ratio better than log N in asymmetric networks

This simple strategy is optimum!

Applications of Exponential Costs

- Exponential cost “rule of thumb” applies to
- Online Routing
- Online Call Admission Control
- Stochastic arrivals
- Stale Information
- Power aware routing

Online Admission Control and Routing

- Connection establishment requests arrive online
- Must assign a route to each connection and reserve bandwidth along that route
- Oversubscribing is not allowed
- Must perform admission control
- Goal: Admit and route connections to maximize total number of accepted connections (throughput)

Exponential Metric and Admission Control

- When a connection arrives, compute the cheapest path under current exponential costs
- If the cost of the path is less than m then accept the connection; else reject

[Awerbuch, Azar, Plotkin ’93]

- Theorem: This simple algorithm admits at least O(1/log N) as many calls as the optimum

Applications of Exponential Costs

- Exponential cost “rule of thumb” applies to
- Online Routing
- Online Call Admission Control
- Stochastic arrivals
- Stale Information
- Power aware routing

Assume Stochastic Arrivals

- Connection arrivals are Poisson, durations are Memory-less
- Assume fat links (Capacity >> log N)
- Theorem: The exponential cost algorithm results in
- Near-optimum congestion for routing problem
- Near-optimum throughput for admission problem

[Kamath, Palmon, Plotkin ’96]

Near-optimum: Compt. ratio = (1+e) for e close to 0

Versatility of Exponential Costs

- Guarantees of log N for Competitive ratio against malicious adversary
- Near-optimum for stochastic arrivals
- Near-optimum given fixed traffic matrix

[Young ’95; Garg and Konemann ’98]

- Exponential cost “rule of thumb” applies to
- Online Routing
- Online Call Admission Control
- Stochastic arrivals
- Stale Information
- Power aware routing

Exponential Metrics and Stale Information

- Exponential metrics continue to work well if
- Link states are a little stale
- Shortest paths are reused over small intervals rather than recomputed for each connection
- No centralized agent

[Goel, Meyerson, Plotkin ’01]

- Caveat: Still pretty hard to implement

- Exponential cost “rule of thumb” applies to
- Online Routing
- Online Call Admission Control
- Stochastic arrivals
- Stale Information
- Power aware routing

Power Aware Routing

- Consider a group of small mobile nodes eg. sensors which form an adhoc network
- Bottleneck Resource: Battery
- Goal: Maximize the time till the network partitions
- Assign a cost to each mobile node which is

ml where l = fraction of battery consumed

- Send packets over the cheapest path under this cost measure
- O(log n) competitive against an adversary
- Near-optimum for stochastic/fixed traffic

- Shared Memory Switches
- Multicast Trees
- The Greedy Strategy
- Routing and Admission Control
- The Exponential Metric
- More Restricted Adversaries
- Adversarial Queueing Theory
- Congestion Control

Adversarial Queueing TheoryMotivation

s

r

- Malicious, all-knowing adversary
- Injects packets into the network
- Each packet must travel over a specified route
- Suppose adversary injects 3 packets per second from s to r
- Link capacities are one packet per second
- No matter what we do, we will have unbounded queues and unbounded delays
- Need to temper our definition of adversaries

Adversarial Queueing TheoryBounded Adversaries

- Given a window size W, and a rate r < 1
- For any link L, and during any interval of duration T > W, the adversary can inject at most rT packets which have link L in their route
- Adversary can’t set an impossible task!!
- More gentle than competitive analysis
- Will study packet scheduling strategies
- Which packet to forward if more than one packets are waiting to cross a link?

Some Interesting Scheduling Policies

- FIFO: First In First Out
- LIFO: Last In First Out
- NTG: Nearest To Go
- Forward a packet which is closest to destination
- FTG: Furthest To Go
- Forward a packet which is furthest from its destination
- LIS: Longest In System
- Forward the packet that got injected the earliest
- Global FIFO
- SIS: Shortest In System
- Forward the packet that got injected the last
- Global LIFO

Stability in the Adversarial Model

- Consider a scheduling policy (eg. FIFO, LIFO etc.)
- The policy is universally stable if for networks and all “bounded adversaries”, the packet delays and queue sizes remain bounded
- FIFO, LIFO, NTG are not universally stable [Borodin et al. ‘96]
- LIS, SIS, FTG are universally stable

[Andrews et al. ‘96]

Adversarial Queueing Model: RoutingUsing the Exponential Cost Metric

- Adversary injects packets into the network but gives only the src, dst
- The correct routes are hidden
- Need to compute routes
- Again, use the exponential cost metric
- Reset the cost periodically to zero
- Use any stable scheduling policy
- Theorem: The combined routing and scheduling policy is universally stable

[Andrews et al. ’01]

- Shared Memory Switches
- Multicast Trees
- The Greedy Strategy
- Routing and Admission Control
- The Exponential Metric
- More Restricted Adversaries
- Adversarial Queueing Theory
- Congestion Control

The Problem

- What rates should the users use to send their data?
- How to keep the network efficient and fair?
- Goal: match the available bandwidth !

Sources

Sinks

Model Description

- Model
- Time divided into steps
- Oblivious Adversary
- Source select xi
- Severe cost function

Available Bandwidth bi

chosen by the Adversary

Algorithm picks and

sends xi

Time

Competitive Ratio

- An Algorithm achieves
- Optimal (offline) achieves
- Seek to minimize

Adversary Model

- Unrestricted Adversary
- Has too much power
- Fixed Range Adversary
- µ-multiplicative adversary
- {α,β}-additive adversary

Fixed Range Model

- Adversary selects any value
- Deterministic Algorithm
- Optimal would never select a rate > c
- If optimal does, adversary can select c, causing the algorithm to send 0
- Optimal selects c
- In that case, adversary selects d
- Competitive ratio is d/c

Fixed range – Randomized Algorithm

- No randomized algorithm can achieve competitive ratio better than 1+ln(d/c) in the fixed range model with range [c,d]
- Proof :
- Yao’s minimax principle
- Consider a randomized adversary against deterministic algorithms
- Adversary can choose g(y) = c/y^2 in [c,d)
- With probability c/d chooses d

Proof continued ….

- If the algorithm picks xi = x
- The expected optimal is at most

µ-multiplicative model – Randomized Algorithm

- No randomized algorithm can achieve competitive ratio better than ln(µ) + 1
- Proof:
- Adversary can always choose bi in [bi, µbi]

Randomized Algorithm 4 log(µ) + 12

- Assumptions –relaxed later-
- µ is a power of 2
- b1 is in the range [1,2µ)
- Algorithm (MIMD)
- At step 1, pick at random x1 power of 2 between 1 and 2µ
- On failure, xi+1 = xi/2;
- On success, xi+1 = 2µxi;
- Claim:
- Competitive ratio of 4 log(µ) + 12

Proof outline

- Think about choosing one deterministic algorithm from log(2µ) + 1 choices
- Think about the algorithms as an ensemble running in parallel
- Will show that the ensemble manages to send at least opt/4. [A bit of work]
- Once this is done, picking one algorithm gives opt/4(log(µ)+2)

Proof (1/3)

- Algorithms pick consecutive sequence
- Ensemble is successful
- bi falls in the picked range
- ei : largest value sent by any

algorithm

- bi < 2ei
- At the next step, if the bandwidth increases or stays constant, the ensemble will succeed
- bi < 2ei , bi+1 < µbi => bi+1 < 2µei
- Bandwidth lies in the range covered by the ensemble

Proof (2/3)

- Need to worry about decreasing bandwidth
- May decrease very fast
- Ensemble achieved ei at step i
- Now it was unsuccessful at step i+ 1
- Could not have been more than ei available
- At step i+2, they all divide their rates by 2
- Could not have been more than ei/2 available
- By induction, one can show that :
- ei + ei/2 + ei/4 + …. = 2ei

Proof (3/3)

- Optimal algorithm could have achieved at most 4ei
- Up to 2ei in at step I because it is not constrained to choose a power of 2
- 2ei when the ensemble were not successful
- Summing over all time steps, at least we can transmit opt/4
- µ- assumption -> round µ to the next power of 2. Result in log(µ) + 3 algorithms

References

- N. Alon and Y. Azar. On-line Steiner trees in the Euclidean plane. Discrete and Computational Geometry, 10(2), 113-121, 1993.
- M. Andrews, B. Awerbuch, A. Fernandez, J. Kleinberg, T. Leighton, and Z. Liu. Universal stability results for greedy contention-resolution protocols. Proceedings of the 37thIEEE Conference on Foundations of Computer Science, 1996.
- M. Andrews, A. Fernandez, A. Goel, and L. Zhang. Source Routing and Scheduling in Packet Networks. To appear in the proceedings of the 42nd IEEE Foundations of Computer Science, 2001.
- J. Aspnes, Y. Azar, A. Fiat, S. Plotkin, and O. Waarts. On-line load balancing with applications to machine scheduling and virtual circuit routing. Proceedings of the 25th ACM Symposium on Theory of Computing, 1993.
- B. Awerbuch, Y. Azar, and S. Plotkin. Throughput competitive online routing. Proceedings of the 34th IEEE symposium on Foundations of Computer Science, 1993.
- A. Ballardie, P. Francis, and J. Crowcroft. Core Based Trees(CBT) - An architecture for scalable inter-domain multicast routing. Proceedings of the ACM SIGCOMM, 1993.

References [Contd.]

- A. Borodin, J. Kleinberg, P. Raghavan, M. Sudan, and D. Williamson. Adversarial queueing theory. Proceedings of the 28th ACM Symposium on Theory of Computing, 1996.
- S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei. The PIM architecture for wide-area multicast routing. IEEE/ACM Transactions on Networking, 4(2), 153-162, 1996.
- M. Doar and I. Leslie. How bad is Naïve Multicast Routing? IEEE INFOCOM, 82-89, 1992.
- M. Faloutsos, A. Banerjea, and R. Pankaj. QoSMIC: quality of service sensitive multicast Internet protocol. Computer Communication Review, 28(4), 144-53, 1998.
- P. Francis. Yoid: Extending the Internet Multicast Architecture. Unrefereed report, http://www.isi.edu/div7/yoid/docs/index.html .
- N. Garg and J. Konemann. Faster and simpler algorithms for multicommodity flow and other fractional packing problems. Proceedings of the 39th IEEE Foundations of Computer Science, 1998.

References [Contd.]

- A. Goel, A. Meyerson, and S. Plotkin. Distributed Admission Control, Scheduling, and Routing with Stale Information. Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms, 2001.
- A. Goel and K. Munagala. Extending Greedy Multicast Routing to Delay Sensitive Applications. Short abstract in proceedings of the 11th ACM-SIAM Symposium on Discrete Algorithms, 2000. Long version to appear in Algorithmica.
- M. Imase and B. Waxman. Dynamic Steiner tree problem. SIAM J. Discrete Math., 4(3), 369-384, 1991.
- C. Intanagonwiwat, R. Govindan,and D. Estrin. Directed diffusion: A scalable and robust communication paradigm for sensor networks. Proceedings of the 6th Annual International Conference on Mobile Computing and Networking (MobiCOM), 2000.
- A. Kamath, O. Palmon, and S. Plotkin. Routing and admission control in general topology networks with Poisson arrivals. Proceedings of the 7th ACM-SIAM Symposium on Discrete Algorithms, 1996.
- D. Waitzman, C. Partridge, and S. Deering. Distance Vector Multicast Routing Protocol. Internet RFC 1075, 1988.
- N. Young. Randomized rounding without solving the linear program. Proceedings of the 6th ACM-SIAM Symposium on Discrete Algorithms, 1995.

References [Contd.]

- R. Karp, E. Koutsoupias, C. Papadimitriou, and S. Shenker, “Optimization problems in congestion control”. In Proceedings of the 41st Annual IEEE Symposium of Foundation of Computer Science.
- S. Arora, B. Brinkman, “A Randomized Online Algorithm for Bandwidth Utilization ”

Balaji Prabhakar

Optical Fabric

Tunable Lasers

Receivers

- Switching is achieved by tuning lasers to different wavelengths
- The time to tune the lasers can be much longer than the duration of a cell

.

.

.

.

.

.

- Input-queued switch.
- Scheduler picks a new configuration (matching).
- There is a configuration delay C.
- Then the configuration is held for a pre-defined period of time.

The Bipartite Scheduling Problem

- The makespan of the schedule:
- total holding time +
- the configuration overhead.
- Goal: minimize the makespan.
- Preemptive: cells from a single queue can be scheduled in different configurations.
- Non-preemptive: all cells from a single queue are scheduled in just one configuration.

Non-Preemptive Scheduling

- Minimizes the number of reconfigurations.
- Allows to design low complexity schedulers, which can operate at high speeds.
- Handles efficiently variable size packets: no need to keep packet reassembly buffers.

The weight of each edge is the occupancyof the corresponding input queue.

Create a new matching.

Go over uncovered edges in order of non-decreasing weight. Add the edge to the matching if possible marking it as covered.

If there are uncovered edges, goto Step 1.

Analysis of Greedy: Complexity

Theorem 1: Greedy needs at most 2N-1 configurations.

Proof outline:

- Consider all VOQi* and all VOQ*j
- There can be at most 2N-1 such queues
- At each iteration, at least one of the corresponding edges is covered
- Thus, after 2N-1 iterations VOQij must be served.

Analysis of Greedy: Makespan

Theorem 2 (UB):

Greedy achieves an approximation factor of at most 2 for all values of C.

Theorem 3 (Greedy-LB):

Greedy achieves an approximation factor of at least 2 for C=.

Proof of Theorem 2

Consider the k-th matching and let (i,j) be the heaviest edge of weight w.

Lemma 1:There are at least k/2 edges of weight w incident to either input i or output j.

Proof outline:

In all iterations 1,...,k-1 Greedy chosen edge of weight w incident to i or j.

Proof of Theorem 2 Cont.

Observation 1:OPT’s schedule contains at least k/2 configurations.

Observation 2:The k/2-th largest holding time in OPT’s schedule is at least w.

The theorem follows !

Hardness Results

Theorem 4 (General-LB):

The NPBS problem is NP-hard for all values of C and hard to approximate within a factor better than 7/6.

Proof outline: [GW85, CDP01]

- Reduction from the Restricted Timetable Design problem, asg. of teachers for 3 hrs.
- Encoding as a demand matrix, C=.
- There is an optimal non-preemptive schedule that contains 3 matchings.
- Works for all values of C !

- We considered Greedy in the offline case
- What if packets constantly arrive ?
- We use the idea of batch scheduling
- Avoids starvation since all queued cells are included in the next batch

- We have shown that the makespan of Greedy is at most twice that of OPT
- A moderate speedup of 2 will allow us to provide strict delay guarantees for any admissible traffic

Open Problems

- Close the gap between the upper and the lower bound (2 vs. 7/6),
- Consider packet-mode scheduling

Literature

- Preemptive scheduling:
- [Inukai79] Inukai. An Efcient SS/TDMA Time Slot Assignment Algorithm. IEEE Trans. on Communication, 27:1449-1455, 1979.
- [GW85]Gopal and Wong. Minimizing the Number of Switchings in a SS/TDMA System. IEEE Trans. Communication, 33:497-501, 1985.
- [BBB87] Bertossi, Bongiovanni and Bonuccelli. Time Slot Assignment in SS/TDMA systems with intersatellite links. IEEE Trans. on Communication, 35:602-608. 1987.
- [BGW91]Bonuccelli, Gopal and Wong. Incremental Time Slot Assignement in SS/TDMA satellite systems. IEEE Trans. on Communication, 39:1147-1156. 1991.
- [GG92] Ganz and Gao. Efficient Algorithms for SS/TDMA scheduling. IEEE Trans. on Communication, 38:1367-1374. 1992
- [CDP01] Crescenzi, Deng and Papadimitriou. On Approximating a Scheduling Problem, Journal of Combinatorial Optimization, 5:287-297, 2001.
- [TD02] Towles and Dally. Guaranteed Scheduling for Switches with Conguration Overhead. Proc. of INFOCOM'02.
- [LH03] Li and Hamdi, -Adjust Algorithm for Optical Switches with Reconguration Delay. Proc. of ICC'03.
- ... many others
- Non-preemptive scheduling:
- [PR00] Prais and Ribeiro. Reactive GRASP: An Application to a Matrix Decomposition Problem in TDMA Trafc Assignment. INFORMS Journal on Computing, 12:164-176, 2000.

Download Presentation

Connecting to Server..