adaptive routing l.
Skip this Video
Loading SlideShow in 5 Seconds..
Adaptive Routing PowerPoint Presentation
Download Presentation
Adaptive Routing

Loading in 2 Seconds...

play fullscreen
1 / 30

Adaptive Routing - PowerPoint PPT Presentation

  • Uploaded on

Adaptive Routing. Rienforcement Learning Approaches. Contents. Routing Protocols Reinforcement Learning Q-Routing PQ-Routing Ant Routing Summary. Routing Classification. Distributed. Centralized. A Main controller updates all node’s routing tables. Fault Tollerent.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Adaptive Routing

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
adaptive routing

Adaptive Routing

Rienforcement Learning Approaches

  • Routing Protocols
  • Reinforcement Learning
  • Q-Routing
  • PQ-Routing
  • Ant Routing
  • Summary
routing classification
Routing Classification



  • A Main controller updates all

node’s routing tables.

  • Fault Tollerent.
  • Suitable for small networks.
  • Route computation shared

among nodes by exchanging.

  • Widley used.
routing classification4
Routing Classification…



  • Routing based only on source

and destination.

  • Current network state - Ignored
  • Adapt policy to time and trafic.
  • More attractive.
  • Ossilations in path.
routing classification based on optimization
Routing Classification Based on Optimization.




Shortest Path



short falls of static routing
Short falls of Static Routing
  • Dynamic networks are subjected to the following changes.
    • Topologies changes, as nodes are added and removed
    • Traffic patterns change cyclically
    • Overall network load changes
  • So, routing algorithms that assume that the network is static don’t work in this setting
tackling dynamic networks
Tackling Dynamic Networks
  • Periodic Updates?
  • Routing traffic?
  • When to update?

Adaptive Routing’s the Answer?

reinforcement learning
Reinforcement Learning

Agent Playing against a player- Chess and Tic-Tac-Toe

Learning a Value Function

learning value function
Learning Value Function
  • Temporal Difference
  • V(e) = V(e) + K [ V(g) – V(e) ]

For K = 0.4 We have

V(e) = 0.5 + 0.2 = 0.7

  • Exploration Vs Exploitation
  • e and e*



rienfocement learning networks
Rienfocement Learning -Networks


V(s) = V(s) + K [ V(s’) – V(s) ]








q routing
  • Qx(d, y) is the time that node x estimates it will take to deliver a packet to node d through its neighbor y
  • When y receives the packet, it sends back a message (to node x), containing its (i.e. y’s) best estimate of the time remaining to get the packet to d, i.e.
    • t = min(Qy(d, z)) over all z neighbors( y )
  • x then updates Qx(d, y) by:
    • [Qx(d, y)]NEW = [Qx(d, y)]OLD + K.(s+q+t - [Qx(d,y)]OLD)


    • s = RTT from x to y
    • q = Time spent in queue at x
    • T = new estimate by y


q routing12


message to d




min(Qy(d, zi)) = 13;

RTT = s = 11

[Qx(d, y)] += (0.25).[(11+17) - 20]



to d

Qy(d, z1) = 25

Qy(d, z2) = 17

estimated RTT = 3

Qy(d, ze) = 70

short falls
Short falls
  • Shortest path algorithm – better than Q Routing under low load.
  • Failure to converge back to shortest paths when network load decreases again.

Failure to explore new shortcuts

short falls15


message to d




Short falls…

to d

Qy(d, z1) = 25

Qy(d, z2) = 17

Qy(d, ze) = 70


Even if route via y reduces later,

It never gets used untill route via

W gets cunjusted

predictive q routing
Predictive Q-Routing
  • DQ = s+q+t - [Qx(d,y)]OLD
  • [Qx(d, y)]NEW = [Qx(d, y)]OLD + K.DQ
  • Bx(d,y) = MIN[Bx(d,y), Qx(d,y)]
  • If(DQ < 0) //Path is improving
    • DR = DQ/(currentTime – lastUpdatedTime)
    • Rx(d,y) = Rx(d,y) + B.DR //Decrease in R
  • Else
    • Rx(d,y) = G.Rx(d,y) //Increase of R
  • End If
  • lastUpdatedTime = currentTime
pq routing policy
PQ-Routing Policy…

Finding neighbour y

  • For each neighbour y of x
  • Dt = currentTime – lastUpdatedTime
  • Qx-pred(d,y) = Qx(d,y) + Dt.Rx(d,y)
  • Choose y with MIN[Qx-pred(d,y)]
pq routing results
PQ-Routing Results
  • Performs better than Q-Routing under low, high and varying network loads.
  • Adapts faster if “probing inactive paths” for shortcuts introduced.
  • Under high loads, behaves like Q-Routing.
  • Uses more memory than Q-Routing.
ant routing stigmergy inspirations from nature
Ant- RoutingStigmergy - Inspirations From Nature…
  • Sorts brood and food items
  • Explore particular areas for food, and preferentially exploits the richest available food source
  • Cooperates in carrying large items
  • Leaves pheromones on their way back
  • Always finds the shortest paths to their nests or food source
  • Are blind, can not foresee future, and has very limited memory
  • Each router x in the network maintains for each destination node d a list of the form:
    • <d, <y1, p1>, <y2, p2>, …, <ye, pe>>,
    • where y1, y2, …, ye are the neighbors of x, and
    • p1 + p2 + …+ pe = 1
  • This is a parallel (multi-path) routing scheme
  • This also multiplies the number of degrees of freedom the system has by a factor of |E|
  • Every destination host hd periodically generates an “ant” to a random source host hs
    • An “ant” is a 3-tuple of the form:
      • < hd, hs, cost>
        • cost is a counter of the cost of the path the ant has covered so far
ant routing example
Ant Routing Example






< 4,0,cost >

Routing Table for 1

ants updation

1+ p

1+ p

normalizing sum of probabilities to 1

Ants: Updation

When a router x receives an ant < hd, hs, cost> from neighbor yi, it:

  • Updates cost by the cost of traversing the link from xtoyi (i.e. the cost of the link in reverse)
  • Updates entry for host (<hd, <y1, p1>, <y2, p2>, …, <ye, pe>>)

p = k / cost, for some k

pi = pi+ p

for j i, pj = pj

ants propagation
Ants: Propagation
  • Two sub-species of ant:
    • Regular Ants:

P( ant sent to yi ) = pi

    • Uniform Ants:

P( ant sent to yi ) = 1 / e

  • Regular ants use learned tables to route ants
  • Uniform ants explore randomly
q routing vs ants
Q-Routing vs. Ants
  • Q-Routing only changes its currently selected route when the cost of that route increases, not when the cost of an alternate route decreases
  • Q-Routing involves overhead linear in the volume of traffic in the network; ants are effectively free in moderate traffic
  • Q-Routing cannot route messages by parallel paths; uniform ants can
ants with evoperation
Ants with Evoperation
  • Evaporation is a real life scenario - Where pheromone laid by real ants evaporates.
  • Link usage statistics are used to evaporate (E(x)).
  • It is the proportion of number of ants from node x over the total ants received by the current node.
  • Routing algorithms that assume a static network don’t work well in real-world networks, which are dynamic
  • Adaptive routing algorithms avoid these problems, at the cost of a linear increase in the size of the routing tables
  • Q-Routing is a straightforward application of Q-Learning to the routing problem
  • Routing with ants is more flexible than Q-Routing
  • Boyan, J., & Littman, M. (1994). Packet routing in dinamically changing networks: A rein-forcement learning approach. In Advances in Neural Information Processing Systems 6 (NIPS6), pp. 671-678. San Francisco, CA:Morgan Kaufmann.
  • Di Caro, G., & Dorigo, M. (1998). Two ant colony algorithms for best-eort routing in datagram networks. In Proceedings of the Tenth IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS'98), pp. 541-546. IASTED/ACTA Press.
  • Choi, S., & Yeung, D.-Y. (1996). Predictive Q-routing: A memory-based reinforcement learning approach to adaptive trac control. In Advances in Neural Information Processing Systems 8 (NIPS8), pp. 945-951. MIT Press.
  • Dorigo, M., Maniezzo, V., & Colorni, A. (1996). The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 26 (1), 29-41.