Adaptive routing
1 / 30

- PowerPoint PPT Presentation

  • Updated On :

Adaptive Routing. Rienforcement Learning Approaches. Contents. Routing Protocols Reinforcement Learning Q-Routing PQ-Routing Ant Routing Summary. Routing Classification. Distributed. Centralized. A Main controller updates all node’s routing tables. Fault Tollerent.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - johana

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Adaptive routing l.jpg

Adaptive Routing

Rienforcement Learning Approaches

Contents l.jpg

  • Routing Protocols

  • Reinforcement Learning

  • Q-Routing

  • PQ-Routing

  • Ant Routing

  • Summary

Routing classification l.jpg
Routing Classification



  • A Main controller updates all

    node’s routing tables.

  • Fault Tollerent.

  • Suitable for small networks.

  • Route computation shared

    among nodes by exchanging.

  • Widley used.

Routing classification4 l.jpg
Routing Classification…



  • Routing based only on source

    and destination.

  • Current network state - Ignored

  • Adapt policy to time and trafic.

  • More attractive.

  • Ossilations in path.

Routing classification based on optimization l.jpg
Routing Classification Based on Optimization.




Shortest Path



Short falls of static routing l.jpg
Short falls of Static Routing

  • Dynamic networks are subjected to the following changes.

    • Topologies changes, as nodes are added and removed

    • Traffic patterns change cyclically

    • Overall network load changes

  • So, routing algorithms that assume that the network is static don’t work in this setting

Tackling dynamic networks l.jpg
Tackling Dynamic Networks

  • Periodic Updates?

  • Routing traffic?

  • When to update?

Adaptive Routing’s the Answer?

Reinforcement learning l.jpg
Reinforcement Learning

Agent Playing against a player- Chess and Tic-Tac-Toe

Learning a Value Function

Learning value function l.jpg
Learning Value Function

  • Temporal Difference

  • V(e) = V(e) + K [ V(g) – V(e) ]

For K = 0.4 We have

V(e) = 0.5 + 0.2 = 0.7

  • Exploration Vs Exploitation

  • e and e*



Rienfocement learning networks l.jpg
Rienfocement Learning -Networks


V(s) = V(s) + K [ V(s’) – V(s) ]








Q routing l.jpg

  • Qx(d, y) is the time that node x estimates it will take to deliver a packet to node d through its neighbor y

  • When y receives the packet, it sends back a message (to node x), containing its (i.e. y’s) best estimate of the time remaining to get the packet to d, i.e.

    • t = min(Qy(d, z)) over all z neighbors( y )

  • x then updates Qx(d, y) by:

    • [Qx(d, y)]NEW = [Qx(d, y)]OLD + K.(s+q+t - [Qx(d,y)]OLD)


    • s = RTT from x to y

    • q = Time spent in queue at x

    • T = new estimate by y


Q routing12 l.jpg


message to d




min(Qy(d, zi)) = 13;

RTT = s = 11

[Qx(d, y)] += (0.25).[(11+17) - 20]



to d

Qy(d, z1) = 25

Qy(d, z2) = 17

estimated RTT = 3

Qy(d, ze) = 70

Short falls l.jpg
Short falls

  • Shortest path algorithm – better than Q Routing under low load.

  • Failure to converge back to shortest paths when network load decreases again.

Failure to explore new shortcuts

Short falls15 l.jpg


message to d




Short falls…

to d

Qy(d, z1) = 25

Qy(d, z2) = 17

Qy(d, ze) = 70


Even if route via y reduces later,

It never gets used untill route via

W gets cunjusted

Predictive q routing l.jpg
Predictive Q-Routing

  • DQ = s+q+t - [Qx(d,y)]OLD

  • [Qx(d, y)]NEW = [Qx(d, y)]OLD + K.DQ

  • Bx(d,y) = MIN[Bx(d,y), Qx(d,y)]

  • If(DQ < 0) //Path is improving

    • DR = DQ/(currentTime – lastUpdatedTime)

    • Rx(d,y) = Rx(d,y) + B.DR //Decrease in R

  • Else

    • Rx(d,y) = G.Rx(d,y) //Increase of R

  • End If

  • lastUpdatedTime = currentTime

Pq routing policy l.jpg
PQ-Routing Policy…

Finding neighbour y

  • For each neighbour y of x

  • Dt = currentTime – lastUpdatedTime

  • Qx-pred(d,y) = Qx(d,y) + Dt.Rx(d,y)

  • Choose y with MIN[Qx-pred(d,y)]

Pq routing results l.jpg
PQ-Routing Results

  • Performs better than Q-Routing under low, high and varying network loads.

  • Adapts faster if “probing inactive paths” for shortcuts introduced.

  • Under high loads, behaves like Q-Routing.

  • Uses more memory than Q-Routing.

Ant routing stigmergy inspirations from nature l.jpg
Ant- RoutingStigmergy - Inspirations From Nature…

  • Sorts brood and food items

  • Explore particular areas for food, and preferentially exploits the richest available food source

  • Cooperates in carrying large items

  • Leaves pheromones on their way back

  • Always finds the shortest paths to their nests or food source

  • Are blind, can not foresee future, and has very limited memory

Slide21 l.jpg

  • Each router x in the network maintains for each destination node d a list of the form:

    • <d, <y1, p1>, <y2, p2>, …, <ye, pe>>,

    • where y1, y2, …, ye are the neighbors of x, and

    • p1 + p2 + …+ pe = 1

  • This is a parallel (multi-path) routing scheme

  • This also multiplies the number of degrees of freedom the system has by a factor of |E|

Slide22 l.jpg

  • Every destination host hd periodically generates an “ant” to a random source host hs

    • An “ant” is a 3-tuple of the form:

      • < hd, hs, cost>

        • cost is a counter of the cost of the path the ant has covered so far

Ant routing example l.jpg
Ant Routing Example






< 4,0,cost >

Routing Table for 1

Ants updation l.jpg

1+ p

1+ p

normalizing sum of probabilities to 1

Ants: Updation

When a router x receives an ant < hd, hs, cost> from neighbor yi, it:

  • Updates cost by the cost of traversing the link from xtoyi (i.e. the cost of the link in reverse)

  • Updates entry for host (<hd, <y1, p1>, <y2, p2>, …, <ye, pe>>)

p = k / cost, for some k

pi = pi+ p

for j i, pj = pj

Ants propagation l.jpg
Ants: Propagation

  • Two sub-species of ant:

    • Regular Ants:

      P( ant sent to yi ) = pi

    • Uniform Ants:

      P( ant sent to yi ) = 1 / e

  • Regular ants use learned tables to route ants

  • Uniform ants explore randomly

Q routing vs ants l.jpg
Q-Routing vs. Ants

  • Q-Routing only changes its currently selected route when the cost of that route increases, not when the cost of an alternate route decreases

  • Q-Routing involves overhead linear in the volume of traffic in the network; ants are effectively free in moderate traffic

  • Q-Routing cannot route messages by parallel paths; uniform ants can

Ants with evoperation l.jpg
Ants with Evoperation

  • Evaporation is a real life scenario - Where pheromone laid by real ants evaporates.

  • Link usage statistics are used to evaporate (E(x)).

  • It is the proportion of number of ants from node x over the total ants received by the current node.

Summary l.jpg

  • Routing algorithms that assume a static network don’t work well in real-world networks, which are dynamic

  • Adaptive routing algorithms avoid these problems, at the cost of a linear increase in the size of the routing tables

  • Q-Routing is a straightforward application of Q-Learning to the routing problem

  • Routing with ants is more flexible than Q-Routing

Reference l.jpg

  • Boyan, J., & Littman, M. (1994). Packet routing in dinamically changing networks: A rein-forcement learning approach. In Advances in Neural Information Processing Systems 6 (NIPS6), pp. 671-678. San Francisco, CA:Morgan Kaufmann.

  • Di Caro, G., & Dorigo, M. (1998). Two ant colony algorithms for best-eort routing in datagram networks. In Proceedings of the Tenth IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS'98), pp. 541-546. IASTED/ACTA Press.

  • Choi, S., & Yeung, D.-Y. (1996). Predictive Q-routing: A memory-based reinforcement learning approach to adaptive trac control. In Advances in Neural Information Processing Systems 8 (NIPS8), pp. 945-951. MIT Press.

  • Dorigo, M., Maniezzo, V., & Colorni, A. (1996). The ant system: Optimization by a colony of cooperating agents. IEEE Transactions on Systems, Man, and Cybernetics-Part B, 26 (1), 29-41.