1 / 23

Relying on Autonomous Multipath Routing to Achieve Inter-Domain Load Balancing in the Internet

Relying on Autonomous Multipath Routing to Achieve Inter-Domain Load Balancing in the Internet. Robert Löfman Ph.D Student. The Goal of the Research. Provide better QoS (not guaranteed), end-to-end, by traffic engineering Means: Multipath Routing Focus: Throughput

janetharris
Download Presentation

Relying on Autonomous Multipath Routing to Achieve Inter-Domain Load Balancing in the Internet

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relying on Autonomous Multipath Routing to Achieve Inter-Domain Load Balancing in the Internet Robert Löfman Ph.D Student

  2. The Goal of the Research • Provide better QoS (not guaranteed), end-to-end, by traffic engineering • Means: Multipath Routing • Focus: Throughput • Can improve performance of • apps. insensitive to packet reordering but also, • Video-on-demand (if there is time to wait for buffering, when packets are reordered) • A/V Conferencing apps. (if packet reordering rate is acceptable) Multipath routing = some or all routers can forward to more than one nexthop

  3. Some Other Traffic Engineering Techniques: • Queuing policies (needed if strict QoS is to be guaranteed) • DiffServ: class of packets receive preferential treatment • IntServ: resources reserved on-demand • Permanent packet scheduling (Priority Q, Fair Q, Weighted FQ….) • Drop schemes (Random Early Detection) • Source Routing • MPLS • Fast layer-2 forwarding, based on virtual circuits. • Enables better addressability and discrimination of routes. • Over provisioning • Have so much hardware that capacity cannot run out

  4. Background - The Internet’s Infrastructure • Intra-Domain (Autonomous System, AS) • A network of a single organization • Private routing (RIP, OSPF, EIGRP…) • Spans a local region. • Inter-Domain • The routes between ASes. • Routing domain is global. • Border Gateway Protocol (BGP)

  5. Background - The Internet’s Infrastructure Intra-Domain AS Inter-Domain

  6. Intra-domain Multipath Schemes • EIGRP, proportional routing = traffic is dispersed proportionally to metrics • OSPF-Equal Cost MPR, shares load on equal optimal routes. • OSPF Optimized MPR, disseminates load levels and can reroute around cong. • MP extended Link State and Distance Vector, calculate and address the k-best paths. • “A survey of multipath routing for traffic engineering", Gyu Myoung Lee, Jin Seek Choi. • MultiPath Algorithm (MPA) • Discount Shortest Path Algorithm (DSPA) • Capacity Removal Algorithm (CRA) • Multipath Distance Vector Algorithm (MDVA) • Multipath Partial Dissemination Algorithm (MPDA) • Quality Multiple Partial Dissemination Algorithm (QMPDA) • Diffusing Algorithm for Shortest Multipath(DASM) • MPATH

  7. Inter-domain Multipath Routing • In use: • There are none, only BGP is used. • BGP installs only a single route. • Multihomed ASes can have some basic TE by policy • Some proposed methods for enabling MPR in BGP: • Dynamic egress-router selection for an multihomed AS • Advertisement of multiple AS-paths, and inter-domain source routing by means of hashed IDs, which are based on the AS-paths. • Advertisement of the min. SLA-guaranteed BW that the path supports, so thatpaths can be discriminated • Overlay Networks • Probe for “better” routes to destinations via other overlay nodes

  8. The Problem • Initial research: • “Can throughput be expected to improve by letting intermediate routers disperse traffic arbitrarily on possibly joint routes, without cross-domain routing ?” • If so, • no changes to routing protocols would be needed in order to obtain better performance • The hierarchical structure of the Internet may be preserved. • Biggest obstacle: The intra-domain routes are not visible to the inter-domain. • Proportional routing impossible. • Might disperse too much traffic on a low BW route • An answer was sought by simulation.

  9. Simulation – Set-up • OPNET Simulator • Partial Internet Routes • Inter/Intra-domains and LANs (BGP and EIGRP) • Fictional (generated) and real WANs • Full end-to-end path diversity: • Intra-domains may have mult. routes to mult. egress points. • Inter-domain routers also use MPR (tweaked BGP) • Paths may have joint links

  10. Simulation – Set-up • Studied: • MultiPath Routing 2 and 3 (MPR-2, MPR-3) • MPR-2, all routers forward packets to 2 nexthops, if possible • Realized by allowing 2 or 3 sub-optimal routes in EIGRP in each AS. • Compared to: SinglePath Routing (SPR) • Every router forwards to one nexthop

  11. Simulation – Set-up • Reordered packets can invoke fast retransmit wrongly. • Avoided by Increased Maximum Segment Size (15000B, MSS) • Decreased packet reordering at TCP-layer Number of false fast retransmissions Note: No FRs occurred with TCP Reno when that function was turned off. => all inv. of FR false.

  12. Simulation – Set-up • Large MSS (15000 bytes): • Forces fragmentation • Increased chance that all segments have fragments which traverses the “slowest” path • Increases buffering at IP-layer as segments “wait” for delayed fragments • Smoothes the segment delay deviation (15K MSS had an order of magnitude less deviation) • Now it is possible to examine TCP over MPR beyond the FR-problem. • Infinite receive buffers assumed • Last Mile-link capacity 1Gb/s

  13. Simulation – Set-up • Background traffic • Packet size of 40B, 570B or 1500B randomly • Measurements taken during 0, 25, 50, 75, 90, 95 and 96 percent back.load and averaged. • Delay-metrics set equal. • BW-metric set to the true BW of the link. • The allowed variance of sub-optimal routes was kept low to avoid loops (loops where detected when TTL values dropped to 0)

  14. Simulation – Results, UDP over MPR • Starting point: UDP, because it does not constrain its transmission. Throughput:Transmission rate is only 200 million pkt/h of size 1500B each. A function of back.load. Throughput: Transmission rate is 300 million pkt/h. MPR better perf. • Indication: • MPR performs best when trans.rate is high or residual BW low. • Sending must be at a rate which the SP can’t handle. (Trans_Rate / Res_BWSP > 1) • Dispersing packets while the SP has enough BW increases the delay

  15. Simulation – Results, UDP over MPR • Throughput improvement possible, • Good news for transferring data which is to be used after complete reception • but • This might not be the throughput perceived by real-time applications due to packet reordering Standard deviation of RTT times. MPR deviates more.

  16. Simulation – Results, UDP over MPR • In-sequence BW = the rate of packets/s that arrive in-order. • Out-of-sequence BW = the rate of packets/s that arrive. • Especially important for real-time applications. • If all paths in MPR have a BW greater than an nth of the BW of the SP: • possible to transmit more than n packets in parallel before n packets are transmitted on the SP. • also the in-sequence BW of MPR would be greater than SPR

  17. Simulation – Results, TCP over MPR • MPR always performs worse Transfer time of a file (seconds). Note, no false “fast retransmits”.

  18. Simulation – Results, TCP over MPR 0% 25% 50% 75% 90% 95% 96% • SPR always has faster CWIND growth and grows to a much greater end value • Reason for a slowly growing or a small window must be that it takes a long time to transmit the packet bursts which the CWIND allows. SPR SPR SPR SPR SPR SPR SPR MPR MPR MPR MPR MPR MPR MPR Several congestion windows (in thousands) as a function of time (seconds). Graphs are clustered according to back.load. X values begin at 0 for every cluster. This is a representative sample.

  19. Simulation – Results, TCP over MPR Average fragment delay (seconds) Standard deviation of fragment delay. • The reason for small and slowly growing MPR CWIND • MPR has longer fragment delay with heavy delay deviation. • Segments are passed at slow rate to TCP as they “wait” for delayed fragments

  20. Simulation – Results, TCP over MPR • Slower transmission rate for MPR, due to the slowly growing CWIND holding the transmission rate back. SPR SPR SPR SPR SPR SPR SPR Sent segment sequence numbers (in thousands) as a function of time (seconds)

  21. Simulation – Results, TCP over MPR • MPR experiences more pauses in transmission since the CWIND chokes the transmission more often. Cumulative pause time due to a full CWIND

  22. Simulation – Interpretation of Results, TCP over MPR • In the beginning, the transmission is very dependant on a small delay (CWIND portions transmission into small burst) in order to get the CWIND opened quickly. • MPR: Too many fragments are put on high delay paths as there is no proportional forwarding. • Segments “wait” for delayed fragments (although it has a large BW which isn’t needed at this point) • Burst transmission takes longer • slower CWIND growth • Vicious circle: future allowed burst are also smaller (compared to SPR) • SPR: Gets the CWIND quickly opened and therefore can transmit large bursts which allow even larger future bursts. • Need to get the in-sequence BW higher in order for MPR to be successful.

  23. Future Work • Goal: Provide I-D prop.forwarding without cross-domain routing • Possible solution algorithm: • At the creation of a new flow, use SPR • Probe for highest possible throughput • Raise the MPR-X value, in order to let packets take sub-optimal routes also. • Then probe for a new throughput high, and so on… • Mark packets for every probe with an ID, and require routers to disperse packets with same ID always to the same nexthops • the transmission rate increment from every probe is always put on the same paths (might not scale due to the need for per flow record-keeping)

More Related