1 / 57

Distributed Partial Information Management (DPIM) for Survivable Networks

Distributed Partial Information Management (DPIM) for Survivable Networks. Dahai Xu. Content. Basic Concepts of Protection & Restoration Previous Work on Shared Path Protection Proposed DPIM Schemes what partial info to maintain and how?

rowdy
Download Presentation

Distributed Partial Information Management (DPIM) for Survivable Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Partial Information Management (DPIM) forSurvivable Networks Dahai Xu

  2. Content • Basic Concepts of Protection & Restoration • Previous Work on Shared Path Protection • Proposed DPIM Schemes • what partial info to maintain and how? • how a connection is routed under distributed control and with partial info? • how distributed signaling is done and bandwidth (BW) allocated/deallocated? • A heuristic based on Potential Backup Cost

  3. Protection • Path Protection • Link Protection • Advantages & Disadvantages

  4. Path Protection • Use more than one path to guarantee the data be sent successfully • Dedicated Path Protection • Shared Path Protection

  5. Dedicated Path Protection • 1+1 Protection • Point-to-Point Protection & Mesh Network Protection

  6. 1+1 Protection

  7. Mesh Network Protection

  8. Shared Path Protection • 1:N Protection • 1:1 Protection

  9. Link Protection • Use an alternate path if the link failed • Dedicated Link Protection: not practical • Shared Link Protection: practical • It may fail when a node fails

  10. Advantages & Disadvantages of Protection • Simple • Quick: Do not require much extra process time • Usually can only recover from single link fault • Inefficient usage of resource

  11. Restoration • Path Restoration • Route can be computed after failure • Link Restoration • Path is discovered at the end nodes of the failed link • More practical than path restoration • Advantages & Disadvantages of Restoration • Usually can recover from multiplex element faults • More efficient usage of resource • Complex • Slow: require extra process time to setup path and reserve resource

  12. Comparison between Protection & Restoration • Characteristic: Protection -- the resource are reserved before the failure, they may be not used; Restoration -- the resource are reserved and used after the failure • Route: Protection -- predetermined; Restoration -- can be dynamically computed • Resource Efficiency: Protection -- Low; Restoration -- High

  13. Comparison between Protection & Restoration (Cont’) • Time used: Protection -- Short; Restoration -- Long • Reliability: Protection -- mainly for single fault; Restoration -- can survive under multiplex faults • Implementation: Protection -- Simple; Restoration -- Complex

  14. Offline Routing • Arrange a set of traffic flows • Integer Linear Programming(ILP) to get optimal results • Heuristic Algorithms • Relaxation of ILP • Simulated Annealing - A stochastic hill-climbing heuristic search method. (Explore a larger area in the search space without being trapped in local optimal) • Genetic Algorithm: Evolves the current population of “good solutions” toward the optimality by using carefully designed crossover and mutation operators. • Tabu search

  15. Online Routing of Bandwidth Guaranteed • Online routing, bandwidth guaranteed path with simultaneous protection path • Metrics • Unlimited Link Capacity • Bandwidth Consumption • Limited Link Capacity • Connection drop/block probability • Profit / Revenue

  16. Assumption • Two connections whose active paths are completely link disjoint can share backup Bandwidth (BBW). • The objective of the algorithm is to exploit this BBW sharing to e.g., reduce the total amount of bandwidth (TBW) consumed by the connections.

  17. Information for Routing • The amount of BBW sharing depends on the information available to the routing algorithm. • Three important cases to be considered. • No Information on how existing connections are routed • Complete Per-flow/Aggregate Information • Partial Aggregate Information

  18. No Sharing (NS) • Only know the residual (available) bandwidth on each link • Residual bandwidth = Link capacity -Reserved active bandwidth (ABW) - Reserved backup bandwidth (BBW) • Can be obtained from OSPF Extensions or IS­IS Extensions • Only the total used bandwidth is known (active + backup) • Can not share BBW, thus waste resources.

  19. Sharing with Complete Information (SCI) • Know routes for the active and backup paths of all current connections. • May have too much information to maintain. O(LQ). L is the average path length, Q is the number of existing connections. • Permits the best sharing and provides a Performance upper-bound

  20. Partial Information for Routing • Know some aggregated information of each link • Two schemes • SPI (Sharing with Partial Information): Centralized control, knows BBW and ABW on each/every link • DPIM (Distributed Partial Information Management): Distributed control, each ingress edge (source) node decides the routes.

  21. Notations (I)

  22. Notations (II)

  23. No Sharing (NS) • Remove links Re < w • Determine two link disjoint paths for active/backup • Formulation: • standard network flow problem • each link has unit cost and unit capacity • s supply two units, d demand two units • minimum cost flow algorithm can be used

  24. Linear Programming for SCI (I) • For new request (s, d, w), the least cost of using a on AP and b on BP • The cost of using e on BP (1)

  25. Linear Programming for SCI (II) • Objective • Constraints

  26. SPI • In SCI, can be calculated from per-flow information. Need maintain per-flow information. Not scalable. • In SPI, is not known, only is knownSame objective and constraints as in SCI • Further improvement to be discussed in DPIM

  27. Survivable Routing (SR) • Distributed control with complete but aggregated information. • Every edge node essentially maintains a matrix of for all links a and b • Uses the active path first (APF) heuristic instead of ILP formulation • Remove links whose Re<w (temporarily) • Find a shortest path as AP • Put back temporarily removed links, remove AP links, calculate backup cost using Eq. (1) • Find a shortest (cheapest) path as BP

  28. Successive SR (SSR) • After is updated as a result of setting up a new connection, some existing BPs may change (route and the amount of additional BBW reserved) • Such changes may in turn trigger changes to other existing BPs until an equilibrium state is reached • Achieve a better BBW sharing, but with a high signaling and control overhead

  29. RAFT • RAFT: Resource Aggregation for Fault Tolerance • Each node maintains fault management table (FMT) , which list AP or BP flow on each link e. FMT must be updated each time a request initiates or terminates • AP and BP route are node-disjoint by using shortest path algorithm firstly • A request is accepted only if the bandwidth requirement is available on all the links on its AP and BP, otherwise it is rejected.

  30. Doshi’s • Each node maintains a link capacity control table (LCCT) for each local link • Source nodes using Content-lock mechanism to avoid multiple demands deadlock. • BP route search: Distributed breadth-first search (BFS) over a residual network • In BFS, it first query the residual spare capacity in LCCT, only use the link if the link has sufficient capacity • If a route is found, the source node stores it as the restoration route for the demand. • If fail to find the BP route, the capacity optimization procedure is activated by changing previous BP routes

  31. Su’s • Each node maintains “bucket”-based link state (equivalent to ) • The amount of link states is proportional to the number of failure/link, not the number of light paths • AP and BP are optimized separately. AP are assumed to using minimum-hop paths, BP are optimized to reduce the wavelength redundancy • The “width” of link l with respect to a failure event k* is defined as the normalized difference between the maximum bucket height and the bucket corresponding to link failure k*, which indicates the sharing capacity of links.

  32. Su’s (Cont’) • By using Bellman-Ford algorithm to identify the widest path between the end nodes of the protected link, the path that offer the most sharing. • In the event that there are more than one such path candidates, the one that traverses the lease number links with width 0 was selected

  33. DPIM-SAM • Distributed Partial Information Management • Edge node maintains (and exchanges) non-local information: for each link e. (O(E) information) • Each node also maintains profiles of ABW and BBW for each local link e. (O(E) information)

  34. Path Determination • This estimated BBW may not be minimal • Using ILP, or APF to find AP and BP • DPIM-M-A: APF with Minimal BBW Allocation

  35. Distributed Signaling • Minimal BBW Allocation • Maintaining Partial Information on AP and BP • Send AP Set-up packet containing BP to the nodes along AP, each node having an outgoing link e in AP updates • Similar way to update

  36. Minimal BBW allocation

  37. Connection Release • Can’t be done efficiently in SPI • AP Tear-Down and BBW Deallocation. Update PBe and release bw.

  38. Network Topology

  39. Performance Evaluation • Traffic Types • Incremental traffic (Established connection lasts forever) • Dynamic traffic (with connection durations) • Performance Metrics • Unlimited Link Capacity • Bandwidth Saving (Ratio): upper bound 50% • Limited Link Capacity • Connection drop/block probability • Total Earning (Ratio) : Earning Rate matrix (independent of traffic load)

  40. Simulation Results • Average Bandwidth Saving Ratio • Total Earning Ratio

  41. Active Path First with Potential Backup Cost (APF-PBC) • Challenges • Integer Linear Programming (ILP) based approaches are notoriously time consuming • Guarantee minimal allocation of TBW for each request, but do not guarantee an optimal result for all requests. • Active path first (APF) can only achieve sub-optimal results: • Does not consider the potential cost along the BP when selecting the AP

  42. Main idea of APF-PBC • Also uses Active Path First • In selecting Active Path, Each capable link a will be assigned a cost • We use as the potential backup cost (and try to minimize TBW). • Intuition: PBC increases with w and • Can apply to SCI and DPIM-SAM (which determine backup cost and BP differently)

  43. Potential Backup Cost - Derivation • is derived based on the statistical analysis of experimental data. (SCI-ILP) for the 15-node network, infinite link capacity) • challenge: but do not know which link b to be used to backup link a, let alone Bb and • solution: guess the (weighted average) value of Bb (call it x) and (call it s)

  44. Derivation based on statistical analysis of Bb • Distribution of Bb/M • (w,s,M) is the expected value of a(w) when s is fixed. • Guess the distribution of and calculated the weighted average value of (w,s,M) over all s to obtain a(w)

  45. Distribution of Bb/M

  46. Graph of (w,s,M) & approximation • Integral (curves) from adaptive Lobatto quadrature • Approximation (line-fitting Y=c1X+c2)

  47. Cumulative distribution function of

  48. Graph of

  49. Approximation of a(w) • Distribution of • Effect of constants c and on performance of APF-PBC

  50. Distribution of

More Related