1 / 25

Distributed Mesh Span Restoration

Distributed Mesh Span Restoration. Centralized vs. Distributed Network Configuration. Network configuration: Establishment of working paths (provisioning) Restoration Traffic adaptation Centralized approach:

Download Presentation

Distributed Mesh Span Restoration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Mesh Span Restoration

  2. Centralized vs. Distributed Network Configuration • Network configuration: • Establishment of working paths (provisioning) • Restoration • Traffic adaptation • Centralized approach: • A central command point that has a knowledge of the complete network state (full network map, link states, available capacity,…) makes routing decisions and sends commands to the nodes. • Distributed approach (self-organizing): • Nodes apply simple rules in an autonomous and asynchronous manner to handle the problems of demand routing, restoration or traffic adaptation

  3. Centralized vs. Distributed Network Configuration • Advantages of centralized approach: • The complete knowledge of the network state make it possible to find an optimal network configuration • Drawbacks of centralized approach: • Diverse telemetry network required for collection of information about network state • Constant verification of database integrity is needed • Very slow due to time for commands download and validation • Can easily lead to the Centralized control is unable to achieve the 2 second restoration time widely recognized as a target for the transport network.

  4. Centralized vs. Distributed Network Configuration • Advantages of a (distributed) self-organizing solution: • Simple (avoids the “Software mountain” problem) • Speed • Accuracy • Robustness • Resource usage efficiency • Drawback: • Not necessarily optimal The rest of the lecture presents a self-organizing solution to the problem of span-restoration: The self-healing network (SHN) protocol

  5. Objectives of the self-healing protocol (For the problem of span restoration) • Find the maximum number of replacement paths between the end-nodes of a failed span with the least possible amount of spare capacity The corresponding mathematical problem is single-commodity maximum flow: Remark: A solution to this formulation does not tell us what cross connections should be made in the nodes

  6. Objectives of the self-healing protocol (2) • Achieve fast restoration (below 2 sec) • Avoid the complexity of centralized restoration and the “software mountain” phenomenon

  7. Distributed Restoration: what about flooding? • Flooding is a simple solution to find restoration routes However … • Flooding does not help decide the number of restoration paths to establish on each route • The formation of paths must be simultaneously coordinated to achieve maximum flow • Flooding however can be used for the determination of minimum delay routes or broadcast of topology updates (Remember the trap topology)

  8. SHN Protocol: Node Interactions using Statelets • Statelets are attributes of the transmission links Statelets: (NID,Sender,Chooser,index,repeat count) Node NID: Node Identification Sender, Chooser: Pair of nodes at the two end of the failed span (decision of attribution of sender and chooser roles will be explained later) Index: Unique number assigned to each statelet emitted by the sender Repeat count: Counting of the number of hops since statelet was emitted by sender

  9. SHN Protocol: Different node states The SHN protocol is an event-driven finite state machine (FSM) • Nodes can only be in a finite number of different states: • Pre-failure state • Sender state • Chooser state • “Tandem node” state • The transition between these states depends on the changes of incoming statelets (events) • A change in an incoming statelet is called a receive statelet (RS) event

  10. SHN Protocol: Pre-failure Node State In pre-failure state, nodes send null statelets on all working and spare links (null) (null) (null) (null) s s (null) Node Q pre-failure state s s s (null) (null) s s s s (null) (null) s s s s (null) (null) (null) (null) = (Q,0,0,0,0) (null)

  11. RS.NID = P SHN Protocol: Activation A span failure is detected by the span end-nodes first. One of the nodes become the “Sender node” (null) (null) (null) (null) (null) (Q,Q,P,1,1) (null) (Q,Q,P,2,1) s s (null) (Q,Q,P,3,1) Node Q pre-failure state s s s Span cut (null) (Q,Q,P,4,1) w s w s w state = sender state determination . . . . . (null) (Q,Q,P,5,1) s s s s Statelets are sent on each available spare links up to a maximum of min(w,si) in each span (null) (null) (null) (Q,Q,P,6,1) (null) (Q,Q,P,7,1) (null) (Q,Q,P,8,1)

  12. NID: Q replaced by Tandem node ID Repeat count increased by 1 SHN Protocol: Basic view of Tandem Nodes role (Complete tandem node rules to be explained later) (null) (T1,Q,P,1,2) • Node T1 in Tandem node state: • Selective re-broadcast • Update of statelets (null) (T1,Q,P,2,2) (null) (T1,Q,P,3,2) (null) s s s s Node Ti pre-failure state (null) (null) (T1,Q,P,1,2) (T1,Q,P,2,2) s s s s (null) (null) (T1,Q,P,1,2) (T1,Q,P,2,2) s s s (null) s s (null) (null) (Q,Q,P,1,1) (Q,Q,P,1,1) (T1,Q,P,1,2) (null) (Q,Q,P,2,1) (null) (Q,Q,P,3,1)

  13. (Ti,Q,P,2,n) (P,Q,P,2,1)R SHN Protocol: Initiation of Reverse Linking • Node P receives first statelet • (for index 2 in example): • complementary statelet sent s s s s Node P Chooser state s s s s s s s s s

  14. (T2,Q,P,2,n-1)R SHN Protocol: Reverse Linking process (T1,Q,P,1,2) Node T1 receives reverse-linking statelet and copies it to the port going to the precursor (T1,Q,P,2,2) (T1,Q,P,3,2) (null) s s s s (null) (T1,Q,P,1,2) (T1,Q,P,2,2) s s s s (null) (T1,Q,P,1,2) (T1,Q,P,2,2) s s s (null) s s (null) Request for local cross connection (null) (Q,Q,P,1,1) (null) (Q,Q,P,2,1) (null) (Q,Q,P,3,1) (T,Q,P,2,n)R

  15. RS.NID = Q RS.NID = P More details on SHN: Sender-Chooser Arbitration How am I ranked compared to Q? How am I ranked compared to P? Node P Node Q I will be the chooser node I will be the sender node Other possibility: The node with the lowest number of surviving spare links becomes the sender (to minimize the volume of statelets generated)

  16. More details on SHN: Tandem Nodes Rules 1) Keep list of ports where precursor statelets are presently found and sort statelets by: • increasing repeat count • increasing number of the port where they appear 2) Replace precursors by better ones when better ones appear 3) Try as much as possible to re-broadcast statelets to all other spans 3a) When full re-broadcast is not possible, consider statelets in order of repeat count starting with the lowest values. 4) When complement statelet is received it is copied to the port of the precursor, all re-broadcast of forward flooding statelets for the corresponding index is stop and any subsequent appearance of a reverse linking statelet with that index is ignored

  17. Tandem Node Rules: Index ranking • Keep list of ports where precursor statelets are presently found and sort statelets by: • increasing repeat count • increasing number of the port where they appear r = 3 3 n Rank in the list 1 4 r = 1 2 1 r = 2 2 3 r = 6 5 r = 5 4

  18. Tandem Node Rules: Selective re-broadcast Try as much as possible to re-broadcast statelets to all other spans When full re-broadcast is not possible, consider statelets in order of repeat count starting with the lowest values. r = 3 3 r = 1 1 r = 2 2 r = 6 5 r = 5 4

  19. (T2,Q,P,2,n-1)R Tandem Node Rules: Reverse Linking (T1,Q,P,1,2) Node T1 obeys rule 4: • Set the status of both ports as “working” (T1,Q,P,2,2) (T1,Q,P,3,2) • The complement statelet is sent to precursor (null) • Re-broadcast for that index is stopped s s s s • Cross connection is requested Complement statelets (T1,Q,P,1,2) (T1,Q,P,2,2) s s w s s (null) (T1,Q,P,1,2) (T1,Q,P,2,2) s s s (null) s w s (null) Request for local cross connection (null) (Q,Q,P,1,1) (null) (Q,Q,P,2,1) (null) (Q,Q,P,3,1) (T,Q,P,2,n)R

  20. SHN Protocol: Frequent low-level effects • The application of rule 3 and the effects of reverse linking results is several frequent low-level effects: • The precursor location for an index shifts • A new index appears at the node • A precursor disappears • Links are freed for more re-broadcast after reverse linking • After any of these events the rebroadcast pattern is revised to follow rule 3

  21. Index i, repeat r-1 3 Index i, repeat r Index i, repeat r Index i, repeat r SHN Protocol: Frequent low-level effects (2) • The precursor location for an index shifts Index i, repeat r-1 Index i, repeat r+1 Index i, repeat r+1 Index i, repeat r+1 Index i, repeat r

  22. r = 4 SHN Protocol: Frequent low-level effects (3) • A new index appears at a node r = 3 3 r = 1 1 r = 2 2 Index 6 is not re-broadcast anymore r = 6 5 6 4 r = 5 4 5

  23. SHN Protocol: High Level Behaviour • At the high level what we see is: Some trees are stopped because no re-broadcast possible Reverse linking make successful trees collapse Index trees expanding Some trees reach the chooser node Freed capacity allows revision of re-broadcast patterns • Eventually: • 100% restoration is achieved • or … No reverse linking events occur and the sender suspends statelet flooding after some time (time-out)

  24. SHN Protocol: Performances • Finding the maximum number of restoration paths • In no test case in over 15 test networks derived from the real world did the SHN process yield any fewer the maximum feasible number of paths in the given network • Achieving fast restoration • Speed of restoration depends on the implementation • But easily realized on-demand (adaptive) “restoration” in less than 2 seconds • or, can be used for Distributed Preplanning for pre-failure self-development of fast-acting protection pre-plans (~ 100 msec or less reaction upon failure). Results of complete restoration times for one of the implementation tested in [1]

  25. Self-Organizing Networks: other applications • “Capacity scavenging” for: • Automated service paths provisioning (“broad-band dial-up” • Network Audit (advance detection of restorability limitations and/or locations where capacity will soon be exhausted) • Improved restorability to complete node failures • “Distributed Pre-planning” • For more details, see: [1] W. Grover, “Self-organizing broad-band transport networks,” Proceedings of the IEEE, vol. 85, no. 10, October 1997. [2] W.D. Grover, "Distributed Restoration of the Transport Network," Chapter 11 in Telecommunications Network Management into the 21st Century, Techniques, Standards, Technologies and Applications, S.Aidarous, T. Plevyak (editors), IEEE Press, 1994, pp. 337-417.

More Related