Ethernet Automatic Protection Switching (EAPS)

Ethernet Automatic Protection Switching (EAPS) A small comparison with Eternet Ring Protection Switching (ERPS)

Introduction • EAPS is a protocol invented to increase the availability of Ethernet rings • Developed by Extreme Networks (RFC3619 – 2003) • Objective: • Provide a resilience level comparable to SONET rings • Current version (v1.3 - 2011) has some enhancements over version 1 (RFC3619 – 2003)

Motivation • Ethernet is widely used in Local Area Networks (LANs) and Metropolitan Area Networks (MANs) • Typically present a ring topology • MAN operators want to reduce recovery time • Spanning Tree Protocol (STP) could take 30 – 60 second to recover • Rapid Spanning Tree Protocol (RSTP) is faster... • Convergence time depends on the number of nodes • Both STP and RSTP limit the number of nodes • EAPS recovers in less than 1 second (100 ms) • Does not limit the number of nodes!!!

Basic Considerations (I) • A ring is made up of two or more switches • Each switch has two ports connected to the ring • An EAPS domain exists on a single Ethernet ring • A domain protects a group of VLANs • A domain has a unique control VLAN • Multiple EAPS domains could coexist on the same ring • Multiple control VLANs

Basic Considerations (II) • For each EAPS domain: • One of the nodes is the Master (S1) • One port is designated as the Primary port (P) • The other is the Secondary Port (S) • All other nodes (S2-S6) are known as Transit nodes

Normal Operation • The Master node blocks its secondary port -> avoid loops • Non-control traffic is blocked (Control VLAN is NOT blocked) • Master is in COMPLETE state • Transient nodes are in LINKS-UP state • The Master sends health-check frames (HEALTH-CHECK- PDU) periodically (Hello timer) • From primary port to secondary port • Control frames consumed by the Master -> NOT forwarded

Fault Operation • When a fault is detected: • The Master changes to FAILED state • Unblocks secondary port • Flushes it bridging table • The Master orders the other nodes to flush their tables • Sends a RING-DOWN-FLUSH-FDB-PDU frame • Transit nodes learn the new topology

Fault Detection (I) • 2 ways of detecting a failure • Link Down Alert • Ring Polling • Link Down Alert • Transient nodes detect a link-down • Transient detecting the failure changes to LINKS-DOWN state • Transient sends a LINK-DOWN-PDU frame to the Master • Master changes to FAILED state • Master unblocks secondary port • ...

Fault Detection (II) • Ring Polling (version 1 – RFC3619) • Master sends HEALTH-CHECK-PDU frames periodically • From primary to secondary port • Master has a Fail-period timer • If health check frame received before timer expires -> reset timer • If health check frame NOT received before timer expires • Master changes to FAILED state • Master unblocks secondary port • ...

Fault Detection (III) • Ring Polling (version 1.3) • 2 options if the Fail-period timer expires (configurable) • «Open Secondary Port» -> previous slice • «Send-Alert» • Master DO NOT unblock its secondary port yet • Master sends a QUERY-LINK-STATUS-PDU frame out of both ports • Transit nodes with link failure reply with LINK-DOWN-PDU frame • Master changes to FAILED state • ... • Prevents False Failures • Health frames could not return to Master –> even if the ring is complete • Control VLAN misconfigurations • Too much traffic • Master node’s CPU busy Why?

Fault Restoration (I) • Master in FAILED state -> continues sendind HEALTH-CHECK-PDU frames • Ring restored -> Master’s secondary port receives health frame • Master changes to COMPLETE state • Blocks non-control frames on secondary port • Flushes its bridge table • Orders the other nodes to flush their tables • Sends a RING-UP-FLUSH-FDB-PDU frame • Transit nodes re-learn the topology

Fault Restoration (II) – PREFORWARDING State • Time between • The Transit node detecting its link is restored • The Master detecting the ring is restored • Master’s secondary port is unblocked • Possible temporary loop !!!! • When Transit node detects its link is restored • Changes to PREFORWARDING state and starts Preforwarding timer • Protected VLANs in that port are temporary blocked • Waits till a RING-UP-FLUSH-FDB-PDU is received • Changes to LINKS-UP state • Unblocks previously blocked VLANs • Flushes its bridge table and stops Preforwarding timer • Re-learns topology

Fault Restoration (III) – PREFORWARDING State • Preforwarding timer deals with: • Lost RING-UP-FLUSH-FDB-PDU from the Master • Another break in the ring • If the transient node remains in PREFORWARDING state indefinitely -> disconnected network • Preforwarding timer is derived from the Hello-timer for HEALTH-CHECK-PDU frames

Enhancements of version 1.3 • «Send-alert» configuration for Ring Polling fault detection method • INIT state • Master comes up for first time and its ports are up • Master does not know if the ring is up • Master starts in INIT state -> blocks secondary port • When the first health frame is received -> changes to COMPLETE state • Helps spotting misconfigurations in control VLAN • LINK-UP-PDU • Transient detects a link comes up -> sends LINK-UP-PDU to Master • Timestamp used for trouble-shooting • If the Master never changes to COMPLETE state • Allows use of EAPS Shared-Ports

VLANs in Multiple EAPS domains (Multiple Rings) (I) • EAPS could handle a simple configuration • Each ring has a EAPS domain, a Master node and a Control VLAN • VLAN spanning in both rings is added as protected by both EAPS domains

VLANs in Multiple EAPS domains (Multiple Rings) (I) • Topologies with a common link could be problematic • If the common link fails • Both Masters open secondary ports • Protected VLANs spanning both rings will have a loop • S1-S2-S3-S4-S5-S6-S7-S8-S9-S10-S1 • EAPS Shared-Ports deals with it • Out of the scope

States and Control Frames Version 1 – RFC3619 Version 1.3

Ethernet Ring Protection Switching (I) • Ethernet Ring Protection Switching (ERPS) is defined by ITU-T G.8032 -> achieve sub-50 ms recovery times in rings • Basic considerations: • One link is designated as the Ring Protection Link (RPL) -> blocked to prevent loops • The node setting the block is the RPL Owner (Master in EAPS) • Nodes monitor link failure using Ethernet Continuinity Check (ETH-CC) messages • Four defined local events: • Local Signal Failure (local SF) -> detection of link failure • Local clear Signal Failure (local clear SF) -> detection of link restoration • Wait-To-Restore Expire (WTR-Expire) -> timer expiration • Wait-To-Restore Running (WTR-Running) -> timer running

Ethernet Ring Protection Switching (II) • Basic considerations (cont.): • The protocol uses Ring Automatic Protection Switching (R-APS) messages: • R-APS(SF): sent by the node detecting link failure (gets local SF) • R-APS(NR): sent by the node detecting link restoration (gets local clear SF) • R-APS(NR,RB): sent by RPL Owner indicating the RPL is blocked • Two important timers • Wait-To-Restore (WTR) Timer: usedby the RPL Owner to verify that the ring has stabilized before blocking the RPL after failure • Guard Timer: used by links detecting link restoration to avoid receiving outdated R-APS messages • Three states for nodes • Initialization: first defining the node • Idle: normal state, RPL blocked, all nodes/ports working • Protecting: protection switching is in effect

Ethernet Ring Protection Switching (III) • Basic considerations (cont.): • An R-APS channel is configured using a VLAN -> transmitting R-APS messages

ERPS Principle of Operation (I) • In normal operation (nodes in state Idle): RPL is blocked • Link failure (local SF): nodes detecting it block failed port, send R-APS(SF) and flush filtering database (FDB) • Nodes receiving R-APS(SF) flush FDBs • RPL Owner receives R-APS(SF): flushes FDB, unblocks RPL • Link Restoration (local clear SF): detecting nodes send R-APS(NR) periodically and start Guard Timer • RPL Owner receives R-APS(NR): starts WTR Timer • WTR Timer expires: RPL Owner blocks RPL, sends R-APS(NR,RB) and flushes DFB • Nodes receiving R-APS(NR,RB) flush FDBs • Nodes detecting link restoration unblock recovered ports, stop sending R-APS(NR) and flush FDBs

ERPS Principle of Operation (II)

EAPS vs. ERPS • Same basic idea: break the loop in the ring by blocking one port • In case of failure, unblock the blocked port and keep connectivity • EAPS: • Both the Master and Transient nodes can detect a failure • Only the Master detects the failed link is restored • ERPS: • Only the nodes adjacent to a failed link detect failures and restoration

References • S.Shah, M. Yip, «RFC3619: Extreme Networks’ Ethernet Protection Switching (EAPS), Version 1», Network Working Group, October 2003. • A. Lim, S. Blake, S. Shah, «Extreme Networks’ Ethernet Protection Switching (EAPS), Version 1.3», Internet-Draft, July 2011. • Extreme Networks Whitepaper «Ethernet Automatic Protection Switching (EAPS)», Extreme Networks, Inc., 2006. • J. D. Ryoo, H. Long, Y. Yang, M. Holness. Z. Ahmad, J. K. Rhee, «Ethernet Ring Protection for Carrier Ethernet Networks», IEEE Comm. Magazine, September 2008

Ethernet Automatic Protection Switching (EAPS)

Ethernet Automatic Protection Switching (EAPS)

Presentation Transcript

Automatic Protection Switching

802.1Qay Protection Switching

Emergency Action Plan (EAP) – Basics

The Impact of the Patient Protection and Affordable Care Act (PPACA) on EAPs

E1 Protection Switching Equipment

Tongue Twisters

CCNA 1 v3.1 Module 8 Ethernet Switching

Introducción a Switching Ethernet

Ethernet (LAN switching)

Ethernet Switching

Ethernet Protection and Packet Synchronization

Protection switching

E1 Automatic Protection Switching Solution (1+1 E1 Redundancy Switch)

E1 Automatic Protection Switching Solution (1+1 E1 Redundancy Switch)

G.8032 for Ethernet Networks Ethernet Ring Protection

Ethernet Switching

Automatic Protection Gates

Employee Assistance Programs

G.8032 for Ethernet Networks Ethernet Ring Protection

E1 Automatic Protection Switching Solution (1+1 E1 Redundancy Switch)

CCNA 1 v3.1 Module 8 Ethernet Switching

Ethernet (LAN switching)