1 / 67

Multiservice Switch Architecture

Multiservice Switch Architecture. Scope. Discuss only distributed architecture Focus on data path functions. Outline. Architecture Overview Data Path Processing Data path functions Fast or slow path processing Control and Data Plane partitioning High Availability. Forwarding Engine.

honora
Download Presentation

Multiservice Switch Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiservice Switch Architecture

  2. Scope • Discuss only distributed architecture • Focus on data path functions

  3. Outline • Architecture Overview • Data Path Processing • Data path functions • Fast or slow path processing • Control and Data Plane partitioning • High Availability

  4. Forwarding Engine Forwarding Engine QoS/DiffServ Enabled Routing Engine MPLS Control QoS/DiffServ QoS/DiffServ Enabled Architecture Overview: Logic View Control Module ATM NIC Cell / Packet Switch Fabric IP NIC IP NIC

  5. Forwarding Engine Forwarding Engine QoS/DiffServ Enabled Routing Engine MPLS Control QoS/DiffServ QoS/DiffServ Enabled Architecture Overview: Forwarding Paths: Fast and Slow Control Module ATM NIC Cell / Packet Switch Fabric IP NIC IP NIC

  6. Architecture Overview: Interfaces Source: Agilent Technologies

  7. Physical View Source: “Network Processors and Coprocessors for Broadband Network Applications,” T. A. Chu, ACORN Networks

  8. Card Level Paths: Fast and Slow Source: “Network Processors and Coprocessors for Broadband Network Applications,” T. A. Chu, ACORN Networks

  9. Architecture Overview: Coprocessors Source: “Network Processor Based Systems: Design Issues and Challenges,” I. Jeyasubramanian, et.al., HCL Technologies Limited

  10. Architecture Overview: Coprocessors Source: “Network Processor Based Systems: Design Issues and Challenges,” I. Jeyasubramanian, et.al., HCL Technologies Limited

  11. Architecture Overview: Software Architecture EMS Policy Routing MPLS FIB Control Plane (2’) Forwarding/control Interface (4’’) (4) Policer Classifier (5) B u f f e r Mgt. S c h e d u l e r (4’) (3) Action IP Header Validation (6) (9) (10) IP/MPLS Header processing (2) (1) Pre-IP Processing (2’’) (7) (8) Mapper FE

  12. Data Path Processing • (1) The ingress Ethernet frame from an input port or frame from switching fabric are validated and decapsulated. • (2’)For non-IP frames, such as PPP and IS-IS, Pre-IP Processing will result in the frame PDU to be directly forwarded to the control card • (2”)For MPLS labeled packet which needs to be forwarded by label swapping, the label swap table is looked up in the Pre-IP Processing module and the labeled packet is sent to the IP/MPLS header processing module for further processing • (2)IP packet header information is validated • (3) In the Classifier, the Firewall/policy based classification and IP forwarding table lookup are performed • (4) For DiffServ based filtering, classified packet flows are policed and marked/remarked • (4’) For a non-DiffServ router or DiffServ router in the core, the policer module may be bypassed, and the packet is acted upon based on the outcome of the Classifier.

  13. Data Path Processing • (4’’) IP based Control protocol packets are sent to the control card for further processing, e.g., OSPF, RSVP-TE packets. • (5) The marked packet from the policer is sent to the Action module to be rate limited. One or multiple thresholds can be used to decide whether the packet should be dropped based on the current traffic rate and the color of the packet (only for DiffServ) • (6) The packet is processed including TTL update, fragmentation, checksum update, and encapsulation • (7) The Mapper maps the packet to one of the eight output queues based on IP precedence subfield, DSCP, or even input interface ID or circuit ID the packet came from. • (8) The Buffer Manager further sends the packet to the appropriate queue. • (9) The scheduler schedules the packet out to the circuit.

  14. Protocol Stack Overview

  15. ISO LLC • Logical Link Control is specified in ISO 11802 . LLC consists of three fields, a destination SAP address, a source SAP address and a control field. Multiple protocols encapsulated over LLC can be identified by protocol identification. • ISO provides its scheme for network layer protocol identification (NLPID) as specified in ISO 9577 . ISO assigns an LLC SAP address (0xFE) for use of ISO 9577 NLPID scheme. IEEE 802.1a provides its own scheme for network layer protocol identification (SNAP). For this purpose, ISO assigns an LLC SAP address (0xAA) for the use of IEEE802.1a SNAP scheme. • The LLC encapsulation comes with two different formats. One is based on the ISO NLPID (Network Layer Protocol Identifier (PID)) format and the other is based on IEEE 802.1a SubNetwork Attachment Point (SNAP) format or LLC/SNAP format.

  16. ISO LLC • The LLC header value 0xFE-FE-03 must be used to identify a routed PDU for the ISO NLPID format (e.g. PPP, IS-IS, etc.). • The LLC header is 3-octet in length and its value is 0xAA-AA-03, indicating the presence of a SNAP header. Note: The LLC/SNAP format must be used for IP datagram encapsulation. • The SNAP header consists of a three octet Organization Unique Identifier (OUI) and a two octet PID. The SNAP header uniquely identifies a routed or bridged protocol. The OUI value 0x00-00-00 indicates that the PID is an EtherType. OUI PID

  17. ISO LLC Examples: Note: AppleTalk: LLC = 0xaa aa 03; OUI = 0x080007; SNAP= 0x809b

  18. Frame Format

  19. Ethernet Frame Format

  20. Input Frame IEEE 802.3 Or Ethernet ? Read the value of Length/Type field Ethernet IEEE 802.3 Read the value of Type field in SNAP What Protocol Type ? MPLS PPP IS-IS others IP MPLS What PPP Type ? others Discard IP PPP Control Message or IS-IS Label Swapping? Yes No Send to the control card Send to IP Header Validation Module Send to IP/MPLS Header Processing Module Pre-IP processing: Ingress

  21. Pre-IP Processing: Generic MPLS Label Swapping

  22. ATM-LSR Label Swapping

  23. IP Header Format

  24. TCP Header Format

  25. UDP Header Format

  26. IP Header Validation

  27. Search Key and Filter Rule

  28. Packet Classification

  29. LER

  30. Action Types • Accept • Discard • Reject • Routing instance • Alert • Count • Log • DSCP set • Rate limit

  31. IP/MPLS Header Processing: TTL Update

  32. MTU Check and Fragmentation

  33. Fragmentation at a LSR

  34. A Fragmentation Algorithm • FO -- Fragment offset in the units of 8-octets • IHL -- Internet Header Length in the units of 4-octets • DF -- Don’t Fragment flag • MF -- More Fragment flag • TL -- Total Length in octets • OFO -- Old Fragment Offset • OIHL -- Old Internet Header Length • OMF -- Old More Fragments flag • OTL -- Old Total Length • NFB -- Number of Fragment Blocks (Block size = 8 Octets) • MTU -- Maximum Transmission Unit in Octets

  35. A Fragmentation Algorithm IF TL =< MTU THEN submit this datagram to the next step in datagram processing ELSE IF DF = 1 • THEN discard the datagram and may send an ICMP Destination Unreachable message (See Section 6.2.2) back to the source ELSE To produce the first fragment: i.Copy the original internet header; ii.OIHL <= IHL; OTL <= TL; OFO <= FO; OMF<= MF; iii.NFB <= (MTU-IHL*4)/8; iv.Attach the first NFB*8 data octets; v.Correct the header: MF <= 1; TL <= (IHL*4)+(NFB*8); Recompute Checksum; vi.Submit this fragment to the next step in datagram processing; To produce the second fragment: vii.Selectively copy the internet header (some options are not copied, see Section 6.2.1.4); viii.Append the remaining data; ix.Correct the header: IHL <= {[(OIHL*4)-(Length of options not copied)] + 3}/4; TL <= OTL –NFB*8 – (OIHL-IHL)*4; FO <= OFO +NFB; MF <= OMF; Recompute Checksum; x.Submit this fragment to the fragmentation test; DONE.

  36. Checksum Update HC: old checksum in header HC’: new checksum in header M: old value of a 16-bit field M’: new value of a 16-bit field Then the algorithm is as follows: IF M-M’=1 HC’ = HC –0xfffe with borrow ELSE HC’ = HC - ~M – M’ with borrow

  37. Fast or Slow Paths Forwarding • Some gray areas: • ICMP • Options field • Packet fragmentation

  38. ICMP

  39. ICMP

  40. ICMP

  41. ICMP • May have different handlings for different ICMP type messages • Informational ICMP may be handled by control card, e.g., • Timestamp/Timestamp Reply • Echo/Echo Reply • ICMP relevant to data forwarding may be handled by the network processor itself, e.g., • Destination Unreachable • Source Quench (obsolete) • Redirect • Time Exceed • Parameter Problem • Rate limiting to the central control card for ICMP packets should be enforced to prevent ICMP DOS

  42. Options Field • Needs to be done by either central control card or local CPU, preferably the central control card

  43. Fragmentation • About 3% Internet traffic needs fragmentation • Slow path forwarding can be problematic • An Example: for an OC-192 interface, the CPU has to handle 300Mbps traffic!

  44. Fragmentation • Concept of Wire-speed forwarding Assumptions: • A network processor working at 200 MHz clock rate or 5 ns • One instruction per clock cycle • There are 8 threads working in pipeline • Minimum frame size is 60 bytes • Line rate = 1 Gigabit per second Per frame time = 60x8/1Gigabit = 480 ns Instruction budget = 480/5=96 instructions per packet Latency budget = 480x8 = 3840 ns Wire-speed: So long as the network processor is work conserving and the instruction budget is not exceeded, wire-speed forwarding is maintained

  45. Fragmentation • Traditional perception: “Fragmentation should not be done by the network processor because it consumes too many clock cycles or instructions • Traditional perception could be wrong and the truth might be: “Care needs to be taken for the load and store of the IP header information for updating to avoid long latency for packet fragmentation” • Instruction budget is not an issue because it is calculated based on available clock cycles for minimum sized packet

  46. Function Partitioning • Why is it important? • Distributed or Centralized? • Ideally local information should be handled by local components, however, the need for information exchange between components sometimes call for centralized approach • Components mainly involve control card/Central CPU, NICs, and local CPU,

  47. Function Partitioning • Examples: • Framing at ingress or egress NIC? • ARP and/or PPP running on local CPU or central CPU? • Control plane functions running on local CPU or central CPU?

  48. Framing at Ingress or Egress • Definitions: • Ingress framing: do the layer 2 framing for outgoing packet at the ingress NIC • Egress framing: do the layer 2 framing for outgoing packet at the egress NIC • Which one’s better? • Ingress framing requires globalization of local information, e.g., ARP tables, interface MTUs, etc – more memory space • Egress framing requires more processing on the same packet, e.g., another IP forwarding table lookup to find the next hop IP address or more overhead on carrying next hop IP address from ingress to egress • Prioritizing ingress processing versus egress processing in the network processor may favor one solution over the other

  49. Router ARP Scope • Within an IP subnet • A physical interface may support multiple IP subnet

  50. ARP • Design choices: • Distributed Solution • Run ARPs locally on the local CPUs in NICs • Centralized Solution • Run ARPs on the central control processor • Hybrid solution • Run ARPs locally but the ARP tables are centralized • Impact of different design choices • Distributed solution is good when packet framing is done at the egress NICs • If packet framing is done at the ingress NICs, centralized solution may be better • Hybrid solution can be a good choice when central control processor power is constrained while packet framing needs to be done at the ingress NICs

More Related