1 / 42

OpenFlow Switch Limitations

OpenFlow Switch Limitations. Background: Current Applications . Traffic Engineering application (performance) Fine grained rules and short time scales Coarse grained rules and long time scales Middlebox provision ( perf + security) Fine grained rule and long time scales Network services

marci
Download Presentation

OpenFlow Switch Limitations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OpenFlow Switch Limitations

  2. Background: Current Applications • Traffic Engineering application (performance) • Fine grained rules and short time scales • Coarse grained rules and long time scales • Middlebox provision (perf + security) • Fine grained rule and long time scales • Network services • Load balancer: fine-grained/short-time • Firewall:fine-grained/long-time • Cloud Services • Fine grained/long-time scales

  3. Background: Switch Design Network Controller 13Mbs Switch CPU+Mem 35Mbs Hash Table 250GB TCAM

  4. OpenFlowBackground: Flow Table Entries • OpenFlow rules match on 14 fields • Usually stored in TCAM (TCAM is much smaller) • Generally 1K-10K entries. • Normal switches • 100K-1000K entries • Only match on 1-2 fields

  5. Background: Switch Design Network Controller 13Mbs Switch CPU+Mem 35Mbs Hash Table 250GB TCAM

  6. OpenFlow Background: Network Events • Packet_In (flow-table expect, pkt matches no rule) • Asynch from switch to controller • Flow_mod (insert flow table entries) • Asynch from controller to switch • Flow_timeout (flow was removed due to timeout) • Asynch from switch to controller • Get Flow statistics (information about current flows( • Synchronous between switch & controller • Controller sends request, switch replies

  7. Background: Switch Design 3. Controller run code to process event 2. CPU create packet-in event, and sends to controller Network Controller 13Mbs Switch CPU+Mem 35Mbs Check Flow table, If no match then Inform CPU From: Theo To: Bruce

  8. Background: Switch Design 4. Controller creates flow event and sends a flow_mod event 2. CPU processes flow_mod and insert into TCAM Network Controller 13Mbs Switch CPU+Mem 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 0 From: Theo To: Bruce

  9. Background: Switch Design 4. Controller creates flow event and sends a flow_mod event 2. CPU processes flow_mod and insert into TCAM Network Controller 13Mbs Switch CPU+Mem 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce

  10. Background: Switch Design Network Controller 13Mbs Switch CPU+Mem Check Flow table Found matching rule Forward packet Update the count 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce

  11. Background: Switch Design Network Controller 13Mbs Switch CPU+Mem Check Flow table No matching rule … now we must talk tothe controller 35Mbs From: theo, to: bruce, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: John

  12. Background: Switch Design Network Controller 13Mbs Switch CPU+Mem Check Flow table Found matching rule Forward packet Update the count 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: John

  13. Background: Switch Design Network Controller 13Mbs Switch CPU+Mem Check Flow table Found matching rule Forward packet Update the count 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Cathy

  14. Background: Switch Design • Problem with Wild-card • Too general • Can’t find details of individual flows • Hard to do anything fine-grained Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1

  15. Background: Switch Design • Doing fine-grained things • Think hedera • Find all elephant flows • Put elephant flows on diff path • How to do this? • Controller sent get-stat request • Switch respond will all stats • Controller goes through each request • Install special paths Switch CPU+Mem 35Mbs From: theo, to: bruce, send on port 1 Timeout: 1secs, count: 1K From: theo, to: john, send on port 1 Timeout: 10 secs, count: 1

  16. Background: Switch Design • Doing fine-grained things • Think hedera • Find all elephant flows • Put elephant flows on diff path • How to do this? • Controller sent get-stat request • Switch respond will all stats • Controller goes through each request • Install special paths Switch CPU+Mem 35Mbs From: theo, to: bruce, send on port 3 Timeout: 1secs, count: 1K From: theo, to: john, send on port 1 Timeout: 10 secs, count: 1

  17. Problems with Switches • TCAM is very small can only support a small number of rules • Only 1k per switch, endhost generate lots more flows • Controller install entry for each flow increases latency • Takes about 10ms to install new rules • So flow must wait!!!!!! • Can install at a rate of 13Mbs but traffic arrives at 250Gbp • Controller getting stats for all flows takes a lot resources • For about 1K, you need about MB • If you request every 5 seconds then you total:

  18. Background: Switch Design Network Controller 13Mbs Switch CPU+Mem 35Mbs Hash Table 250GB TCAM

  19. Problems with Switches • TCAM is very small can only support a small number of rules • Only 1k per switch, endhost generate lots more flows • Controller install entry for each flow increases latency • Takes about 10ms to install new rules • So flow must wait!!!!!! • Can install at a rate of 13Mbs but traffic arrives at 250Gbp • Controller getting stats for all flows takes a lot resources • For about 1K, you need about MB • If you request every 5 seconds then you total:

  20. Getting Around TCAM Limitation • Cloud centric solutions • Use Placement tricks • Data Center centric solutions • Use overlay: use placement tricks • General technique: Difane • Use Detour routing

  21. DiFANE

  22. DiFane • Creates a hierarchy of switches • Authoritative switches • Lots of memory • Collectively stores all the rules • Local switches • Small amount of memory • Stores a few rules • For unknown rules route traffic to an authoritative switch

  23. Packet Redirection and Rule Caching Authority Switch Feedback: Cache rules Ingress Switch Forward Egress Switch Redirect First packet Following packets Hit cached rules and forward

  24. Three Sets of Rules in TCAM In ingress switches reactively installed by authority switches In authority switches proactively installed by controller In every switch proactively installed by controller

  25. Stage 1 The controller proactively generates the rules and distributes them to authority switches.

  26. Partition and Distribute the Flow Rules Flow space accept Controller Distribute partition information AuthoritySwitch B Authority Switch A reject Authority Switch C Authority Switch B Egress Switch Authority Switch A Ingress Switch Authority Switch C

  27. Stage 2 The authority switches keeppackets always in the data plane and reactively cache rules.

  28. Packet Redirection and Rule Caching Authority Switch Feedback: Cache rules Ingress Switch Forward Egress Switch Redirect First packet Following packets Hit cached rules and forward A slightly longer path in the data plane is faster than going through the control plane

  29. Bin-Packing/Overlay

  30. Bin-Packing/Overlay

  31. Virtual Switch • Virtual switch has more Mem than hardware switch • So you can install a lot more rules in virtual switches • Create an overlay between virtual switches • Install fine-grained in virtual switches • Install normal OSPF rules in HW • Can implement everything in virtual switch • Has overlay draw-backs.

  32. Bin-Pack in data Centers • Insight: traffic is between certain servers • If server placed together then their rules are only inserted in one switch

  33. Getting Around CPU Limitations • Prevent controller from being in flow creation loop • Create clone rules • Prevent controller from being in decision loops • Create forwarding groups

  34. Clone Rules • Insert a special wild card rule • When a packet arrives switch makes a micro-flow rule itself • Micro-flow inherits all properties of the wildcard rule Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce

  35. Clone Rules • Insert a special wild card rule • When a packet arrives switch makes a micro-flow rule itself • Micro-flow inherits all properties of the wildcard rule Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1 From: Theo To: Bruce

  36. Forwarding Groups • What happens when there’s a failure? • Port 1 goes down? • Switch must inform the controller • Instead, have backup ports • Each rule also states backup Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1

  37. Forwarding Groups • What happens when there’s a failure? • Port 1 goes down? • Switch must inform the controller • Instead, have backup ports • Each rule also states backup Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1, backup: 2 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1, backup2 Timeout: 10 secs, count: 1

  38. How do I do load balancing? • Something like ECMP? • Or server load-balancing? • Currently, • Controller installs rules for each flow do load balancing when installing • Controller can do get stats, and load balance later Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1

  39. Forwarding Groups • Instead, have port-groups • Each rule specifies a group of ports to send on • When micro-rule is create • Switch can assign ports to micro-rules • in a round robin matter • Or based on probability Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1,2,4 Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 1 Timeout: 10 secs, count: 1

  40. Forwarding Groups • Instead, have port-groups • Each rule specifies a group of ports to send on • When micro-rule is create • Switch can assign ports to micro-rules • in a round robin matter • Or based on probability Switch CPU+Mem 35Mbs From: theo, to: ***, send on port 1(10%), 2(90%) Timeout: 10 secs, count: 1 From: theo, to: Bruce, send on port 2 Timeout: 10 secs, count: 1

  41. Getting Around CPU Limitations • Prevent controller from polling switches • Introduce triggers: • Each rule has a trigger and sends stats to the controller when the threshold is reached • E.g. if over 20 pkts match flow, • Benefits of triggers: • Reduces the number entries being returned • Limits the amount of network traffic

  42. Summary • Switches have several limitations • TCAM space • Switch CPU • Interesting ways to reduce limitations • Place more responsibility in the switch • Introduce triggers • Have switch create micro-flow rules from general rules

More Related