sdn controller challenges n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
SDN Controller Challenges PowerPoint Presentation
Download Presentation
SDN Controller Challenges

Loading in 2 Seconds...

play fullscreen
1 / 39

SDN Controller Challenges - PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on

SDN Controller Challenges. The Story T hus Far. SDN --- centralize the network’s control plane The controller is effectively the brain of the network Controller determines what to do and tell switches how to do it. The Story Thus Far. The Story Thus Far. Something Happened!!!!.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'SDN Controller Challenges' - eliana-allen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
the story t hus far
The Story Thus Far
  • SDN --- centralize the network’s control plane
    • The controller is effectively the brain of the network
    • Controller determines what to do and tell switches how to do it.
the story thus far1
The Story Thus Far

Something Happened!!!!

the story thus far2
The Story Thus Far

Let’s Ask the Brian!!!!

the story thus far3
The Story Thus Far

Think about what happen…

Maybe come up with a solution

the story thus far4
The Story Thus Far
  • Controller runs control function
  • Control function creates switch state
    • F(global network state)  Switch state
    • Global network state can be graph of the network

Tell the network what to do

challenges with centralization
Challenges with Centralization
  • Single point of failure
    • Fault tolerance
  • Performance bottleneck
    • Scalability
    • Efficiency (switch-controller latency)
  • Single point for security violations
motivation for distributed controllers
Motivation for Distributed Controllers
  • Wide-Area-Network
    • Wide distribution of switches: from USA to Australia.
    • High latency between one controller and All switches
  • Application + Network growth
    • Higher CPU load for controller
    • More memory for storing FIB entries and calculations
  • High availabilit
class outline
Class Outline
  • Fault Tolerance
    • Google’s B4 paper
  • Controller Scalability
    • Ways to scale the controller
    • Distributed controllers: Mesh Versus Hierarchy
    • Implications of controller placement
google s b4 network
Google’s B4 Network
  • Provides connectivity between DC sites
  • Uses SDN to control edge switches
  • Goal: high utilization of links
  • Insight: fine-grained control over edge and network can lead to higher utilization
  • Distributed Controllers
    • One set of controllers for each Data center (site)
google s b4 network1
Google’s B4 Network
  • Provides connectivity between DC sites
  • Uses SDN to control edge switches
  • Goal: high utilization of links
  • Distributed Controllers
    • One set of controllers for each Data center (site)
fault tolerance in b4
Fault Tolerance in B4
  • Each site runs a set of controller
  • Paxos is run between controllers in a site to determine master
quick overview of paxos
Quick Overview of Paxos
  • Given N controllers
    • 1 Acts as leader, and N-1 as workers
    • All N controller maintain the same state
  • Switches interact with leader
  • Change doesn’t happen until whole group agrees
  • Failure of primary
      • N-1 work together to elect a new leader(determine new leader)

Propagate

State changes

Network

Events

pros cons of paxos
Pros-Cons of Paxos
  • Pros
    • Well understood and studied; gives good FT
    • Many implementations in the wild
    • E.g. Zookeeper
  • Cons
    • Time to recover
    • Impacts through of the put of the entire system
what limits a controller s scalability
What limits a controller’s scalability?
  • Number of control messages from switch
    • Depends on the application logic
      • E.g. MicroTE/Hedera periodically query all switches for stats
      • Reactive controller, evaluated in NoX, requires each switch to send messages for a new flow
    • Packet-in (if reactive Apps)
    • Flow stats, Flow_time-outs
what limits a controller s scalability1
What limits a controller’s scalability?
  • Application processing overhead
  • The controller runs a bunch of application
    • Similar to: A server running a set of programs
    • CPU/Memory constraint limit how the app runs
what limits a controller s scalability2
What limits a controller’s scalability?
  • Distance between controller and the switches

Hedera

L3

FW

Controller 1

how to scale the controller
How to Scale the Controller.
  • Obvious: add more controllers.
  • BUT: how about the applications?
    • Synchronization/concurrency problems.
      • Who controls which switch?
      • Who reacts to which events?

Hedera

L3

FW

Hedera

L3

FW

Hedera

L3

FW

?

?

Controller 1

Controller 2

Controller N

Stats + Install OF entries

medium sized networks
Medium Sized Networks
  • Assumption:
    • controller can’t store all forwarding table entries in memory
    • But can process all events and run all apps
  • Each controller
    • Get same network events+ running same app.  same output
    • But store output for only a fraction and config only a fraction

Hedera

L3

FW

Hedera

L3

FW

Hedera

L3

FW

Controller 1

Controller 2

Controller N

Stats + Install OF entries

medium sized networks hyperflow
Medium Sized Networks: hyperflow
  • Each controller
    • Push state to each controller
    • Each controller things it’s the only one in the network

Sub-subscribe ssytem

Hedera

L3

FW

Hedera

L3

FW

Hedera

L3

FW

Controller 1

Controller 2

Controller N

Stats + Install OF entries

large sized networks
Large Sized Networks
  • Assumptions
    • Each controller can’t store all the FIB entries
    • Each controller can’t run the entire application or handle events
  • Need to partition the application
    • But how?
application partition 1
Application partition 1
  • Approach 1: each controller runs a specific application
    • How do your resolve conflicts in FW entries
    • Apps can conflict in the rules they install

Hedera

L3

FW

Controller 1

Controller 2

Controller N

application partition 2
Application partition 2
  • Approach 2: all controllers run the same application but for a subset of devices
    • Results in a Distributed Mesh control plane

Abstract

Network view

Hedera

L3

FW

Controller 2

Hedera

L3

FW

Hedera

L3

FW

Controller 1

Controller N

application partition 21
Application Partition 2
  • Abstract view exchanged with each other
    • Abstract view reduces the n/w information used by each controller

REAL NETWORK

Hedera

L3

FW

Controller 2

Abstraction

Provided by

Controller N

Abstraction

Provided by

Controller 1

Controller 2’s View of NETWORK

onix to the sdn programmer
ONIX to the SDN Programmer
  • Controllers synchronize through a DB or DHT
    • So each app needs synchronization code.
    • How do you deal with concurrency.
  • How to synchronize between domains.
  • How many domains? Or controllers?
  • How many switches in a domain?
application partition 3
Application partition 3
  • Approach 3: divide application into local, and global.
    • Results in a hierarchical control plane
  • Global Controller and Local Controllers
    • Applications that do not need network-wide state
      • Can be run locally without communicate with other controllers
are hierarchical controllers feasible
Are Hierarchical Controllers Feasible
  • Examples of local applications:
    • Link Discovery, Learning switch, local policies
  • Examples of local portions of a global algo
    • Data center Traffic engineering
      • Elephant flow detection (hedera)
      • Predictability detection (MicroTE)
  • Local apps/controllers have other benefits
    • High parallelism
    • Can be run closer to the devices.
kandoo hierarchical controllers
Kandoo: Hierarchical controllers
  • 2 levels of controllers: global and local
    • Local applications are embarrassingly parallel
    • Local shields global from network events

Hedera

Global Controller

Hedera

L3

FW

Hedera

L3

FW

Hedera

L3

FW

Controller 2

Controller 1

Controller N

kandoo hierarchical controllers1
Kandoo: Hierarchical controllers
  • Local Controllers: run local apps
    • Returns abstract view to the global controller
    • Reduces # events sent to global and reduce size of network seen by

Hedera

Global Controller

Hedera

L3

FW

Hedera

L3

FW

Hedera

L3

FW

Controller 2

Controller 1

Controller N

kandoo hierarchical controllers2
Kandoo: Hierarchical controllers
  • Global Controllers
    • Runs global apps: AKA apps that need network wide state

Hedera

Global Controller

Hedera

L3

FW

Hedera

L3

FW

Hedera

L3

FW

Controller 2

Controller 1

Controller N

hedera reminder
Hedera Reminder
  • Goal: reduce network contention
  • Insight: contention happens when elephants share paths.
  • Solution:
    • Detect Elephant flows
    • Place Elephant flows on different flows
implementing hedera in onix
Implementing Hedera in Onix
  • 2 levels of controllers: global and local
    • Local applications are embarrassingly parallel
    • Local shields global from network events

Exchange

TM+detection

Hedera:

detection +placement

Hedera:detection+placement

Controller 1

Controller 2

Flow

Table

entries

Flow

Table

entries

Stats

Stats

implementing hedera in kandoo
Implementing Hedera in Kandoo
  • Local Controllers: get stats from networks + elephant detection
  • Global Controller: decide flow placement + flow installation

Hedera: Global placement

Global Controller

Install new flow table entries

Inform of elephant flows

Elephant detection

Elephant detection

Elephant detection

Controller 2

Controller 1

Controller N

Stats

implementing b4 in kandoo like architecture
Implementing B4 in Kandoo like architecture
  • Local Controllers: get stats from networks + determines demand
  • Global Controller: calculate paths for traffic

Install TE Ops

TE+BW allocator

TE DB

Global Controller

Inform of Flow demands

Elephant detection

Elephant detection

Elephant detection

Site Controller 2

Site Controller

Site Controller N

Stats + Install OF entries

kandoo to the sdn programmer
Kandoo to the SDN Programmer
  • Think of what is local and what is global
    • When apps are written, annotate with local flag
  • Kandoo will automatically place local
    • And place global.
  • Kandoo restricts messages between global and local controllers
    • You can’t send OF styles messages
    • Must send Kandoo style messages
summary
Summary
  • Centralization provide simplicity at the cost of reliability and scalability
  • Replication can improve reliability and scalability
  • For Reliability, Paxos is an option
  • For Scalability, conqueror and divide
    • Partition the applications
      • Kandoo: Local apps and global apps
    • Partition the network
      • Onix: each controller controls a subset of switches (Domain)