A clean slate 4d approach to network control and management
1 / 30

A Clean Slate 4D Approach to Network Control and Management - PowerPoint PPT Presentation

  • Uploaded on

A Clean Slate 4D Approach to Network Control and Management. Hui Zhang Carnegie Mellon University Joint work with Hemant Gogineni, Albert Greenberg, Gisli Hjalmtysson, David Maltz, Andy Myers, Eugene Ng, Jennifer Rexford, Geoffrey Xie, Hong Yan, Jibin Zhan. A Conventional View of a Network.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' A Clean Slate 4D Approach to Network Control and Management ' - andrew-livingston

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
A clean slate 4d approach to network control and management

A Clean Slate 4D Approach to Network Control and Management

Hui Zhang

Carnegie Mellon University

Joint work with

Hemant Gogineni, Albert Greenberg, Gisli Hjalmtysson, David Maltz, Andy Myers, Eugene Ng, Jennifer Rexford, Geoffrey Xie, Hong Yan, Jibin Zhan

A conventional view of a network
A Conventional View of a Network


  • Physical topology is a graph of nodes and links

  • Run Dijkstra to find route to each node










A conventional view of a network1
A Conventional View of a Network





Knowing how the routers are connected says little about how two hosts are communicating






Configuration File




Access Control Table

NAT Table

Tunnel Table

Forwarding Table

A study of operational production networks
A Study of Operational Production Networks

  • How complicated/simple are real control planes?

    • What is the structure of the distributed system?

  • Use reverse-engineering methodology

    • There are few or no documents

    • The ones that exist are out-of-date

  • Anonymized configuration files for 31 active networks (>8,000 configuration files)

    • 6 Tier-1 and Tier-2 Internet backbone networks

    • 25 enterprise networks

    • Sizes between 10 and 1,200 routers

    • 4 enterprise networks significantly larger than the backbone networks

Router configuration files
Router Configuration Files

  • interface Ethernet0

  • ip address

  • interface Serial1/0.5 point-to-point

  • ip address

  • ip access-group 143 in

  • frame-relay interface-dlci 28

  • router ospf 64

  • redistribute connected subnets

  • redistribute bgp 64780 metric 1 subnets

  • network area 0

  • router bgp 64780

  • redistribute ospf 64 match route-map 8aTzlvBrbaW

  • neighbor remote-as 12762

  • neighbor distribute-list 4 in

access-list 143 deny

access-list 143 permit any

route-map 8aTzlvBrbaW deny 10

match ip address 4

route-map 8aTzlvBrbaW permit 20

match ip address 7

ip route

Limitation of today s control management complex system of systems
Limitation of Today’s Control & Management:Complex System of Systems

  • Systems are designed as components to be used in larger systems in different contexts, for different purposes, interacting with different components

    • Example: OSPF and BGP are complex systems in its own right, they are components in a routing system of a network, interacting with each other and packet filters …

  • Complex configuration to enable flexibility

    • The glue has tremendous impact on network performance

    • State of art: multiple interactive distributed programs written in assembly language

  • Lack of intellectual framework to understand global behavior

Are we going to the right direction
Are We Going to The Right Direction?

  • IP Control Plane function overloading

    • Reachability

    • Policy control

    • Resiliency and survivability

    • Traffic Engineering, load balancing

    • VPN

  • Ethernet control plane overloading

    • Spanning Tree, RSP, MSTP, vLAN, …

  • Control & management complexity works against robustness, dependability, security

Limitation of today s control management difficult to implement interesting policies
Limitation of Today’s Control & Management: Difficult to Implement Interesting Policies

  • Network designers want “simple” things, but achieving them is incredibly hard

Data Center Infrastructure


Limitation of today s control management difficult to implement interesting policies1
Limitation of Today’s Control & Management: Difficult to Implement Interesting Policies

  • Difficult for designers to express desired behaviors

A clean slate 4d approach to network control and management
Limitation of Today’s Control & Management:Hard to Coordinate Multiple Low Level Mechanisms To Achieve Network-wide Goals

  • Lack of higher level specification of network wide goals

    • Load balancing objectives vs. per link OSFP weight

    • Reachability matrix vs. per interface access control list

  • Difficult to dynamically coordinate multiple mechanisms

    • Forwarding, access control, NAT, tunnel management

Limitation of today s control management lack of robust management communication channel
Limitation of Today’s Control & Management:Lack of Robust Management Communication Channel

  • Circular dependencies between management plane, control plane, data plane

    • Management & control traffic travels along the paths that they intend to maintain

  • Operational networks usually require a separate control/management network

    • What should be the architecture of a converged network?

Fundamental problem never goes away the geni management problem



Virtualization SW

Virtualization SW

Substrate HW

Substrate HW

Fundamental Problem Never Goes Away: The GENI Management Problem

  • How the GMC communicates with each node?

    • Fundamental architectural issue that affects overall network manageability and security

Slice Manager




Resource Controller

Auditing Archive






Virtualization SW

Substrate HW

A strawman solution
A Strawman Solution

  • GMC uses the Internet to manage GENI elements

    • GENI elements expose external IP addresses that can be reached from the Internet

    • Management traffic between GMC and GENI elements traverses the Internet

Issues with the strawman
Issues with the Strawman

  • Not consistent with the architectural vision of a single converged network for the future

    • Management depends on a separate network

  • Management plane is susceptible to security and available risks prevalent in current Internet

    • DDOS attacks target external IP interfaces of GENI elements

    • GENI elements only as secure and available as the Internet

Summary limitations of today s control management plane
Summary: Limitations of Today’s Control & Management Plane

  • High complexity

  • Difficult to implement non-trivial control policies

  • Difficult to dynamically coordinate control logics

  • Lack of robust control communication channel

  • Root cause:

    • Wrong partition of functionalities in control/management plane, which leads to lack of flexibility and high complexity

Incremental approach to address problem

Configuration File

Configuration File

Configuration File










Access Control

NAT Table

Tunnel Table

Access Control

NAT Table

Tunnel Table

Access Control

NAT Table

Tunnel Table

Forwarding Table

Forwarding Table

Forwarding Table

Incremental Approach To Address Problem

  • Challenges of using sophisticated management plane to manipulate control plane via configuration interface

    • Configuration interface too primitive

    • Control plane internal logic too complex

Convert to

Control plane



Routing Logic



Config commands

Good abstractions reduce complexity
Good Abstractions Reduce Complexity



All decision making logic lifted out of control plane

  • Routers no longer makes decisions on routing

  • Dissemination plane provides robust communication to/from data plane switches




Control Plane




Data Plane

Data Plane

A clean slate approach the 4d architecture
A Clean-Slate Approach: The 4D Architecture

Generating table entries

Decision Plane

Routing Table

Access Control Table

NAT Table

Tunnel Table


Install table entries

Discovery Plane

Modeled as a set of tables

Data Plane

What is hard and what is not
What is Hard and What is Not

  • Hard

    • Distributed algorithms for sophisticated functions

    • Distributed algorithms and protocols that are extensible

    • Consensus on the right protocol

  • Not Hard

    • Centralized algorithms

    • Basic protocol for Decision Element (DE) recovery

    • Robust and configuration-free protocol that provides reachability

4d separates distributed computing issues from networking issues
4D Separates Distributed Computing Issues from Networking Issues

  • Distributed computing issues: protocols and network architecture

    • Overhead

    • Resiliency

    • Scalability

  • Networking issues: decision logic

    • Traffic engineering and service provisioning

    • Egress point selection

    • Tunnel management

    • Reachability control (VPNs)

    • Precomputation of backup paths

Example 4d approach to reachability control
Example – 4D Approach to Reachability Control Issues

Reachability matrix

Decision Plane

  • Reachability matrix directly expresses intended goal

  • Path computation can jointly balance load and obey reachability constraints

  • Packet filters installed only where needed, and changed when routing changes

Path Computation

Traffic Matrix



Discovery/Dissemination Plane

Load info

Data Plane

Example 4d enables simpler and better traffic engineering
Example: 4D Enables Simpler and Better Traffic Engineering Issues


  • OSPF normally calculates a single path to each destination D

  • OSPF allows load-balancing only for equal-cost paths to avoid loops

  • Using ECMP requires careful engineering of link weights


Decision Plane with network-wide view can do more sophisticated optimization

Example meta management for geni
Example: Meta-Management for GENI Issues

  • A thin layer of meta-control software pervades the GENI network, boots up GENI elements, connects them to the GMC

    • GENI nodes are able to bootstrap with zero pre-configuration

  • The meta-control exposes IP protocol stack interface to CM and GMC so that all existing management software can run intact

  • The meta-control protocol uses the same GENI physical resources but runs independently from native data protocols

  • The design of meta-control protocol is optimized for high robustness and security

    • Performance is a lesser consideration

    • Can implement stronger security mechanisms such as hop-by-hop authentication to prevent man-in-the-middle attack

    • Can implement robust protocols such as source routing or gossip

Using the 4d architecture
Using the 4D Architecture Issues

  • Install a security key on each device

  • Connect them together

  • Connect Decision Elements

Example network with 49 switches and 5 DEs

Does it work yes
Does it work? Yes. Issues

  • 4D designed so performance can be predicted

  • Recovers from single link failure in < 120 ms

    • < 1 s response considered “excellent”

    • Faster forwarding reconvergence possible

  • Survives failure of master Decision Element

    • New DE takes control within 170 ms

    • No disruption unless second fault occurs

  • Gracefully handles complete network partitions

    • Less than 170 ms of outage

    • At no point did two DEs attempt to master the same switch

4d enables customized decision logic
4D Enables Customized Decision Logic Issues

  • Example also illustrates the 4D controlling both L2 and L3 (Ethernet and IP)

One size fits all

State-of-Art Issues

Different set of protocols for different data planes

STP for Ethernet


Same protocols (logic) for different environments

Data center, campus, ISP

Hard to customize, hard to extend


Common dissemination (protocol)

IPv4, IPv6, Ethernet

Customizable decision plane (algorithm)

Data center, enterprise, access, metro, backbone

Reachability, traffic engineering, robustness

One Size Fits All?

Control plane the key leverage point
Control Plane: The Key Leverage Point Issues

  • Great Potential: control plane determines the behavior of the network

    • Reaction to events, reachability, services

  • Great Opportunities

    • A radical clean-slate control plane can be deployed

      • Agnostic to packet format: IPv4/v6, ethernet

      • No changes to end-system software

    • Control plane is the nexus of network evolution

      • Changing the control plane logic can smooth transitions in network technologies and architectures

Learning from ethernet evolution experience

Ethernet or 802.3 Issues

Early Implementations

  • Bus-based Local Area Network

  • Collision Domain, CSMA/CD

  • Bridges and Repeaters for distance/capacity extension

  • 1-10Mbps: coax, twisted pair (10BaseT)






  • Switched solution

  • Little use for collision domains

  • 80% of traffic leaves the LAN

  • Servers, routers 10 x station speed

  • 10/100/1000 Mbps, 10gig coming: Copper, Fiber






Learning from Ethernet Evolution Experience

Current Implementations:

Everything Changed Except Name and Framing