lecture 3 l.
Skip this Video
Loading SlideShow in 5 Seconds..
Lecture 3 PowerPoint Presentation
Download Presentation
Lecture 3

Loading in 2 Seconds...

play fullscreen
1 / 26

Lecture 3 - PowerPoint PPT Presentation

  • Uploaded on

Lecture 3 Responsivness vs. stability Brief refresh on router architectures Protocol implementation Quagga To read To present Threads vs. events (belegrakis) U-Loop prevention Sub-millisecond convergence (lekakis) REFS Netlink Quagga manual Router architectures

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Lecture 3

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
lecture 3
Lecture 3
  • Responsivness vs. stability
  • Brief refresh on router architectures
  • Protocol implementation
  • Quagga
to read
To read
  • To present
    • Threads vs. events (belegrakis)
    • U-Loop prevention
    • Sub-millisecond convergence (lekakis)
  • REFS
    • Netlink
    • Quagga manual
    • Router architectures
how to be faster
How to be faster
  • Faster SPF
    • Better algorithms
    • Incremental SPF
  • Faster detection
    • Faster HELLOs
    • BFD!!!
      • In the line card instead of the control plane
      • many protocols can share
  • Faster FIB download
    • Download “important” prefixes first
  • Do things faster
    • Trigger SPF immediately
    • Trigger LSA origination immediately
how to be stable
How to be stable
  • SPF may be expensive
    • Can not do SPF all the time something minor changes, may be better to do one SPF for all changes
      • Avoid extra FIB downloads
      • Do not overload the CPU
  • Do not want to sent too many updates at once
    • Receiver may get overloaded
  • Do not want to trigger updates too quickly
    • Link may be flapping
  • When CPU/links are loaded ensure that I do not miss important things
    • Do not miss HELLOs, will make things worse
configuration timers
Configuration: Timers
  • Hello timer, dead timer
  • LSA update delay
  • LSA pacing
  • LSA retransmission pacing
  • SPF delay
    • Wait for this time before you do SPF
  • SPF hold-time
    • Do not do another SPF before this time passes
  • Can have dynamic timers
    • Be fast when CPU is idle
    • Be slow when CPU is loaded
it is difficult
It is difficult
  • Speed and stability are conflicting goals
  • Alternatively: Disconnect convergence from data plane
    • Avoid u-loops
      • See overview
    • Have alternate next-hops pre-computed and switch to them in case of failures
      • We will see this later
anatomy of a protocol routing or not
Anatomy of a protocol (routing or not)
  • Inputs
    • Static configuration
    • Other protocol instances in the network
    • Other components in my platform
    • Dynamic events on my platform; link states etc…
  • State
    • Transient (packets, queues)
    • Protocol
  • Computation
    • Triggered (process incoming packet)
    • Periodic (timers) (refresh state, LSA)
  • Outputs
    • To other protocol instances on the network
    • To other components in the platform
    • To FIB
examples of protocol tasks
Examples of protocol tasks
  • Receive, send protocol packet(s)
  • Schedule/process timers
  • Perform computations (I.e SPF)
  • Communicate with other components
  • Download Routes to FIB
  • Process changes in the environment
    • Link state changes
    • Adjacency changes
  • Process configuration and configuration changes
  • Can be complex and long running
    • Download 1,000 routes to FIB
    • Originate 500 LSAs
    • Perform an SPF in a large network
  • Usually protocol runs on one CPU
    • Have to multiplex tasks
  • Scheduling of tasks is what makes or breaks an implementation
  • Liveness
    • even when I download 100,000 routes to the FIB, I can receive and process LSAs
  • Stability
    • Prioritize tasks
      • Send the hellos first even under load
      • Never skip important tasks when overloaded
    • Shed excess load so that I do not collapse
      • Queue incoming packets and start dropping if queue becomes too long
      • Slow down the SPFs…
the big question
The big question
  • How to implement/handle parallelism
  • Events vs. threads
  • Events
    • trigger event handlers that are essentially function calls
    • Run to completion, I.e. until the function returns
  • Threads
    • flows of execution with their own local state/stack
    • Can be suspended and resumed
    • With pre-emptive threads system may switch to another thread along the way
    • With non-preemptive threads I have to yield
how does my protocol look with events
How does my protocol look with events
  • Assign events and event handlers
    • Packet receive, packet send, spf etc…
  • Event loop (A.K.A the big select() loop)
    • Loop waiting for events
      • Incoming packet, timer, signal other event
    • Pick the next event to handle
      • According to my own scheduling
    • Call its event handler
  • When I want to initiate an action I post an event
    • Put the packet in a queue
    • Schedule a Packet_send event
how does my protocol look with threads
How does my protocol look with threads
  • FIG!
  • Assign tasks to threads
    • Packet_rx thread, packet_tx thread, FIB_download thread
  • Thread blocks when there is no work to do
    • Packet_rx on the socket, FIB_download on a cond variable
    • It is unblocked when there is work
  • System handles the scheduling of the threads
    • May not have control in it
events vs threads

Manage my own state

Manage my own scheduling

I explicitly handle parallelism by controlling when a event handler terminates

If I want to suspend an event handler must take care of its state


Can arbitrarily suspend/resume a thread

State is automatically managed in the thread stack

The thread scheduler has control

With pre-emptive threads system handles parallelism

But I have to LOCK

Events vs. threads

I have total control of everything and I can do what is best

Handle parallelism explicitly no need for locking, etc

May be more efficient

No context switches and state saving there


I have total responsibility of everything, system does not help me

If I want to yield to another handler need to take care of the state myself, I.e. stop a long SPF in the middle


Parallelism is handled in a more clean and natural way

System helps a lot in scheduling, state copying


Real parallel programming is hard

Locking etc

State copying can be expensive

Thread scheduler may be making the wrong scheduling decisions

Not application specific

an example quagga
An example: Quagga
  • First some router architecture
    • Forwarding and control plane
  • Forwarding plane has to be fast
    • NPs, FPGAs, ASICs, little bit inflexible
  • Control plane is usually implemented in a commodity processor
    • Commodity OS, environment and tools
big and small routers
Big and Small Routers
  • How does a large router look?
    • EXAMPLE control vs. forwarding plane
    • line-cards, switch, FIB per-line-card, control processor
  • How does a PC router look?
    • Kernel for the forwarding
    • Use space for the control plane
distributed control planes
Distributed control planes
  • I want resiliency and minimal fate sharing
  • Break the control plane into components that are independent
    • Processes
    • One process per-protocol
  • It was a novelty 6 years ago, now everybody has it
  • May need to share some state
    • Need to prioritize between multiple routes
    • Redistribution: later
quagga a distributed control plane for a pc router
Quagga: a distributed control plane for a PC router
  • Multiple processes
    • One per-protocol
    • Zebra
      • manage all the routes from all protocols
      • send routes to the FIB (kernel)
      • Centralize the management of local interfaces etc…
  • EXAMPLE of system
  • Zebra – protocols talk to each other through a private control protocol
    • Over a TCP socket
  • Protocols send their packets directly to the interfaces
  • But send their routes to zebra
    • Over a TCP socket
  • Zebra talks to the kernel through netlink
  • Interface down
    • Kernel to zebra through netlink
    • Zebra to protocols through private proto
  • Route download
    • Protocol to zebra through private proto
    • Zebra to kernel through netlink
  • OSPF Hellos
    • Directly from OSPF to interfaces and back
  • Data packets
    • Never leave the kernel
zebra protocol
Zebra protocol
  • Interface
    • Add, delete, addr-add, addr-delete, up, down
  • Route
    • Ipv4-add, ipv4-del, ipv6-add, ipv6-del
  • Redistribute
    • Add, del
  • Uses a special socket
  • Very powerful
    • Read and change interface state
    • Read and change interface configuration
    • Read and change routing tables
    • And MPLS, scheduling….
  • And efficient
    • Multicast some notifications
configuration and management
Configuration and management
  • Prompt based configuration and management
    • telnet localhost 2601 for zebra
    • telnet localhost 2604 for ospf
  • Directories
    • Zebra, ospf, lib for common functions
  • Event based (but confusingly called threads)
    • Main loop in lib/thread.c thread_fetch()
      • Considers: sockets, timers, signals
    • Timers are used as a general event mechanism
      • If I want to do something now, I schedule a timer with 0 expiration
  • Netlink interface in zebra/rt_netlink.c