Network management
1 / 44

Network Management - PowerPoint PPT Presentation

  • Updated On :

Network Management. Richard Mortier Microsoft Research, Cambridge (Guest lecture, Digital Communications II). Overview. Introduction Abstractions IP network components IP network management protocols Pulling it all together An alternative approach. Overview. Introduction

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Network Management' - niabi

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Network management l.jpg

Network Management

Richard Mortier

Microsoft Research, Cambridge

(Guest lecture, Digital Communications II)

Overview l.jpg

  • Introduction

  • Abstractions

  • IP network components

  • IP network management protocols

  • Pulling it all together

  • An alternative approach

Overview3 l.jpg

  • Introduction

    • What’s it all about then?

  • Abstractions

  • IP network components

  • IP network management protocols

  • Pulling it all together

  • An alternative approach

What is network management l.jpg
What is network management?

  • One point-of-view: a large field full of acronyms


    • (Don’t ask me what all of those mean, I don’t care!)

  • From

    • In 1989, a random of the journalistic persuasion asked hacker Paul Boutin “What do you think will be the biggest problem in computing in the 90s?” Paul's straight-faced response: “There are only 17,000 three-letter acronyms.” (To be exact, there are 26^3 = 17,576.)

  • Will ignore most of them 

What is network management5 l.jpg
What is network management?

  • Computer networks are considered to have three operating timescales

    • Data: packet forwarding [ μs, ms ]

    • Control: flows/connections [ secs, mins ]

    • Management: aggregates, networks [ hours,days ]

  • …so we’re concerned with “the network” rather than particular devices

  • Standardization is key!

Overview6 l.jpg

  • Introduction

  • Abstractions


  • IP network components

  • IP network management protocols

  • Pulling it all together

  • An alternative approach

Iso fcaps functional separation l.jpg
ISO FCAPS: functional separation

  • Fault

    • Recognize, isolate, correct, log faults

  • Configuration

    • Collect, store, track configurations

  • Accounting

    • Collect statistics, bill users, enforce quotas

  • Performance

    • Monitor trends, set thresholds, trigger alarms

  • Security

    • Identify, secure, manage risks

Tmn ems administrative separation l.jpg
TMN EMS: administrative separation

  • Telecommunications Management Network

  • Element Management System

  • “...simple but elegant...” (!)

    • (my emphasis)

  • NEL: network elements (switches, transmission systems)

  • EML: element management (devices, links)

  • NML: network management (capacity, congestion)

  • SML: service management (SLAs, time-to-market)

  • BML: business management (RoI, market share, blah)

The b isdn reference model l.jpg
The B-ISDN reference model

  • Asynchronous Transfer Mode “cube”

    • See IAP lectures, maybe 

  • Plane management…

    • The whole network

  • …vs layer management

    • Specific layers

  • Topology

  • Configuration

  • Fault

  • Operations

  • Accounting

  • Performance

management plane

control plane

user plane

higher layers

higher layers

plane management

layer management

ATM adaptation layer

ATM layer

physical layer

Network management10 l.jpg
Network management

  • Models of general communication networks

    • Tend to be quite abstract and exceedingly tedious!

    • Many practitioners still seem excited about OO programming, WIMP interfaces, etc

    • …probably because implementation is hard due to so many excessively long and complex standards!

  • My view: basic “need-to-know” requirements are

    • What should be happening? [ c ]

    • What is happening? [ f, p, a ]

    • What shouldn’t be happening? [ f, s ]

    • What will be happening? [ p, a ]

Network management11 l.jpg
Network management

  • We’ll concentrate on IP networks

    • Still acronym city: ICMP, SNMP, MIB, RFC 

    • Sample size: 102 routers, 105 hosts

  • We’ll concentrate on the network core

    • Routers, not hosts

  • We’ll ignore “service management”

    • DNS, AD, file stores, etc

Overview12 l.jpg

  • Introduction

  • Abstractions

  • IP network components

    • IP primer, router configuration

  • IP network management protocols

  • Pulling it all together

  • An alternative approach

Ip primer you probably know all this l.jpg
IP primer (you probably know all this)

  • Destination-routed packets – no connections

    • Time-to-live field: allow removal of looping packets

  • Routers forward packets based on routeing tables

    • Tables populated by routeing protocols

  • Routers and protocols operate independently

    • …although protocols aim to build consistent state

  • RFCs ~= standards

    • Often much looser semantics than e.g. ISO, ITU standards

    • Compare for example OSPF [RFC2327] and IS-IS [RFC1142, RFC1195], two link-state routeing protocols

So how do you build an ip network l.jpg
So, how do you build an IP network?

  • Buy (lease) routers

  • Buy (lease) fibre

  • Connect them all together

  • Configure routers appropriately

  • Configure end-systems appropriately

  • Assume you’ve done 1–3 and someone else is doing 5…

Router configuration l.jpg
Router configuration

  • Initialization

    • Name the router, setup boot options, setup authentication options

  • Configure interfaces

    • Loopback, ethernet, fibre, ATM

    • Subnet/mask, filters, static routes

    • Shutdown (or not), queueing options, full/half duplex

  • Configure routeing protocols (OSPF, BGP, IS-IS, …)

    • Process number, addresses to accept routes from, networks to advertise

  • Access lists, filters, ...

    • Numeric id, permit/deny, subnet/mask, protocol, port

  • Route-maps, matching routes rather than data traffic

  • Other configuration aspects: traps, syslog, etc

  • Router configuration fragments l.jpg
    Router configuration fragments

    hostname FOOBAR


    boot system flash slot0:a-boot-image.bin

    boot system flash bootflash:

    logging buffered 100000 debugging

    logging console informational

    aaa new-model

    aaa authentication login default tacacs local aaa authentication login consoleport none

    aaa authentication ppp default if-needed tacacs

    aaa authorization network tacacs !

    ip tftp source-interface Loopback0

    no ip domain-lookup

    ip name-server


    ip multicast-routing

    ip dvmrp route-limit 7000

    ip cef distributed

    interface Loopback0


    ip address


    interface FastEthernet0/0/0

    description Link to New York

    ip address

    ip access-group 175 in

    ip helper-address

    ip pim sparse-mode

    ip cgmp

    ip dvmrp accept-filter 98 neighbor-list 99



    interface FastEthernet4/0/0

    no ip address

    ip access-group 183 in

    ip pim sparse-mode

    ip cgmp



    router ospf 2


    passive-interface FastEthernet0/0/0

    passive-interface FastEthernet0/1/0

    passive-interface FastEthernet1/0/0

    passive-interface FastEthernet1/1/0

    passive-interface FastEthernet2/0/0

    passive-interface FastEthernet2/1/0

    passive-interface FastEthernet3/0/0

    network area

    network area

    network area

    access-list 24 remark Mcast ACL

    access-list 24 permit

    access-list 24 permit

    access-list 24 permit

    access-list 24 permit

    access-list 24 permit

    access-list 1011 deny 0000.0000.0000 ffff.ffff.ffff ffff.ffff.ffff 0000.0000.0000 0xD1 2 eq 0x42

    access-list 1011 permit 0000.0000.0000 ffff.ffff.ffff 0000.0000.0000 ffff.ffff.ffff

    tftp-server slot1:some-other-image.bin

    tacacs-server host

    tacacs-server key xxxxxxxx

    rmon event 1 trap Trap1 description "CPU Utilization>75%" owner config

    rmon event 2 trap Trap2 description "CPU Utilization>95%" owner config

    Router configuration17 l.jpg
    Router configuration

    • Lots of quite large and fragile text files

      • 00s/000s routers, 00s/000s lines per config

      • Errors are hard to find and have non-obvious results

      • Router configuration also editable on-line

    • How to keep track of them all?

      • Naming schemes, directory hierarchies, CVS

      • ssh upload and atomic commit to router

      • Perhaps even a database

    • State of the art is pretty basic

      • Few tools to check consistency

      • Generally generate configurations from templates and have human-intensive process to control access to running configs

    • Topic of current research [Feamster et al]

    this counts as quite advanced!

    Overview18 l.jpg

    • Introduction

    • Abstractions

    • IP network components

    • IP network management protocols

      • ICMP, SNMP, Netflow

    • Pulling it all together

    • An alternative approach

    Slide19 l.jpg

    • Internet Control Message Protocol [RFC792]

      • IP protocol #1

      • In-band “control”

    • Variety of message types

      • echo/echo reply [ PING (packet internet groper) ]

      • time exceeded [ TRACEROUTE ]

      • destination unreachable, redirect

      • source quench

    Ping packet internet groper l.jpg
    Ping (Packet INternet Groper)

    • Test for liveness

      • …also used to measure (round-trip) latency

    • Send ICMP echo

    • Valid IP host [RFC1122, RFC1123] must reply with ICMP echo response

    • Subnet PING?

      • Useful but often not available/deprecated

      • “ACK” implosion could be a problem

      • RFCs ~= standards

    Traceroute l.jpg

    • Which route do my packets take to their destination?

      • Send UDP packets with increasing time-to-live values

      • Compliant IP host must respond with ICMP “time exceeded”

      • Triggers each host along path to so respond

    • Not quite that simple

      • One router, many IP addresses: which source address?

        • Router control processor, inbound or outbound interface

      • Routes often asymmetric, so return path != outbound path

      • Routes change

    • Do we want full-mesh host-host routes anyway?!

      • Size of data set, amount of probe traffic

      • This is topology, what about load on links?

    Slide22 l.jpg

    • Protocol to manage information tables at devices

    • Provides get, set, trap, notify operations

      • get, set: read, write values

      • trap: signal a condition (e.g. threshold exceeded)

      • notify: reliable trap

    • Complexity mostly in the MIB design

      • Some standard tables, but many vendor specific

      • Non-critical, so often tables populated incorrectly

      • Many tens of MIBs (thousands of lines) per device

      • Different versions, different data, different semantics

        • Yet another configuration tracking problem

      • Inter-relationships between MIBs

    Ipfix l.jpg

    • IETF working group

      • Export of flow based data out of IP network devices

      • Developing suitable protocol based on Cisco NetFlow™ v9

      • [RFC3954, RFC3955]

    • Statistics reporting

      • Setup template

      • Send data records matching template

    • Many variables

      • Packet/flow counters, rule matches, quite flexible

    Overview24 l.jpg

    • Introduction

    • Abstractions

    • IP network components

    • IP network management protocols

    • Pulling it all together

      • Network mapping, statistics gathering, control

    • An alternative approach

    An hypothetical nms l.jpg
    An hypothetical NMS

    • GUI around ICMP (ping, traceroute), SNMP, etc

    • Recursive host discovery

      • Broadcast ping, ARP, default gateway: start somewhere

      • Recursively SNMP query for known hosts/connected networks

      • Ping known hosts to test liveness

      • Iterate

    • Display topology: allow “drill-down” to particular devices

    • Configure and monitor known devices

      • Trap, Netflow™, syslog message destinations

      • Counter thresholds, CPU utilization threshold, fault reporting

      • Particular faults or fault patterns

      • Interface statistics and graphs

    An hypothetical nms27 l.jpg
    An hypothetical NMS

    • All very straightforward? No, not really

      • A lot of software engineering: corner cases, traceroute interpretation, NATs, etc

      • MIBs may contain rubbish

      • Can only view inside your network anyway

    • Efficiency

      • Rate pacing discovery traffic: ping implosion/explosion

      • SNMP overloading router CPUs

    • Tunnelled, encrypted protocols becoming prevalent

    • Using NMSs also not straightforward

      • How to setup “correct” thresholds?

      • How to decide when something “bad” has happened?

      • How to present (or even interpret) reams and reams of data?

    Overview28 l.jpg

    • Introduction

    • Abstractions

    • IP network components

    • IP network management protocols

    • Pulling it all together

    • An alternative approach

      • From the edges…

    Slide29 l.jpg

    • Edge-based network management platform

      • Collect flow information from hosts, and

      • Combine with topology information from routeing protocols

    • Enable visualization, analysis, simulation, control

    • Avoid problems of not-quite-standard interfaces

      • Management support is typically ‘non-critical’ (i.e. buggy ) and not extensively tested for inter-operability

    • Do the work where resources are plentiful

      • Hosts have lots of cycles and little traffic (relatively)

    • Protocol visibility: see into tunnels, IPSec, etc

    System outline l.jpg




    System outline






    Traffic matrix

    Set of routes







    Where is my traffic going today l.jpg
    Where is my traffic going today?

    • Pictures of current topology and traffic

      • Routes+flows+forwarding rules  BIG PICTURE

    • In fact, where did my traffic go yesterday?

      • Keep historical data for capacity planning, etc

    • A platform for anomaly detection

      • Historical data suggests “normality,” live monitoring allows anomalies to be detected

    Where might my traffic go tomorrow l.jpg
    Where might my traffic go tomorrow?

    • Plug into a simulator back-end

      • Discrete event simulator, flow allocation solver

    • Run multiple ‘what-if’ scenarios

      • …failures

      • …reconfigurations

      • …technology deployments

    • E.g. “What happens if we coalesce all the Exchange servers in one data-centre?”

    Where should my traffic be going l.jpg
    Where should my traffic be going?

    • Close the loop: compute link weights to implement policy goals

      • Recompute on order of hours/days

    • Allows more dynamic policies

      • Modify network configuration to track e.g. time of day load changes

    • Make network more efficient (~cheaper)?

    Where are we now l.jpg
    Where are we now?

    • Three major components

      • Flow collection

      • Route collection

      • Distributed database

    • Building prototypes, simulating system

    Data collection l.jpg
    Data collection

    • Flow collection

      • Hosts track active flows

        • Using low overhead event posting infrastructure, ETW

        • Built prototype device driver provider & user-space consumer

      • Used packet traces for feasibility study on (client, server)

        • Peaks at (165, 5667) live and (39, 567) active flows per sec

    • Route collection

      • OSPF is link-state: passively collect link state adverts

      • Extension of my work at Sprint (for IS-IS and BGP); also been done at AT&T (NSDI’04 paper)

    The distributed database l.jpg
    The distributed database

    • Logically contains

      • Traffic flow matrix (bandwidths), {srcs}×{dsts}

      • …each entry annotated with current route from src to dst

        • N.B. src/dst might be e.g. (IP end-point, application)

    • Large dynamic data set suggests aggregation

    • Related work

      • { distributed, continuous query, temporal } databases

      • Sensor networks

    • Potential starting points: Astrolabe or SDIMS (SIGCOMM’04)

      • Where/what/how much to aggregate?

        • Is data read- or write-dominated?

        • Which is more dynamic, flow or topology data?

        • Can the system successfully self-tune?

    The distributed database37 l.jpg
    The distributed database

    • Construct traffic matrix from flow monitoring

      • Hosts can supply flows they source and sink

      • Only need a subset of this data to get complete traffic matrix

    • Construct topology from route collection

      • OSPF supplies topology → routes

    • Wish to be able to answer queries like

      • “Who are the top-10 traffic generators?”

        • Easy to aggregate, don’t care about topology

      • “What is the load on link l ?”

        • Can aggregate from hosts, but need to know routes

      • “What happens if we remove links {l…m} ?”

        • Interaction between traffic matrix, topology, even flow control

    The distributed database38 l.jpg
    The distributed database

    • Building simulation model

      • OSPF data gives topology, event list, routes

      • Simple load model to start with (load ~ # subnets)

      • Precedence matrix (from SPF) reduces flow-data query set

    • Can we do as well/better than e.g. NetFlow?

      • Accuracy/coverage trade-off

    • How should we distribute the DB?

      • Just OSPF data? Just flow data? A mixture?

    • How many levels of aggregation?

      • How many nodes do queries touch?

    • What sort of API is suitable?

      • Example queries for sample applications

    Summary l.jpg

    • Introduction

      • What is network management?

    • Abstractions


    • IP network components

      • IP, routers, configurations

    • IP network management protocols

      • ICMP, SNMP, etc

    • Pulling it all together

      • Outline of a network management system

    • An alternative approach: from the edges

    The end l.jpg
    The end

    • Questions

    • Answers?






    Backup slides l.jpg
    Backup slides

    • Internet routeing

    • OSPF

    • BGP

    Internet routeing l.jpg
    Internet routeing

    • Q: how to get a packet from node to destination?

    • A1: advertise all reachable destinations and apply a consistent cost function (distance vector)

    • A2: learn network topology and compute consistent shortest paths (link state)

      • Each node (1) discovers and advertises adjacencies; (2) builds link state database; (3) computes shortest paths

    • A1, A2: Forward to next-hop using longest-prefix-match

    Ospf link state routeing l.jpg
    OSPF (~link state routeing)

    • Q: how to route given packet from any node to destination?

    • A: learn network topology; compute shortest paths

    • For each node

      • Discover adjacencies (~immediate neighbours); advertise

      • Build link state database (~network topology)

      • Compute shortest paths to all destination prefixes

      • Forward to next-hop using longest-prefix-match (~most specific route)

    Bgp path vector routeing l.jpg
    BGP (~path vector routeing)

    • Q: how to route given packet from any node to destination?

    • A: neighbours tell you destinations they can reach; pick cheapest option

    • For each node

      • Receive (destination, cost, next-hop) for all destinations known to neighbour

      • Longest-prefix-match among next-hops for given destination

      • Advertise selected (destination, cost+, next-hop') for all known destinations

    • Selection process is complicated

    • Routes can be modified/hidden at all three stages

      • General mechanism for application of policy