Internet traffic measurement
1 / 35

Internet Traffic Measurement - PowerPoint PPT Presentation

  • Updated On :

Internet Traffic Measurement. CS590F Survey Project by Vadim Gorbach [email protected] Purdue University December 4, 2000. Why Measurements are Important. “If we don’t measure the Internet, we don’t have objective data about how it performs”

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Internet Traffic Measurement' - ruggiero

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Internet traffic measurement l.jpg

Internet Traffic Measurement

CS590F Survey Project

by Vadim Gorbach

[email protected]

Purdue University

December 4, 2000

Why measurements are important l.jpg
Why Measurements are Important

  • “If we don’t measure the Internet, we don’t have objective data about how it performs”

  • Essential to understanding Internet’s behavior and growth

  • Essential to identifying and ameliorating network problems

  • Economy more and more relies on the Internet: infrastructure-wide analysis and planning are needed to efficiently scale up the Internet

  • Internet users need reliable means of verifying service guarantees; ISPs need to diversify grades of service to improve revenues

  • Corporations need affordable VPNs instead of pricey PNs: strict SLAs are a must (currently, unclear inter-ISP business mechanics is a hindrance)

  • Many user groups view Internet as mission-critical: strict QoS guarantees are required and as soon as possible(Success story: AIAG network – AutoNet or ANX)

Why measurements are difficult l.jpg
Why Measurements are Difficult

  • To effectively measure the global Internet, wide cooperation is needed. However, ISPs are reluctant to coordinate their efforts

  • To be done correctly and accurately, profound understanding and experience (expertise) are required, therefore…

  • Statistics collection is viewed as a luxury (OC48mon = $100,000) - only large ISPs can afford statistics collection and analysis - demand is still dormant

  • Best Effort service, low profit margins for ISPs make operational support difficult – data collection is low priority

  • Traffic volume, high trunk capacity, diversity of protocols, technologies and applications make traffic monitoring and analysis a challenging endeavor

  • Results get obsolete very rapidly: Internet is under very active development – traffic, technology and topology change very fast

  • Tremendous growth of Internet – it is difficult to scale measurements

  • Overprovisioning is a widely practiced solution to network congestion

Internet the patient today l.jpg
Internet (the Patient) Today

  • Global Internet is growing very fast:

    • as of 11am today, there are 98,160,522 hosts on the Internet, and 344,456,271 users (out of 6,113,303,473; or 5.63%) and counting(future is optimistic)

    • number of hosts doubles every 16-18 months (close to Moore’s law)

    • volume of Internet traffic doubles in about every 100 days

    • volume of World-Wide Web content doubles every 50-70 days

  • Diversity (by no means synergetic) of protocols, applications and technologies

  • Concern: growing popularity of streaming applications, which threatens the stability of the network

Internet traffic l.jpg
Internet Traffic

  • MCI backbone measurements by CAIDA in April 1998:

    • TCP: 95% of the bytes, 90% of the packets, 80% of the flows

    • IPv6, IP-over-IP, ICMP: around 3%

    • UDP: rest of the traffic

    • HTTP: 75% of bytes, 70% of packets

    • SMTP: 5% of bytes, 5% of packets

    • FTP: 5% of bytes, 3% of packets

    • NNTP: 2% of bytes, less than 1% of packets

    • Telnet: 1% of bytes, less than 1% of packets (marked decrease due to popularity of alternative protocols: ssh, kerberos, rlogin)

    • Rest of the traffic is spread over mostly Web-related TCP and UDP ports (81, 443, 3128 – Proxy?, 8000, 8080 - encodings?)

History l.jpg

  • 1981: RFC 792 (ICMP)

  • September 1990: Advanced Network and Services.

  • Early 1995: NSF relinquishes its control over the Internet, sending it into free flight - Attempts to adequately track and monitor the Internet get more and more problematic (Will the story repeat with Internet2?)

  • 1994-1996: Vern Paxson’s core thesis work

  • October 1996: Internet2 consortium (

  • October 1997: Next Generation Internet (NGI) – US government initiative (,

History ii l.jpg
History - II

  • 1997: IETF IPPM (IP Performance Metrics) working group

    • Common definition of IP Metrics: to develop standard metrics to measure quality, performance, and reliability of Internet data delivery services

    • The goal is to provide basis for unbiased quantitative measures of Internet performance

  • May 1998: RFC 2330 (Framework for IP performance metrics)

  • September 1999: RFCs 2678 – 2681 (connectivity, one-way delay and loss, round-trip delay)

  • NLANR ( and DOE/LBL ( currently work on developing scalable tool set “to measure the global Internet” for IPPM-style metrics (NIMI)

  • Similar projects: Surveyor, IPMA, Felix, ???

Vern paxson s thesis work l.jpg
Vern Paxson’s Thesis Work

  • Groundbreaking measurement work, first large-scale studies of the end-to-end Internet routing and packet dynamics

  • Network Probe Daemon (NPD) framework, 1994-1995:

    • NPDs cooperate with one another for measurement: both NPDs send and receive the data, tracing packet departure and arrival times

  • Some of the findings:

    • Forward and Reverse directions of the paths between two nodes are often quite different: routes of more than half of all network paths differ in at least one city visited

    • Bottleneck bandwidths and queuing delays are frequently asymmetric

  • This work served as a basis for Surveyor (adopted for Internet2) and PingER architectures; promoted SLA mechanism, which would provide powerful economic incentive for improving QoS in NGI

Measurement l.jpg

  • Measurement is: data collection, analysis and visualization

  • Traffic data:

    • Network Topology and Mapping (connectivity)

    • Workload (passive or non-intrusive)

    • Performance (active)

    • Routing (BGP routing tables)

  • Active approach

    • Inject traffic and wait for arrival to the destination or reply

  • Passive approach

    • No traffic injected; Measurements are done over a collection of network monitors

Topology and mapping l.jpg
Topology and Mapping

  • New physical connections among core Internet backbones occur hourly

  • Myriads of new technologies and applications: streaming audio and video, distance education, entertainment, telephony and video-conferencing, as well as numerous new and still evolving communication protocols

  • Tracking and visualizing Internet topology is clearly a challenge

  • CAIDA: skitter, a tool for dynamically discovering and depicting global Internet topology

    • X-ray tomography techniques: 3D-object from 2D-images

    • collects connectivity, RTT and path data with a number of network monitors across the United States, Europe and Asia

    • sends ICMP packets with longer TTL, like traceroute does

    • source host is notified of packets whose TTL expires (ICMP Time Exceeded message)

  • There is abysmal lack of geographical mapping data for Internet address space

Workload passive measurement l.jpg
Workload (Passive) Measurement

  • Usually infrastructure-wide measurements

  • NLANR collects traces from major universities

  • Traces suggest very active proliferation of new applications (streaming video and audio, etc.). Also: Non-congestion controlled traffic directly affects infrastructural stability of the networks

  • Challenge: to develop passive measurement techniques. It is difficult, because scope of applicability is limited and needs to be developed

Workload passive measurement ii l.jpg
Workload (Passive) Measurement –II

  • Performed by network monitors: in routers, switches, or standalone devices

  • Used primarily for traffic analysis: composition of traffic by application, packet size distributions, packet inter-arrival times, performance, path length. Essential for engineering next generation internetworking equipment and overall infrastructure.

  • Country-specific flows (geographic information), distributions of packet sizes, flow volume, flow duration; NetFlow capability of Cisco routers

  • Results are commonly produced in the form of traffic matrices: traffic between specific source and destination. Essential for investment decisions

  • Per-packet and per-byte statistics of switching is essential for optimizing hardware and software architecture in switching equipment

  • It is important to see whether the traffic is composed of friendly flows or not. E.g.: streaming multimedia traffic

Workload passive measurement iii l.jpg
Workload (Passive) Measurement - III

  • Other applications of passive monitoring: optimizing web caches and proxies; security monitoring; monitoring effectiveness of congestion control; impact of new technologies and protocols such as multicast or IPv6, etc.

  • Current disparity in transmission and measurement technology: OC-192 routers and switches vs. no commodity measurement solution even at OC-3 (late 1999).

  • Focus for near future: support for monitoring at least OC12-OC48 links (this work is being done by Internet2 consortium), different interface types, and encapsulation/framing; performance testing of monitors (whether they keep up with the load); enhanced configuration of what to collect; improving security and manageability

  • OC12mon at iMCI and VBNS

    • VBNS is currently being upgraded to OC-48 trunks

    • from hundreds of thousands to about a million simultaneous flows

Performance active measurement l.jpg
Performance (Active) Measurement

  • Most popular for benchmarking end-to-end performance of commercial service providers (Transit, Access, Content hosting, and Caching), analyzing traffic behavior across specific paths, monitoring fulfillment of Service Level Agreements (SLA), diagnosing network problems

  • Commonly measured parameters: delay, packet loss, flow capacity (throughput), availability

  • However, there is no standard metric or measurement methodology that would allow consistent comparison and calibration

  • Unfortunately, AM often involves large number of parameters that are difficult, if not impossible, to model independently

  • “We lack in most cases the ability even to measure traffic at a granularity that would enable infrastructure-level research”

Performance active measurement ii l.jpg
Performance (Active) Measurement - II

  • Proliferation of uncoordinated active measurement initiatives has led to counterproductive actions, such as ISPs turning off ICMP traffic at select routers to limit the visibility (and vulnerability) of their infrastructure

  • Challenge: Active Measurements are effective but invasive

  • Focus for near future:

    • tools that identify critical pieces of the public public infrastructure

    • tools that find particular periodic cycles or frequency components in performance data

    • developing a calculus for describing and drawing the difference between two given `snapshots' of network performance

    • finding the topological `center' of the net, techniques for real-time visualization of routing dynamics

    • correlation with passive measurements

  • See for available tools

Slide16 l.jpg

  • Internet Control Message Protocol, RFC 792

  • Integral part of IP

  • RFC792: ICMP messages are sent in several situations: for example, when a datagram cannot reach its destination, when the gateway does not have a buffering capacity to forward a datagram, and when the gateway can direct the host to send traffic on a shorter route

    • The purpose of these control messages is to provide feedback about problems in the communication environment, not to make IP reliable (IP is not designed to be absolutely reliable).

    • ICMP messages typically report errors in processing of datagrams (No ICMP messages are sent about ICMP messages).

Icmp continued l.jpg
ICMP (continued)

  • ICMP message types:

    • Destination Unreachable (distance = )

    • Time Exceeded (TTL expired)

    • Parameter Problem (incorrect values)

    • Source Quench (buffer is full)

    • Redirect (shorter path found)

    • Echo or Echo Reply (by ID or sequence number)

    • Timestamp or Timestamp Reply (Originate, Receive, Transit)

    • Information Request or Information Reply (to find out network number)

Routing dynamics measurements l.jpg
Routing (Dynamics) Measurements

  • The reliability and robustness of the Internet highly depend on efficient, stable routing among provider networks

  • Analysis of routing behavior has direct implications for the next generation of networking hardware, software and operational policies

  • Analysis of routing data (BGP – Border Gateway Protocol) – show actual current traffic paths – but difficult to do exhaustive measurement to generalize across providers

  • Routing dynamics gives the following insights:

    • effects of outages on surrounding ISPs

    • effects of topology changes on Internet performance

    • unintended consequences of new routing policies

    • potential to improve ability to respond to congestion and topology changes

    • infrastructural vulnerabilities caused by critical paths

Routing dynamics measurements ii l.jpg
Routing (Dynamics) Measurements - II

  • A very important area of work is identification of optimal routes given performance results

  • Other high-priority areas:

    • assessing utilization of the IP address space

    • extent of asymmetric routing and route instability as a function of service provider and over time

    • distribution of traffic by network address prefix lengths

    • efficiency of usage of BGP routing table space, e.g., via aggregation

    • favoritism of traffic flow and routing toward a small proportion of the possible addresses/entities

    • degree of incongruity between unicast and multicast routing

    • quantifying effects on connectivity after removing specific ASes

Metrics l.jpg

  • Utilization

  • Availability

  • Delay (one-way vs. round-trip)

  • Packet Loss (one-way vs. round-trip)

  • Throughput

  • Routing stability

  • No standards and well-understood methodologies developed yet: results can be hard to interpret or impossible to compare between implementations

Traffic analysis l.jpg
Traffic Analysis

  • Collected traffic data is of little use without strong ability to analyze that data and predict network behavior

  • Simulation and Modeling give essential insights

  • However, there is little consensus currently on how to accomplish IP traffic modeling – telephony models (developed at Bell Labs and elsewhere) rely on queuing theory and other techniques that are not readily replicable to packet-switched Internet. In particular, Erlang distributions, Poisson arrivals, and other means for predicting call-blocking probabilities and other vital telephony service characteristics, typically don’t apply to wide area internetworking technologies

Projects coral oc12mon l.jpg
Projects: Coral/OC12mon

  • Coral/OC12mon, a passive measurement architecture deployed on iMCI and vBNS backbones

  • Flow-based traffic characterization: flow size by protocol, percentage composition of traffic by protocol and application, distributions of flow sizes, length of packet trains, statistics on IP fragmentation, prefix length distribution, and address space utilization

  • Matrices of traffic flow by country or AS, traffic import and export, routing/address space coverage

  • Non-flow-based analysis: interarrival time behavior, protocol-relevant (TCP retransmissions, packet size distributions), security applications

Projects nimi l.jpg
Projects: NIMI

  • Goal is to develop NIMI infrastructure for a very large (global) network that would comprehensively and consistently:

    • diagnose performance problems

    • measure properties of a wide range of network paths for research purposes

    • provide systematized assessment of ISP performance thus spurring ISPs to optimize their networks

    • facilitate public access to Internet measurements

    • scalability for global Internet

  • Based on original Vern Paxson’s NPDs, where a collection of measurement probes cooperatively actively measures the properties of Internet paths and clouds by exchanging traffic among themselves, emphasis on scalability

  • NIMI is targeted as the fundamental measurement platform, with other measurement infrastructures to be built on top of it

Nimi ii l.jpg

  • Design goals:

    • Work in administratively diverse environment

    • Work in commercial Internet

    • Support a wide range of measurements

    • Conduct active measurements rather than passive (because of commercial Internet)

    • Scale to thousands of measurement platforms (minimizing measurement and control traffic)

    • Give platform owners full administrative and policy control over their platforms…

    • …but make it easy for platform owners to delegate control (and not exercise it)

    • …and provide fine-grained control when needed

    • Build in solid security and authentication from the beginning (system design integrity suffers when security mechanisms are added late in the design process)

    • Require minimal administration, maximal self-configuration (scalability)

Nimi iii l.jpg

  • NIMI Architecture goals:

    • Measurement requests

    • Credential-based authentication (public-key cryptography)

    • Policy based on ACLs (access control lists, representing NIMI platform’s measurement and control policies) and credentials

    • Security and Privacy (public-key cryptography)

    • Delegating trust (hierarchical with subtables)

    • Autoconfiguration

  • NIMI Architecture conceptually consists of NIMI platforms (perform measurements and record results) and different external components that analyze the measurements and control the platforms

  • Each NIMI platform runs a measurement server whose job is to:

    • authenticate measurement requests as they arrive

    • check requests against platform’s policy table

    • queue them for future execution

    • execute them at the appropriate time

    • bundle and ship the results of the requests to whatever destination the request specified

    • delete results when instructed to

Nimi iv l.jpg

  • Internally, the NIMI probe is divided into two distinct daemons:

    • nimid is responsible for communication with the outside world and performing access control checks

    • scheduled does the actual measurement scheduling, execution and result packaging

  • External elements:

    • CPOC (Configuration Point of Contact), which serves to configure and administer a set of NIMI probes within the CPOC’s sphere, in particular:

      • CPOC provides the initial policies for each distinct NIMI probe, and, over time, provides updates to these policies

      • When needed, CPOC acts as a repository for NIMI public keys and measurement modules

    • MC (Measurement Client), which end users use to access the infrastructure. MC communicates directly with the NIMI probes involved in the measurement (CPOC is not involved in the processing of individual measurement requests)

    • DAC (Data Analysis Client) acts as repository and post-processor of the data returned by NIMI probe(s) upon completion of a measurement

Nimi v l.jpg

  • NIMI is modular: it has no knowledge of particular measurement tools, so the tools are standalone plug-in modules produced by third parties

  • Currently (year 2000) the following measurement modules have been deployed: traceroute, mtrace, treno, cap/capd, zing, mflect, traffic/discardd, ftp

  • Two major problems are currently being solved by NIMI team: how to update the software on the measurement platforms securely, and to constrain the resources consumed by different measurements

  • Measurements of standardized performance metrics (RFCs by IETF IPPM WG)

  • Other research groups that develop probe platforms for smaller groups of sites: IPMA (Merit Network), Surveyor (, Felix (Telcordia)

Projects surveyor l.jpg
Projects: Surveyor

  • Surveyor, a measurement infrastructure that measures end-to-end unidirectional delay, packet loss, and route information along Internet paths

  • Deployed in Abilene (Internet2 network) at about 60 higher education and research sites throughout the world, measures over 1500 paths among these sites (almost full mesh), including transatlantic and transpacific paths

  • Features:

    • Techniques for scalable and accurate measurements, tools for analysis, architecture for long-term storage and data access

    • Stress the importance of one-way measurements as opposed to traditional round-trip measurements

  • Goal: to create architecture for consistent Internet measurement to promote accurate common understanding of performance and reliability of the Internet paths

  • Measures one-way delay (RFC 2679) and one-way loss (RFC 2680) over long periods of time, according to metrics specification developed by IETF IPPM workgroup (see RFC 2330, 2678); routing information (modified V.Jacobson’s traceroute)

Surveyor ii l.jpg
Surveyor - II

  • Emphasis is on unidirectional properties:

    • many Internet paths are asymmetric (sequence of routers in forward and reverse directions differ). In presence of asymmetric paths, traditional round-trip measurements (e.g., “ping” for latency) measure the performance of two different paths altogether

    • even if the path is symmetric, load (and therefore performance) may be radically different in the two directions. Examples: transatlantic and transpacific paths; as a particular example, traffic from USA to New Zealand is roughly 4 times higher than in reverse direction. Web caches in New Zealand take advantage of this asymmetry

    • clock synchronization is necessary for one-way measurements, therefore global positioning system (GPS) hardware is used (precision is synchronization is better than 1 millisecond; in practice, 2 microseconds on the average)

Surveyor iii l.jpg
Surveyor - III

  • Dedicated measurement hardware:

    • to ensure that each machine is uniform and runs with a controlled load (unlike general-purpose multi-user workstations, prone to noise in measurements)

    • special hardware to synchronize clocks, which is easier to to install and maintain using dedicated computer

    • to provide a high level of security (to ensure measurements and the measurement instrument are not compromised and are not sources of attacks)

  • Continuous measurement to accurately record traffic fluctuations

  • Long-term performance data for provisioning, capacity planning and overall engineering of networks and network research

  • Real-time access to performance data – for real-time troubleshooting

Surveyor iv l.jpg
Surveyor - IV

  • Measurement methodology:

    • Delay and loss are measured using the same stream of active test traffic. A Poisson process on the sending machine schedules test packets (, average sending rate is 2 packets/s). New Zealand and Swiss sites use  = 1. 12-byte UDP packets are used (minimal size, for beginning)

    • Fractal (self-similar) nature of Internet traffic: frequent snapshots are desirable. However, the amount of test traffic should not perturb measurements

    • Also, disk space was a limitation: 178,800 measurements a day per path required initially more than 2 Mbytes of disk space per day plus relational database overhead

    • Delay: receiver subtracts timestamp in received packet from current time

    • Loss: a packet does not arrive in 10 seconds (sequence numbers, Poisson process)

    • Route information:

      • Modified traceroute, with 10 (instead of 3) ICMP probes in case of failure, and 1 (instead of 3) – in case of success; traceroutes are Poisson-generated with period 10 minutes on average, with a forced traceroute if interval exceeds 10 minutes

Surveyor v l.jpg
Surveyor - V

  • Surveyor infrastructure:

    • Dedicated measurement machines: Dell desktop PC with 200-400 MHz Pentium processors, NIC (10base-T, 100base-T, FDDI, OC-3), GPS card (TrueTime; ISA, PCI; with antenna and GPS daughter board by Trimble), BSDI OS v.3.1. These machines report to central database

    • Database (4-processor Silicon Graphics Origin 200 with FibreChannel 600-Gb RAID storage array): catalogs all the performance data from dedicated measurement machines (transferred using ssh)

    • Analysis server: performs analysis (generates summary statistics for each path; produces 24-hour plots) and posts results on the Web (3 daily plots for each path: delay summary, loss summary, histogram of delay values)

  • Current work:

    • wider Abilene deployment: test packets with DiffServ byte set to test QoS-enhanced paths, deployment inside Qbone testbed, (IPv6 and multicast paths?)

    • SNMP alerts about “interesting” paths, more near-real-time access, more analysis enhancements

Other measurement efforts l.jpg
Other Measurement Efforts

  • IPMA project at Merit Network, Inc.

    • routing protocol collectors to understand dynamic routing behavior in the Internet

  • PingER project at Stanford

    • complex measurements at monitoring sites throughout the high-energy physics research community

  • WAND project in New Zealand

    • passive (!) unidirectional measurements using GPS

  • RIPE Test-Traffic project in Europe

    • IETF IPPM unidirectional metrics, similar to Surveyor

  • Felix project at Telcordia

    • Prototype monitoring infrastructure to track “health” of large networks, without requiring prior knowledge of network topology or routing information

    • Linear Decomposition Algorithms (LDA) for topology discovery and performance evaluation of specific network elements

Skitter tool by caida l.jpg
Skitter tool by CAIDA

  • Macro-level analysis of the Internet

  • Measures forward IP paths from a single source to many destinations (1998: 23,000) using traceroute-like incrementing of the TTL of each hop

  • Key goals:

    • to identify and track routing behavior, e.g. providing indications of low-frequency persistent routing changes

    • to assist in dynamic discovery of network connectivity through probimh paths to destinations spread throughout IPv4 address space

  • A secondary goal is to collect RTTs for the paths to each of these destinations for analysis of general trends in Internet performance

Conclusion challenges l.jpg
Conclusion: Challenges

  • Less invasive active measurements

  • More effective passive measurements

  • Improving impact of measurements – aggregating, mining and visualizing massive data sets in ways that are useful to many people

  • Mapping IP addresses to more useful entities: specific systems, their geographic location, countries, etc.

  • Both top-down and bottom-up momentums are needed

  • Internet data analysis is no longer justifiable as an isolated activity; the Net has grown too large and under auspices of too many independent, uncoordinated entities, therefore coordinated effort is in great need