1 / 29

Measuring the Internet: A case study

This case study discusses the measurement of IP performance on a large scale. It covers the setting, packet generation, measurement system and methodology, and drawing conclusions.

lgenthner
Download Presentation

Measuring the Internet: A case study

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Measuring the Internet:A case study by Bob Mandeville and Andrew Corlett bob@iometrix.com andrew@iometrix.com

  2. Agenda PART 1IP Performance Measurement Case Study (What we did) PART 2Measurement System and Methodology (How we did it) PART 3Drawing Conclusions (and going onto next steps)

  3. PART 1IP Performance Measurement Case Study (What we did)

  4. The setting • Large-scale test of seven of the world’s biggest ISPs • 28 measurement nodes (cNodes) on backbone core of: • Cable & Wireless (C&W) • Level 3 Communications • Qwest Communications • Savvis Communications • Sprint Corp. • Verio • Williams Communications 

  5. Measurement packet generation • Test ran 30 days; total project took more than a year to complete • cNodes generated 4,558,388,076 packets during the month of August 2002 • All told, we collected 156,050,656 discrete measurements • cNodes record more than 70 IP metrics but in this test we focused on just three: uptime, jitter, and packet loss 

  6. Packet types • The cNodes generated vectors of both 1,518 byte TCP and 256 byte UDP packets • With each cNode sending packets to three other cities there were a total of six vectors per cNode • cNodes configured to generate an aggregate transmit rate for all vectors not to exceed 512 kbit/s.

  7. A picture tells…

  8. Measured Uptime

  9. Maintenance Windows

  10. Outages by Numbers

  11. Measured Jitter

  12. Measured Packet Loss

  13. PART 2Measurement Systemand Methodology (How we did it)

  14. BROWSER Database Service-Daemon cNode cNode cNode System Architecture OSS Traffic Engineering Application #3 Application #2

  15. Service-Daemon • Central hub of Measurement System • Configures cNodes for measurements • Retrieves Results and stores into database • Sophisticated state-machines maintain measurement system automatically. For example: • Downloads results stored in cNodes but not stored in database • Configures cNodes that may have been power-cycled. • cNodes continue to measure and store results internally if connectivity to Service-Daemon is interrupted • CLI/Scripting engine allows for external and bulk configuration • Runs on Windows, Solaris, and Linux

  16. Terminology • Vector • Basis of all measurements • Defines measurements from one cNode to another cNode • All packets are formatted the same (Service-Type) • Many different vectors can be executed simultaneously • HTTP, VoIP, FTP, etc. • Service-Type/Packet-Types • Defines the format of measurement packets • Example: TELNET, TCP Port 23, 1500 byte packets

  17. Terminology • Vector Handler • Computes and stores measurement results • Located on the destination cNode • Measurement Period • Interval of time representing results data • 5 minute intervals • Can be combined to report or alarm on larger intervals • 10, 15, 30, 1hr, 1day

  18. Service-Type/Packet-Type • Optional UDP or TCP headers • port numbers • TCP fields: Flags, Window, MSS option, Urgent Pointer • DSCP settings • Packet Length • Payload Type (all 0’s, all 1’s or Random) • TCP- Flags, Window Size, Urgent Pointer, MSS option • TTL • Loose, Strict, and/or Record Route options • VLAN tags

  19. Database cNode cNode cNode Continuous Measurements 12:25 12:20 12:15 12:10 12:05 12:00 Measurement Period (5 minutes) Computed Results

  20. Results • Every 5 minutes all of the packets received for a vector are processed through sophisticated algorithms and a ~1Kbyte results packet is created representing all of the metrics • The results packet is automatically sent to the service-daemon and stored into the internal memory of the cNode • Results packets can be combined so reports and alarms can be generated over time periods other than 5 minute intervals: e.g. 1 hour, 1 day, 1 week or even 1 year.

  21. Optional Header (UDP/TCP) Metric Header Optional IP Header Payload (zeros/ones/random) Ethernet Header Ethernet CRC IP Header Timestamp Measurement Packet • Optional UDP or TCP headers • Source/Destination Port numbers • TCP fields: Flags, Window, MSS option, Urgent Pointer • DSCP settings • Packet Length • Payload Type (all 0’s, all 1’s or Random) • TCP- Flags, Window Size, Urgent Pointer, MSS option • TTL • Loose, Strict, and/or Record Route options • VLAN tags

  22. Metric Header • Allows measurement packets to be formed as any protocol without interfering with manageability of cNodes • E.g. cNodes can measure Telnet traffic while Telnet sessions are in process on the cNode • Header Identifier and Version • Hardware Timestamp • UTC, 64-bit, 1ns units • Packet ID • 64-bit • Initial TTL, TOS, and IP Protocol fields • Payload Checksum • Metric Header Checksum • Vector and Measurement Period Identification

  23. One-Way Measurements • Accurate • 64-bit hardware timestamps • 12.5 ns clock synchronized by GPS (internal), 1 PPS and IRIG-B, and/or NTP • All counters are 64-/128-/256-bit • Continuous • Send active measurements continuously • Calculate results every 5 minutes • Comprehensive • Over 65 IP Metrics • Delay (latency), jitter, loss, outages • Out-of-order, loss patterns, fragmentation, hop count and hop changes, DSCP changes, duplicates, corruptions

  24. One-Way Measurements • Scalable • Highly distributed system • Results computed at cNodes • cProt allows minimal communication w/cNodes for configuration and data gathering • Operationally: system designed to be self-maintaining • Scientific • Methodology designed from years of test and measurement experience • Statistical accuracy – Pullin papers (CalTech) • Accountable • Event-lists account for power-failures, link failures, time synchronization changes, etc. • Comparable • Over time and topology

  25. IP Packet Metrics

  26. IP Packet Metrics

  27. PART 3Drawing Conclusions (and going to next steps)

  28. Some conclusions drawn from the experience… • We disagree with the NetworkWorld article conclusion: outages were too significant to qualify providers as ‘telco grade’ • One-way measurements hampered by lack of GPS clock sources on 85% of sites under test • Full set of 70 IP metrics used successfully to analyze anomalous behavior • Currently majority of ISPs do not have advanced IP measurement capabilities deployed on their networks

  29. Is there any time left for questions?

More Related