290 likes | 349 Views
This paper explores a distributed system that allows end users to identify and pinpoint network faults degrading their applications. It introduces Tulip, a diagnostic tool for fault localization, and evaluates its effectiveness. The methodology compares Tulip with other tools to locate loss and delay in the Internet, emphasizing the persistence of faults. Recommendations include path verification and utilizing IP identifiers and router timestamps. The study concludes that Tulip is a practical tool for diagnosing packet reordering, loss, and queuing in the network.
E N D
User-level Internet Path Diagnosis Ratul Mahajan, Neil Spring, David Wetherall and Thomas Anderson Designed by Yao Zhao
A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. L. Lamport
Motivation • Can end users, with no special privileges identify and pinpoint faults inside the network that degrade the performance of their applications? • Why (unprivileged) end users? • Operators do not share the users’ view of the network • Operators may have no more insight than unprivileged users for problems inside other administrative domains • user can directly contact the responsible ISP leading to faster problem resolution • Many techniques are more effective and scalable with fault localization than blindly trying all possibilities
Outline • Diagnosis architecture • Diagnosis Tool: Tulip • Evaluation • Recommendations • Conclusion
An Ideal Trace-based Solution • Routers log packet activity and make these traces available to users. • The log at each router is recorded for both input and output interfaces. • impractical for deployment
Packet-based Solutions • Complete Embedding • Each router along the path records information into each packet that it forwards. • Barring two exceptions, the scheme above is equivalent to the path trace. • Reduced Embedding • Remove the step of embedding the complete input packet in the output packet • Constant Space Embedding • Sample TTL • Real Clocks • Unsynchronized clock • Finite precision
Outline • Diagnosis architecture • Diagnosis Tool: Tulip • Evaluation • Recommendations • Conclusion
Internet Approximations • Out-of-band measurement probes • ICMP timestamp requests to access time at the router • IP identifiers instead of per-flow counters
Assumptions for Packet Loss • IP-IDs are consecutive • 80% of the time from over 90% of the routers • Small size packets usually have low loss rate • In over 60% of the cases when any packet in the triplet was lost, only the data packet was lost. • ICMP rate-limiting will not be mistaken as packet loss • 1 more check packet
Packet Queuing • Similar to cing • Two practical problems: • ICMP generation time • Cable modems and wireless links
Tulip • Network Load • BL/W • Diagnosis time • 10 ~ 30 min per path • Parallel search vs Binary search • Two or more faults?
Outline • Diagnosis architecture • Diagnosis Tool: Tulip • Evaluation • Recommendations • Conclusion
Methodology • Evaluate applicability • Diagnosis granularity • Three sources: MIT, U Washington and London • Destinations from Skitter • Validation
Validation • IP-IDs and ICMP timestamp vs End-to-end measurement • Tulip vs Sting • Consistency of Tulip’s inferences • Consistency between Tulip and Paths
Two facts • Locating Loss and Delay in the Internet • Persistence of Faults
Outline • Diagnosis architecture • Diagnosis Tool: Tulip • Evaluation • Recommendations • Conclusion
Limitations of Tulip • Out-of-band measurements • Stable routing path • IP-ID counters • Limitations of ICMP timestamps
In-band vs Out-of-band Diagnosis • Priority of protocols • Packet drop • Packet size • Loss rate • Reordering
Other Recommendations • Path Verification • IP Identifiers • Router Timestamps
Related Works • Diagnosis Approaches • Magpie • SPIE • NetFlow • Measurement Primitives • Overlay primitives • IPMP • Measurement Tools • PING, Traceroute, pathchar, Sting
Conclusion • Tulip • Practical tool to diagnose packet reordering, loss and queuing • Diagnosis architecture • In-band • Lightweight