1 / 20

Reliable Multicast for Time-Critical Systems

Reliable Multicast for Time-Critical Systems. Mahesh Balakrishnan Ken Birman Cornell University. Mission-Critical Datacenters. COTS Datacenters Online e-tailers, search engines, corporate applications Web-services Mission-Critical Apps

eman
Download Presentation

Reliable Multicast for Time-Critical Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reliable Multicast for Time-Critical Systems Mahesh Balakrishnan Ken Birman Cornell University

  2. Mission-Critical Datacenters • COTS Datacenters • Online e-tailers, search engines, corporate applications • Web-services • Mission-Critical Apps • Need: Scalability, Availability, Fault-Tolerance … Timeliness!

  3. The Time-Critical Datacenter • Migrating time-critical applications to commodity datacenters… • … conversely, providing datacenter web-services with time-critical performance.

  4. What’s a Time-Critical System? • Not ‘real time’, but ‘real fast’! • Financial calculators, military command and control… air traffic control (ATC) • … foobooks.com! • Technology Gap: Real-Time focuses on determinism, scale-up architectures

  5. The French ATC System • Mid to Late 90’s • Teams of 3-5 air traffic controllers on a cluster of desktop consoles • 50-200 of these console clusters in an air traffic control center • Why study the French ATC?

  6. ATC Subsystems • Radar Image • Weather Alert • Track Updates • Updates to Flight Plans • Console to Console State Updates • System Management and Monitoring • ATC center to center Updates • Multicast ubiquitous…

  7. Two Kinds of Multicast • Virtually Synchronous Multicast: very reliable, not particularly fast • Unreliable Multicast: very fast, not particularly reliable • Nothing in between!

  8. Two Kinds of Subsystems • Category 1: Complete reliability (virtual synchrony) e.g: Routing decisions • Category 2: Careful application design + natural hardware properties + management policies. e.g: Radar

  9. Multicast in the French ATC • Engineering Lessons: • Structure application to tolerate partial failures • Exploit natural hardware properties • Can we generalize to modern systems? • Research Direction: Time-Critical Reliability • Can we design communication primitives that encapsulate these lessons?

  10. Anatomy of a Cloned Service

  11. Services • An Amazon web-page is constructed by 100s of co-operating services* • Multicast is used for: • Updating Cloned Services • Publish-Subscribe / Eventing • Datacenter Management/Monitoring * Werner Vogels, CTO of amazon.com, at SOSP 2005

  12. A node is in many multicast groups: One for each service it hosts One for each topic it subscribes to One or more administration groups Multicast in the Datacenter Large Numbers of Overlapping Groups!

  13. Service Semantics Data Store Services: stale data can result in overselling / underselling  loss of real-world dollars Cache Services: updated periodically by back-end data-stores

  14. The Challenge • Datacenter Blades are failure-prone: • Crash failures • Byzantine behavior • Bursty Packet Loss : End-hosts kernels drop packets when subjected to traffic spikes.

  15. A New Reliability Model • Rapid delivery is more important than perfect reliability • Probabilistic Timeliness • Graceful Degradation

  16. Wanted: a multicast primitive that • Scales to large numbers of arbitrarily overlapping multicast groups • Delivers multicasts quickly • Tolerates datacenter failure modes – bursty packet loss, node failures • Offers probabilistic properties • ‘Gives up’ on lost data after a threshold period

  17. Ricochet: Lateral Error Correction • Receivers exchange error correction XORs of multicast traffic • Works very well with multiple groups – scales upto a thousand groups per node • Probabilistic Timeliness: probability distribution of delivery latencies

  18. Predictive Total Ordering (Plato) • Delivers messages to applications with no ordering delay in most cases • Orders messages only if there is a high probability of out-of-order delivery across different nodes • Probabilistic Timeliness: probability distribution of ordered delivery latency

  19. SRM takes seconds to recover lost packets Ricochet recovers almost all packets within ~70 milliseconds Performance

  20. Conclusion • Move from R/T to T/C yields huge benefits! • Ricochet is faster… slashes latency… scalable… • Clean delivery delay curve a powerful design tool, replaced traditional hard (but conservative) limits • We’re open for business: • Software and detailed paper available for download • Give it a try… tell us what you think! www.cs.cornell.edu/projects/quicksilver/ricochet.html

More Related