1 / 50

Pushing group communication to the edge will enable radically new distributed applications

Pushing group communication to the edge will enable radically new distributed applications. Ken Birman Cornell University. First, a premise. An industry driven by disruptive changes Incremental advances are fine but don’t shake things up very much

Download Presentation

Pushing group communication to the edge will enable radically new distributed applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Pushing group communication to the edge will enable radically new distributed applications Ken Birman Cornell University

  2. First, a premise • An industry driven by disruptive changes • Incremental advances are fine but don’t shake things up very much • Those who quickly seize and fully exploit disruptive technologies thrive while those who miss even a single major event are left behind • Our challenge as researchers? • Notice these opportunities first

  3. Ingredients for disruptive change • Set the stage: • Pent up opportunity to leverage legacy app base • Infrastructure enablers suddenly in place • Potential for radically new / killer applications • Some sort of core problem to solve • Often demands out-of-the-box innovation • Deployable in easily used form • These days, developers demand integrated tools

  4. Why group communication? • Or more specifically… • Why groups as opposed, say, to pub-sub? • Why does the edge of the network represent such a big opportunity? Who would use it, and why? • Can software challenges (finally) be overcome? • How to integrate with existing platforms?

  5. Groups: A common denominator • Not a new idea… dates back 20 years or more • V system (Stanford) and Isis system (Cornell) • A group is a natural distributed abstraction • Can represent replicated data or services, shared keys or other consistent state, leadership or other coordination abstractions, shared memory… • These days would visualize a group as an object • Can store pointer to it in file system name space • Each has an associated type (“endpoint management class”) • A process opens as many groups as it likes (like files)

  6. A very short history of groups • V system offered O/S level groups but stalled • Isis added “strong semantics” and also many end-user presentations, like pub-sub • Strong semantics indistinguishable from a single entity implementing same abstraction • Our name for this model: virtual synchrony • Can be understood as a form of transactional serializability but with processes, groups, messages as components • Weaker than Paxos, which uses a strong consensus model • Pub-sub emerged as the most popular end-user API

  7. Virtual Synchrony Model G0={p,q} G1={p,q,r,s} G2={q,r,s} G3={q,r,s,t} p q r s t crash r, s request to join p fails r,s added; state xfer t requests to join t added, state xfer ... to date, the only widely adopted model for consistency and fault-tolerance in highly available networked applications

  8. Four models side-by-side • Traditional pub-sub as supported in current products • No guarantees, often scales poorly (1-1 TCP connection). • Versions that use IP multicast are prone to melt-down when stressed • Virtual synchrony • Fastest and most scalable of the “strongly consistent” models • Cheats on apparent synchrony whenever it can, reducing risk of a correlated failure caused by a poison-pill multicast • Paxos • Closest fit to “consensus” agreement and f.tol. properties • Performance limited by 2-phase protocol required to achieve this • Virtual synchrony uses Paxos-like protocol to track group membership • Transactions • Most expensive at all: updates touch persistent storage • Execution model too constraining for high-speed comm. apps

  9. Successes… and failures • Some successes: • New York Stock Exchange, Swiss Exchange, French Air Traffic Control System, US Navy AEGIS, telephony, factory automation. Microsoft Vista clusters, IBM Websphere. • Paxos popular for small fault-tolerance services • Some failures: • Implementations were often fragile, didn’t scale well, poorly integrated with devel. environments • Pub-sub users tolerated weak semantics • Bottom line: Market didn’t scale adequately

  10. Why didn’t the market scale? • Mile high perspective: • All existing group communication solutions targeted server platforms, e.g. to replicate data in a clustered application • Pub-sub became a majority solution for sending data (for example stock trades) from data centers to client platforms • But neither market was ultimately all that large • The number of server platforms is tiny compared to the number of client systems… and group communication isn’t the whole story – you always needed “more” technology • Meanwhile, once you license pub-sub to every major trading floor you exhaust the associated revenue opportunity

  11. Why didn’t either displace the other? • Pub-sub systems were best-effort technologies • Compete with systems that just make lots of TCP connections and push data… • Lacking stronger semantics, applications that want security or stronger reliability had to build extra end-to-end logic • But group communication systems had scalability issues of their own • Most platforms focus on processes using a small number of small groups (often just one group of 3-5 members) • Other “positioning” involves relaying through a central service

  12. Is there an answer? • Retarget group communication towards the edge of the network! • Provide it as a direct client-to-client option • Integrate tightly into the dominant client computing platform (.net, web services) • Make it scale much better • Now value is tied to number of client systems, not number of servers…

  13. Potential roles for group communication at the edge of the net? • Gaming systems and VR immersion • Delivery of streaming media, stock quotes, projected pricing for stock and bonds… • Replication of security keys, other system structuring data and management information • Vision: Any client can securely produce and/or consume data streams… servers are present but in a “supporting role”

  14. Recall our list of ingredients… • Pent up demand: developers have lacked a way to do this for decades… • Technology enabler: Availability of ubiquitous broadband connectivity, high bandwidths • Potential for high-value use in legacy apps and potential for new killer apps ... But can we overcome scalability limits?

  15. Quicksilver: Krzys Ostrowski, Birman, Phanishayee, Dolev • Publish-subscribe eventing and notification • Scalable in many dimensions • Number of publishers, subscribers • Number of groups (topics) • Churn, failures, loss, perturbances • High data rates • Reliable • Easy to use, and supporting standard APIs

  16. Quicksilver: Key ideas Design dissemination, reliability, security, virtual synchrony as concurrently active “stacks” No need to relay multicasts through any form of centralized service… Send typical message with a single (or a few) IP or overlay multicasts Meta-protocols aggregate work across groups for efficiency System is extensively optimized to maximize throughput in all respects

  17. Quicksilver is a work in progress… • The basic scalable infrastructure is working today (coded in C#, runs on .net) • We’re currently adding: • Clean integration into .net to make it easy to use in much the same way that “files” are used today • Scalable virtual synchrony protocols, may also offer Paxos for those who want stronger model • Comprehensive, scalable security architecture • Planning a series of free releases from Cornell

  18. Conclusions? • Enablers for a revolution at the edge • Groups that look like a natural part of .net, web services • Incredibly easy to use… much like shared files. • They scale well enough so that they can actually be used in the ways you want… and offer powerful security and consistency guarantees for applications that need them • Will also integrate with a persistency service to capture the history associated with a group if desired. Like a transactional log… but much faster and more flexible! • Shared group with strong properties enable a new generation of trustworthy applications

  19. Extra Slides (provided by Krzys) • What to read? Our OSDI submission: • QuickSilver Scalable Multicast.  Krzysztof Ostrowski, Ken Birman, and Amar Phanishayee.  • On www.cs.cornell.edu/Projects/Quicksilver • QSM itself is available for download today.

  20. Non-Goals • We don’t aim at “real time” guarantees • We always try to deliver messages, even if late • We can sacrifice latency for throughput • Unavoidable trade-off • Buffering, scheduling (at high rates, systems aren’t idle) • But we’re still in 10-30ms range for 1K messages • We don’t do pub-sub filtering • We provide multicast. Cayuga will do the filtering.

  21. Why multiple groups? • Groups are out there in the wild...

  22. Why multiple groups? • Groups are easy to think of... • Why not use them like we would use files? • Separate group for each: • Event • Category of data items, user requests • Stock • Category of products • Type of service • May lead to new, easier ways of programming

  23. Limitations of existing approaches • Existing protocols aren’t enough • Designed to scale in one dimesion at a time • Overheads • Bottlenecks • Costly to run (typically CPU-bound) • Example: JGroups • Popular, and considered a solid platform. • Part of JBoss. • Running in managed environment (Java)

  24. Limitations of existing approaches 1 sender 1 group sending as fast as possiblecluster ofPIII 1.3 GHz512 MB100 Mbps

  25. Limitations of existing approaches 1 sender sending as fast as possible all groupshave theexact same members

  26. Limitations of existing approaches Lightweight groups: • Overloaded agents • Wasted bandwidth • Filtering on receive • Extra network hops Protocol per group: • ACK/NAK overload

  27. QSM was tested on 110 nodes 1 group sending as fast as possiblerates setmanually

  28. QSM was tested on 8192 groups 1 sendergroupsperfectly overlap

  29. QSM is very cheap to run 1 group 110 nodes 1-2 senders

  30. QSM is very cheap to run... 1 sender 1 group 110 nodes maximum rate

  31. QSM has an acceptable latency...

  32. ...yet sometimes it needs tuning the default buffering settings lead to higher latencies for large messages lower latencies achievable via manually tuning its settings

  33. QSM tolerates bursty packet loss 1 sender 110 nodes once every 10s a selected node (receiver) drops every incoming packet including data and control for the period of 1s and returns back to normal for the remaining9 seconds

  34. QSM tolerates bursty packet loss 1 sender 110 nodes loss occursevery 10sas beforeduration ofthe lossis varying

  35. QSM tolerates node crashes...

  36. ...and much worse scenarios worst casescenario node freezes for 10s in the middle ofthe run, butthen resumesand triggersa substantialamounf ofloss recovery

  37. Cumulative effect of perturbances cumulative delayhow much extra time we need to send the same N messages as a result of the perturbance

  38. QSM doesn’t collapse but might oscillate 2 senders 110 nodes trying to send at a rate that exceeds the maximum ofwhat can beachieved

  39. Key Insight: Regions • G(x) = set of groups node x is a member of • x and y are in the same region if G(x) = G(y) • Interest sharing • Receiving the same messages • Fate sharing • Experiening the same load, burstiness • Experiencing the same losses • Being similarly affected by churn, crashes etc.

  40. Key Insights: Regions

  41. Key Insights: Regions

  42. Key Insights: Regions

  43. Key Insights: Internals

  44. As of today... • Already available: • Multicast scalable in multiple dimensions • Simple reliability model (keep trying until ACK’ed) • Simple messaging API • Work in progress: • Extending with strong reliability • Support for the WS-* APIs, „typed endpoints” • Request-Reply communication

  45. Deployment scenarios • Architecture with the local „daemon” • Support multiple processes more smoothly • Support WS-BrokeredNotification • Support WS-Eventing • Support non-.NET applications by linking with a separate thin library • Only needs to talk to the local „daemon” • No need to implement any part of the QSM protocol • Small, might be written in any language! • Currently in progress...

  46. Deployment scenarios

  47. Eventing

  48. Conclusions • QSM currently delivers multicast scalable in multiple dimesions, with basic reliability properties • Future: • We are ading support for WS-* APIs • We are extending the robustness and reliability • Two new dimensions • Request-reply and pub-sub mode of communication • Strong typing of groups (e.g. for security)

  49. Publications • QuickSilver Scalable MulticastKrzysztof Ostrowski, Ken Birman, and Amar Phanishayee        • Extensible Web Services Architecture for Notification in Large-Scale SystemsKrzysztof Ostrowski and Ken Birman http://www.cs.cornell.edu/projects/quicksilver/pubs.html

More Related