1 / 46

Dr. Multicast for Data Center Communication Scalability

Dr. Multicast for Data Center Communication Scalability. Ymir Vigfusson    Hussam Abu-Libdeh   Mahesh Balakrishnan   Ken Birman Cornell University Yoav Tock IBM Research Haifa. HotNets , October 5, 2008. IP Multicast in Data Centers. IPMC is not used in data centers.

vanya
Download Presentation

Dr. Multicast for Data Center Communication Scalability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dr. Multicast for Data Center Communication Scalability Ymir Vigfusson   Hussam Abu-Libdeh   Mahesh Balakrishnan   Ken Birman Cornell University Yoav Tock IBM Research Haifa HotNets, October 5, 2008

  2. IP Multicast in Data Centers • IPMC is not used in data centers

  3. IP Multicast in Data Centers • IPMC is not used in data centers • Would speed up products that use multicast

  4. IP Multicast in Data Centers • Why is IP multicast rarely used?

  5. IP Multicast in Data Centers • Why is IP multicast rarely used? • Limited IPMC scalability on switches/routers and NICs

  6. IP Multicast in Data Centers • Why is IP multicast rarely used? • Limited IPMC scalability on switches/routers and NICs • Broadcast storms: Loss triggers a horde of NACKs, which triggers more loss, etc.  • Disruptive even to non-IPMC applications.

  7. IP Multicast in Data Centers • IP multicast has a bad reputation

  8. IP Multicast in Data Centers • IP multicast has a bad reputation • Works great up to a point,                                 after which it breaks                                         catastrophically

  9. IP Multicast in Data Centers • Bottom line: • Administrators have no control over multicast use ... • Without control, they opt for never.

  10. Dr. Multicast

  11. Dr. Multicast (MCMD) • Policy: Permits data center operators to selectively enable and control IPMC • Transparency: Standard IPMC interface, system calls are overloaded. • Performance: Uses IPMC when possible, otherwise point-to-point unicast • Robustness: Distributed, fault-tolerant service

  12. Terminology • Process: Application that joins logical IPMC groups • Logical IPMC group: A virtualized abstraction • Physical IPMC group: As usual • UDP multi-send: New kernel-level system-call  • Collection: Set of logical IPMC groups with identical membership

  13. Acceptable Use Policy • Assume a higher-level network management tool compiles policy into primitives • Explicitly allow a process to use IPMC groups • allow-join(process,logical IPMC) • allow-send(process,logical IPMC) • UDP multi-send always permitted • Additional restraints • max-groups(process,limit) • force-udp(process,logical IPMC)

  14. Overview • Library module • Mapping module • Gossip layer • Optimization questions • Results

  15. MCMD Library Module • Transparent. Overloads the IPMC functions • setsockopt(),send(), etc. • Translation. Logical IPMC map to a set of P-IPMC/unicast addresses. • Two extremes

  16. MCMD Mapping Role • MCMD Agent runs on each machine • Contacted by the library modules  • Provides a mapping • One agent elected to be a leader: • Allocates IPMC resources according to the current policy

  17. MCMD Mapping Role • Allocating IPMC resources: An optimization problem Procs Collections L-IPMC Procs L-IPMC This box intentionally left BLACK

  18. MCMD Gossip Layer • Runs system-wide as part of the agent • Automatic failure detection  • Group membership fully replicated via gossip • Node reports its own state • Future: Replicate more selectively • Leader runs optimization algorithm on data and reports the mapping

  19. MCMD Gossip Layer • But gossip is slow... • Implications: • Slow propagation of group membership • Slow propagation of new maps • We assume a low rate of membership churn • Remedy: Broadcast module • Leader broadcasts urgent messages  • Bounded bandwidth of urgent channel • Trade-off between latency and scalability

  20. Overview • Library module • Mapping module • Gossip layer • Optimization questions • Results

  21. Optimization Questions Collections BLACK Procs   L-IPMC Procs    L-IPMC • First step: compress logical IPMC groups

  22. Optimization Questions • How compressible are subscriptions? • Multi-objective optimization:  • Minimize number of collections • Minimize bandwidth overhead on network • Thm: The general problem is NP-complete • Thm: In uniform random allocation, "little" compression opportunity.  • Social preferences • Lots of duplicates due to replication (e.g. for load balancing) klk;l

  23. Optimization Questions • Which collections get an IPMC address? • Thm: Ordered by decreasing traffic*size,  assign P-IPMC addresses greedily, we minimize bandwidth. • Tiling heuristic: • Sort L-IPMC by traffic*size • Greedily collapse identical groups • Assign IPMC to collections in reverse order of traffic*size, UDP-multisend to the rest • Building tilings incrementally klk;l

  24. Experimental Results klk;l

  25. klk;l Overhead (max. throughput) • Insignificant overhead when mapping L-IPMC to P-IPMC.

  26. klk;l Overhead (CPU utilization) • Insignificant overhead when mapping L-IPMC to P-IPMC.

  27. Network Overhead • Gossip Layer uses constant background bandwidth, urgent channel behaves well klk;l

  28. Latency • Latency of propagation of joins/leaves and new maps

  29. klk;l Policy control • A malfunctioning node bombards an existing IPMC group. • MCMD policy prevents ill-effects <Traffic starts <New policy

  30. Conclusion • IPMC has been a bad citizen...

  31. Conclusion • IPMC has been a bad citizen... • Dr. Multicast has the cure! • Opportunity for big performance enhancements and policy control.

  32. Thank you!

  33. Thank you!

  34. klk;l Overhead • Insignificant overhead when mapping L-IPMC to P-IPMC.

  35. klk;l Policy control • A malfunctioning node bombards an existing IPMC group. • MCMD policy prevents ill-effects

  36. klk;l Policy control • A malfunctioning node bombards an existing IPMC group. • MCMD policy prevents ill-effects

  37. klk;l Overhead • Linux kernel module increases UDP-multisend throughput by 17% (compared to user-space UDP-multisend)

  38. Latency of events • Gossip: 99% of nodes aware of change within 9 epochs (now 1 sec) klk;l

  39. Conclusions • Policy: Allows data center operators to        enable and control IPMC • Transparency: Standard IPMC interface, system calls are overloaded. • Performance: Uses IPMC when possible, otherwise point-to-point UDP • Robustness: Distributed, fault-tolerant service

  40. Results • Library Module • Insignificant slowdown • Linux Kernel module provides 17% speed-up for UDP multi-send klk;l

  41. Optimization questions Users Topics This box intentionally left BLACK Users Groups Topics klk;l • Multi-objective:  • Minimize number of groups • Minimize bandwidth overhead on network • Thm: This problem is NP-complete • Reduction to Minimum Normal Set Basis

  42. MCMD Library Layer • Overloads the IPMC functions • setsockopt(),send(), etc. • Translates logical IPMC addresses to physical IPMC, or point-to-point UDP packets depending on policy • Notifies MCMD immediately about joins/leaves • Learns about new mappings from MCMD • Keeps statistics about group traffic rates

  43. MCMD Library Layer • Overloads the IPMC functions • setsockopt(),send(), etc. • Translates logical IPMC addresses to physical IPMC, or point-to-point UDP packets depending on policy • Caches translation maps • Maintains a connection to MCMD for updates

  44. Overview • Library module • Mapping module • Gossip layer • Optimization questions • Results

More Related