1 / 21

Porcupine: A Highly Available Cluster-based Mail Service

Porcupine: A Highly Available Cluster-based Mail Service. Y. Saito, B. Bershad, H. Levy U. Washington SOSP 1999 Presented by: Fabián E. Bustamante. Porcupine – goals & requirements. Use commodity hardware to build a large, scalable mail service Main goal – scalability in terms of

mick
Download Presentation

Porcupine: A Highly Available Cluster-based Mail Service

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Porcupine: A Highly Available Cluster-based Mail Service Y. Saito, B. Bershad, H. Levy U. Washington SOSP 1999 Presented by: Fabián E. Bustamante

  2. Porcupine – goals & requirements Use commodity hardware to build a large, scalable mail service Main goal – scalability in terms of • Manageability - large but easy to manage • Self-configure w/ respect to load and data distribution • Self-heal with respect to failure & recovery • Availability – survive failures gracefully • Failure may prevent some users to access email • Performance – scale linear with cluster size • Target – 100s of machines ~ billions of mail msgs/day

  3. Functional Homogeneity “any node can perform any task” Framework Dynamic Scheduling Automatic Reconfiguration Techniques Replication Availability Goals Manageability Performance Key Techniques and Relationships

  4. Why Email? • Mail is important • Real demand – Saito now works for Google • Mail is hard • Write intensive • Low locality • Mail is easy • Well-defined API • Large parallelism • Weak consistency

  5. Conventional Mail Solution Static partitioning • Performance problems: • No dynamic load balancing • Manageability problems: • Manual data partition • Availability problems: • Limited fault tolerance SMTP/IMAP/POP Luca’s mbox Jeanine’s mbox Joe’s mbox Suzy’s mbox NFS servers

  6. SMTP server POP server IMAP server Load Balancer User map Membership Manager RPC Replication Manager Mail map Mailbox storage User profile ... ... Node A Node Z Node B Porcupine Architecture

  7. Protocol handling User lookup Load Balancing Message store Internet C A 4. “OK, luca has msgs on C and D 1. “send mail to luca” 3. “Verify luca” 6. “Store msg” ... ... A B C B 5. Pick the best nodes to store new msg  C 2. Who manages luca?  A Porcupine Operations DNS-RR selection

  8. B C A C A B A C B B C C A A C C A A B B A A C C Basic Data Structures “luca” Apply hash function User map Mail map /user info Luca: {A,C} suzy: {A,C} joe: {B} ann: {B} Suzy’s MSGs Ann’s MSGs Suzy’s MSGs Luca’s MSGs Joe’s MSGs Bob’s MSGs Mailbox storage A B C

  9. Porcupine Advantages • Advantages: • Optimal resource utilization • Automatic reconfiguration and task re-distribution upon node failure/recovery • Fine-grain load balancing • Results: • Better Availability • Better Manageability • Better Performance

  10. Performance • Goals • Scale performance linearly with cluster size • Strategy: Avoid creating hot spots • Partition data uniformly among nodes • Fine-grain data partition

  11. Measurement Environment • 30 node cluster of not-quite-all-identical PCs • 100Mb/s Ethernet + 1Gb/s hubs • Linux 2.2.7 • 42,000 lines of C++ code • Synthetic load • Compare to sendmail+popd

  12. How does Performance Scale? 68m/day 25m/day

  13. Availability • Goals: • Maintain function after failures • React quickly to changes regardless of cluster size • Graceful performance degradation / improvement • Strategy: Two complementary mechanisms • Hard state: email messages, user profile •  Optimistic fine-grain replication • Soft state: user map, mail map •  Reconstruction after membership change

  14. B A A B A B A B A C A C A C A C luca: {A,C} luca: {A,C} luca: {A,C} B B B B B C C C C C A A A A A B B B B B A A A A A B B B B B A A A A A C C C C C suzy: suzy: {A,B} B A A B A B A B A C A C A C A C joe: {C} joe: {C} joe: {C} ann: ann: {B} suzy: {A,B} suzy: {A,B} suzy: {A,B} ann: {B} ann: {B} ann: {B} Soft-state Reconstruction 2. Distributed disk scan 1. Membership protocol Usermap recomputation A B C Timeline

  15. Reaction to Configuration Changes

  16. Hard-state Replication • Goals: • Keep serving hard state after failures • Handle unusual failure modes • Strategy: Exploit Internet semantics • Optimistic, eventually consistent replication • Per-message, per-user-profile replication • Efficient during normal operation • Small window of inconsistency

  17. Replication Efficiency 68m/day 24m/day

  18. Replication Efficiency 68m/day 33m/day 24m/day Pretending – remove disk flushing from disk logging routines.

  19. Load balancing: Storing messages • Goals: • Handle skewed workload well • Support hardware heterogeneity • No voodoo parameter tuning • Strategy: Spread-based load balancing • Spread: soft limit on # of nodes per mailbox • Large spread  better load balance • Small spread  better affinity • Load balanced within spread • Use # of pending I/O requests as the load measure

  20. Support of Heterogeneous Clusters Relative performance improvement. +16.8m/day (+25%) Node heterogeneity – 0% all nodes ~ at same speed, 3,7 & 10% - percentage of nodes w/ very fast disks +0.5m/day (+0.8%)

  21. Conclusions • Fast, available, and manageable clusters can be built for write-intensive service • Key ideas can be extended beyond mail • Functional homogeneity • Automatic reconfiguration • Replication • Load balancing • Ongoing work • More efficient membership protocol • Extending Porcupine beyond mail: Usenet, Calendar, etc • More generic replication mechanism

More Related