Manageability, availability and performance in Porcupine: scalable, cluster-based mail service

Manageability, availability and performance in Porcupine: scalable, cluster-based mail service Presented By Abhinav Bondalapati

Email vs. … • Write Intensive • Weak Consistency • Inherent Parallelism

Pre Porcupine Era • Static Partitioning • Manageability Problems – Manual • Availability – No Fault Tolerance • Performance – No load Balancing Homer Bart Marge Lisa Maggie

Goals • Provide a highly scalable email service using readily available commodity hardware • How to achieve scalability?

SCALABILITY Manageability Availability Performance Automatic Reconfiguration, Replication, Load Balancing Functional Homogeneity

SMTP server POP server IMAP server Load Balancer User map Membership Manager Porcupine Architecture RPC Replication Manager Mail map Mailbox storage User profile ... ... Node A Node B Node Z

MAIL OPERATIONS INTERNET 4. OK BOB HAS MSGS ON C AND D 1. SEND MAIL TO BOB DNS C A 6. STORE MSG 3. VERIFY BOB B A B C 5. PICK THE BEST NODE 2. WHO MANAGES BOB?

B C A C A B A C B B C C A A C C A A B B A A C C DATA STRUCTURES Apply hash function bob: {A,C} suzy: {A,C} joe: {B} ann: {B} Suzy’s MSGs Bob’s MSGs Suzy’s MSGs Joe’s MSGs Bob’s MSGs Ann’s MSGs B C A

ADVANTAGES • User always has access to his mail • Dynamic Load Balance • Automatic Reconfiguration

Manageability • Adapt to changes like node failures and recoveries, node addition etc – TRM protocol • Soft State Reconstruction

Round 1: detecting node becomes “coordinator” and broadcasts a proposed epoch ID for the new group Three Round Membership Protocol A B “Epoch ID = X” E C D

Round 2: nodes receiving new epoch ID reply, coordinator waits for some timeout period Three Round Membership Protocol A B “Epoch ID = X” E C D

Three Round Membership Protocol Round 3: coordinator broadcasts new membership and epoch ID A B “Epoch ID = X, Group = {A, B, C, D}” E C D

B B B B B C C C C C A A A A A B B B B B A A A A A B B B B B A A A A A C C C C C Soft-state Reconstruction 2. Distributed disk scan 1. Membership protocol Usermap recomputation B A A B A B A B A C A C A C A C A bob: {A,C} bob: {A,C} bob: {A,C} suzy: suzy: {A,B} B A A B A B A B A C A C A C A C B joe: {C} joe: {C} joe: {C} ann: ann: {B} suzy: {A,B} C suzy: {A,B} suzy: {A,B} ann: {B} ann: {B} ann: {B} Timeline

How does Porcupine React to Configuration Changes?

Availability • Maintain service after failures • Optimistic, eventually consistent replication strategy

How Efficient is Replication? 68m/day 24m/day

How Efficient is Replication? 68m/day 33m/day 24m/day

Performance • Scalable Performance with cluster size • Dynamic load balancing at message level

Load balancing: Deciding where to store messages Goals: Handle skewed workload well Support hardware heterogeneity No voodoo parameter tuning Strategy: Spread-based load balancing Spread: soft limit on # of nodes per mailbox Large spread  better load balance Small spread  better affinity Load balanced within spread Use # of pending I/O requests as the load measure

How does Performance Scale? 68m/day 25m/day

Positives • Key Ideas: • Functional Homogeneity • Automatic Reconfiguration • Replication • Load Balancing

Negatives • Communication Overhead • Membership Protocol - Might overwhelm the coordinator

Manageability, availability and performance in Porcupine: scalable, cluster-based mail service