Database Replication in WAN

Database Replication in WAN Yi Lin Supervised by: Prof. Kemme April 8, 2005

Contents • Introduction • Centralized Snapshot Isolation Replication (SIR) protocol • Decentralized SIR protocol for WAN • Experiments • Further optimizations • Related work • Conclusions and milestones

Introduction: What,Why,How? Without Replication With Replication Toronto Montreal Ottawa Toronto Montreal Ottawa … … Replica control WAN Montreal Toronto Montreal Ottawa Benefits: Performance, Fault Tolerance

w(x) w(x) Introduction, challenge x x x x General Correctness Criteria: 1-copy-serializability

w(x) r(x) w(x) w(y) r(x),w(x) r(z)w(z) time 1. Introduction, 1-copy-serializability • 1-copy-serializability • The replicated system behaves as one database providing serializability • Serializability • Highest txn isolation level • to what extend txns interfere with each other • The result is the same as executing them serially. • Conflict: read/write and write/write T0 w(x) T1 r(x) w(x) w(y) T2 r(x) w(x) T3 r(z), w(z) time

1. Introduction, 1-copy-SI • Snapshot Isolation (SI): • Conflict: only write/write • Read from a snapshot of the committed data as of the time txn starts. • 2 concurrent write txns. If one commits, the other aborts • Very popular (Oracle, PostgreSQL) • 1-copy-SI • The replicated system behaves as one database providing SI T0 w(x) commit T1 r(x) w(x) w(y) abort T2 r(x) w(x) time

commit commit r(x) w(x) w(x) apply ws, commit Extract writeset x x 2. Centralized Snapshot Isolation Replication (SIR) Protocol • Challenge: • How to detect concurrent conflicting txns? Validation validation succeed validation fail x x

2. Centralized SIR Protocol • How to detect two txns are conflicting? • Writeset contains modified tuples and their corresponding primary keys. • If two writesets share some primary keys, they conflict. • Note: Snapshot Isolation only cares about write/write conflicts. T1 Key=1 T2

2. Centralized SIR Protocol • How to detect two txns are concurrent? start=0 end=1 T0 end=2 start=1 T1 T2 start=1 counter • A counter for each database, increased upon committing a txn • Record start time and end time of txns • T0.end  T1.start || T1.end  T0.start  T0 and T1 not concurrent.

LAN WAN Middleware replica Middleware replica LAN DB DB Decentralized Architecture 3. Decentralized SIR Protocol for WANs • Centralized approach not good for WANs WAN Middleware replica WAN DB DB Centralized Architecture

commit commit w(x) r(x) r(x) w(x) T1 validation succeed T1 succeed validation T1 succeed validation Group Comm, Total order T2 validation fail T2 fail validation T2 fail validation Extract writeset apply ws, commit Extract writeset abort x x x x 3. Decentralized SIR Protocol for WANs T1 T2 • Challenge: • Validation same as centralized approach • Total orderall middleware components make the same decision

4. Experiments Fig. TPC-W benchmark, 5 sites, 50% update txns,

Group Comm, Total order commit commit Extract writeset Extract writeset abort 5. Some optimizations T1 T2 • With GCS • Disadvantage: • Total order expensive • Large response time • Advantage: • Uniform reliable for failover r(x) w(x) commit r(x) w(x) commit sequencer validation succeed validation fail • Without GCS, but with a sequencer • Advantage: • Less communication overhead • Disadvantage: • Complicated in Failover

6. Related work • Kernel-based replica control • Middleware-based replica control • Advantages • Heterogeneous DB • Easy to implement • Disadvantages • No access to concurrency control in the kernel oralce PostgreSQL

6. Related work • Many have a centralized component. [Ganymed, Conflict Aware] • Does not work well in WANs • Some are primary/secondary approaches.[Ganymed] • Updates must always be performed on primary copy • Need to mark read-only txn in advance • Some need to know all operations in advance [Conflict Aware] • Some are table-based locking [Middle-R, Conflict Aware] • Nearly all only look at 1-copy-serializability [Conflict Aware, GlobData, Middle-R, State Machine]

7. Conclusions • Work well in WANs • Only 1 multicast msg • No restrictions such as • Marking read-only txn in advance • Knowing all operations in advance • Tuple based locking • 1-copy-SI

GCS Total order 7. Milestones • Currently • 1-copy-SI • Centralized and decentralized protocol formulized, implemented • Sep, 2005: • Failover (coordinated with a Master project) • Dec, 2005: • Further optimizations proposed in report • May, 2006: • Recovery

References • [SIR] Y. Lin, B. Kemme, R. Jimenez-Peris, and M. Patiòno-Martnez. Middleware based data replication providing snapshot isolation. In SIGMOD, June 2005. • [Ganymed] C. Plattner and G. Alonso. Ganymed: Scalable replication for transactional web applications. In Middleware, 2004. • [GlobData] L. Rodrigues, H. Miranda, R. Almeida, J. Martins, and P. Vicente. Strong Replication in the GlobData Middleware. In Workshop on Dependable Middleware-Based Systems, 2002. • [Middle-R] R. Jimenez-Peris, M. Patiòno-Martnez, B. Kemme, and G. Alonso. Improving Scalability of Fault Tolerant Database Clusters. In ICDCS'02. • [Conflict-Aware] C. Amza, A. L. Cox, and W. Zwaenepoel. Conict-Aware Scheduling for Dynamic Content Applications. In USENIX Symp. on Internet Tech. and Sys., 2003. • [Postgres-R] S. Wu and B. Kemme. Postges-R(SI): Combining replica control with concurrency control based on snapshot isolation. In ICDE, Tokoyo, Japan, 2005. • [State Machine] F. Pedone, R. Guerraoui, and A. Schiper. The Database State Machine Approach. Distributed and Parallel Databases, 14:71-98, 2003.

Database Replication in WAN