1 / 73

Outline

Outline. Introduction Background Distributed DBMS Architecture Distributed Database Design Semantic Data Control Distributed Query Processing Distributed Transaction Management Transaction Concepts and Models Distributed Concurrency Control Distributed Reliability

tulia
Download Presentation

Outline

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Outline • Introduction • Background • Distributed DBMS Architecture • Distributed Database Design • Semantic Data Control • Distributed Query Processing • Distributed Transaction Management • Transaction Concepts and Models • Distributed Concurrency Control • Distributed Reliability • Parallel Database Systems • Distributed Object DBMS • Database Interoperability • Concluding Remarks

  2. Transaction A transaction is a collection of actions that make consistent transformations of system states while preserving system consistency. • concurrency transparency • failure transparency Database may be temporarily in an inconsistent state during execution Database in a consistent state Database in a consistent state Begin Transaction Execution of Transaction End Transaction

  3. Transaction Example – A Simple SQL Query Transaction BUDGET_UPDATE begin EXEC SQL UPDATE PROJ SET BUDGET = BUDGET1.1 WHERE PNAME = “CAD/CAM” end.

  4. Example Database Consider an airline reservation example with the relations: FLIGHT(FNO, DATE, SRC, DEST, STSOLD, CAP) CUST(CNAME, ADDR, BAL) FC(FNO, DATE, CNAME,SPECIAL)

  5. Example Transaction – SQL Version Begin_transaction Reservation begin input(flight_no, date, customer_name); EXEC SQL UPDATE FLIGHT SET STSOLD = STSOLD + 1 WHERE FNO = flight_no AND DATE = date; EXEC SQL INSERT INTO FC(FNO, DATE, CNAME, SPECIAL); VALUES (flight_no, date, customer_name, null); output(“reservation completed”) end . {Reservation}

  6. Termination of Transactions Begin_transaction Reservation begin input(flight_no, date, customer_name); EXEC SQL SELECT STSOLD,CAP INTO temp1,temp2 FROM FLIGHT WHERE FNO = flight_no AND DATE = date; if temp1 = temp2 then output(“no free seats”); Abort else EXEC SQL UPDATE FLIGHT SET STSOLD = STSOLD + 1 WHERE FNO = flight_no AND DATE = date; EXEC SQL INSERT INTO FC(FNO, DATE, CNAME, SPECIAL); VALUES (flight_no, date, customer_name, null); Commit output(“reservation completed”) endif end . {Reservation}

  7. Example Transaction – Reads & Writes Begin_transaction Reservation begin input(flight_no, date, customer_name); temp Read(flight_no(date).stsold); if temp = flight(date).cap then begin output(“no free seats”); Abort end else begin Write(flight(date).stsold, temp + 1); Write(flight(date).cname, customer_name); Write(flight(date).special, null); Commit; output(“reservation completed”) end end. {Reservation}

  8. Characterization • Read set (RS) • The set of data items that are read by a transaction • Write set (WS) • The set of data items whose values are changed by this transaction • Base set (BS) • RS  WS

  9. Formalization Let • Oij(x) be some operation Oj of transaction Ti operating on entity x, where Oj {read,write} and Oj is atomic • OSi = jOij • Ni {abort,commit} Transaction Ti is a partial order Ti = {i, <i} where • i = OSi{Ni } • For any two operations Oij, OikOSi , if Oij = R(x) and Oik = W(x) for any data item x, then either Oij <iOik or Oik <iOij • OijOSi, Oij<iNi

  10. Example Consider a transaction T: Read(x) Read(y) x x + y Write(x) Commit Then  = {R(x), R(y), W(x), C} < = {(R(x), W(x)), (R(y), W(x)), (W(x), C), (R(x), C), (R(y), C)}

  11. DAG Representation Assume < = {(R(x),W(x)), (R(y),W(x)), (R(x), C), (R(y), C), (W(x), C)} R(x) W(x) C R(y)

  12. Properties of Transactions ATOMICITY • all or nothing CONSISTENCY • no violation of integrity constraints ISOLATION • concurrent changes invisible È serializable DURABILITY • committed updates persist

  13. Atomicity • Either all or none of the transaction's operations are performed. • Atomicity requires that if a transaction is interrupted by a failure, its partial results must be undone. • The activity of preserving the transaction's atomicity in presence of transaction aborts due to input errors, system overloads, or deadlocks is called transaction recovery. • The activity of ensuring atomicity in the presence of system crashes is called crash recovery.

  14. Consistency • Internal consistency • A transaction which executes alone against a consistent database leaves it in a consistent state. • Transactions do not violate database integrity constraints. • Transactions are correct programs

  15. Consistency Degrees • Degree 0 • Transaction T does not overwrite dirty data of other transactions • Dirty data refers to data values that have been updated by a transaction prior to its commitment • Degree 1 • T does not overwrite dirty data of other transactions • T does not commit any writes before EOT

  16. Consistency Degrees (cont’d) • Degree 2 • T does not overwrite dirty data of other transactions • T does not commit any writes before EOT • T does not read dirty data from other transactions • Degree 3 • T does not overwrite dirty data of other transactions • T does not commit any writes before EOT • T does not read dirty data from other transactions • Other transactions do not dirty any data read by T before T completes.

  17. Isolation • Serializability • If several transactions are executed concurrently, the results must be the same as if they were executed serially in some order. • Incomplete results • An incomplete transaction cannot reveal its results to other transactions before its commitment. • Necessary to avoid cascading aborts.

  18. Isolation Example • Consider the following two transactions: • T1: Read(x) T2: Read(x) • xx1 xx+1 • Write(x) Write(x) • Commit Commit • Possible execution sequences: T1: Read(x) T1: Read(x) T1: xx1 T1: xx+1 T1: Write(x) T2: Read(x) T1: Commit T1: Write(x) T2: Read(x) T2: xx+1 T2: xx+1 T2: Write(x) T2: Write(x) T1: Commit T2: Commit T2: Commit

  19. SQL-92 Isolation Levels Phenomena: • Dirty read • T1 modifies x which is then read by T2 before T1 terminates; T1 aborts T2 has read value which never exists in the database. • Non-repeatable (fuzzy) read • T1 reads x; T2 then modifies or deletes x and commits. T1 tries to read x again but reads a different value or can’t find it. • Phantom • T1 searches the database according to a predicate while T2 inserts new tuples that satisfy the predicate.

  20. SQL-92 Isolation Levels (cont’d) • Read Uncommitted • For transactions operating at this level, all three phenomena are possible. • Read Committed • Fuzzy reads and phantoms are possible, but dirty reads are not. • Repeatable Read • Only phantoms possible. • Anomaly Serializable • None of the phenomena are possible.

  21. Durability • Once a transaction commits, the system must guarantee that the results of its operations will never be lost, in spite of subsequent failures. • Database recovery

  22. Characterization of Transactions Based on • Application areas • non-distributed vs. distributed • compensating transactions • heterogeneous transactions • Timing • on-line (short-life) vs batch (long-life) • Organization of read and write actions • two-step • restricted • action model • Structure • flat (or simple) transactions • nested transactions • workflows

  23. Transaction Structure • Flat transaction • Consists of a sequence of primitive operations embraced between a begin and end markers. Begin_transaction Reservation … end. • Nested transaction • The operations of a transaction may themselves be transactions. Begin_transaction Reservation … Begin_transaction Airline • … end. {Airline} Begin_transaction Hotel … end. {Hotel} end. {Reservation}

  24. Nested Transactions • Have the same properties as their parents  may themselves have other nested transactions. • Introduces concurrency control and recovery concepts to within the transaction. • Types • Closed nesting • Subtransactions begin after their parents and finish before them. • Commitment of a subtransaction is conditional upon the commitment of the parent (commitment through the root). • Open nesting • Subtransactions can execute and commit independently. • Compensation may be necessary.

  25. Workflows • “A collection of tasks organized to accomplish some business process.” [D. Georgakopoulos] • Types • Human-oriented workflows • Involve humans in performing the tasks. • System support for collaboration and coordination; but no system-wide consistency definition • System-oriented workflows • Computation-intensive & specialized tasks that can be executed by a computer • System support for concurrency control and recovery, automatic task execution, notification, etc. • Transactional workflows • In between the previous two; may involve humans, require access to heterogeneous, autonomous and/or distributed systems, and support selective use of ACID properties

  26. T1 T2 T3 T4 T5 Workflow Example T1: Customer request obtained T2: Airline reservation performed T3: Hotel reservation performed T4: Auto reservation performed T5: Bill generated Customer Database Customer Database Customer Database

  27. Transactions Provide… • Atomic and reliable execution in the presence of failures • Correct execution in the presence of multiple user accesses • Correct management of replicas(if they support it)

  28. Transaction Processing Issues • Transaction structure (usually called transaction model) • Flat (simple), nested • Internal database consistency • Semantic data control (integrity enforcement) algorithms • Reliability protocols • Atomicity & Durability • Local recovery protocols • Global commit protocols

  29. Transaction Processing Issues • Concurrency control algorithms • How to synchronize concurrent transaction executions (correctness criterion) • Intra-transaction consistency, Isolation • Replica control protocols • How to control the mutual consistencyof replicated data • One copy equivalence and ROWA

  30. Begin_transaction, Read, Write, Commit, Abort Distributed Execution Monitor With other With other TMs SCs Scheduler (SC) To data processor Architecture Revisited Results Transaction Manager (TM) Scheduling/ Descheduling Requests

  31. User Application User Application Centralized Transaction Execution … Begin_Transaction, Read, Write, Abort, EOT Results & User Notifications Transaction Manager (TM) Read, Write, Abort, EOT Results Scheduler (SC) Scheduled Operations Results Recovery Manager (RM)

  32. User application Distributed Transaction Execution Results & User notifications Begin_transaction, Read, Write, EOT, Abort Distributed Transaction Execution Model TM TM Replica Control Protocol Read, Write, EOT, Abort Distributed Concurrency Control Protocol SC SC Local Recovery Protocol RM RM

  33. Concurrency Control • The problem of synchronizing concurrent transactions such that the consistency of the database is maintained while, at the same time, maximum degree of concurrency is achieved. • Anomalies: • Lost updates • The effects of some transactions are not reflected on the database. • Inconsistent retrievals • A transaction, if it reads the same data item more than once, should always read the same value.

  34. Execution Schedule (or History) • An order in which the operations of a set of transactions are executed. • A schedule (history) can be defined as a partial order over the operations of a set of transactions. • T1: Read(x) T2: Write(x) T3: Read(x) • Write(x) Write(y) Read(y) • Commit Read(z) Read(z) • Commit Commit H1={W2(x),R1(x), R3(x),W1(x),C1,W2(y),R3(y),R2(z),C2,R3(z),C3}

  35. Formalization of Schedule A complete schedule SC(T) over a set of transactions T={T1, …, Tn} is a partial order SC(T)={T, < T} where • T = ii , for i = 1, 2, …, n • < T i < i , for i = 1, 2, …, n • For any two conflicting operations Oij, OklT, either Oij < T Okl or Okl < T Oij

  36. Complete Schedule – Example Given three transactions T1: Read(x) T2: Write(x) T3: Read(x) Write(x) Write(y) Read(y) Commit Read(z) Read(z) Commit Commit A possible complete schedule is given as the DAG R1(x) W2(x) R3(x) W1(x) W2(y) R3(y) C 1 R2(z) R3(z) C 2 C 3

  37. Schedule Definition A schedule is a prefix of a complete schedule such that only some of the operations and only some of the ordering relationships are included. T1: Read(x) T2: Write(x) T3: Read(x) Write(x) Write(y) Read(y) Commit Read(z) Read(z) Commit Commit W2(x) R3(x) R1(x) W2(x) R3(x) R1(x) W1(x) W2(y) R3(y) W2(y) R3(y)  C 1 R2(z) R3(z) R2(z) R3(z) C 2 C 3

  38. Serial History • All the actions of a transaction occur consecutively. • No interleaving of transaction operations. • If each transaction is consistent (obeys integrity rules), then the database is guaranteed to be consistent at the end of executing a serial history. • T1: Read(x) T2: Write(x) T3: Read(x) • Write(x) Write(y) Read(y) • Commit Read(z) Read(z) • Commit Commit Hs={W2(x),W2(y),R2(z),C2,R1(x),W1(x),C1,R3(x),R3(y),R3(z),C3}

  39. Serializable History • Transactions execute concurrently, but the net effect of the resulting history upon the database is equivalent to some serial history. • Equivalent with respect to what? • Conflict equivalence: the relative order of execution of the conflicting operations belonging to unaborted transactions in two histories are the same. • Conflicting operations: two incompatible operations (e.g., Read and Write) conflict if they both access the same data item. • Incompatible operations of each transaction is assumed to conflict; do not change their execution orders. • If two operations from two different transactions conflict, the corresponding transactions are also said to conflict.

  40. Serializable History • T1: Read(x) T2: Write(x) T3: Read(x) • Write(x) Write(y) Read(y) • Commit Read(z) Read(z) • Commit Commit The following are not conflict equivalent Hs={W2(x),W2(y),R2(z),C2,R1(x),W1(x),C1,R3(x),R3(y),R3(z),C3} H1={W2(x),R1(x), R3(x),W1(x),C1,W2(y),R3(y),R2(z),C2,R3(z),C3} The following are conflict equivalent; therefore H2 is serializable. Hs={W2(x),W2(y),R2(z),C2,R1(x),W1(x),C1,R3(x),R3(y),R3(z),C3} H2={W2(x),R1(x),W1(x),C1,R3(x),W2(y),R3(y),R2(z),C2,R3(z),C3}

  41. Serializability in Distributed DBMS • Somewhat more involved. Two histories have to be considered: • local histories • global history • For global transactions (i.e., global history) to be serializable, two conditions are necessary: • Each local history should be serializable. • Two conflicting operations should be in the same relative order in all of the local histories where they appear together.

  42. Global Non-serializability • T1: Read(x) T2: Read(x) • xx5 xx15 • Write(x) Write(x) • Commit Commit The following two local histories are individually serializable (in fact serial), but the two transactions are not globally serializable. LH1={R1(x),W1(x),C1,R2(x),W2(x),C2} LH2={R2(x),W2(x),C2,R1(x),W1(x),C1}

  43. Concurrency Control Algorithms • Pessimistic • Two-Phase Locking-based (2PL) • Centralized (primary site) 2PL • Primary copy 2PL • Distributed 2PL • Timestamp Ordering (TO) • Basic TO • Multiversion TO • Conservative TO • Hybrid • Optimistic • Locking-based • Timestamp ordering-based

  44. Locking-Based Algorithms • Transactions indicate their intentions by requesting locks from the scheduler (called lock manager). • Locks are either read lock (rl) [also called shared lock] or write lock (wl) [also called exclusive lock] • Read locks and write locks conflict (because Read and Write operations are incompatible rlwl rl yes no wl no no • Locking works nicely to allow concurrent processing of transactions.

  45. Two-Phase Locking (2PL) • A Transaction locks an object before using it. • When an object is locked by another transaction, the requesting transaction must wait. • When a transaction releases a lock, it may not request another lock. Lock point Obtain lock Release lock No. of locks Phase 1 Phase 2 BEGIN END

  46. Strict 2PL Hold locks until the end. Obtain lock Release lock Transaction duration BEGIN END period of data item use

  47. Centralized 2PL • There is only one 2PL scheduler in the distributed system. • Lock requests are issued to the central scheduler. Data Processors at participating sites Coordinating TM Central Site LM Lock Request Lock Granted Operation End of Operation Release Locks

  48. Distributed 2PL • 2PL schedulers are placed at each site. Each scheduler handles lock requests for data at that site. • A transaction may read any of the replicated copies of item x, by obtaining a read lock on one of the copies of x. Writing into x requires obtaining write locks for all copies of x.

  49. Distributed 2PL Execution Coordinating TM Participating LMs Participating DPs Lock Request Operation End of Operation Release Locks

  50. Timestamp Ordering • Transaction (Ti) is assigned a globally unique timestamp ts(Ti). • Transaction manager attaches the timestamp to all operations issued by the transaction. • Each data item is assigned a write timestamp (wts) and a read timestamp (rts): • rts(x) = largest timestamp of any read on x • wts(x) = largest timestamp of any read on x • Conflicting operations are resolved by timestamp order. Basic T/O: for Ri(x) for Wi(x) ifts(Ti) < wts(x) ifts(Ti) < rts(x) andts(Ti) < wts(x) then reject Ri(x) then reject Wi(x) else accept Ri(x) else accept Wi(x) rts(x) ts(Ti) wts(x) ts(Ti)

More Related