1 / 26

Transactions

Transactions. What is it?. Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed in 1950's banking activities Idea of transaction formalized in 1970's Transactions in clouds in 2010’s?. Why?.

acharlton
Download Presentation

Transactions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Transactions

  2. What is it? • Transaction - a logical unit of database processing • Motivation - want consistent change of state in data • Transactions developed in 1950's • banking activities • Idea of transaction formalized in 1970's • Transactions in clouds in 2010’s?

  3. Why? Want to interleave operations- • increases throughput of the system (number of transactions that can finish in any given period) • interleave I/Os and CPU time

  4. Problems 1) Inconsistent result if crash in middle of transaction crash due to: hardware failure, system error, exception condition 2) Can have error if concurrent execution 3) Uncertainty when changes permanent. -- Write to disk every time? Concurrency control and Recovery from failures are needed to solve these problems

  5. ACID properties – Jim Gray • ACID properties • Atomicity - transaction is indivisible, • Consistency - correct execution takes DB from one consistent state to another • Isolation – transactions affect each other as if not concurrent • Durability - once a transaction completes (commits), changes made to data are permanent and transaction is recoverable

  6. Systems guarantees 4 properties • ACID properties • Atomicity • all or nothing, performs transaction in entirety • solves 1) Inconsistent result if crash in middle of transaction • Consistency • a logical property based on some consistency rule, implied by isolation • If isolation is implemented properly • solves 2) correct execution takes DB from one consistent state to another • Isolation • equivalent to serial schedule (serializability) • T2 -> T1 or T1 -> T2 • takes care of 2) • Durability • changes are never lost due to subsequent failures – • solves 3) Uncertainty when changes permanent

  7. ACID properties • Atomicity ensures recovery • Consistency ensures programmer and DBMS enforce consistency • Isolation ensures concurrency control is utilized • Durability ensures recovery

  8. Operations of transactions • Read and Write of data items • Granularity can be  rows of a table or a table (Granularity can affect concurrency) • Each transaction has an identifier i assigned to it (Ti) • Transaction commit statement - causes transaction to end (commit) successfully (Ci) • Transaction abort (rollback) statement - all writes undone (Ai) • System or User can specify commit and abort

  9. R’s and W’s Notation: R1(X) - transaction 1 reads data item X     W2(Y) - transaction 2 writes to data item Y     C1 - transaction 1 commits BL1 – transaction 1 blocks     A2 - transaction 2 aborts, rollback • A series of R's and W's is a schedule (history) • Allow multiple users to execute simultaneously to access tables in a common DB • Concurrent access - concurrency

  10. Lost Update Problem – Dirty Write Dirty write - some value of DB is incorrect T1 T2     where A=10                     R1(A)                                          R2(A) (A=10) A=A-10                     W1(A)                                         A=A+20                                         W2(A) R1(A)R2(A)W1(A)C1W2(A)C2 The value of A is 30 when T1 and T2 are done, but it should be 20                  

  11. Dirty Read Problem Transaction updates DB item, then fails T1 T2     where A=10                     R1(A)                     A=A-10                     W1(A)                     R1(C)                                        R2(A)     (A=0, reads value T1 wrote)                     A1 - T1 fails                                         R2(B)                                         B=B+A                                         W2(B) R1(A)W1(A)R1(C)R2(A)W2(A)A1 R2(B)W2(B)C2 When T1 is aborted, A is reset to value of 10, values B                         written by T2 are incorrect

  12. Unrepeatable Read Transaction reads data item twice, in between values change T1 T2     where A=10                     R1(A)                     R1(B)                     B=B+A                     W1(B)                                        R2(A)                                        A=A+20                                        W2(A)                     R1(A)                     R1(C)                     C=C+A                     W1(C) R1(A)R1(B)W1(B)R2(A)W2(A)C2R1(A)R1(C)W1(C)C1 When T1 first reads A it has a value of 10, then a value of 30

  13. Degrees of Isolation DBs try to achieve the following: degree 0 - doesn't overwrite data updated (dirty data) by other transactions with degree at least 1 degree 1 - no lost updates (no dirty writes) degree 2 - no lost updates and no dirty reads degree 3 - degree 2 plus repeatable reads degree 4 - serializability

  14. Serializable A transaction history is serializable if it is equivalent to some serial schedule         (does not mean it is a serial schedule)    The important word here is ‘some’         Example of two serial schedules: R1(X)W1(X)C1  R2(X)W2(X)R2(Y)W2(Y)C2       T1<< T2 R2(X)W2(X)R2(Y)W2(Y)C2  R1(X)W1(X)C1       T2 << T1

  15. Equivalence • Is a given schedule equivalent to some serial schedule? R2(X)W2(X)R1(X)W1(X)R2(Y)W2(Y)C1C2 • How do we answer this question? • What is equivalence? • Need same number of operations • How to define equivalence

  16. Result equivalence Result equivalent if produce same final state R1(X) (X*2)W1(X)C1 R2(X)(X+Xmod10)W2(X)C2       if X=10 result is X=20 R1(X)R2(X)(T2:X+Xmod10)W2(X)C2(T1: X*2)W1(X) if X=10 result is X=20 Same results if X=10, but next try X= 7: (results are 18 and 14)

  17. Conflict equivalence First define conflicting operations R and W conflict if:    1.  from 2 different transactions    2.  reference same data item    3.  at least one is a write

  18. Conflicting operations The conflicting operations are:    Ri(A) << Wj(A)   read followed by a write    Wi(A) << Rj(A)   write followed by a read    Wi(A) << Wj(A)  write followed by a write Ri(A) << Rj(A)  don't conflict    Ri(A) << Wj(B)  don't conflict

  19. Conflict equivalence Conflict equivalent if order of any 2 conflicting operations is the same in both schedules R1(A)W1(A)C1  R2(A)W2(A)C2  or in reverse order R1(A)<<W2(A)                R2(A)<<W1(A)  W1(A)<<R2(A)                W2(A)<<R1(A) W1(A)<<W2(A)               W2(A)<<W1(A)

  20. Conflict equivalence R1(A)R2(A)W1(A)C1W2(A)C2                     R1(A)<<W2(A)                     R2(A)<<W1(A) W1(A)<<W2(A) NOT serializable - example of lost update

  21. Example Is this schedule serializable?          R2(X)W2(X)R1(X)W1(X)R2(Y)W2(Y)C1C2               R2(X)<<W1(X) W2(X)<<R1(X)          W2(X)<<W1(X) T2<<T1 Is this schedule serializable?    R2(X)R1(X)W2(X)R2(Y)W1(X)C1W2(Y)C2 R2(X)<<W1(X)          R1(X)<<W2(X) W2(X)<<W1(X) not equivalent

  22. Strategy • There is a simple algorithm for determining conflict serializability of a schedule • What is the algorithm?

  23. Testing for Equivalence Construct a precedence graph (aka serialization graph) 1.  For each Ti in S, create node Ti in graph   2.  For all Rj(X) after Wi(X) create an edge Ti->Tj   3.  For all Wj(X) after Ri(X) create an edge Ti->Tj   4.  For all Wj(X) after Wi(X) create an edge Ti-> Tj    5.  Serializable iff no cycles     If a cycle in the graph, no equivalent serial history

  24. Example Is this the following schedule serializable? R1(A)R2(A)W1(A)W2(A)C1C2 T1  T2 

  25. Serializability • If a schedule is serializable, we can say it is correct • Serializable does not mean it is serial • Difficult to test for conflict serializability - interleaving of operations is determined by the operating system scheduler • Concurrency control methods don't test for it. • Instead, protocols developed that guarantee a schedule is serializable.

  26. Concurrency Control Techniques Protocols - set of rules that guarantee serializability • locking • Timestamps • multiversion • validation or certification - optimistic

More Related