Transactions and Wrap-Up - PowerPoint PPT Presentation

Transactions and wrap up
Download
1 / 25

Transactions and Wrap-Up. Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 5, 2007. Some slide content derived from Ramakrishnan & Gehrke. Reminders. Please be sure you’re signed up for a project demo Due by 12/14: 5-10 page report describing:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Transactions and Wrap-Up

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Transactions and wrap up

Transactions and Wrap-Up

Zachary G. Ives

University of Pennsylvania

CIS 550 – Database & Information Systems

December 5, 2007

Some slide content derived from Ramakrishnan & Gehrke


Reminders

Reminders

Please be sure you’re signed up for a project demo

  • Due by 12/14: 5-10 page report describing:

    • What your project goals were

    • What you implemented

    • Basic architecture and design

    • Division of labor

  • And the final code!

  • Also: please email me an assessment of your group; group members’ contributions; your contributions

  • Final examination will be available on Monday @ noon (unless otherwise announced)

    • Cheryl Hickey, Levine 502

    • You’ll sign when you pick it up and return it


  • Acid semantics

    ACID Semantics

    • Atomicity: each operation looks atomic to the user

    • Consistency: each operation in isolation keeps the database in a consistent state (this is the responsibility of the user)

    • Isolation: should be able to understand what’s going on by considering each separate transaction independently

    • Durability: updates stay in the DBMS!!!

    • Achieved through transactional storage


    What is a transaction

    What is a Transaction?

    A transaction is a sequence of read and write operations on data items that logically functions as one unit of work:

    • should either be done entirely or not at all

    • if it succeeds, the effects of write operations persist (commit); if it fails, no effects of write operations persist (abort)

    • these guarantees are made despite concurrent activity in the system, and despite failures that may occur


    How things can go awry

    How Things Can Go Awry

    • Suppose we have a table of bank accounts which contains the balance of the account

    • An ATM deposit of $50 to account # 1234 would be written as:

    • This reads and writes the account’s balance

    • What if two accountholders make deposits simultaneously from two ATMs?

    update Accounts

    set balance = balance + $50

    where account#= ‘1234’;


    Concurrent deposits

    Concurrent Deposits

    This SQL update code is represented as a sequence of read and write operations on “data items” (which for now should be thought of as individual accounts):

    where X is the data item representing the account with account# 1234.

    Deposit 1 Deposit 2

    read(X.bal) read(X.bal)

    X.bal := X.bal + $50 X.bal:= X.bal + $10

    write(X.bal) write(X.bal)


    A bad concurrent execution

    BAD!

    time

    A “Bad” Concurrent Execution

    Only one “action” (e.g. a read or a write) can actually happen at a time, and we can interleave deposit operations in many ways:

    Deposit 1 Deposit 2

    read(X.bal)

    read(X.bal)

    X.bal := X.bal + $50

    X.bal:= X.bal + $10

    write(X.bal)

    write(X.bal)


    A good execution

    GOOD!

    time

    A “Good” Execution

    • Previous execution would have been fine if the accounts were different (i.e. one were X and one were Y), i.e., transactions were independent

    • The following execution is a serial execution, and executes one transaction after the other:

    Deposit 1 Deposit 2

    read(X.bal)

    X.bal := X.bal + $50

    write(X.bal)

    read(X.bal)

    X.bal:= X.bal + $10

    write(X.bal)


    Good executions

    Good Executions

    An execution is “good” if it is serial (transactions are executed atomically and consecutively) or serializable (i.e. equivalent to some serial execution)

    Equivalent to executing Deposit 1 then 3, or vice versa

    • Why would we want to do this instead?

    Deposit 1 Deposit 3

    read(X.bal)

    read(Y.bal)

    X.bal := X.bal + $50

    Y.bal:= Y.bal + $10

    write(X.bal)

    write(Y.bal)


    Atomicity

    Transfer

    read(X.bal)

    read(Y.bal)

    X.bal= X.bal-$100

    Y.bal= Y.bal+$100

    CRASH

    Atomicity

    Problems can also occur if a crash occurs in the middle of executing a transaction:

    Need to guarantee that the write to X does not persist (ABORT)

    • Default assumption if a transaction doesn’t commit


    Transactions in sql

    Transactions in SQL

    • A transaction begins when any SQL statement that queries the db begins.

    • To end a transaction, the user issues a COMMIT or ROLLBACK statement.

    Transfer

    UPDATE Accounts

    SET balance = balance - $100

    WHERE account#= ‘1234’;

    UPDATE Accounts

    SET balance = balance + $100

    WHERE account#= ‘5678’;

    COMMIT;


    Read only vs read write transactions

    Read-Only vs. Read-Write Transactions

    • We can tell the DBMS that we won’t be performing any updates:

    • If we are going to modify the DBMS, we need:

    SET TRANSACTION READ ONLY;

    SELECT * FROM Accounts

    WHERE account#=‘1234’;

    SET TRANSACTION READ WRITE;

    UPDATE Accounts

    SET balance = balance - $100

    WHERE account#= ‘1234’; ...


    Dirty reads

    Dirty Reads

    • Dirty data is data written by an uncommitted transaction; a dirty read is a read of dirty data (WR conflict)

    • Sometimes we can tolerate dirty reads; other times we cannot:

      e.g., if we wished to ensure balances never went negative in the transfer example, we should test that there is enough money first!


    Bad dirty read

    “Bad” Dirty Read

    EXEC SQLselect balance into :bal

    from Accounts

    where account#=‘1234’;

    if (bal > 100) {

    EXEC SQLupdate Accounts

    set balance = balance - $100

    where account#= ‘1234’;

    EXEC SQLupdate Accounts

    set balance = balance + $100

    where account#= ‘5678’;

    }

    EXEC SQLCOMMIT;

    If the initial read (italics) were dirty, the balance

    could become negative!


    Acceptable dirty read

    Acceptable Dirty Read

    If we are just checking availability of an airline seat, a dirty read might be fine! (Why is that?)

    Reservation transaction:

    EXEC SQL select occupied into :occ

    from Flights

    where Num= ‘123’ and date=11-03-99

    and seat=‘23f’;

    if (!occ) {EXEC SQL

    update Flights

    set occupied=true

    where Num= ‘123’ and date=11-03-99

    and seat=‘23f’;}

    else {notify user that seat is unavailable}


    Other undesirable phenomena

    Other Undesirable Phenomena

    • Unrepeatable read: a transaction reads the same data item twice and gets different values (RW conflict)

      • Why? Someone changed the tuple

    • Phantom problem: a transaction retrieves a collection of tuples twice and sees different results

      • Why? Someone added or removed a tuple


    Isolation

    Isolation

    • The problems we’ve seen are all related to isolation

    • General rules of thumb w.r.t. isolation:

      • Fully serializable isolation is expensive

        • We can’t do as many things concurrently (or we have to undo them frequently)

      • For performance, the DBMS lets you relax the isolation levelif your application can tolerate it, e.g:

        SET TRANSACTION READ WRITE

        ISOLATION LEVEL READ UNCOMMITTED;


    Implementing isolation levels

    Implementing Isolation Levels

    One approach – use locking at some level:

    • each data item is either locked (in some mode, e.g. shared or exclusive) or is available (no lock)

    • an action on a data item can be executed if the transaction holds an appropriate lock

    • consider granularity of locks – how big of an item to lock

      • Larger granularity = fewer locking operations but more contention!

      • tuple, page, table, etc.

        Appropriate locks:

    • Before a read, a shared lock must be acquired

    • Before a write, an exclusive lock must be acquired


    Locks prevent bad execution

    Locks Prevent “Bad” Execution

    If the system used locking, the first “bad” execution could have been avoided:

    Deposit 1 Deposit 2

    xlock(X)

    read(X.bal)

    {xlock(X) is not granted}

    X.bal := X.bal + $50

    write(X.bal)

    release(X)

    xlock(X)

    read(X.bal)

    X.bal:= X.bal + $10

    write(X.bal)

    release(X)


    Locking and serializability

    Locking and Serializability

    • A transaction must hold all locks until it terminates (a condition called strict locking)

    • It turns out that this is crucial to guarantee serializability

      • Note that the first (bad) example could have been produced if transactions acquired and immediately released locks.


    Well formed two phased transactions

    Well-Formed, Two-Phased Transactions

    • A transaction is well-formed if it acquires at least a shared lock on Q before reading Q or an exclusive lock on Q before writing Q and doesn’t release the lock until the action is performed

      • Locks are also released by the end of the transaction

    • A transaction is two-phased if it never acquires a lock after unlocking one

      • i.e., there are two phases: a growing phase in which the transaction acquires locks, and a shrinking phase in which locks are released


    Two phased locking theorem

    Two-Phased Locking Theorem

    • If all transactions are well-formed and two-phase, then any schedule in which conflicting locks are never granted ensures serializability

      • i.e., there is a very simple scheduler!

    • However, if some transaction is not well-formed or two-phase, then there is some schedule in which conflicting locks are never granted but which fails to be serializable

      • i.e., one bad apple spoils the bunch


    Summary

    Summary

    • Transactions are all-or-nothing units of work guaranteed despite concurrency or failures in the system

    • Theoretically, the “correct” execution of transactions is serializable (i.e. equivalent to some serial execution)

    • Practically, this may adversely affect throughput  isolation levels

    • With isolation levels, users can specify the level of “incorrectness” they are willing to tolerate


    What to look for down the road

    What to Look for Down the Road

    Well, no one really knows the answer to this…

    But here are some current directions:

    • Sensors, networks, and streaming data

    • Peer-to-peer meets databases and data integration

      • “The Semantic Web”

    • Security and privacy – especially as integration becomes more commonplace

    • Uncertainty and ranked retrieval

    • … We have lots of research projects at Penn relating to these and other topics


    A plug for next year

    A Plug for Next Year

    CIS 455/555 (Spring): Internet and Web Systems

    • Focus: building and interconnecting scalable Web servers and services; information retrieval; integration

    • Emphasis on implementation, programming-in-the-large, experimentation – need substantial coding experience

    • Consider it 1.5CU – but you will get out of the course what you put into it

      CIS 650 (Fall): Building Data Management Systems

      Meanwhile… Best of luck on your projects and exams – and have a wonderful break!

    • I hope you learned a lot in this course and that it was enjoyable!


  • Login