1 / 51

CSIS 7102 Spring 2004 Lecture 4 : Issues in locking techniques

CSIS 7102 Spring 2004 Lecture 4 : Issues in locking techniques. Dr. King-Ip Lin. Table of contents. Implementation of locks The phantom problem Multiple granularity locks Isolation levels. The story so far. Two-phase locking (2PL) as a protocol to ensure conflict serializability

blake
Download Presentation

CSIS 7102 Spring 2004 Lecture 4 : Issues in locking techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSIS 7102 Spring 2004Lecture 4 : Issues in locking techniques Dr. King-Ip Lin

  2. Table of contents • Implementation of locks • The phantom problem • Multiple granularity locks • Isolation levels

  3. The story so far • Two-phase locking (2PL) as a protocol to ensure conflict serializability • Once a transaction start releasing locks, cannot obtain new locks • Ensure that the conflict cannot go both direction • 2PL does not guarantee recoverability • Strict 2PL ensure that • Rigorous 2PL ensure strict schedule • Deadlock handling in locking • Detection • Time-out • Cycles in wait-for graph • Avoidance • Wait-die vs. wound-wait • Prevention • Conservation 2PL • Ordering of resources

  4. Locks -- implementation • Various support need to implement locking • OS support – lock(X) must be an atomic operation in the OS level • i.e. support for critical sections • Implementation of read(X)/write(X) – automatically add code for locking • Lock manager – module to handle and keep track of locks

  5. Locks -- implementation • A transaction Ti issues the standard read/write instruction, without explicit locking calls. • The operation read(D) is processed as: if Ti has a lock on D then read(D) else begin if necessary wait until no other transaction has a X-lock on D (or apply any deadlock avoidance rules) grant Ti a S-lock on D; read(D) end

  6. Locks -- implementation • write(D) is processed as: if Ti has a X-lock on D then write(D) else begin if necessary wait until no other trans. has any lock on D (or apply any deadlock avoidance rules), if Ti has a S-lock on D then upgrade lock on D to X-lock else grant Ti a X-lock on D write(D) end; • All locks are released after commit or abort

  7. Locks – Lock manager • A Lock manager can be implemented as a separate process to which transactions send lock and unlock requests • The lock manager replies to a lock request by sending a lock grant messages (or a message asking the transaction to roll back, in case of a deadlock) • The requesting transaction waits until its request is answered • The lock manager maintains a data structure called a lock table to record granted locks and pending requests • The lock table is usually implemented as an in-memory hash table indexed on the name of the data item being locked

  8. Locks – lock manager • Black rectangles indicate granted locks, white ones indicate waiting requests • Lock table also records the type of lock granted or requested • New request is added to the end of the queue of requests for the data item, and granted if it is compatible with all earlier locks • Unlock requests result in the request being deleted, and later requests are checked to see if they can now be granted • If transaction aborts, all waiting or granted requests of the transaction are deleted • lock manager may keep a list of locks held by each transaction, to implement this efficiently

  9. Insertion & Deletion • Does delete(X) conflict with read(X)/write(X)? • Yes. Wrong order leads to errors • Thus delete(X) should be treated like a write operation • Request X-locks • Similar to insert(X) operation. • X-lock is given to the newly created tuple.

  10. The phantom menace • Consider the following 2 transactions • There does not seems to be a conflict (in terms of tuple) • Assume initially CS faculty have total salary 1,000,000 and Math faculty have total salary 2,000,000 • Then T1 -> T2 will imply the select statements return 1,000,000 and 2,000,000 • T2 -> T1 will imply the select statements return 1,001,000 and 2,005,000 • Select sum(salary) • From faculty • Where dept = “CS” • 2. Select sum(salary) • From faculty • Where dept = “Math” a. Insert into faculty values (“Lin”, “CS”, 1000) b. Insert into faculty values (“Lam”, “Math”, 5000) T2 T1

  11. The phantom menace • But consider the following schedule • The output will be 1,000,100 and 2,000,000 • Not conflict serializable! a. Insert into faculty values (“Lin”, “CS”, 1000) b. Insert into faculty values (“Lam”, “Math”, 5000) • Select sum(salary) • From faculty • Where dept = “CS” • 2. Select sum(salary) • From faculty • Where dept = “Math” T1 T2

  12. The phantom problem • This is known as the phantom problem • Why does it occur? • No tuples are in conflict • However, conflict occurs for tuples that satisfy a certain condition (dept = “CS”, dept = “Math”) • T1 require access for ALL tuples satisfying the condition • However, T2 changes the number of tuples satisfying the condition • No quick solution: index-locking as a possibility (later)

  13. Multiple granularity locking • Another implementation issue: what to lock? • Options • Whole database • Table • Pages (assume no page holds tuples from multiple tables) • Tuples • Part of a tuple (e.g. single attributes) • Pros and cons?

  14. Multiple granularity locking • Example 1: select * from person where age > 30 • How many tuples need to be examined? (suppose we do not have any index) • What kind of lock you need to issue? • What level do you want to lock?

  15. Multiple granularity locking • Example 2: select * from person where id# = “A1234567” • Now suppose we have an index (say, B+-tree) on the person table • How many tuples need to be examined? • What kind of lock you need to issue? • What level do you want to lock?

  16. Multiple granularity locking • Example 3: update person set salary = salary * 1.1 • How many tuples need to be examined? • What kind of lock you need to issue? • What level do you want to lock?

  17. Multiple granularity locking • Example 4: update person set salary = salary * 1.1 where age > 40 • How many tuples need to be examined? (Assume no index) • What kind of lock you need to issue? • What level do you want to lock?

  18. Multiple granularity locking • In short • Allowing only locking tuples lead to significant inefficiency in many queries • How about allowing only locking tables? • Less concurrency • Can we allow both? • How to implement that?

  19. Multiple granularity locking • Assume we allow both • i.e. for S-lock(X), X-lock(X), X can either be a table, a page or a tuple

  20. Multiple granularity locking • Example 1: (no index) select * from person where age > 30 • S-lock(Person) • Example 2: (with index) select * from person where id# = “A1234567” • S-lock(the tuple in Person with ssn = “A1234567”)

  21. Multiple granularity locking • Example 3: update person set salary = salary * 1.1 • X-lock(Person) • Example 4: update person set salary = salary * 1.1 where age > 40 • S-lock(Person), X-lock(tuples with age > 40)

  22. Multiple granularity locking • How to implement? • Entries in lock table specify which type of object it is. • To request a lock • Check if a lock is placed on the object • How to check?

  23. Multiple granularity locking • Suppose X-lock(Person) • What do we need to check • Any lock placed on the Person table • Check if there is an entry of Person table on the lock table • Is that enough? • What if a lock placed on a tuple of Person? • However, if checking the lock table for tuples in Person table  inefficient

  24. Multiple granularity locking • Suppose X-lock(Person) • Alternative solution: • When placing a X-lock on tuple of Person, place a X-lock on the Person table also • Defeat the purpose of multiple granularity! • What can we do?

  25. Multiple granularity locking • Goal: • Ensure correctness: • Example: if a tuple in Table A is X-locked, then no other transaction can S-lock/X-lock Table A • Ensure concurrency: • Example: if a tuple in Table A is X-locked, then a transaction should be allowed to X-lock another tuple in Table A, provided no transaction S-lock/X-lock the whole table. • Ensure efficiency: • Example: one should avoid scanning the whole lock table to determine whether locking is allowed

  26. Multiple granularity locking • Solution: • Correctness : hierarchy • The resource are organized in a hierarchical fashion • E.g. DB = database, A = Table, F = pages, r = tuple • Locking must be done from top to bottom • Unlocking must be done from bottom to top • Locking any node (S, X)-lock implies locking every node below with the same lock

  27. Multiple granularity locking • Solution: • Concurrency: Intention locks (I-locks) • Not locking the object, but declare intension to lock part of the object below • Three types • IS lock: intention to S-lock portion of the table below • IX lock: intention to X-lock portion of the table below • SIX lock: S-lock the object with an intention to X-lock part of the object below

  28. Multiple granularity locking • Solution: • Efficiency: separate lock table for each type • Lock table for table, page, tuple etc. • With the intention lock, one only need to check one entry for each table

  29. Multiple granularity locking • Example: S-Lock a tuple in Person table • IS-lock the whole database • IS-lock the Person table • S-lock the required tuple • Example: X-lock a tuple in Person table • IX-lock the whole database • IX-lock the Person table • X-lock the required tuple

  30. Multiple granularity locking • Example: X-lock the whole Person table • IX-lock the whole database • X-lock the Person table

  31. Multiple granularity locking • Revisiting earlier examples • Example 1: (no index) select * from person where age > 30 • IS-lock(Database) • S-lock(Person) • Example 2: (with index) select * from person where id# = “A1234567” • IS-lock(Database) • IS-lock(Person) • S-lock(the tuple in Person with ssn = “A1234567”)

  32. Multiple granularity locking • Example 3: update person set salary = salary * 1.1 • IX-lock(Database) • X-lock(Person) • Example 4: update person set salary = salary * 1.1 where age > 40 • IX-lock(Database), IS-lock(Database) • SIX-lock(Person) • X-lock(tuples with age > 40) • If the system is smart, it can try to figure out how many tuples with age > 40, then decide whether to put individual SIX-lock on the tuples on upgrade to X-lock(Person)

  33. Multiple granularity locking • Lock compatibility table T2 holds T1 Request

  34. Multiple granularity locking • Lock compatibility table T2 holds T1 Request

  35. Multiple granularity locking • Transaction Ti can lock a node Q, using the following rules: 1. The lock compatibility matrix must be observed. 2. The root of the tree must be locked first, and may be locked in any mode. 3. A node Q can be locked by Ti in S or IS mode only if the parent of Q is currently locked by Ti in either IX or IS mode. 4. A node Q can be locked by Ti in X, SIX, or IX mode only if the parent of Q is currently locked by Ti in either IX or SIX mode. 5. Ti can lock a node only if it has not previously unlocked any node (that is, Tiis two-phase). 6. Tican unlock a node Q only if none of the children of Q are currently locked by Ti.

  36. Isolation levels • The goal of locking is to ensure serializability • To ensure serializability, we require 2PL. • No new locks are acquired once releasing locks begin • Locks acquired need to be held for a long time (long locks) • One cannot acquire a lock, done work with it, and then release it immediately • Adv: ensure serializability • Dis: less concurrency • Is serializability ( long locks/two-phase locking) necessary in all cases?

  37. Isolation levels • Example • Consider recording temperature from a electronic thermometer • We record the temperature every minute onto a new location on the disk • We have variables which store last hour’s mean and variance; they need to be updated hourly

  38. Isolation levels • Example • Consider the following four transaction • T0: Record the current temperature onto the disk • T1: Calculate the mean and the variance of the temperature in the past hour and store it onto the disk • T2: Read the mean from the disk and do two calculation and store the results in two locations (to be used by other transactions) • T3: Read the mean and variance and do two calculation and store the result in two locations (to be used by other transactions)

  39. Isolation levels • T0:Record the current temperature onto the disk • It just record the latest temperature • Need S-locks? • No • Need X-locks? • Yes • Need long X-locks? • No, can release lock immediately and commit, since we are only recording “current” temperature, no need to check for any other consistency

  40. Isolation levels • T1: Calculate the mean • It read the most recent values • It writes onto two fixed location • Need S-locks? • No. Because the values read is never going to be updated/overwritten/invalid • That means, it reads without requesting any locks • Need X-locks? • Yes • Need long X-locks? (i.e. can we do the following: X-lock(mean) Write(mean) Unlock(mean) X-lock(Variance) Write(Variance) Unlock(Variance) • We cannot allow that (i.e. we must have long X-locks). Otherwise, a transaction may read mean and variance there and the values are not consistent with one another

  41. Isolation levels • T2:Read the mean and do two calculation and store the results in two locations • It need to read a valid value for the mean • Write the calculated results into two separate locations • Need S-locks? • Yes. If we read without obtaining a lock, it may read a value that is written by a T2 which may abort, thus reading an invalid value of the mean • Need long S-locks? • No. It only read the mean once, and once it obtains a lock, the mean should be a valid one. • Need long X-locks? • Yes, similar argument as before

  42. T3:Read the mean and variance and do two calculation and store the results in two locations It need to read a valid value for the mean and variance Write the calculated results into two separate locations Need S-locks? Yes. Similar argument as before Need long S-locks? Yes. Otherwise, the following schedule may happen Inconsistent value of mean and variance for T4 Need long X-locks? Yes, similar argument Isolation levels S-lock(mean) Read(mean) Unlock(mean) S-lock(variance) Read(variance) Unlock(variance) Read(readings) X-lock(mean) Write(mean) Unlock(mean) X-lock(variance) Write(variance) Unlock(variance) T2 T4

  43. Isolation levels • Thus we see that in many cases, 2PL is not necessary • We need to decide when it is necessary, and when it is not

  44. Isolation levels • T1 : Calculate the mean • No need for S-lock • i.e. it can read data that is uncommitted • But in real life such won’t happen anyway • Because it reads values written by T0, which is always valid • It needs long X-lock • Ensure values written are consistent • In other words, it must ensure all writes commit together (at the end)

  45. Isolation levels • T2: Read mean to calculate • Need S-lock • To avoid reading means that may not be valid • Because when T1 finished writing the mean, but not the variance, T1 will have an X-lock on mean, and T2 cannot obtain S-lock on mean • Thus T2 cannot read data that is uncommitted (dirty data)

  46. Isolation levels • T3: Read mean and variance to calculate • Need long S-lock • To avoid mean and variance are both valid and consistent with one another • i.e. once T3 read the mean but before the T3 read the variance, T3 will have an S-lock on mean, so T2 cannot update the value of mean and variance, thus invalidate the value of mean read by T3 • Thus no transaction can overwrite (dirty) data that T3 has read before T3 terminates • This implies that if the transaction read the same item from the disk twice (without updating it), it is guaranteed the value is the same (repeatable read)

  47. Isolation levels • To formalize the notion, introduce isolation level (level of consistency) • For each transaction T, 4 requirements • T does not overwrite dirty data of other transactions (level 0,1,2,3) • T does not commit any writes before end of transaction (level 1,2,3) • T does not read dirty data of other transactions (level 2,3) • Other transactions do not dirty any data read by T before T commits (level 3) • A transaction is said to be at the certain level if the corresponding requirement(s) are satisfied.

  48. Isolation levels • To ensure the appropriate level of isolation, one need to apply the appropriate locking protocol

  49. Isolation levels • A third way of describing isolation levels, using the ideas of schedules and dependencies: • Given two transactions T1 and T2, suppose T1 has an operation O1[X] and T2 follows with another operation O2[X], we define: • T1 <3 T2 if there is at least a write in O1 & O2 (RW, WR, WW conflict) • T1 <2 T2 if O1 = write (WR, WW conflict) • T1 <1 T2 if O1 and O2 = write (WW conflict) • Define graphs G1, G2, G3 similar to the serializability graph (but with <1, <2, <3 as edges respectively) • Theorem: a schedule observe isolation level 1 (2, 3) if the corresponding graph G1 (G2, G3) is acyclic

  50. Isolation levels in SQL • SQL defined four isolation levels

More Related