Cs 347 parallel and distributed data management notes05 concurrency control
This presentation is the property of its rightful owner.
Sponsored Links
1 / 51

CS 347: Parallel and Distributed Data Management Notes05: Concurrency Control PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

CS 347: Parallel and Distributed Data Management Notes05: Concurrency Control. Hector Garcia-Molina. Overview. Concurrency Control Schedules and Serializability Locking Timestamp Control Failure Recovery next set of notes. Schedule.

Download Presentation

CS 347: Parallel and Distributed Data Management Notes05: Concurrency Control

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cs 347 parallel and distributed data management notes05 concurrency control

CS 347: Parallel and DistributedData ManagementNotes05: Concurrency Control

Hector Garcia-Molina

Notes05


Overview

Overview

  • Concurrency Control

    • Schedules and Serializability

    • Locking

    • Timestamp Control

  • Failure Recovery

    • next set of notes...

Notes05


Schedule

Schedule

  • Just like in a centralized system, a schedule represents how a set of transactions were executed

  • Schedules may be “good” or “bad”

    (preserve constraints)

Notes05


Example

Example

constraint: X=Y

X

Y

Node 1

Node 2

T1T2

1(T1) a  X 5 (T2) c  X

2(T1) X  a+100 6 (T2) X  2c

3(T1) b  Y 7 (T2) d  Y

4(T1) Y  b+100 8 (T2) Y  2d

Precedence relation

Notes05


Schedule s1

Schedule S1

Precedence: intra-transactioninter-transaction

(node X)(node Y)

1(T1) a  X

2(T1) X  a+100

5 (T2) c  X 3 (T1) b  Y

6 (T2) X  2c 4 (T1) Y  b+100

7 (T2) d  Y

8 (T2) Y  2d

If X=Y=0 initially, X=Y=200 at end (always good?)

Notes05


Definition of schedule

Definition of Schedule

Let T= {T1,T2,Tn} be a set of transactions.

A schedule S over T is a partial order with

ordering relation <S where:

(1) S = Ti

(2) <S <i

(3) for any two conflicting operations p,q S,

either p <S q or q <S p

N

i=1

N

i=1

Notes05


Example1

Example

(T1)r1[X]  W1[X]

(T2)r2[X]  W2[Y]  W2[X]

(T3)r3[X]  W3[X]  W3[Y]  W3[Z]

r2[X]  W2[Y]  W2[X]

S1:r3[Y]  W3[X]  W3[Y]  W3[Z]

r1[X]  W1[X]

Notes05


Definition of p s

Definition of P(S)

  • The precedence graph for schedule S, P(S), is a directed graph where

    • nodes: the transactions in S

    • edges: Ti Tj is an edge IFF p  Ti, q  Tj such thatp, q conflict and p <S q

Notes05


Example2

Example

r3[X]  w3[X]

S1: r1[X]  w1[X]  w1[Y]

r2[X]  w2[Y]

P(S1): T2 T1  T3

Notes05


Serializability theorem

Serializability Theorem

Theorem: A schedule S is serializable IFF P(S) is acyclic.

Notes05


Enforcing serializability

Enforcing Serializability

  • Locking

  • Timestamps

Notes05


Locking

Locking

  • Just like in a centralized system…

  • But with multiple lock managers

scheduler 1

scheduler 2

...

locks

for

D1

locks

for

D2

D1

D2

node 1

node 2

Notes05


Locking1

access &

lock data

access &

lock data

T

(release all locks at end)

Locking

  • Just like in a centralized system…

  • But with multiple lock managers

scheduler 1

scheduler 2

...

locks

for

D1

locks

for

D2

D1

D2

node 1

node 2

Notes05


Locking rules

Locking Rules

  • Well-formed transactions

  • Legal schedulers

  • Two-phase transactions

  • These rules guarantee serializable schedules

Notes05


What about

What about

  • Locking in a shared-memory architecture?

  • Locking in a shared-disks architecture?

Notes05


Locking with shared memory

Locking with Shared Memory

  • Where does lock table live?

  • How do we avoid race conditions?

P

P

P

...

M

Notes05


Locking with shared disks

P

P

P

M

Locking with Shared Disks

  • Where does lock table live?

  • How do we avoid race conditions?

...

M

M

Notes05


Shared disk locking cases to discuss

Shared Disk Locking, Cases to Discuss

  • Control of Data Partitioned;fixed partition function known by everyone

  • Dynamic partition of control

    • For each DB object i we need

      • LT(i): lock table entry for i (transaction that has lock, waiting transactions, lock mode, etc...)

      • W(i): at what processor is LT(i) currently?

    • Replicate W(i) at all processors (why?)

    • Need replicated data management scheme!

Notes05


Overview1

Overview

  • Concurrency Control

    • Schedules and Serializability

    • Locking

    • Timestamp Control

  • Failure Recovery

    • next set of notes...

next

Notes05


Timestamp ordering schedulers

Timestamp Ordering Schedulers

  • Basic idea:

    - assign timestamp as transaction begins

    - if ts(T1) < ts(T2) … < ts(Tn), then scheduler produces history equivalent to T1,T2,T3,T4, ... Tn

Notes05


To rule

TO Rule

If pi[x] and qj[x] are conflicting operations, then pi[x] is executed before qj[x]

(pi[x] <S qj[x])

IFF ts(Ti) < ts(Tj)

Notes05


Example a non serializable schedule s 2

Example: a non-serializable schedule S2

(Node X)(Node Y)

(T1) a  X (T2) d  Y

(T1) X  a+100 (T2) Y  2d

(T2) c  X (T1) b  Y

(T2) X  2c (T1) Y  b+100

Notes05


Example a non serializable schedule s 21

ts(T1) < ts(T2)

reject!

abort T1

Example: a non-serializable schedule S2

(Node X)(Node Y)

(T1) a  X (T2) d  Y

(T1) X  a+100 (T2) Y  2d

(T2) c  X (T1) b  Y

(T2) X  2c (T1) Y  b+100

Notes05


Example a non serializable schedule s 22

ts(T1) < ts(T2)

reject!

abort T1

abort T1

abort T2

abort T2

Example: a non-serializable schedule S2

(Node X)(Node Y)

(T1) a  X (T2) d  Y

(T1) X  a+100 (T2) Y  2d

(T2) c  X (T1) b  Y

(T2) X  2c (T1) Y  b+100

Notes05


Strict t o

Strict T.O.

  • Lock written items until it is certain that writing transaction has been successful (avoid cascading rollbacks)

Notes05


Example revisited

ts(T1) < ts(T2)

Example Revisited

(Node X)(Node Y)

(T1) a  X (T2) d  Y

(T1) X  a+100 (T2) Y  2d

(T2) c  X (T1) b  Y

reject!

delay

abort T1

Notes05


Example revisited1

abort T1

(T2) c  X

(T2) X  2c

ts(T1) < ts(T2)

Example Revisited

(Node X)(Node Y)

(T1) a  X (T2) d  Y

(T1) X  a+100 (T2) Y  2d

(T2) c  X (T1) b  Y

reject!

delay

abort T1

Notes05


Enforcing t o

Enforcing T.O.

  • For each data item X:

    MAX_R[X]: maximum timestamp of a transaction that read X

    MAX_W[X]: maximum timestamp of a transaction that wrote X

    rL[X]: # of transactions currently reading X (0,1,2,…)

    wL[X]: # of transactions currently writing X (0 or 1)

Notes05


T o scheduler part 1

T.O. Scheduler - Part 1

ri[X] arrives

IF ts(Ti) < MAX_W[X] THEN ABORT Ti

ELSE [ IF ts(Ti) > MAX_R[X] THEN MAX_R[X]  ts(Ti);

IF queue is empty AND wL[X] = 0 THEN

[rL[X]  rL[X]+1; START READ OF X]

ELSE add (r, Ti) to queue]

Notes05


T o scheduler part 2

T.O. Scheduler - Part 2

Wi[X] arrives

IF ts(Ti) < MAX_W[X] OR ts(Ti) < MAX_R[X] THEN ABORT Ti

ELSE [ MAX_W[X]  ts(Ti);

IF queue is empty AND wL[X]=0 AND rL[X]=0

THEN [wL[X]  1; WRITE X; WAIT FOR Ti TO FINISH ]

ELSE add (w, Ti) to queue ]

Notes05


T o scheduler part 3

T.O. Scheduler - Part 3

When o finishes (o is r or w) on X

oL[X]  oL[X] - 1; NDONE  TRUE

WHILE NDONE DO

[ let head of queue be (q, Tj); (smallest timestamp)

IF q=w AND rL[X]=0 AND wL[X]=0 THEN

[remove (q,Tj); wL[X]  1;

WRITE X AND WAIT FOR Tj TO FINISH ]

ELSE IF q=r AND wL[X]=0 THEN

[remove (q,Tj); rL[X]  rL[X] +1; START READ OF X]

ELSE NDONE  FALSE ]

Notes05


Note about the code

Note about the code

for reads:[rL[X]  rL[X]+1; START READ OF X]

for writes:[wL[X]  1; WRITE X; WAIT FOR Ti TO FINISH ]

Meaning: In Part 3, the end of a write is

only processed when all writes for its

transaction have completed.

Notes05


Cs 347 parallel and distributed data management notes05 concurrency control

 If a transaction is aborted, it must be retired with a new, larger timestamp

MAX_R[X]=10T ts(T)=8

MAX_W[X]=9

read X

.

.

.

X

.

.

.

Notes05


Cs 347 parallel and distributed data management notes05 concurrency control

ts(T)=11

 Starvation possible

 If a transaction is aborted, it must be retired with a new, larger timestamp

MAX_R[X]=10T ts(T)=8

MAX_W[X]=9

read X

.

.

.

X

.

.

.

Notes05


Thomas write rule

Thomas Write Rule

MAX_R[X]MAX_W[X]

ts(Ti)

Ti wants to write X

Notes05


Change in t o scheduler

Change in T.O. Scheduler:

MAX_R[X] MAX_W[X]

When Wi[X] arrives

IF ts(Ti)<MAX_R[X] THEN ABORT Ti

ELSE IF ts(Ti)<MAX_W[X] THEN

[IGNORE* THIS WRITE (tell Ti it was OK)]

ELSE [process write as before…

MAX_W[X]  ts(Ti);

IF queue is empty AND wL[X]=0 AND rL[X] =0 THEN

[wL[X]  1; WRITE X and WAIT FOR Ti TO FINISH]

ELSE add (W, Ti) to queue]

ts(Ti)

Ti wants to write X

* Ignore if transaction that wrote MAX_X[X] did not abort

Notes05


Question

Question

MAX_R[X]

MAX_W[X]

ts(Ti)

Ti wants to write

  • Why can’t we let Ti go ahead?

Notes05


Question1

Tj read

MAX_R[X] is only the latest read;

there could be a Tj read as shown...

Question

MAX_R[X]

MAX_W[X]

ts(Ti)

Ti wants to write

  • Why can’t we let Ti go ahead?

Notes05


Optimizations

Optimizations

  • Update MAX_R, MAX_W when action executed, not when action put on queue

    Example: MAX_W[X]=9 or 7?

X:

W, ts=9

W, ts=8

W, ts=7

active write

Notes05


Optimizations1

Optimizations

  • Use multiple versions of data

X:

Value written with ts=9

ri[x] ts(Ti)=8

Value written with ts=7

.

.

.

Notes05


2pl to

2PL  TO

T1: w1[Y]

T2: r2[X] r2[Y] w2[Z] ts(T1)<ts(T2)<ts(T3)

T3: w3[X]

S: r2[X] w3[X] w1 [Y] r2[Y] w2[Z]

 S could be produced with T.O. but not with 2PL

Notes05


Are all 2pl schedules t o

Are all 2PL schedules T.O.?

any

examples

here??

T.O. schedules

2PL schedules

previous example

Notes05


Theorem

Theorem

If S is a schedule representing an execution by a T.O. scheduler, then S is serializable

Proof:

1) say Ti Tj in P(S)

  conflicting pi[x], qj[x] in S,

such that pi[x] <S qj[x] Then by T.O. rule, ts(Ti)< ts(Tj)

Notes05


Proof continued

Proof - Continued

2) Say there is a cycle T1 T2 ...Tn T1 in P(S) then:ts(T1)<ts(T2) <ts(T3)…<ts(Tn)<ts(T1)

A contradiction!

3) So P(S) is acyclic

 S is serializable

Notes05


Timestamp management

Timestamp management

Item data MAX_R MAX_W

X1

X2- too much space!

.- more IO

.

.

Xn

.

.

.

.

.

.

.

.

.

Notes05


Timestamp cache

Timestamp cache

Item MAX_R MAX_W

X

tsMIN

Y

.

.

.

Z

  • If transaction reads or writes X, make entry in cache for X (add row if not in)

  • Periodically purge all items X withMAX_R[X] < tsMIN, MAX_W[X] < tsMINand remember tsMIN (choose tsMIN  current time - d)

Notes05


Timestamp cache1

Timestamp cache

  • To enforce T.O. rule for pi[X]

    • if X entry in cache, use MAX_R, MAX_W values in cache

    • else assume MAX_R[X]=tsMIN, MAX_W[X]=tsMIN

  • Use hashing (on items) for cache

    (same as lock table)

Notes05


Distributed t o scheduler

Distributed T.O. Scheduler

scheduler 1

scheduler 2

...

D1

ts

cache

D2

ts

cache

D1

D2

node 1

node 2

  • Each scheduler is “independent”

  • At end of transaction, signal all schedulers involved to release all wL[X] locks

Notes05


Distributed t o scheduler1

access data

access data

T

Distributed T.O. Scheduler

scheduler 1

scheduler 2

...

D1

ts

cache

D2

ts

cache

D1

D2

node 1

node 2

  • Each scheduler is “independent”

  • At end of transaction, signal all schedulers involved to release all wL[X] locks

Notes05


Summary

Summary

  • 2PL - the most popular - useful in a distributed system- deadlocks possible- several variations

  • T.O.- good for multiple versions

    - aborts more likely- no deadlocks- useful in a distributed sytem

Notes05


Cs 347 parallel and distributed data management notes05 concurrency control

  • Others concurrency control schemes

    e.g., Certifiers, serialization graph testing

    hard to not very practical

    implement inneed global data

    a distributedstructure

    system

Notes05


  • Login