Computer science 425 distributed systems
This presentation is the property of its rightful owner.
Sponsored Links
1 / 29

Computer Science 425 Distributed Systems PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on
  • Presentation posted in: General

Computer Science 425 Distributed Systems. Lecture 20 Distributed Transactions Chapter 14. Distributed Transactions. A transaction (flat or nested) that invokes operations in several servers. T11. A. A. X. T1. H. T. T12. B. T. B. T21. C. Y. T2. K. D. C. T22. F. D. Z.

Download Presentation

Computer Science 425 Distributed Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Computer science 425 distributed systems

Computer Science 425Distributed Systems

Lecture 20

Distributed Transactions

Chapter 14


Distributed transactions

Distributed Transactions

  • A transaction (flat or nested) that invokes operations in several servers.

T11

A

A

X

T1

H

T

T12

B

T

B

T21

C

Y

T2

K

D

C

T22

F

D

Z

Nested Distributed Transaction

Flat Distributed Transaction


Coordination in distributed transactions

Coordination in Distributed Transactions

Coordinator

join

Coordinator

Participant

Open Transacton

Each server has a special participant process. Coordinator process (leader) resides in one of the servers, talks to trans. & participants.

TID

A

T

X

Close Transaction

join

Participant

Abort Transaction

join

T

B

3

1

Y

a.method (TID )

Join (TID, ref)

Participant

C

A

Participant

D

2

Z

Coordinator & Participants

The Coordination Process


Distributed banking transaction

join

openTransaction

participant

closeTransaction

A

a.withdraw(4);

.

.

join

BranchX

T

participant

b.withdraw(T, 3);

Client

B

b.withdraw(3);

T =

openTransaction

BranchY

join

a.withdraw(4);

c.deposit(4);

participant

b.withdraw(3);

c.deposit(4);

d.deposit(3);

C

closeTransaction

D

d.deposit(3);

Note: the coordinator is in one of the servers, e.g. BranchX

BranchZ

Distributed banking transaction


I atomic commit problem

I. Atomic Commit Problem

  • Atomicity principle requires that either all the distributed operations of a transaction complete or all abort.

  • At some stage, client executes closeTransaction(). Now, atomicity requires that either all participants and the coordinator commit or all abort.

  • What problem statement is this?


Atomic commit protocols

Atomic Commit Protocols

  • Consensus, but it’s impossible in asynchronous networks!

  • So, need to ensure safety property in real-life implementation.

  • In a one-phase commitprotocol, the coordinator communicates either commit or abort, to all participants until all acknowledge.

    • Doesn’t work when a participant crashes before receiving this message (partial transaction results are lost).

    • Does not allow participant to abort the transaction, e.g., under deadlock.

  • In a two-phase protocol

    • First phase involves coordinator collecting commit or abort vote from each participant (which stores partial results in permanent storage).

    • If all participants want to commit and no one has crashed, coordinator multicasts commit message

    • If any participant has crashed or aborted, coordinator multicasts abort message to all participants


Operations for two phase commit protocol

Operations for Two-Phase Commit Protocol

  • canCommit?(trans)-> Yes / No

    • Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote.

  • doCommit(trans)

    • Call from coordinator to participant to tell participant to commit its part of a transaction.

  • doAbort(trans)

    • Call from coordinator to participant to tell participant to abort its part of a transaction.

  • haveCommitted(trans, participant)

    • Call from participant to coordinator to confirm that it has committed the transaction. (May not be required if getDecision() is used – see below)

  • getDecision(trans) -> Yes / No

    • Call from participant to coordinator to ask for the decision on a transaction after it has voted Yes but has still had no reply after some delay. Used to recover from server crash or delayed messages.


The two phase commit protocol

The two-phase commit protocol

  • Phase 1 (voting phase):

    • 1. The coordinator sends a canCommit? request to each of the participants in the transaction.

    • 2. When a participant receives a canCommit? request it replies with its vote (Yes or No) to the coordinator. Before voting Yes, it prepares to commit by saving objects in permanent storage. If the vote is No, the participant aborts immediately.

  • Phase 2 (completion according to outcome of vote):

    • 3. The coordinator collects the votes (including its own).

      • (a)If there are no failures and all the votes are Yes, the coordinator decides to commit the transaction and sends a doCommit request to each of the participants.

      • (b)Otherwise the coordinator decides to abort the transaction and sends doAbort requests to all participants that voted Yes.

  • 4. Participants that voted Yes are waiting for a doCommit or doAbort request from the coordinator. When a participant receives one of these messages it acts accordingly and in the case of commit, makes a haveCommitted call as confirmation to the coordinator.

Recall that

server may

crash


Communication in two phase commit protocol

Coordinator

Participant

step

status

step

status

canCommit?

prepared to commit

1

Yes

(waiting for votes)

2

prepared to commit

(uncertain)

doCommit

3

committed

haveCommitted

4

committed

done

Communication in Two-Phase Commit Protocol

  • To deal with server crashes

    • Each participant saves tentative updates into permanent storage, right before replying yes/no in first phase. Retrievable after crash recovery.

  • To deal with canCommit? loss

    • The participant may decide to abort unilaterally after a timeout (coordinator will eventually abort)

  • To deal with Yes/No loss, the coordinator aborts the transaction after a timeout (pessimistic!). It must annouce doAbort to those who sent in their votes.

  • To deal with doCommit loss

    • The participant may wait for a timeout, send a getDecision request (retries until reply received) – cannot abort after having voted Yes but before receiving doCommit/doAbort!


Two phase commit 2pc protocol

Two Phase Commit (2PC) Protocol

Coordinator

Participant

CloseTrans()

Execute

  • Execute

  • Precommit

notready

ready

request

  • Abort

  • Send NO to coordinator

  • Uncertain

  • Send request to each participant

  • Wait for replies (time out possible)

  • Precommit

  • send YES to coordinator

  • Wait for decision

NO

YES

Timeout or a NO

All YES

COMMIT decision

ABORT decision

  • Abort

  • Send ABORT to each participant

  • Commit

  • Send COMMIT to each participant

  • Commit

  • Make transaction visible

Abort


Ii locks in distributed transactions

II. Locks in Distributed Transactions

  • Each server is responsible for applying concurrency control to its objects.

  • Servers are collectively responsible for serial equivalence of operations.

  • Locks are held locally, and cannot be released until all servers involved in a transaction have committed or aborted.

  • Locks are retained during 2PC protocol

  • Since lock managers work independently, deadlocks are (very?) likely.


Distributed deadlocks

Distributed Deadlocks

  • The wait-for graph in a distributed set of transactions is held partially by each server

  • To find cycles in a distributed wait-for graph, one option is to use a central coordinator:

    • Each server reports updates of its wait-for graph

    • The coordinator constructs a global graph and checks for cycles

  • Centralized deadlock detection suffers from usual comm. overhead + bottleneck problems.

  • In edge chasing, servers collectively make the global wait-for graph and detect deadlocks :

    • Servers forward “probe” messages to servers in the edges of wait-for graph, pushing the graph forward, until cycle is found.


Probes transmitted to detect deadlock

W

®

®

®

W

U

V

W

Held by

Waits for

Deadlock

C

detected

A

Z

X

Initiation

®

®

W

U

V

Waits

®

W

U

for

V

U

Held by

Waits for

B

Y

Probes Transmitted to Detect Deadlock


Edge chasing

Edge Chasing

  • Initiation: When a server S1 notes that a transaction T starts waiting for another transaction U, where U is waiting to access an object at another server S2, it initiates detection by sending <TU> to S2.

  • Detection: Servers receive probes and decide whether deadlock has occurred and whether to forward the probes.

  • Resolution: When a cycle is detected, a transaction in the cycle is aborted to break the deadlock.

  • Phantom deadlocks=false detection of deadlocks that don’t actually exist

    • Edges may disappear. So, all edges in a “detected” cycle may not have been present in the system all at the same time.


Transaction priority

Transaction Priority

  • In order to ensure that only one transaction in a cycle is aborted, transactions are given priorities (e.g., inverse of timestamps) in such a way that all transactions are totally ordered.

  • When a deadlock cycle is found, the transaction with the lowest priority is aborted. Even if several different servers detect the same cycle, only one transaction aborts.

  • Transaction priorities can be used to limit probe messages to be sent only to lower prio. trans. and initiating probes only when higher prio. trans. waits for a lower prio. trans.

    • Caveat: suppose edges were created in order 3->1, 1->2, 2->3. Deadlock never detected.

    • Fix: whenever an edge is created, tell everyone (broadcast) about this edge. May be inefficient.


Iii 2pc in nested transactions

III. 2PC in Nested Transactions

  • Each (sub)transaction has a coordinator

    openSubTransaction(trans-id)  subTrans-id

    getStatus(trans-id)  (committed / aborted / provisional)

  • Each sub-transaction starts after its parent starts and finishes before its parent finishes.

  • When a sub-transaction finishes, it makes a decision to abort or provisionally commit. (Note that a provisional commit is not the same as being prepared – it is just a local decision and is not backed up on permanent storage.)

  • When the root subtrans completes, it does a 2PC with its subtrans to decide to commit or abort.


Example 2pc in nested transactions

Abort

No

Provisional

Yes

Yes

Provisional

Provisional

Yes

No

Yes

Abort

Provisional

Yes

Example 2PC in Nested Transactions

T11

A

T11

T1

H

T1

T12

B

T12

T

T

T21

C

T21

T2

T2

K

D

T22

T22

F

Nested Distributed Transaction

Bottom up decision in 2PC


An example of nested transaction

T

abort (at M)

11

T

provisional commit (at X)

1

T

T

provisional commit (at N)

12

provisional commit (at N)

T

21

aborted (at Y)

T

2

T

provisional commit (at P)

22

An Example of Nested Transaction


Information held by coordinators of nested transactions

Information Held by Coordinators of Nested Transactions

Coordinator

Child

Participant

Provisional

Abort list

of…

subtrans

commit list

T

T

, T

yes

T

, T

T

, T

1

2

1

12

11

2

T

T

, T

yes

T

, T

T

1

11

12

1

12

11

T

T

, T

no (aborted)

T

2

21

22

2

T

no (aborted)

T

11

11

T

T

, T

T

but not

T

, T

21

12

21

12

21

12

T

no (parent aborted)

T

22

22

  • When each sub-transaction was created, it joined its parent sub-transaction.

  • The coordinator of each parent sub-transaction has a list of its child sub-transactions.

  • When a nested transaction provisionally commits, it reports its status and the status of

  • its descendants to its parent.

  • When a nested transaction aborts, it reports abort without giving any information about

  • its descendants.

  • The top-level transaction receives a list of all sub-transactions, together with their status.


Hierarchic two phase commit protocol for nested transactions

Hierarchic Two-Phase Commit Protocol forNested Transactions

  • canCommit?(trans, subTrans) -> Yes / No

    • Call a coordinator to ask coordinator of child subtransaction whether it can commit a subtransaction subTrans. The first argument trans is the transaction identifier of top-level transaction. Participant replies with its vote Yes / No.

  • The coordinator of the top-level sub-transaction sends canCommit? to the coordinators of its immediate child sub-transactions. The latter, in turn, pass them onto the coordinators of their child sub-transactions.

  • Each participant collects the replies from its descendants before replying to its parent.

  • T sends canCommit? messages to T1 (but not T2 which has aborted); T1 sends CanCommit? messages to T12 (but not T11).

  • If a coord. finds no subtrans. matching 2nd param, then it must have crashed, so it replies NO

  • Wasteful if coords. shared among subtrans. May take long.


Flat two phase commit protocol for nested transactions

Flat Two-Phase Commit Protocol for Nested Transactions

  • canCommit?(trans, abortList) -> Yes / No

    • Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote Yes / No.

  • The coordinator of the top-level sub-transaction sends canCommit? messages to the coordinators of all sub-transactions in the provisional commit list (e.g., T1 and T12).

  • If the participant has any provisionally committed sub-transactions that are descendants of the transaction with TID trans:

    • Check that they do not have any aborted ancestors in the abortList. Then prepare to commit.

    • Those with aborted ancestors are aborted.

    • Send a Yes vote to the coordinator giving the good sub-transactions

  • If the participant does not have a provisionally committed descendent, it must have failed after it performed a provisional commit. Send a NO vote to the coordinator.


Iv transaction recovery

IV. Transaction Recovery

  • Recovery is concerned with:

    • Object (data) durability: saved permanently

    • Failure Atomicity: effects are atomic even when servers crash

  • Recovery Manager’s tasks

    • To save objects (data) on permanent storage for committed transactions.

    • To restore server’s objects after a crash

    • To maintain and reorganize a recovery file for an efficient recovery procedure

    • Garbage collection in recovery file


The recovery file

The Recovery File

Recovery File

Transaction Entries

T1: committed

T1: Prepared

T2: Prepared

Trans. Status

Object

Ref

Object

Ref

Intention List

Object

Ref

Object

Ref

Object

Ref

Object Entries

name

name

name

name

name

values

values

values

values

values


Example recovery file

  • “Logging”=appending

  • Write entries at prepared-to-commit

  • and commit points

  • Writes at commit points are forced

C

278

B

242

B

220

A

80

C

300

A

100

B

200

Example: Recovery File

200

300

100

b:

c:

a:

Transaction T1 Transaction T2

balance = b.getBalance()

b.setBalance = (balance*1.1)

balance = b.getBalance()

b.setBalance(balance*1.1)

a.withdraw(balance* 0.1)

c.withdraw(balance*0.1)

220

b:

242

b:

80

a:

278

c:

p1

p3

p4

p7

p2

p5

p6

p0

T2: Preped <B, P4> <C, P6> p5

T1: Preped <A, P1> <B, P2> P0

T1: Commit p3


Using the recovery file after a crash

Using the Recovery File after a Crash

  • When a server recovers, it sets default initial values for objects and then hands over to recovery manager.

  • Recovery manager should apply only those updates that are for committed transactions. Prepared-to-commit entries are not used.

  • Recovery manager has two options:

    • Read the recovery file forward and update object values

    • Read the recovery file backwards and update object values

      • Advantage: each object updated exactly once (hopefully)

  • Server may crash during recovery

    • Recovery operations needs to be idempotent


The recovery file for 2pc

The Recovery File for 2PC

Transaction Entries

T1: committed

T1: Prepared

T2: Prepared

Trans. Status

Object

Ref

Object

Ref

Intention List

Object

Ref

Object

Ref

Object

Ref

Object Entries

name

name

name

name

name

values

values

values

values

values

Coor’d: T2 Participants list. Status.

Coor’d: T1 Participants list. Status.

Par’pant: T1 Coordinator. Status.

Par’pant: T1 Coordinator. Status.

Coordination Entries


Summary

Summary

  • Distributed Transactions

    • More than one server process (each managing different set of objects)

    • One server process marked out as coordinator

    • Atomic Commit: 2PC

    • Deadlock detection: Edge chasing

    • 2PC commit for nested trans.

    • Transaction Recovery: Recovery file

  • Reading for this lecture was: Chapter 14

  • Material not Covered from Chapter 14:

    • Shadow versions format of recovery file

    • Reorganization of recovery file

  • Reading for next lecture: Two URL links on website (exciting topic!)

  • MP2 released – due Nov 11.


Optional slides

Optional Slides


A different kind of edge chasing

V

Held by

Wait for

C

A

X

Z

U  V

Wait for

Held by

T  U  V

T

U

Wait for

Held by

B

Y

X: U  V

Y: T  U  V

A Different Kind of Edge Chasing

V

Held by

Wait for

C

A

X

Z

Held by

Wait for

T

U

B

Wait for

Held by

Y

X: U  V

Y: T  U

Z: V  T

LOCAL Wait-for GRAPHS

Z: T U V  T deadlock detected


  • Login