- 36 Views
- Uploaded on
- Presentation posted in: General

Chapter 4: Transaction Management

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- Title: Efficient Locking for Concurrent Operations on B-Trees
- Authors:Philip L. Lehman, S. Bing Yao
- Pages: 334-354

- Problem
- Problem Statement
- Why is this problem important?
- Why is this problem hard?

- Approaches
- Approach description, key concepts
- Contributions (novelty, improved)
- Assumptions

- Given
- Data on secondary storage devices
- Database index

- Find: Efficient Locking
- Locking mechanisms for search, insertion, and deletion

- Objectives
- The mechanisms are safe from concurrent operations

- Constraints
- Many processes are allowed to operate on the data simultaneously.
- Each process do not share its primary memory.
- Disk page is the smallest unit of read and write.
- Locks should not prevent other processes from reading the locked page.

- B-tree or B*-tree is widely used as a data structure for storing large files of information on secondary storage devices.
- Most databases are manipulated concurrently by several processes.

- Locking root may reduce concurrency.
- Depending upon nodes
- parent – child
- Insert / split may go up many levels
- split / insert conflicts with read, insert

- Concurrent operation on B*-tree is erroneous.

A, B, C: blocks of primary storage

x, y, z: variables in the primary storage

- Related Work
- Naïve approach to concurrent B-tree problem fails.
- Using semaphore locks entire sub-tree affected by updates.
- B*-tree
- Locks are applied mostly in lower sections of tree.

- Contributions
- Uses a small (constant) # of locks at any time
- Locks only prevent multiple update access.

- Add a single ‘link’ pointer field to each node.
- The link provides an additional method for reaching a node.
- The split two nodes are joined by a link pointer, and are functionally essentially the same as a single node.
- The link pointer serves as a ‘temporary fix’ that allows correct concurrent operation.
- Additionally, the Blink-tree enables serial search, i.e., retrieving nodes in the same level (e.g., retrieving only leaves).

Reference: A Guttman ‘R-tree a dynamic index structure for spatial searching’, 1984

- Search
- If a current node is to split, the search algorithm rectifies the error by following the link pointer of the newly split node.

- Insertion
- The insertion may cause
splitting a node. (= unsafe)

- Lock a node before modification.

- The insertion may cause

Example: Splitting node a into node a’ and b’

- The insertion algorithm uses at most a constant # of locks (three) for any process at any time.
- Split chaining across the level of nodes containing the father to find the correct insertion position Three nodes are locked for the duration of one operation.

- This type of locking occurs rarely in a Blink-tree
- Extremely small collision probability

Example: Splitting node a into node a’ and b’

- Correctness Proof
- Theorem 1: Deadlock Freedom. The system can’t produce deadlock.
- Impose an order: bottom to top / left to right
- Locks are placed by the inserter according to a well-ordering
- As long as inserter follow the well-ordering, it never places a lock on any node below a locked node, nor on any node to the left.

- Theorem 2: All put operations correctly modify tree structure.
- Classify put operations into three types.
- Prove the correctness of first case and show consecutive put operations is equivalent to one change.

- Theorem 3: Interaction Theorem. Actions of an insertion process don’t impair correctness of actions of other processes.
- Classify three possible types of insertion.
- Apply lemma 3 to several aspects separately.

- Theorem 1: Deadlock Freedom. The system can’t produce deadlock.
- Livelock: one process runs indefinitely.
- extremely unlikely problem

- How can we resolve the erroneous behavior of B*-tree using Blink-tree?

A, B, C: blocks of primary storage

x, y, z: variables in the primary storage

- Can insert lead to deadlock? Livelock?
- Many nodes have 2 pointers pointing to them,
- One from parent
- One from left sibling
Which one is created first?

- In the figure (b), why the
right link was created first?

Example: Splitting node a into node a’ and b’

- Paper’s focus
- Blink-tree – implementations and correctness

- Ideas
- Link provides an additional method to reach a node.
- The split two nodes work as a single node by the link.

- Contributions
- Locking scheme is simpler (no read-locks).
- A constant # of nodes are locked.

- Analytical Validation
- Correctness proofs

- Assumptions
- Many processes can operate on data simultaneously.
- A process is allowed to lock and unlock a disk page.

- Rewrite today
- Compare with newer methods
- T-tree

- Experimental evaluation - Simulation
- Measure lock efficiency

- Compare with newer methods