1 / 15

Chapter 4: Transaction Management

Chapter 4: Transaction Management. Title: Efficient Locking for Concurrent Operations on B-Trees Authors: Philip L. Lehman, S. Bing Yao Pages: 334-354. Efficient Locking for Concurrent Operations on B-Trees. Problem Problem Statement Why is this problem important?

moya
Download Presentation

Chapter 4: Transaction Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4: Transaction Management • Title: Efficient Locking for Concurrent Operations on B-Trees • Authors:Philip L. Lehman, S. Bing Yao • Pages: 334-354

  2. Efficient Locking for Concurrent Operations on B-Trees • Problem • Problem Statement • Why is this problem important? • Why is this problem hard? • Approaches • Approach description, key concepts • Contributions (novelty, improved) • Assumptions

  3. Problem Statement • Given • Data on secondary storage devices • Database index • Find: Efficient Locking • Locking mechanisms for search, insertion, and deletion • Objectives • The mechanisms are safe from concurrent operations • Constraints • Many processes are allowed to operate on the data simultaneously. • Each process do not share its primary memory. • Disk page is the smallest unit of read and write. • Locks should not prevent other processes from reading the locked page.

  4. Why is this problem important? • B-tree or B*-tree is widely used as a data structure for storing large files of information on secondary storage devices. • Most databases are manipulated concurrently by several processes.

  5. Why is this problem Hard? • Locking root may reduce concurrency. • Depending upon nodes • parent – child • Insert / split may go up many levels • split / insert conflicts with read, insert • Concurrent operation on B*-tree is erroneous. A, B, C: blocks of primary storage x, y, z: variables in the primary storage

  6. Novelty of Contribution • Related Work • Naïve approach to concurrent B-tree problem fails. • Using semaphore locks entire sub-tree affected by updates. • B*-tree • Locks are applied mostly in lower sections of tree. • Contributions • Uses a small (constant) # of locks at any time • Locks only prevent multiple update access.

  7. Principles of Blink-tree • Add a single ‘link’ pointer field to each node. • The link provides an additional method for reaching a node. • The split two nodes are joined by a link pointer, and are functionally essentially the same as a single node. • The link pointer serves as a ‘temporary fix’ that allows correct concurrent operation. • Additionally, the Blink-tree enables serial search, i.e., retrieving nodes in the same level (e.g., retrieving only leaves). Reference: A Guttman ‘R-tree a dynamic index structure for spatial searching’, 1984

  8. Example of Blink-tree

  9. Search, Insertion Algorithms • Search • If a current node is to split, the search algorithm rectifies the error by following the link pointer of the newly split node. • Insertion • The insertion may cause splitting a node. (= unsafe) • Lock a node before modification. Example: Splitting node a into node a’ and b’

  10. Locking Efficiency • The insertion algorithm uses at most a constant # of locks (three) for any process at any time. • Split  chaining across the level of nodes containing the father to find the correct insertion position  Three nodes are locked for the duration of one operation. • This type of locking occurs rarely in a Blink-tree • Extremely small collision probability Example: Splitting node a into node a’ and b’

  11. Validation Methodology • Correctness Proof • Theorem 1: Deadlock Freedom. The system can’t produce deadlock. • Impose an order: bottom to top / left to right • Locks are placed by the inserter according to a well-ordering • As long as inserter follow the well-ordering, it never places a lock on any node below a locked node, nor on any node to the left. • Theorem 2: All put operations correctly modify tree structure. • Classify put operations into three types. • Prove the correctness of first case and show consecutive put operations is equivalent to one change. • Theorem 3: Interaction Theorem. Actions of an insertion process don’t impair correctness of actions of other processes. • Classify three possible types of insertion. • Apply lemma 3 to several aspects separately. • Livelock: one process runs indefinitely. • extremely unlikely problem

  12. Class Exercise 1/2 • How can we resolve the erroneous behavior of B*-tree using Blink-tree? A, B, C: blocks of primary storage x, y, z: variables in the primary storage

  13. Class Exercise 2/2 • Can insert lead to deadlock? Livelock? • Many nodes have 2 pointers pointing to them, • One from parent • One from left sibling Which one is created first? • In the figure (b), why the right link was created first? Example: Splitting node a into node a’ and b’

  14. Summary • Paper’s focus • Blink-tree – implementations and correctness • Ideas • Link provides an additional method to reach a node. • The split two nodes work as a single node by the link. • Contributions • Locking scheme is simpler (no read-locks). • A constant # of nodes are locked. • Analytical Validation • Correctness proofs

  15. Assumptions, Rewrite today • Assumptions • Many processes can operate on data simultaneously. • A process is allowed to lock and unlock a disk page. • Rewrite today • Compare with newer methods • T-tree • Experimental evaluation - Simulation • Measure lock efficiency

More Related