Secondary storage management
Download
1 / 336

Secondary Storage Management - PowerPoint PPT Presentation


  • 71 Views
  • Uploaded on
  • Presentation posted in: General

Secondary Storage Management. Chapter 13. 13.1 The Memory Hierarchy. The Memory Hierarchy. Computer systems have several different components in which data may be stored. Data capacities & access speeds range over at least seven orders of magnitude

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Secondary Storage Management

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript



Secondary storage management

13.1 The Memory Hierarchy


The memory hierarchy
The Memory Hierarchy

  • Computer systems have several different components in which data may be stored.

  • Data capacities & access speeds range over at least seven orders of magnitude

  • Devices with smallest capacity also offer the fastest access speed


Description of levels
Description of Levels

  • Cache

    • Megabyte or more of Cache storage.

    • On-board cache : On same chip.

    • Level-2 cache : On another chip.

    • Cache data accessed in few nanoseconds.

    • Data moved from main memory to cache when needed by processor

    • Volatile


Description of levels1
Description of Levels

2.Main Memory

  • 1 GB or more of main memory.

  • Instruction execution & Data Manipulation - involves information resident in main memory.

  • Time to move data from main memory to the processor or cache is in the 10-100 nanosecond range.

  • Volatile

    3.Secondary Storage

  • Typically a magnetic disk.

  • Capacity upto 1 TB.

  • One machine can have several disk units.

  • Time to transfer a single byte between disk & main memory is around 10 milliseconds.


Description of levels2
Description of Levels

4. Tertiary Storage

  • Holds data volumes measured in terabytes.

  • Significantly higher read/write times.

  • Smaller cost per bytes.

  • Retrieval takes seconds or minutes, but capacities in the petabyte range are possible.


Transfer of data between levels
Transfer of Data Between Levels

  • Data moves between adjacent levels of the hierarchy.

  • Each level is organized to transfer huge amounts of data to or from the below level

  • Key technique for speeding up database operations is to arrange data so that when one piece of a disk block is needed, it is likely that other data on the same block will also be needed at about the same time.


Volatile non volatile storage
Volatile & Non Volatile Storage

  • A volatile device “forgets” what is stored in it when the power goes off.

  • Example: Main Memory

  • A nonvolatile device, on the other hand, is expected to keep its contents intact even for long periods when the device is turned off or there is a power failure.

  • Example: Secondary & Tertiary Storage

    Note: No change to the database can be considered final until it

    has migrated to nonvolatile, secondary storage.


Virtual memory
Virtual Memory

  • Managed by Operating System.

  • Some memory in main memory & rest on disk.

  • Transfer between the two is in units of disk blocks (pages).

  • Not a level of the memory hierarchy



Secondary storage management1
Secondary storage management

Section 13.2

CS-257 Database System Principles

AvinashAnantharamu (102)

008629907


Index
Index

13.2 Disks

13.2.1 Mechanics of Disks

13.2.2 The Disk Controller

13.2.3 Disk Access Characteristics



Mechanics of disks
Mechanics of Disks

Two principal moving pieces of hard drive

1- Head Assembly

2- Disk Assembly

Disk Assembly has 1 or more circular platters

that rotate around a central spindle.

Platters are covered with thin magnetic material



Mechanics of disks1
Mechanics of Disks

  • Tracks are concentric circles on a platter.

  • Tracks are organized into sectors which are segments of circular platter.

  • Sectors are indivisible as far as errors are concerned.

  • Blocks are logical data transfer units.


Disk controller
Disk Controller

Control the actuator to move head assembly

Selecting the surface from which to read or write

Transfer bits from desired sector to main memory

Possibly, buffering an entire track or more in local memory of the disk

controller, hoping that many sectors of this track will be read soon, and

additional accesses to the disk can be avoided.



Disk access characteristics
Disk Access characteristics

Seek time

Rotational latency

Transfer time

Latency of the disk.



Secondary storage management

13.3 Accelerating Access to Secondary Storage

San Jose State University

Spring 2012


13 3 accelerating access to secondary storage section overview
13.3 Accelerating Access to Secondary StorageSection Overview

  • 13.3.1: The I/O Model of Computation

  • 13.3.2: Organizing Data by Cylinders

  • 13.3.3: Using Multiple Disks

  • 13.3.4: Mirroring Disks

  • 13.3.5: Disk Scheduling and the Elevator Algorithm

  • 13.3.6: Prefetching and Large-Scale Buffering


13 3 introduction
13.3 Introduction

  • Average block access is ~10ms.

  • Disks may be busy.

  • Requests may outpace access delays, leading to infinite scheduling latency.

  • There are various strategies to increase disk throughput.

  • The “I/O Model” is the correct model to determine speed of database operations


13 3 introduction contd
13.3 Introduction (Contd.)

  • Actions that improve database access speed:

    • Place blocks closer, within the same cylinder

    • Increase the number of disks

    • Mirror disks

    • Use an improved disk-scheduling algorithm

    • Use prefetching


13 3 1 the i o model of computation
13.3.1 The I/O Model of Computation

  • If we have a computer running a DBMS that:

    • Is trying to serve a number of users

    • Has 1 processor, 1 disk controller, and 1 disk

    • Each user is accessing different parts of the DB

  • It can be assumed that:

    • Time required for disk access is much larger than access to main memory; and as a result:

    • The number of block accesses is a good approximation of time required by a DB algorithm


13 3 2 organizing data by cylinders
13.3.2 Organizing Data by Cylinders

  • It is more efficient to store data that might be accessed together in the same or adjacent cylinder(s).

  • In a relational database, related data should be stored in the same cylinder.


13 3 3 using multiple disks
13.3.3 Using Multiple Disks

  • If the disk controller supports the addition of multiple disks and has efficient scheduling, using multiple disks can improve performance significantly

  • By striping a relation across multiple disks, each chunk of data can be retrieved in a parallel fashion, improving performance by up to a factor of n, where n is the total number of disks the data is striped over


13 3 4 mirroring disks
13.3.4 Mirroring Disks

  • A drawback of striping data across multiple disks is that you increase your chances of disk failure.

  • To mitigate this risk, some DBMS use a disk mirroring configuration

  • Disk mirroring makes each disk a copy of the other disks, so that if any disk fails, the data is not lost

  • Since all the data is in multiple places, access speedup can be increased by more than n since the disk with the head closest to the requested block can be chose

  • Unfortunately, the writing of disk blocks does not speed up at all. The reason is that the new block must be written to each of the n disks.



13 3 5 disk scheduling
13.3.5 Disk Scheduling

  • One way to improve disk throughput is to improve disk scheduling, prioritizing requests such that they are more efficient

    • The elevator algorithm is a simple yet effective disk scheduling algorithm

    • The algorithm makes the heads of a disk oscillate back and forth similar to how an elevator goes up and down

    • The access requests closest to the heads current position are processed first


13 3 5 disk scheduling1
13.3.5 Disk Scheduling

  • When sweeping outward, the direction of head movement changes only after the largest cylinder request has been processed

  • When sweeping inward, the direction of head movement changes only after the smallest cylinder request has been processed

  • Example:


13 3 6 prefetching and large scale buffering
13.3.6 Prefetching and Large-Scale Buffering

  • In some cases we can anticipate what data will be needed

  • We can take advantage of this by prefetching data from the disk before the DBMS requests it

  • Since the data is already in memory, the DBMS receives it instantly



Disk failures
Disk Failures

Presented by Timothy Chen

Spring 2013


Index1
Index

13.4 Disk Failures13.4.1 Intermittent Failures13.4.2 Organizing Data by Cylinders13.4.3 Stable Storage13.4.4 Error- Handling Capabilities of Stable Storage13.4.5 Recovery from Disk Crashes13.4.6 Mirroring as a Redundancy Technique13.4.7 Parity Blocks13.4.8 An Improving: RAID 513.4.9 Coping With Multiple Disk Crashers


Intermittent failures
Intermittent Failures

If we try to read the sector but the correct content of that sector is not delivered to the disk controller

Controller will check good and bad sector

If the write is correct: Read is performed

Good sector and bad sector is known by the read operation


Checksum
CheckSum

Read operation that determine the good or bad status

If, on reading, we find that the checksum is not proper for the data bits, then we know the is an error in reading.

If the checkum is proper, there is still a small chance that the block was not read correctly, but by using many checksum bits we can make the probability of missing a bad read arbitrarily small.


How checksum perform
How CheckSum perform

Each sector has some additional bits

Set depending on the values of the data bits stored in each sector

If the data bit in the not proper we know there is an error reading

Odd number of 1: bits have odd parity(01101000)

Even number of 1: bit have even parity (111011100)

Find Error is the it is one bit parity


Stable storage
Stable Storage

Deal with disk error

Sectors are paired and each pair X showing left and right copies as Xl and Xr

It check the parity bit of left and right by subsituting spare sector of Xl and Xr until the good value is returned


Error handling capabilities of stable storage
Error-Handling Capabilities of Stable Storage

Since it has XL and XR, one of them fail we can still read other one

Chance both of them fail are pretty small

The write Fail, it happened during power outage


Recover disk crash
Recover Disk Crash

The most serious mode of failure for disks is “head crash” where data permanently destroyed.

The way to recover from crash , we use RAID method


Mirroring as a redundancy technique
Mirroring as a Redundancy Technique

it is call Raid 1

Just mirror each disk



Parity block
Parity Block

It often call Raid 4 technical

read block from each of the other disks and modulo-2 sum of each column and get redundant disk

disk 1: 11110000

disk 2: 10101010

disk 3: 00111000

get redundant disk 4(even 1= 0, odd 1 =1)

disk 4: 01100010



Parity block fail recovery
Parity Block- Fail Recovery

It can only recover one disk fail

If it has more than one like two disk

Then it can’t be recover us modulo-2 sum



Coping with multiple disk crash
Coping with multiple Disk Crash

For more one disk fail

Either raid 4 and raid 5 can’t be work

So we need raid 6

It is need at least 2 redundant disk



Reference
Reference

http://www.definethecloud.net/wp-content/uploads/2010/12/325px-RAID_1.svg_.png

http://en.wikipedia.org/wiki/RAID


Secondary storage management

Secondary Storage Management

13.5 Arranging data on disk

MangeshDahale

ID-105

CS 257


Outline
Outline

Fixed-Length Records

Example of Fixed-Length Records

Packing Fixed-Length Records into Blocks

Example of Packing Fixed-Length Records into Blocks

Details of Block header


Arranging data on disk
Arranging Data on Disk

  • A data element such as a tuple or object is represented by a record, which consists of consecutive bytes in some disk block.


Fixed length records
Fixed Length Records

  • The Simplest record consists of fixed length fields.

  • The record begins with a header, a fixed-length region

    where information about the record itself is kept.

  • Fixed Length Record header

    1. A pointer to record schema.

    2. The length of the record.

  • This information helps us skip over records

  • without consulting the schema.

    3. A timestamp indicating when the record was created.

  • Also, record was last modified, or last read.

  • This information may be useful for implementing database transactions


Example
Example

CREATE TABLE employee(

name CHAR(30) PRIMARY KEY,

address VARCHAR(255),

gender CHAR(1),

birthdate DATE

);


Packing fixed length records into blocks
Packing Fixed Length Records into Blocks

Records are stored in blocks of the disk and moved into main memory when we need to access or update them.

A block header is written first and it is followed by series of blocks.


Example1
Example

Along with the header we can pack as many record as we can in one block as shown in the figure and remaining space will be unused


Block header contains following information
Block header contains following information

Links to one or more other blocks that are part of a network blocks

Information about the role played by this block in such a network

Information about which relation the tuples of this block belong to.

A “directory” giving the offset of each round in the block

Timestamp(s) indicating the time of the block's last modification and / or access



Variable length data and records
Variable Length Data and Records

- AshwinKalbhor

Class ID : 107


Agenda
Agenda

Records with Variable Length Fields

Records with Repeating Fields

Variable Format Records

Records that do not fit in a block


Secondary storage management

Example of a record

name

address

gender

birth date

0 30 286 287 297


Records with variable length fields
Records with Variable Length Fields

  • Simple and Effective way to represent variable length records is as follows –

    1. Fixed length fields are kept ahead of the variable length records.

    2. A header is put in front of the of the record.

    3. Record header contains

    • Length of the record

    • Pointers to the beginning of all variable length fields except the first one.


Example2
Example

header information

record length

to address

gender

birth date

name

address

Record with name and address as variable length field.


Records with repeating fields
Records with repeating fields

Repeating fields simply means fields of the same length L.

All occurrences of Field F grouped together.

Pointer pointing to the first field F is put in the header.

Based on the length L the starting offset of any repeating field can be obtained.


Example of a record with repeating fields
Example of a record with Repeating Fields

other header information

record length

to address

to movie pointers

name

address

Movie star record with “movies” as the repeating field.

pointers to movies


Alternative representation
Alternative representation

Record is of fixed length

Variable length fields stored on a separate block.

The record itself keeps track of -

1. Pointers to the place where each repeating field begins, and

2. Either how many repetitions there are, or where the repetitions end.



Representing null values
Representing Null Values

Tuples often have fields that may be NULL. The record format of Fig. offers a convenient way to represent NULL values. If a field such as address is null,

then we put a null pointer in the place where the pointer to an address goes. Then, we need no space for an address, except the place for the pointer.

This arrangement can save space on average, even if address is a fixed-length field but frequently has the value NULL


Variable format records
Variable Format Records

  • Records that do not have fixed schema

  • Represented by sequence of tagged fields

  • Each of the tagged fields consist of information

    • Attribute or field name

    • Type of the field

    • Length of the field

    • Value of the field


Variable format records1
Variable Format Records

code for name

code for restaurant owned

code for string type

code for string type

length

length

N

S

14

Clint Eastwood

R

S

16

Hog’s Breath Inn


Records that do not fit in a block
Records that do not fit in a block

When the length of a record is greater than block size ,then record is divided and placed into two or more blocks

Portion of the record in each block is referred to as a RECORD FRAGMENT

Record with two or more fragments is called a SPANNED RECORD

Record that do not cross a block boundary is called UNSPANNED RECORD


Spanned records
Spanned Records

  • Spanned records require the following extra header information –

    • A bit indicates whether it is fragment or not

    • A bit indicates whether it is first or last fragment of a record

    • Pointers to the next or previous fragment for the same record


Spanned records1
Spanned Records

block header

record header

record

2 - b

record

2 - a

record 1

record 3

block 1

block 2



13 8 record modifications
13.8 Record Modifications

CS257

Lok Kei Leong ( 108 )


Outline1
Outline

Record Insertion

Record Deletion

Record Update


Insertion
Insertion

Insert new records into a relation- records of a relation in no particular order- record of a relation in fixed order (e.g. sorted by primary key)

A pointer to a record from outside the block is a “structured address”


Fixed order
Fixed Order

There is more of a problem when the tuples must be kept in some fixed order, such as sorted by their primary key (e.g., see Section 14.1.1).

If we need to insert a new record, we first locate the appropriate block for that record.


What if the block is full
What If The Block is Full?

  • If we need to insert the record in a particular block but the block is full. What should we do?

  • Find room outside the Block

  • There are 2 solutions

  • Find Space on Nearby Block

  • Create an Overflow Block


Insertion solution 1
Insertion (solution 1)

Find space on a “nearby” block

Block B1 has no space

If space available on block B2 move records of B1 to B2

If there are external pointers to records of B1 moved to B2 leave forwarding address in offset table of B1


Insertion solution 2
Insertion (solution 2)

Create an overflow block

Each block B has its header pointer to an overflow block where additional blocks of B can be placed


Deletion
Deletion

Slide around the block

Cannot slide records- maintain an available-space list in the block headerto keep track of space available

Avoid dangle or wind up pointing to a new record


Tombstone
Tombstone

  • What about pointer to deleted records ?

  • A tombstone is placed in place of each deleted record

  • A tombstone is a bit placed at first byte of deleted record to indicate the record was deleted ( 0 – Not Deleted 1 – Deleted)

  • A tombstone is permanent


Update
Update

For Fixed-Length Records, there is no effect on the storage system

For variable length records:

associated with insertion and deletion(never create a tombstone for old record)

Longer updated recordcreate more space on its block- sliding records - create an overflow block




Query execution section 15 1
Query ExecutionSection 15.1

Sweta Shah

CS257: Database SystemsID: 118


Agenda1
Agenda

  • Query Processor

  • Query compilation

  • Physical Query Plan Operators

    • Scanning Tables

      • Table Scan

      • Index scan

  • Sorting while scanning tables

  • Model of computation for physical operators

  • Parameters for measuring cost

  • Iterators


Query processor
Query Processor

  • The Query Processor is a group of components of a DBMS that turns user queries and data-modification commands into a sequence of database operations and executes those operations

  • Query processor is responsible for supplying details regarding how the query is to be executed



Query compilation
Query compilation

  • Query compilation itself is a multi-step process consisting of :

    • Parsing: in which a parse tree representing query and its structure is constructed

    • Query rewrite: in which the parse tree is converted to aninitial query plan

    • Physical plan generation: where the logical query plan is turned into a physical query plan by selecting algorithms.



Physical query plan operators
Physical Query Plan Operators

  • Physical query plans are built from operators

  • Each of the operators implement one step of the plan.

  • They are particular implementations for one of the operators of relational algebra.

  • They can also be non relational algebra operators like “scan” which scans tables.


Scanning tables
Scanning Tables

  • One of the most basic things in a physical query plan.

  • Necessary when we want to perform join or union of a relation with another relation.


Secondary storage management

Two basic approaches to locating the tuples of a relation R

  • Table-scan

    • Relation R is stored in secondary memory with its tuples arranged in blocks

    • it is possible to get the blocks one by one

    • This operation is called Table Scan


Secondary storage management

Two basic approaches to locating the tuples of a relation R

  • Index-scan

    • there is an index on any attribute of Relation R

    • Use this index to get all the tuples of R

    • This operation is called Index Scan


Sorting while scanning tables
Sorting While Scanning Tables

  • Why do we need sorting while scanning?

    • the query could include an ORDER BY clause requiring that a relation be sorted

    • Various algorithms for relational-algebra operations require one or both of their arguments to be sorted relation

    • Sort-scantakes a relation R and a specification of the attributes on which the sort is to be made, and produces R in that sorted order


Model of computation for physical operators
Model of Computation for Physical Operators

  • Choosing physical plan operators wisely is an essential for a good query processor.

  • Cost for an operation is measured in number of disk i/o operations.

  • If an operator requires the final answer to a query to be written back to the disk, the total cost will depend on the length of the answer and will include the final write back cost to the total cost of the query.


Improvements in cost
Improvements in cost

  • Major improvements in cost of the physical operators can be achieved by avoiding or reducing the number of disk i/o operations

  • This can be achieved by passing the answer of one operator to the other in the main memory itself without writing it to the disk.


Parameters for measuring costs
Parameters for Measuring Costs

  • Parameters that affect the performance of a query

    • Buffer space availability in the main memory at the time of execution of the query

    • Size of input and the size of the output generated

    • The size of memory block on the disk and the size in the main memory also affects the performance


Iterators for implementation of physical operators
Iterators for Implementation of Physical Operators

  • Many physical operators can be implemented as an iterator

  • It is a group of three functions that allows a consumer of the result of the physical operator to get the result one tuple at a time


Iterator
Iterator

  • The three functions forming the iterator are:

    Open:

  • This function starts the process of getting tuples.

  • It initializes any data structures needed to perform the operation


Iterator1
Iterator

GetNext

  • This function returns the next tuple in the result

  • Adjusts data structures as necessary to allow subsequent tuples to be obtained

  • If there are no more tuples to return, GetNext returns a special value NotFound


Iterator2
Iterator

Close

  • This function ends the iteration after all tuples

  • it calls Close on any arguments of the operator

  • Open()

  • Getnext()

  • Close()



Query execution
Query Execution

One-pass algorithm for database operations

Chetan Sharma

008565661


Overview
Overview

  • One-Pass Algorithm

  • One-Pass Algorithm Methods:

  • Tuple-at-a-time, unary operations.

  • Full-relation, unary operations.

  • Full-relation, binary operations.


Overview1
Overview

While many algorithms for operators have been proposed,

they largely fall into three classes:

1. Sorting-based methods (Section 15.4).

2. Hash-based methods (Sections 15.5 and 20.1).

3. Index-based methods (Section 15.6).


One pass algorithm
One-Pass Algorithm

Reading the data only once from disk.

Usually, they require at least one of the arguments to fit in main memory


Tuple at a time
Tuple-at-a-Time

These operations do not require an entire relation, or even a large part of it, in memory at once. Thus, we can read a block at a time, use one main memorybuffer, and produce our output.

Ex- selection and projection


Tuple at a time1
Tuple-at-a-Time

A selection or projection being performed on a relation R


Full relation unary operations
Full-relation, unary operations

These one-argument operations require

seeing all or most of the tuples in memory

at once,

so one-pass algorithms are limited to relations that are approximately of size M (the number of main-memory buffers available) or less.

Ex - The grouping operator

-The duplicate-elimination operator.


Full relation unary operations1
Full-relation, unary operations

Managing memory for a one-pass duplicate-elimination


Grouping
Grouping

A grouping operation gives us zero or more grouping attributes and presumably one or more aggregated attributes. If we create in main memory one entry for each group — that is, for each value of the grouping attributes — then we can scan the tuples of R, one block at a time.

Ex- MIN(a) , MAX(a) , COUNT , AVG(a)


Full relation binary operations
Full-relation, binary operations

All other operations are in this class:

set and bag versions of union,

intersection,

difference,

joins,

and products.

Except for bag union, each of these operations requires at least one argument to be limited to size M, if we are to use a one-pass algorithm


Full relation binary operations examples
Full-relation, binary operations examples

Set Union:

-We read S into M - 1 buffers of main memory and build a search structure where the search key is the entire tuple.

-All these tuples are also copied to the output.

-Read each block of R into the Mth buffer, one at a time.

-For each tuple t of R, see if t is in S, and if not, we copy t to the output. If t is also in S, we skip t.

Set Intersection :

-Read S into M - 1 buffers and build a search structure with full tuples as the search key.

-Read each block of R, and for each tuple t of R, see if t is also in S. If so, copy t to the output, and if not, ignore t.



Nested loops joins
Nested Loops Joins

Book Section of chapter 15.3

Guide: Prof. Dr. T.Y. LIN

Name: Sanya Valsan

Roll: 120


Topic to be covered
Topic to be covered

Tuple-Based Nested-Loop Join

An Iterator for Tuple-Based Nested-Loop Join

A Block-Based Nested-Loop Join Algorithm

Analysis of Nested-Loop Join


Introduction
Introduction

Nested – loop joins are, in a sense, one and a half passes since in each variation one of the two arguments has its tuples read only once, while the other argument has to be read repeatedly.

Nested-loop joins can be used for relations of any size; it is not necessary that one relation fit in main memory.


15 3 1 tuple based nested loop join
15.3.1 Tuple-Based Nested-Loop Join

The simplest variation of nested-loop join has loops that range over individual tuples of the relations involved. In this algorithm, which we call tuple-based nested-loop join, we compute the join as follows

R

S


Continued
Continued

  • For each tuple s in S DO

    For each tuple r in R Do

    if r and s join to make a tuple t THEN

    output t;

    • If we are careless about how to buffer the blocks of relations R and S, then this algorithm could require as many as T(R)T(S) disk I/O’s. There are many situations where this algorithm can be modified to have much lower cost.

      Note: T(R) – number of tuples in relation R.


Continued1
Continued

One improvement looks much more carefully at the way tuples of R and S are divided among blocks, and uses as much of the memory as it can to reduce the number of disk I/O's as we go through the inner loop. We shall consider this block-based version of nested-loop join.


15 3 2 an iterator for tuple based nested loop join
15.3.2 An Iterator for Tuple-Based Nested-Loop Join

  • Open() {

    • R.Open();

    • S.open();

    • A:=S.getnext();

      }

      GetNext() {

      Repeat {

      r:= R.Getnext();

      IF(r= Not found) {/* R is exhausted for the current s*/

      R.close();

      s:=S.Getnext();


Secondary storage management

IF( s= Not found) RETURN Not Found;

/* both R & S are exhausted*/

R.Close();

r:= R.Getnext();

}

}

until ( r and s join)

RETURN the join of r and s;

}

Close() {

R.close ();

S.close ();

}


15 3 3 a block based nested loop join algorithm
15.3.3 A Block-Based Nested-Loop Join Algorithm

We can Improve Nested loop Join by compute R join S.

  • Organizing access to both argument relations by blocks.

  • Using as much main memory as we can to store tuples belonging to the relation S, the relation of the outer loop.


The nested loop join algorithm
The nested-loop join algorithm

FOR each chunk of M-1 blocks of S DO BEGIN

read these blocks into main-memory buffers;

organize their tuples into a search structure whose

search key is the common attributes of R and S;

FOR each block b of R DO BEGIN

read b into main memory;

FOR each tuple t of b DO BEGIN

find the tuples of S in main memory that

join with t ;

output the join of t with each of these tuples;

END ;

END ;

END ;


15 3 4 analysis of nested loop join
15.3.4 Analysis of Nested-Loop Join

Assuming S is the smaller relation, the number of chunks or iterations of outer loop is B(S)/(M - 1). At each iteration, we read hf - 1 blocks of S and

B(R) blocks of R. The number of disk I/O's is thus

B(S)/M-1(M-1+B(R)) or B(S)+B(S)B(R)/M-1

Assuming all of M, B(S), and B(R) are large, but M is the smallest of these, an approximation to the above formula is B(S)B(R)/M. That is, cost is proportional to the product of the sizes of the two relations, divided by the amount of available main memory.


Example3
Example

  • B(R) = 1000, B(S) = 500, M = 101

    • Important Aside: 101 buffer blocks is not as unrealistic as it sounds. There may be many queries at the same time, competing for main memory buffers.

  • Outer loop iterates 5 times

  • At each iteration we read M-1 (i.e. 100) blocks of S and all of R (i.e. 1000) blocks.

  • Total time: 5*(100 + 1000) = 5500 I/O’s

  • Question: What if we reversed the roles of R and S?

  • We would iterate 10 times, and in each we would read 100+500 blocks, for a total of 6000 I/O’s.

  • Compare with one-pass join, if it could be done!

  • We would need 1500 disk I/O’s if B(S) x M-1


Continued2
Continued…….

  • The cost of the nested-loop join is not much greater than the cost of a one-pass join, which is 1500 disk 110's for this example.

  • Nested-loop join is mostly not the most efficient join algorithm.


Summary of the topic
Summary of the topic

In this topic we have learned about how the nested tuple Loop join are used in database using query execution and what is the process for that.



Two pass algorithm based on sorting
Two Pass Algorithm Based On Sorting

Section 15.4

CS257 Spring2013

SwapnaVemparala

Class ID : 131


Contents
Contents

Two-Pass Algorithms

Two-Phase, Multiway Merge-Sort

Duplicate Elimination Using Sorting

Grouping and Aggregation Using Sorting

A Sort-Based Union Algorithm

Sort-Based Intersection and Difference

A Simple Sort-Based Join Algorithm

A More Efficient Sort-Based Join


Two pass algorithms
Two-Pass Algorithms

Data from operand relation is read into main memory, processed, written out to disk again, and reread from disk to complete the operation.

we concentrate on

two-pass algorithms because:

a) Two passes are usually enough, even for very large relations.

b) Generalizing to more than two passes is not hard;


15 4 1 two phase multiway merge sort
15.4.1 Two-Phase, Multiway Merge-Sort

To sort very large relations in two passes.

Phase 1: Repeatedly fill the M buffers with new tuples from R and sort them, using any main-memory sorting algorithm. Write out each sorted sublist to secondary storage.

Phase 2 : Merge the sorted sublists. For this phase to work, there can be at most M — 1 sorted sublists, which limits the size of R. We allocate one input block to each sorted sublist and one block to the output.


Merging
Merging

Find the smallest key

Move smallest element to first available position of output block.

If output block full -write to disk and reinitialize the same buffer in main memory to hold the next output block.

If this block -exhausted of records, read next block from the same sorted sub­list into the same buffer that was used for the block just exhausted.

If no blocks remain- stop.


15 4 2 duplicate elimination using sorting
15.4.2 Duplicate Elimination Using Sorting

Same as previous…

Instead of sorting on the second pass, -repeatedly select first unconsidered tuple t among all sorted sub lists.

Write one copy of t to the output and eliminate from the input blocks all occurrences of t.

Output - exactly one copy of any tuple in R.


15 4 3 grouping and aggregation using sorting
15.4.3 Grouping and Aggregation Using Sorting

Read the tuples of R into memory, M blocks at a time. Sort the tuples in each set of M blocks, using the grouping attributes of L as the sort key. Write each sorted sublist to disk.

Use one main-memory buffer for each sublist, and initially load the first block of each sublist into its buffer.

Repeatedly find the least value of the sort key present among the first available tuples in the buffers.


15 4 4 a sort based union algorithm
15.4.4 A Sort-Based Union Algorithm

In the first phase, create sorted sublists from both R and S.

Use one main-memory buffer for each sublist of R and S. Initialize each with the first block from the corresponding sublist.

Repeatedly find the first remaining tuple t among all the buffers


15 4 5 sort based intersection and difference
15.4.5 Sort-Based Intersection and Difference

For both set version and bag version, the algorithm is same as that of set-union except that the way we handle the copies of a tuple t at the fronts of the sorted sub lists.

For set intersection -output t if it appears in both R and S.

For bag intersection -output t the minimum of the number of times it appears in R and in S.

For set difference -output t if and only if it appears in R but not in S.

For bag difference-output t the number of times it appears in R minus the number of times it appears in S.


15 4 6 a simple sort based join algorithm
15.4.6 A Simple Sort-Based Join Algorithm

Given relations R(X, Y) and S(Y, Z) to join, and given M blocks of main memory for buffers

Sort R, using 2PMMS, with Y as the sort key

Sort S similarly

Merge the sorted R and S, use only two buffers


15 4 8 a more efficient sort based join
15.4.8 A More Efficient Sort-Based Join

  • If we do not have to worry about very large numbers of tuples with a common value for the join attribute(s), then we can save two disk 1/0's per block by combining the second phase of the sorts with the join itself

  • To compute R(X, Y) ►◄ S(Y, Z) using M main-memory buffers

    • Create sorted sublists of size M, using Y as the sort key, for both R and S.

    • Bring the first block of each sublist into a buffer


Secondary storage management


Thank you1
Thank you available


Two pass algorithms based on hashing
Two-Pass available Algorithms Based on Hashing

Chapter – 15.5

Cs 257

Id 131 Swapna vemparala


Contents1
Contents available

Introduction

Partitioning Relations by Hashing

A Hash-Based Algorithm for Duplicate Elimination

Hash-Based Grouping and Aggregation

Hash-Based Union, Intersection, and Difference

The Hash-Join Algorithm

Saving Some Disk I /O ’s

Differences between sort-based and corresponding hashbased algorithms


Introduction1
Introduction available

The essential idea behind all these previous algorithms is as follows:

If the data is too big to store in main-memory buffers, hash all the tuples of

the argument or arguments using an appropriate hash key.

For all the common operations, there is a way to select the hash key so all the tuples that need to be considered together when we perform the operation fall into the same bucket.

We then perform the operation by working on one bucket at a time (or on

a pair of buckets with the same hash value, in the case of a binary operation).

In effect, we have reduced the size of the operand(s) by a factor equal to the

number of buckets, which is roughly M.


15 5 1 partitioning relations by hashing
15.5.1 Partitioning Relations by Hashing available

Take a relation R and, using M buffers, partition R into M — 1 buckets of roughly equal size.

assume that h is the hash function, and that h takes complete tuples of R as its argument

associate one buffer with each bucket

The last buffer holds blocks of R , one at a time. Each tuple t in

the block is hashed to bucket h(t) and copied to the appropriate buffer.

If that buffer is full, we write it out to disk, and initialize another block for the same bucket.

At the end, we write out the last block of each bucket if it is not empty.


Algorithm
ALGORITHM: available


15 5 2 a hash based algorithm for duplicate elimination
15.5.2 A Hash-Based Algorithm for available Duplicate Elimination

We shall now consider the details of hash-based algorithms for the various operations of relational algebra that might need two-pass algorithms.

First, consider duplicate elimination, that is, the operation S(R). We hash R to M — 1 buckets, two copies of the same tuple t will hash to the same bucket.

Thus, we can examine one bucket at a time, perform <5 on that bucket in isolation, and take as the answer the union of S(Ri), where Riis the portion of R that hashes to the ith bucket.

The one-pass algorithm eliminates duplicates from each Riin turn and write out the resulting unique tuples


Secondary storage management

This method will work as long as the individual available R i’s are sufficiently small to fit in main memory and thus allow a one-pass algorithm.

Since we may assume the hash function h partitions R into equal-sized buckets, each Riwill be approximately B(R)/(M — 1) blocks in size.

If that number of blocks is no larger than M , i.e., B(R) < M(M — 1), then the two-pass, hash-based algorithm will work.

Thus, a conservative estimate (assuming M and M — 1 are essentially the same) is B(R) < M 2, exactly as for the sort-based, two-pass algorithm for 6.

The number of disk I/O ’s is also similar to that of the sort-based algorithm.

We read each block of R once as we hash its tuples, and we write each block of each bucket to disk. We then read each block of each bucket again in theone-pass algorithm that focuses on that bucket.

Thus, the total number of disk I/O ’s is 3B(R).


15 5 3 hash based grouping and aggregation
15.5.3 Hash-Based Grouping and Aggregation available

To perform the 7 l ( R ) operation, we again start by hashing all the tuples of R to M — 1 buckets.

However, in order to make sure that all tuples of the same group wind up in the same bucket, we must choose a hash function that depends only on the grouping attributes of the list L.

Having partitioned R into buckets, we can then use the one-pass algorithm for 7 to process each bucket in turn.

For S, we can process each bucket in main memory provided B{R) < M 2.

However, on the second pass, we need only one record per group as we process each bucket


Secondary storage management

.Thus available , even if the size of a bucket is larger than M, we can handle the bucket in one pass provided the records for all the groups in the bucket take no more than M buffers.

As a consequence, if groups are large, then we may actually be able to handle much larger relations R than is indicated by the B(R) < M 2 rule.

On the other hand, if M exceeds the number of groups, then we cannot fill all buckets.

Thus, the actual limitation on the size of R as a function of M is complex, but B(R) < M 2 is a conservative estimate.

Finally, we observe that the number of disk I/O ’s for 7 , as for 8, is 3B(R).


15 5 4 hash based union intersection and difference
15.5.4 Hash-Based Union, Intersection, and Difference available

When the operation is binary, use the same hash function to hash tuples of both arguments. For example, to compute R Us 5, we hash both R and S to M — 1 buckets each, say i? i, -R2, - - • , R m - 1 and S i,5 2, • • • , S m - 1.

We then take the set-union of Riwith Si for all i, and output the result. Notice that if a tuple t appears in both R and S, then for some iwe shall find t in both Riand Si.

Thus, when we take the union of these two buckets, we shall output only one copy of t , and there is no possibility of introducing duplicates into the result.

.


Secondary storage management

To take the intersection or difference of available R and S, we create the 2(M — 1) buckets exactly as for set-union and apply the appropriate one-pass algorithm to each pair of corresponding buckets.

Notice that all these one-pass algorithms require B(R) -I- B(S) disk I/O ’s.

To this quantity we must add the two disk I/O ’s per block that are necessary to hash the tuples of the two relations and store the buckets on disk, for a total of 3 (B{R) + 5 (5 )) disk I/O ’s.

In order for the algorithms to work, we must be able to take the one-pass union, intersection, or difference of Riand Si, whose sizes will be approximately B(R)/(M - 1) and B(S)/(M - 1), respectively.

Recall that the onepass algorithms for these operations require that the smaller operand occupies at most M — 1 blocks.

Thus, the two-pass, hash-based algorithms require that m in(B(R),B(S)) < M 2, approximately.


15 5 5 the hash join algorithm
15.5.5 The Hash-Join Algorithm available

To compute R{X, Y) txS(Y, Z) using a two-pass, hash-based algorithm, we act almost as for the other binary operations

The only difference is that we must use as the hash key just the join attributes,Y.

Then we can be sure that if tuples of R and S join, they will wind up in corresponding buckets Riand Si for some i.

A one-pass join of all pairs of corresponding buckets completes this algorithm, which we call hash-join.


15 5 6 saving some disk i o s
15.5.6 Saving Some Disk I /O ’s available

If there is more memory available on the first pass than we need to hold one block per bucket, then we have some opportunities to save disk I/O ’s.

One option is to use several blocks for each bucket, and write them out as a group, in consecutive blocks of disk.

Strictly speaking, this technique doesn’t save disk I/O ’s, but it makes the I/O ’s go faster, since we save seek time and rotational latency when we write.

Effective ,method called hybrid hash-join, works as follows.

In general, suppose we decide that to join R t x S, with S the smaller relation, we need to create k buckets, where k is much less than M, the available memory. When we hash S, we can choose to keep m of the k buckets entirely in main memory, while keeping only one block for each of the other k — m buckets. We can manage to do so provided the expected size of the buckets in memory, plus one block for each of the other buckets, does not exceed M ; that is:

m B ( S ) / k + k — m < M


Secondary storage management

expected available size of a bucket is B ( S )/k, and there are m buckets in memory.

Now, when we read the tuples of the other relation, R, to hash that relation into buckets, we keep in memory:

1. The rnbuckets of 5 that were never written to disk, and

2. One block for each of the k — m buckets of R whose corresponding buckets of 5 were written to disk.

If a tuple t of R hashes to one of the first m buckets, then we immediately join it with all the tuples of the corresponding 5-bucket, as if this were a onepass, hash-join.


Secondary storage management

It available is necessary to organize each of the in-memory buckets of 5 into an efficient search structure to facilitate this join, just as for the one-pass hash-join.

If t hashes to one of the buckets whose corresponding 5-bucket is on disk, then t is sent to the main-memory block for that bucket, and eventually migrates to disk, as for a two-pass, hash-based join.

On the second pass, we join the corresponding buckets of R and 5 as usual.

However, no need to join the pairs of buckets for which the 5-bucket was left in memory; these buckets already been joined

The savings in disk I/O ’s is equal to two for every block of the buckets of 5 that remain in memory, and their corresponding ft-buckets.

Since m / k of the buckets are in memory, the savings is 2(m/k)(B(R) + B(S)).


Secondary storage management

The intuitive justification is that all but available k — m of the main-memory buffers can be used to hold tuples of 5 in main memory, and the more of these tuples, the fewer the disk I/O ’s.

Thus, we want to minimize k, the total number of buckets.

We do so by making each bucket about'as big as can fit in main memory; that is, buckets are of size M, and therefore k = B (S ) /M .

If that is the case, then there is only room for one bucket in the extra main memory; i.e., m — 1.

In fact, we really need to make the buckets slightly smaller than B( S ) /M , or else we shall not quite have room for one full bucket and one block for the other k — 1 buckets in memory at the same time.

Assuming, for simplicity, that k is about B ( S ) / M and m = 1, the savings in disk I/O ’s is 2 M ( B ( R ) + B { S ) ) / B { S ) and the total cost is (3 — 2M / B ( S ) ) (B ( R ) + B(S)).



D ifferences between sort based and corresponding hashbased algorithms
D available ifferences between sort-based and corresponding hashbased algorithms

1. Hash-based algorithms for binary operations have a size requirement that depends only on the smaller of two arguments rather than on the sum of the argument sizes, that sort-based algorithms require.

2. Sort-based algorithms sometimes allow us to produce a result in sorted order and take advantage of that sort later.

The result might be used in another sort-based algorithm for a subsequent operator, or it could be the answer to a query that is required to be produced in sorted order.

3. Hash-based algorithms depend on the buckets being of equal size.

there is generally at least a small variation in size, it is not possible to use buckets that


Secondary storage management

4. In sort-based algorithms, the sorted available sublists may be written to consecutive blocks of the disk

5. Moreover, if M is much larger than the number of sorted sublists, then we may read in several consecutive blocks at a time from a sorted sublist, again saving some latency and seek time.

6 . On the other hand, if we can choose the number of buckets to be less than M in a hash-based algorithm, then we can write out several blocks of a bucket at once.


Thank you2
Thank you available


15 6 index based algorithms
15.6 Index Based Algorithms available

By: Tomas Tupy (123)


Outline2
Outline available

  • Terminology

  • Clustered Indexes

    • Example

  • Non-Clustered Indexes

  • Index Based Selection

  • Joining Using an Index

  • Join Using a Sorted Index


Secondary storage management

Index-based algorithms are especially useful for the selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.


What is an index
What is an Index? selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

A data structure which improves the speed of data retrieval ops on a relation, at the cost of slower writes and the use of more storage space.

Enables sub-linear time lookup.

Data is stored in arbitrary order, while logical ordering is achieved by using the index.


Terminology recap
Terminology Recap selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

B(R)– Number of blocks needed to hold R

T(R) – Number of tuples in R

V(R,a) – Number of distinct values of the column for a in R.

Clustered Relation – Tuples are packed into as few blocks as possible.

Clustered Indexes – Indexes on attribute(s) such that all tuples with a fixed value for the search key appear on a few blocks as possible.


Clustering indexes
Clustering Indexes selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

A relation is clustered if its tuples are packed into relatively few blocks.

Clustering indexes are indexes on an attribute or attributes such that all the tuples with a fixed value for the search key of this index appear in as little blocks as possible.

Tuples are stored to match the index order.

A relation that isn’t clustered cannot have a clustering index.


Clustering indexes1
Clustering Indexes selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Let R(a,b) be a relation sorted on attribute a.

Let the index on a be a clustering index.

Let a1be a specific value for a.

A clustering index has all tuples with a fixed value packed into minumum # of blocks.

a1 a1

a1 a1a1a1

a1 a1a1a1

a1

All the a1 tuples


Pros cons
Pros/Cons selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Pros

    • Faster reads for particular selections

  • Cons

    • Writing to a table with a clustered index can be slower since there might be a need to rearrange data.

    • Only one clustered index possible.


Clustered index example
Clustered Index Example selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Problem: We want to quickly retrieve all orders for a particular customer.

  • How do we do this?


Clustered index example1
Clustered Index Example selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Solution: Create a clustered index on the “CustomerID” column of the Order table.

Now the tuples with the same CustomerID will be physically stored closed to one another on the disk.


Non clustered indexes
Non Clustered Indexes selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

There can be many per table

Quicker for insert and update operations.

The physical order of tuples is not the same as index order.


Index based algorithms
Index Based Algorithms selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Especially useful for the selection operator.

Join and other binary operators also benefit.


Index based selection
Index-Based Selection selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • No index

    • Without an index on relation R, we have to read all the tuples in order to implement selection oC(R), and see which tuples match our condition C.

    • What is the cost in terms of disk I/O’s to implement oC(R)? (For both clustered and non-clustered relations)


Index based selection1
Index-Based Selection selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • No index

    • Answer:

      • B(R) if our relation is clustered

      • Upto T(R) if relations in not-clustered.


Index based selection2
Index-Based Selection selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Let us consider an index on attribute a where our condition C is a = v.

oa=v(R)

In this case we just search the index for value v and we get pointers to exactly the tuples we need.


Index based selection3
Index-Based Selection selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Let’s say for our selection oa=v(R), our index is clustering.

What is the cost in the # of disk I/O’s to retrieve the set oa=v(R)?


Index based selection4
Index-Based Selection selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Answer

    • the average is: B(R) / V(R,a)

  • A few more I/Os:

    • Index might not be in main memory

    • Tuples which a = v might not be block aligned.

    • Even if clustered, might not be packed as tight as possible. (Extra space for insertion)


Index based selection5
Index-Based Selection selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Non-clustering index for our selection oa=v(R)

What is the cost in the # of disk I/O’s to retrieve the set oa=v(R)?


Index based selection6
Index-Based Selection selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Answer

    • Worst case is: T(R) / V(R,a)

    • This can happen if tuples live in different blocks.


Joining by using an index algorithm 1
Joining by Using an Index selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.(Algorithm 1)

Consider natural join: R(X,Y) |><| S(Y,Z)

Suppose S has and index on attribute Y.

Start by examining each block of R, and within each block consider each tuple t, where tY is a component of t corresponding to the attribute Y.

Now we use the index to find tuples of S that have tY in their Y component.

These tuples will create the join.


Joining by using an index algorithm 1 analysis
Joining by Using an Index selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.(Algorithm 1) Analysis

Consider R(X,Y) |><| S(Y,Z)

If R is clustered, then we have to read B(R) blocks to get all tuples of R. If R is not clustered then up to T(R) disk I/O’s are required.

For each tuple t of R, we must read an average of T(S) / V(S,Y) tuples of S.

Total: B(R)T(S) / V(S,Y) for clustered index, and T(R)T(S) / V(S,Y) for non-clustered index.


Join using a sorted index
Join Using a Sorted Index selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Consider R(X,Y) |><| S(Y,Z)

Data structures such as B-Trees provide the best sorted indexes.

In the best case, if we have sorting indexes on Y for both R and S then we perform only the last step of the simple sort-based join. (Sometimes called zig-zag join)


Join using a sorted index zig zag join
Join Using a Sorted Index selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.(Zig-zag join)

Consider R(X,Y) |><| S(Y,Z) where we have indexes on Y for both R and S.

Tuples from R with a Y value that does not appear in S never need to be retrieved, and vice-versa…

Index on Y in R

Index on Y in S


Thank you3
Thank You! selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Questions?


Buffer management
Buffer Management selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

By Snigdha Rao Parvatneni

SJSU ID: 008648978

Roll Number: 124

Course: CS257


Agenda2
Agenda selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Introduction

Role of Buffer Management

Architecture of Buffer Management

Buffer Management Strategies

Relation Between Physical Operator Selection And Buffer Management

Example


Introduction2
Introduction selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

We assume that operators in relations have some main memory buffers, to store the needed data.

It is very rare that these buffers are allocated in advance, to the operator.

Task of assigning main memory buffers to process is given to Buffer Manager.

Buffer manager is responsible for allocating main memory to the process as per the need and minimizing the delays and unsatisfiable requests.


Role of buffer manager
Role of Buffer Manager selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Buffer Manager responds to the request of main memory access to disk blocks. Below picture depicts it.


Architecture of buffer management
Architecture of Buffer Management selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • There are two broad architectures for a buffer manager:

    • Buffer manager controls main memory directly like in many RDBMS.

    • Buffer manager allocates buffers in virtual memory and let OS decide which buffers should be in main memory and which buffer should be in OS managed disk swap space like in many Object Oriented DBMS and main memory DBMS.


Problem
Problem selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Irrespective of approach there is a problem that buffer manager has to limit number of buffers, to fit in available main memory.

    • In case where buffer manager controls main memory directly

      • If requests exceeds available space then buffer manager has to select a buffer to empty by returning its contents to disk.

      • When blocks have not been changed then they are imply erased from main memory. But, when blocks have been changed then they are written back to its place on disk.

    • In case where Buffer manager allocates space in virtual memory

      • Buffer manager has the option of allocating more buffers, which can actually fit into main memory. When all these buffers will be in use then there will be thrashing.

      • It is an operating system problem where many blocks are moved in and out of disk’s swap space. Therefore, system will end up spending most of time in swapping blocks and getting very little work done.


Solution
Solution selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

To resolve this problem When DBMS is initialized then the number of buffers is set.

User need not worry about mode of buffering used.

For users there is a fixed size buffer pool, in other words set of buffers are available to query and other database actions.


Buffer management strategies
Buffer Management Strategies selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Buffer Manager needs to make a critical choice of which block to keep and which block to discard when buffer is needed for newly requested blocks.

  • Then buffer manager uses buffer replacement strategies. Some common strategies are –

    • Least-Recently Used (LRU)

    • First-In-First-Out (FIFO)

    • The Clock Algorithm (Second Chance)

    • System Control


Last recently used lru
Last-Recently Used (LRU) selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • This rule is to throw out the block which has not been read or written from long time.

  • To do this the Buffer Manager needs to maintain a table which will indicate the last time when block in each buffer was accessed.

  • It is also needed that each database access should make an entry in this table. Significant amount of is involved effort in maintaining this information.

  • Buffers which are not user from long time is less likely to be accessed before than those buffers which have been accessed recently. Hence, It is an effective strategy.


First in first out fifo
First-In-First-Out (FIFO) selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • In this rule, when buffer is needed then the buffer which has been occupied for longest by same block is emptied and used by new block.

  • To do this Buffer Manager needs to know only the time at which block occupying the buffer was loaded into the buffer.

  • Entry in the table is made when block is read from disk, not every time it is accessed.

  • Involves less maintenance than LRU but it is more prone to mistakes.


The clock algorithm
The Clock Algorithm selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

It is an efficient approximation of LRU and is commonly implemented.

Buffers are treated to be arranged in circle where arrow points to one of the buffers. Arrow will rotate clockwise if it needs to find a buffer to place a disk block.

Each buffer has an associated flag with value 0 or 1. Buffers with flag vale 0 are vulnerable to content transfer to disk whereas buffer with flag value 1 are not vulnerable.

Whenever block is read into buffer or contents of buffer are accessed, flag associated with it is set to 1.


Working of clock s algorithm
Working of Clock’s Algorithm selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Whenever buffer is needed for the

block arrow looks for first 0 it

can find in clockwise direction.

Arrow move changes flag value

from 1 to 0.

Block is thrown out of buffer only

when it remains unaccessed i.e.

flag value 0 for the time

between two rotations of the arrow.

First rotation when flag is set from

1 to 0 and second rotation when

arrow comes back to check flag value.


System control
System Control selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Query processor and other DBMS components can advice buffer manager to avoid some mistake which occurs with LRU, FIFO or Clock.

Some blocks cannot be moved out of main memory without modifying other blocks pointing to it. Such blocks are called pinned blocks.

Buffer Manager needs to modify buffer replacement strategy, to avoid expelling pinned blocks. That’s why some blocks are remains in main memory even though there is no technical reason for not writing it to the disk.


Relation between p hysical operator selection a nd buffer management
Relation Between selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Physical Operator Selection And Buffer Management

  • Physical operator is selected by query optimizer to execute the query. This selection may assume that certain number for the available buffers, for execution of these operators.

  • But as we know that Buffer Manager does not guarantee availability of certain number of buffers when query is executed.

  • Now two question arises

    • Can an algorithm adapt to changes in number of available main memory buffers?

    • When expected number of available buffers are not available and certain blocks are moved to disk which were expected in main memory, then how buffer replacement strategy of Buffer Manager impact the number of I/O performed?


Example4
Example selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Block based nested loop join – algorithm does not depends upon number of available buffers, M but performance depends.

For each M-1 blocks of outer loop relation S, read blocks in main memory, organize the tuple into search structure whose key is common attribute of R and S.

Now for each block b of R, read b into main memory and for each tuple t of b find tuples in main memory for S that join with t.

Here outer loop number M-1 depends upon average number of buffers available at each iteration and the outer loop uses M-1 buffers and 1 is reserved for relation of inner loop, block of R.

If we pin M-1 block that we use for S in one iteration of outer loop then we cannot loose these buffers during this round. In addition to this more buffers will become available and more than one block of R needs to be kept in the memory. Will it improve the running time?


Cases w ith lru
Cases selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.with LRU

  • Case1

    • When LRU is used as buffer replacement strategy then k buffers will be available to hold blocks of R.

    • R is read in order such that blocks that remains in the buffer at the end of iteration of outer look will be last k blocks of R.

    • For next iteration we will again start from beginning of R. Therefore, k buffers for R needs to be replace.

  • Case2

    • With better implementation of nested loop join when LRU is used visit blocks of R in order that alternates first to last then last to first.

    • In this we save k disk I/O on each iteration except first iteration.


With o ther algorithms
With selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Other Algorithms

Other algorithms also are impacted by the fact that availability of buffer can vary and by the buffer-replacement strategy used by the buffer manager.

In sort based algorithm when availability of buffers reduces we can change the size of a sub-lists. Major limitation of this is we will be forced to create many sub-lists that we cannot then allocate a buffer for each sub-list in the merging process.

In hash based algorithm when availability of buffers reduces we can reduce the size of buckets, provided bucket then should not become so large that they do not fit into the allotted main memory.


References
References selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

DATABASE SYSTEMS: The Complete Book, Second Edition by Hector Garcia-Molina, Jeffrey D. Ullman & Jennifer Widom


Thank you4
Thank You selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.


15 8 algorithms using more than two passes
15.8 Algorithms using more than two passes selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Presented By: Seungbeom Ma (ID 125)

Professor: Dr. T. Y. Lin

Computer Science Department

San Jose State University


Multipass algorithms
Multipass selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. Algorithms

  • Previously , most of algorithms are required two passes.

  • There is a case that we need more than two passes.

  • Case : Data is too big to store in main memory.

    • We have to hash or sort the relation with multipass algorithms.


Agenda3
Agenda selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

1. Multipass Sort-Based Algorithm

2.Multipass Hash-Based Algorithm


Multipass sort based algorithm
Multipass selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. sort-based algorithm.

M: Number of Memory Buffers

R: Relation

B(R) : Number of blocks for holding relation.

BASIS:

1. If R fits in M block (B (R) <= M).

2. Reading R into main memory.

3. Sorting R in the main memory with any sorting algorithm.

4. Write the sorted relation to disk.


Multipass sort based algorithm1
Multipass selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. sort-based algorithm.

INDUCTION: (B(R)> M)

1. If R does not fit into main memory then partitioning the blocks hold R into M groups, which call R1, R2, …, RM

2.Recursively sorting Ri from i =1 to M

3.Once sorting is done, the algorithm merges the M sorted sub-lists.


Performance multipass sort based algorithms
Performance: selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Multipass Sort-Based Algorithms

1) Each pass of a sorting algorithm:

1.Reading data from the disk.

2. Sorting data with any sorting algorithms

3. Writing data back to the disk.

2-1) (k)-pass sorting algorithm needs

2k B(R) disk I/O’s

2-2)To calculate (Multipass)-pass sorting algorithm needs

= > A+ B

A: 2(K-1 ) (B(R) + B(S) ) [ disk I/O operation to sort the sublists]

B: B(R) + B(S)[ disk I/O operation to read the sorted the sublists in the final pass]

Total: (2k-1)(B(R)+B(S)) disk I/O’s


Multipass hash based algorithms
Multipass selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. Hash-Based Algorithms

  • 1. Hashing the relations into M-1 buckets, where M is number of memory buffers.

  • 2. Unary case:

  • It applies the operation to each bucket individually.

  • 1.Duplicate elimination (δ) and grouping (γ).

    • 1) Grouping:Min, Max, Count , Sum , AVG , which can group the data in the table

    • 2) Duplicate elimination: Distinct

      Basis:

      If the relation fits in M memory block,

      -> Reading relation into memory and perform the operations.

  • 3. Binary case: It applies the operation to each corresponding pair of buckets.

  • Query operations: union, intersection, difference , and join

    • If either relations fits in M-1 memory blocks,

    • -> Reading that relation into main memory M-1 blocks

    • -> Reading next relation to 1 block at a time into the Mth block

    • Then performing the operations.


Induction
INDUCTION selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • If Unary and Binary relation does not fit into the main memory buffers.

    • Hashing each relation into M-1 buckets.

    • Recursively performing the operation on each bucket or corresponding pair of buffers.

    • Accumulating the output from each buckets or pair.


Hash based algorithms unary operatiors
Hash-Based Algorithms : Unary selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Operatiors


Perfermance hash based algorithms
Perfermance selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.: Hash-Based Algorithms

R: Realtion.

Operations are like δ and γ

M: Buffers

U(M, k): Number of blocks in largest relation with k-pass hashing algorithm.


Performance induction
Performance selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.:Induction

Induction:

1. Assuming that the first step divides relation R into M-1 equal buckets.

2. The buckets for the next pass must be small enough to handle in k-1 passes

3.Since R is divided into M-1 buckets , we need to have (M-1)u(M, k-1).


Sort based vs hash based
Sort-Based VS Hash-Based selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

1. Sort-based can produce output in sorted order. It might be helpful to reduce rotational latency or seek time

2. Hash-based depends on buckets being of equal size. For binary operations, hash-based only limits size of smaller relation. Therefore, hash-based can be faster than sort-based for small size of relation.

Sort-based can produce output in sorted order.

Hash-based depends on buckets being of equal size.


Thanks
THANKS selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.


David le cs257 id 126 feb 28 2013
David Le selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.CS257, ID: 126Feb 28, 2013

15.9

QueryExecutionSummary


Secondary storage management

Overview selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • QueryProcessing

  • Outline of Query Compilation

  • Table Scanning

  • CostMeasures

  • Review of Algorithms

    • One-passMethods

    • Nested-LoopJoin

    • Two-pass

      • Sort-based

      • Hash-based

    • Index-based

    • Multi-pass


Query processing
Query selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Processing

  • Queryiscompiled. This involves extensive optimizationusingoperations of relationalalgebra.

  • First compiledinto a logicalquery plans, e.g. using expressions of relationalalgebra.

  • Thenconverted to a physicalquery plan such as selectingimplementation for eachoperator, ordering joins and etc.

  • Queryisthenexecuted.

query

Query Compilation

query plan

Query Execution

metadata

data


Outline of query compilation1
Outline selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. of Query Compilation

  • Parsing: A parse tree for the query is constructed.

  • Query Rewrite: The parse tree is converted to an initial query plan and transformed into logical query plan.

  • Physical Plan Generation: Logical plan is converted into physical plan by selecting algorithms and order of executions.

SQL query

Parse query

expression

tree

Select logical plan

query

optimization

logical query

plan tree

Select physical plan

physical query

plan tree

Execute plan


Table scanning
Table Scanning selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • There are two approaches for locating tuples of relation R:

    • Table-scan: Get the blocks one by one.

    • Index-scan: Use index to lead us to all blocks holding R.

  • Sort-scan takes a relation R and sorting specifications and produces R in a sorted order. This can be accomplished with SQL clause ‘ORDER BY’.


Secondary storage management

  • Cost selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Measures

  • Estimates of cost are essential for query optimization.

  • It allows us to determine the slow and fast parts of a query plan.

  • Reading many consecutive blocks on a track is extremely important since disk I/O’s are expensive in term of time.

EXPLAIN SELECT * FROM a JOIN b on a.id = b.id;


Secondary storage management

  • Review selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. of Algorithms

  • One-pass Methods

  • Tuple-at-a-time: Selection and projection that do not require an entire relation in memory at once.

  • Full-relation, unary operations. Must see all or most of tuples in memory at once. Uses grouping and duplicate-eliminator operators.

  • Full-relation, binary operations. These include union, intersection, difference, product and join. they largely fall into three classes:

  • 1. Sorting-based methods (Section 15.4).

  • 2. Hash-based methods (Sections 15.5 and 20.1).

  • 3. Index-based methods (Section 15.6).


Secondary storage management

  • Review selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. of Algorithms

  • Nested-Loop Joins

  • In a sense, it is ‘one-and-a-half’ passes, since one argument has its tuples read only once, while the other will be read repeatedly.

  • Can use relation of any size and does not have to fit all in main memory.

  • Two variations of nested-loop joins:

    • Tuple-based: Simplest form, can be very slow since it takes T(R)*T(S) disk I/O’s if we are joining R(x,y) with S(y,z).

    • Block-based: Organizing access to both argument relations by blocks and use as much main memory as we can to store tuples.


Secondary storage management

  • Review selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. of Algorithms

  • Two-pass Algorithms

  • Usually enough even for large relations.

  • Based on Sorting:

    • Partition the arguments into memory-sized, sorted sublists.

    • Sorted sublists are then merged appropriately to produce desired results.

  • Based on Hashing:

    • Partition the arguments into buckets.

    • Useful if data is too big to store in memory.


Secondary storage management

  • Review selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. of Algorithms

  • Two-pass Algorithms

  • Sort-based vs. Hash-based:

    • Hash-based are often superior to sort-based since they require only one of the arguments to be small.

    • Sorted-based works well when there is reason to keep some of the data sorted.


Secondary storage management

  • Review selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. of Algorithms

  • Index-based Algorithms

  • Index-based joins are excellent when one of the relations is small, and the other has an index on join attributes.

  • Clustering and non-clustering indexes:

    • Clustering index has all tuples with fixed value packed into minimum number of blocks.

    • A clustered relation can have non-clustering indexes.


Secondary storage management

  • Review selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. of Algorithms

  • Multi-pass Algorithms

  • Two-pass algorithms based on sorting or hashing can usually take three or more passes and will work for larger data sets.

  • Each pass of a sorting algorithm reads all data from disk and writes it out again.

  • Thus, a k-pass sorting algorithm requires 2·k·B(R) disk I/O’s.


Secondary storage management

Chapter 18 selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.


Chapter 18 1 topics
Chapter 18.1 Topics selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Intro

    • Concurrency Control

    • Scheduler

  • Serializability

  • Schedules

    • Serial and Serializable


Intro concurrency control scheduler
Intro: Concurrency Control & Scheduler selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Concurrently executing transactions can cause inconsistent database state

  • Concurrency Control assures transactions preserve consistency

  • Scheduler:

    • Regulates individual steps of different transactions

    • Takes reads/writes requests from transactions and executes/delays them


Intro scheduler
Intro: Scheduler selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Transaction manager

Transaction requests passed to Scheduler

Scheduler determines execution of requests

Read/Write

requests

Scheduler

Reads and writes

Buffers


Serializability
Serializability selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • How to assure concurrently executing transactions preserve database state correctness?

    • Serializability – schedule transactions as if they were executed one-at-a-time

      • Determine a Schedule


Schedules
Schedules selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Schedule – sequence of important actions performed by transactions

    • Actions: reads and writes

  • Example: Transactions and actions


Serial schedules
Serial Schedules selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • All actions of one transactions are followed by all actions of another transaction, and so on.

  • No mixing of actions

  • Depends only on order of transactions

  • Serial Schedules:

    • T1 precedes T2

    • T2precedes T1


Serial schedule example
Serial Schedule: Example selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • T1 precedes T2

    • Notation: (T1 ,T2)

  • Consistency constraint:

    • A = B

  • Final value:

    • A = B = 250

    • Consistency is preserved


Serializable schedules
Serializable selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. Schedules

  • Serial schedules preserve consistency

  • Any other schedules that also guarantee consistency?

    • Serializable schedules

  • Definition:

    • A schedule Sis serializable if there’s a serial schedule S’ such that for every initial database state, the effects of S and S’ are the same.


Serializable schedule example
Serializable selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. Schedule: Example

Serializable, but not serial,

schedule

T2 acts on A after T1,

but before T1 acts on B

Effect is same as

serial schedule (T1, T2 )


Notation transactions and schedules
Notation: Transactions and Schedules selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Transaction: Ti(for example T1, T2,…)

  • Database element: X

  • Actions: read/write

    • rTi (X)= ri(X)

    • wTi(X) = wi(X)

  • Examples

    • Transactions:

      • T1: r1 (A); w1(A); r1(B); w1 (B);

      • T2: r2(A); w2(A); r2(B); w2(B);

    • Schedule:

      • r1 (A); w1 (A); r2 (A); w2 (A); r1 (B); w1 (B); r2 (B); w2 (B);


18 2 conflict serializability
18.2 selection operator, but algorithms for join and other binary operators also use indexes to very good advantage. Conflict-Serializability

18.2.1 Conflicts

18.2.2 Precedence Graphs and a Test for Conflict-Serializability

18.2.3 Why the Precedence-Graph Test Works


18 2 1 conflicts
18.2.1 Conflicts selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

Conflict -a pair of consecutive actions in a schedule such that, if their order is interchanged, the final state produced by the schedule is changed.


18 2 1 conflicts1
18.2.1 Conflicts selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Non-conflicting situations:

    Assuming Ti and Tj are different transactions, i.e., i ≠ j:

    • ri(X); rj(Y) will never conflict, even if X = Y.

    • ri(X); wj(Y) will not conflict for X ≠ Y.

    • wi(X); rj(Y) will not conflict for X ≠ Y.

    • wi(X); wj(Y) will not conflict for X ≠ Y.


18 2 1 conflicts2
18.2.1 Conflicts selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Two actions of the same transactions always conflictri(X); wi(Y)

  • Two writes of the same database element by different transactions conflictwi(X); wj(X)

  • A read and a write of the same database element by different transaction conflictri(X); wj(X)wi(X); rj(X)

Conflicting situations:

Three situations where actions may not be swapped:


18 2 1 conflicts3
18.2.1 Conflicts selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Conclusions:

    Any two actions of different transactions may be swapped unless:

    • They involve the same database element, and

    • At least one is a write

  • The schedules S and S’ are conflict-equivalent, if S can be transformed into S’ by a sequence of non-conflicting swaps of adjacent actions.

  • A schedule is conflict-serializable if it is conflict-equivalent to a serial schedule.


18 2 1 conflicts4
18.2.1 Conflicts selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

  • Example 18.6

    Conflict-serializable schedule

    S: r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B);

    Above schedule is converted to the serial schedule S’ (T1, T2) through a sequence of swaps.

    r1(A); w1(A); r2(A); w2(A); r1(B); w1(B); r2(B); w2(B);r1(A); w1(A); r2(A); r1(B); w2(A); w1(B); r2(B); w2(B);r1(A); w1(A); r1(B); r2(A); w2(A); w1(B); r2(B); w2(B);r1(A); w1(A); r1(B); r2(A); w1(B); w2(A); r2(B); w2(B);

    S’: r1(A); w1(A); r1(B); w1(B); r2(A); w2(A); r2(B); w2(B);


18 2 2 precedence graphs and a test for conflict serializability
18.2.2 selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Precedence Graphs and a Test for Conflict-Serializability

  • Given a schedule S, involving transactions T1 and T2, T1 takes precedence over T2 (T1 <sT2), if there are actions A1of T1 and A2of T2, such that:

    • A1 is ahead of A2 in S,

    • Both A1 and A2 involve the same database element, and

    • At least one of A1 and A2 is a write action

  • We cannot swap the order of A1 and A2.

  • A1 will appear before A2 in any schedule that is conflict-equivalent to S.

  • A conflict-equivalent serial schedule must have T1 before T2.


  • 18 2 2 precedence graphs and a test for conflict serializability1
    18.2.2 Precedence Graphs and a Test for Conflict- selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Serializability

    • Precedence graph:

      • Nodes represent transactions of S

      • Arc from node i to node j, if Ti <S Tj

  • Example 18.7

    Given Schedule

    S: r2(A); r1(B); w2(A); r3(A); w1(B); w3(A); r2(B); w2(B);

    Precedence Graph

    Acyclic graph  Schedule is conflict-serializable

  • T3

    T1

    T2


    18 2 2 precedence graphs and a test for conflict serializability2
    18.2.2 Precedence Graphs and a Test for Conflict- selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Serializability

    • Example 18.8

      S: r2(A); r1(B); w2(A); r3(A); w1(B); w3(A); r2(B); w2(B);

      Convert S to serial schedule S’ (T1, T2, T3).

      r1(B); r2(A); w2(A); r3(A); w1(B); w3(A); r2(B); w2(B);

      r1(B); r2(A); w2(A); w1(B); r3(A); w3(A); r2(B); w2(B);

      r1(B); r2(A); w1(B); w2(A); r3(A); w3(A); r2(B); w2(B);

      r1(B); w1(B); r2(A); w2(A); r3(A); w3(A); r2(B); w2(B);

      r1(B); w1(B); r2(A); w2(A); r3(A); r2(B); w3(A); w2(B);

      r1(B); w1(B); r2(A); w2(A); r2(B); r3(A); w3(A); w2(B);

      r1(B); w1(B); r2(A); w2(A); r2(B); r3(A); w2(B); w3(A);

      r1(B); w1(B); r2(A); w2(A); r2(B); w2(B); r3(A); w3(A);

      S’: r1(B); w1(B); r2(A); w2(A); r2(B); w2(B); r3(A); w3(A);


    18 2 2 precedence graphs and a test for conflict serializability3
    18.2.2 Precedence Graphs and a Test for Conflict- selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Serializability

    • Example 18.9

      Given Schedule

      S: r2(A); r1(B); w2(A); r2(B); r3(A); w1(B); w3(A); w2(B);

      Precedence Graph

      Cyclicgraph  Schedule is NOT conflict-serializable

    T1

    T2

    T3


    18 2 3 why the precedence graph test works
    18.2.3 Why the Precedence-Graph Test selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Works

    • Consider a cycle involving n transactions

      T1 —> T2 ... —> Tn —> T1

    • In the hypothetical serial order, the actions of T1must precede those of T2, which precede those of T3, and so on, up to Tn.

    • But the actions of Tn, which therefore come after those of T1, are also required to precede those of T1. This puts constraints on legal swaps between T1 and Tn.

    • Thus, if there is a cycle in the precedence graph, then the schedule is not conflict-serializable.


    Secondary storage management

    18.3 Enforcing Serializability by locks selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.


    Introduction3
    INTRODUCTION selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    • Enforcing serializability by locks

      • Locks

      • Locking scheduler

      • Two phase locking


    Locks
    Locks selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    • Maintained on database element to prevent unserializable behavior

    • It works like as follows :

      • A request from transaction

      • Scheduler checks in the lock table to guide the decision

      • Generates a serializable schedule of actions.


    Consistency of transactions
    Consistency of transactions selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    • Actions and locks must relate each other

      • Transactions can only read & write only if it has a lock on the database elements involved in the transaction.

      • Unlocking an element is compulsory.

    • Legality of schedules

      • No two transactions can acquire the lock on same element without the prior one releasing it.


    Locking scheduler
    Locking scheduler selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    • Grants lock requests only if it is in a legal schedule.

    • Lock table stores the information about current locks on the elements.

    • Consider

      • li(X): Transaction Ti requests a lock on database element X

      • ui(X): Transaction Ti releases its lock on database element X


    Locking scheduler contd
    Locking scheduler (contd.) selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    A legal schedule of consistent transactions but unfortunately it is not a serializable but it is legal.


    Locking schedule contd
    Locking schedule (contd.) selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    The locking scheduler delays requests in order to maintain a consistent database state.


    Two phase locking 2pl
    Two-phase locking(2PL) selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    • Guarantees a legal schedule of consistent transactions is conflict-serializable.

    • All lock requests proceed all unlock requests.

    • The growing phase:

      • Obtain all the locks and no unlocks allowed.

    • The shrinking phase:

      • Release all the locks and no locks allowed.


    Working of two phase locking
    Working of Two-Phase locking selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    • Assures serializability.

    • Two protocols for 2PL:

      • Strict two phase locking : Transaction holds all its write locks till commit / abort.

      • Rigorous two phase locking : Transaction holds all locks till commit / abort.

    • Two phase transactions are ordered in the same order as their first unlocks.


    Two phase locking

    Instantaneously executes now selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    Locks

    required

    Time

    Two Phase Locking

    Every two-phase-locked transaction has a point at which it may be thought to execute instantaneously


    Failure of 2pl
    Failure of 2PL. selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    2PL fails to provide security against deadlocks.

    T1: l1(A); r1(A); A:=A+100; w1(A); l1(B); u1(A); r1(B); B:=B+100; w1(B); u1(B);

    T2: l2(B); r2(B); B:=B*2; w2(B); l2(A); u2(B); r2(A);A:=A*2; w2(A); u2(A);


    Concurrency control architecture for a locking scheduler section 18 5
    Concurrency Control selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.Architecture for a Locking SchedulerSection 18.5


    Overview2
    Overview selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    Overview of Locking Scheduler

    Scheduler That Inserts Lock Actions

    The Lock Table

    Handling Locking and Unlocking Request


    Principles of simple scheduler architecture
    Principles of simple scheduler architecture selection operator, but algorithms for join and other binary operators also use indexes to very good advantage.

    The transactions themselves do not request locks, or cannot be relied upon to do so. It is the job of the scheduler to insert lock actions into the stream of reads, writes, and other actions that access data.

    Transactions do not release locks. Rather, the scheduler releases the locks when the transaction manager tells it that the transaction will commit or abort.


    Scheduler that inserts lock actions into the transactions request stream
    Scheduler That Inserts Lock Actions into the transactions request stream

    From Transaction

    Read(A);Write(B)

    ;Commit(T)

    Scheduler, Part 1

    Lock

    Table

    Lock(A);Read(A)

    Scheduler, Part 2

    Read(A);Write(B)


    Secondary storage management

    The scheduler maintains a lock table, which, although it is shown as secondary-storage data, may be partially or completely in main memory

    Actions requested by a transaction are generally transmitted through the scheduler and executed on the database.

    Under some circumstances a transaction is delayed, waiting for a lock, and its requests are not (yet) transmitted to the database.


    The two parts of the scheduler perform
    The two parts of the scheduler perform shown as secondary-storage data, may be partially or completely in main memory

    Part I takes the stream of requests generated by the transactions and inserts appropriate lock actions ahead of all database-access operations, such as read, write, increment, or update.

    Part II takes the sequence of lock and database-access actions passed to it by Part I, and executes each appropriately


    Secondary storage management

    Determine the transaction (T) that action belongs and status of T (delayed or not). If T is not delayed then

    1. Database access action is transmitted to the database and executed

    2. If lock action is received by PartII, it checks the L Table whether lock can be granted or not

    • i> Granted, the L Table is modified to include granted lock

    • ii>Not G. then update L Table about requested lock then PartII delays transaction T


    Secondary storage management

    3. When of T (delayed or not). a T = commits or aborts, PartI is notified by the transaction manager and releases all locks.

    - If any transactions are waiting for locks PartI notifies PartII.

    4. Part II when notified about the lock on some DB element, determines next transaction T’ to get lock to continue.


    The lock table
    The Lock Table of T (delayed or not).

    A relation that associates database elements with locking information about that element

    Implemented with a hash table using database elements as the hash key

    Size is proportional to the number of lock elements only, not to the size of the entire database

    DB element A

    Lock information for A


    Secondary storage management

    Lock table Entry Field of T (delayed or not).

    • Group Mode

      • S means that only shared locks are held.

      • U means that there is one update lock and perhaps one or more shared locks.

      • X means there is one exclusive lock and no other locks.

  • Waiting

    • Waiting bit tells that there is at least one transaction waiting for a lock on A.

  • A list

    • A list describing all those transactions that either currently hold locks on A or are waiting for a lock on A.


  • Handling lock requests
    Handling Lock Requests of T (delayed or not).

    Suppose transaction T requests a lock on A

    If there is no lock table entry for A, then there are no locks on A, so create the entry and grant the lock request

    If the lock table entry for A exists, use the group mode to guide the decision about the lock request


    Secondary storage management

    How to deal with existing Lock of T (delayed or not).

    • 1) If group mode is U (update) or X (exclusive)

    • No other lock can be granted

      • Deny the lock request by T

      • Place an entry on the list saying T requests a lock

      • And Wait? = ‘yes’

    • 2) If group mode is S (shared)

    • Another shared or update lock can be granted

      • Grant request for an S or U lock

      • Create entry for T on the list with Wait? = ‘no’

      • Change group mode to U if the new lock is an update lock


    Handling unlock requests
    Handling Unlock Requests of T (delayed or not).

    • Now suppose transaction T unlocks A

      • Delete T’s entry on the list for A

      • If T’s lock is not the same as the group mode, no need to change group mode

      • Otherwise check entire list for new group mode


    Handling unlock requests1
    Handling Unlock Requests of T (delayed or not).

    • If the value of waiting is “yes" need to grant one or more locks using following approaches

      • First-Come-First-Served:

      • Grant the lock to the longest waiting request.

      • No starvation (waiting forever for lock)

      • Priority to Shared Locks:

      • Grant all S locks waiting, then one U lock.

      • Grant X lock if no others waiting

      • Priority to Upgrading:

      • If there is a U lock waiting to upgrade to an X lock, grant that first.


    18 concurrency control
    18. Concurrency Control of T (delayed or not).

    18.6 Hierarchies of

    Database Elements


    Hierarchies of database elements
    Hierarchies of Database Elements of T (delayed or not).

    • Two problems arises when tree structure in data is encountered – How to provide locking in each case.

      • Hierarchy of lockable elements (This section )

        - Large elements ( relations )

        - Smaller elements within relations ( blocks

        or individual tuples )

        2) Data it-self organized as trees. ( next section 18.7 )


    Locks with multiple granularity
    Locks with Multiple Granularity of T (delayed or not).

    A database element can be a relation, block or a tuple

    Different Database Systems use different size of data base elements for locking purpose


    Example5
    Example of T (delayed or not).

    • Consider a database for a bank

      • Choosing relations as database elements means one lock for an entire relation

      • If we were dealing with a relation having account balances, this kind of lock would be very inflexible and thus provide very little concurrency

      • This means only 1 deposit/withdrawal can be made

      • Better way to lock individual blocks or pages

        such that two accounts in different blocks can be updated simultaneously


    Warning intention locks
    Warning (Intention) Locks of T (delayed or not).

    These protocol helps manage lock at different level of hierarchy



    Warning protocol rules
    Warning Protocol Rules of T (delayed or not).

    • These involve both ordinary (S and X) and warning (IS and IX) locks

    • Below Rules Followed

      • Begin at the root of hierarchy

      • Request the S or X lock if we are at the desired element

      • If the desired element id further down the hierarchy, place a warning lock (IS if S and IX if X)

      • When the warning lock is granted, we proceed to the child node and repeat the above steps until desired node is reached

        Uses Compatibility matrix



    Example6
    Example Locks

    • Movie(title, year, length, studioName)

      Transaction T1

      - Select * from Movie where Title = ‘King kong’

      Transaction T2

      - Update Movie set year = 1939 where title=‘ gone with the wind’


    Secondary storage management

    If T2 had updated tuples related to ‘King Kong’ then it would have to wait untill T1 release S lock.

    S and X are not compatible


    Phantoms and handling insertions correctly
    Phantoms and Handling Insertions Correctly would have to wait untill T1 release S lock.

    This arises when transactions create new sub elements of lockable elements

    Since we can lock only existing elements the new elements fail to be locked

    Example followed


    Example7
    Example would have to wait untill T1 release S lock.

    • Consider a transaction T3

      • Select sum(length) from moviewhere studioName =‘Disney’

      • This calculates the total length of all tuples having studioName=Disney

      • Thus, T3 acquires IS for relation and S for targeted tuples

      • Now, if another transaction T4 inserts a new tuple for studioName = ‘Disney’, the result of T3 becomes incorrect


    Contd
    …contd would have to wait untill T1 release S lock.

    • Not a concurrency problem since the serial order (T3, T4) is maintained

    • Consider below now: T4 writing new tuple while T3 is reading tuples on same relation

      • r3(d1);r3(d2);w4(d3);w4(X);w3(L);w3(X)

      • Want to find total length of all movies under studioName – outcome should be consistent.

      • Above scenario is not consistent. T3 will not get correct sum because lock on new element (d3) was not obtained.


    Contd1
    …contd would have to wait untill T1 release S lock.

    • This problem is due to the relation has a phantom tuple (the new inserted tuple), which should have been locked but it did not existed when lock was acquired by T3.

    • The occurrence of phantoms can be avoided if all insertion and deletion transactions are treated as write operations (Excelusive lock – X ) on the whole relation.


    18 7 the tree protocol
    18.7 THE TREE PROTOCOL would have to wait untill T1 release S lock.


    Outline3
    Outline- would have to wait untill T1 release S lock.

    Introduction

    Motivation

    Rules for Access to Tree-Structured Data

    Why the Tree Protocol Works


    Secondary storage management

    Introduction- would have to wait untill T1 release S lock.

    • In this section, we deal with tree structures that are formed by the link pattern.

      • Database elements are disjoint pieces of data, but the only way to get to a node is through its parent.

      • B-trees are an important example of this sort of data.

      • Traversing a particular path to an element gives us freedom to manage locks differently from the two-phase locking approaches we have seen previously.


    B tree details
    B-tree Details- would have to wait untill T1 release S lock.

    • Basic DS:

      - Keeps records in sorted order, allows searches, sequential access, insertions and deletions in logarithmic time.

      - It is used in databases and file systems.

    • Locking structure:

      - Granularity is at node level. Treating Smaller pieces as elements is not beneficial and entire B-tree as a single element is infeasible!


    Motivation for tree based locking
    Motivation for Tree based locking- would have to wait untill T1 release S lock.

    If we use a standard set of lock modes (shared, update and exclusive) and two phase locking, then concurrent use of the B-tree is almost impossible.

    The reason is that every transaction must begin by locking the root node of the B-tree.

    Any transaction that inserts or deletes could wind up rewriting the root of the B-tree.

    Thus, only one transaction that is not read-only can access the B-tree at any time

    However, in most situations a B-tree node will not be rewritten, even if the transaction inserts or deletes a tuple.


    Secondary storage management

    Thus would have to wait untill T1 release S lock., as soon as a transaction moves to a child of the root and observes the situation that rules out a rewrite of the root, we would like to release the lock on the root.

    Releasing the lock on the root early will violate two phase locking, so we cannot be sure that the schedule of several transactions accessing the B-tree will be serializable.

    The solution is a specialized protocol for transactions that access tree structured data like B-trees.


    Rules for access to tree structured data
    Rules for Access to Tree-Structured Data- would have to wait untill T1 release S lock.

    • The following restrictions on locks form tree protocol.

      Assumptions:

      • There is only ONE kind of a lock, represented by the form li(X)

      • Transactions are consistent, and schedules must be legal but there is no two-phase locking requirement on transactions.

        Rules:

      • A transaction’s first lock may be at any node of the tree

      • Subsequent locks may only be acquired if the transaction currently has a lock on the parent node.

      • Nodes may be unlocked at any time.

      • A transaction may not relock a node on which it has released a lock, even if it still holds a lock on the node’s parent


    Example8
    Example… would have to wait untill T1 release S lock.

    A tree of lockable elements

    A

    B

    C

    D

    E

    F

    G


    3 transactions following the protocol
    3 transactions following the protocol… would have to wait untill T1 release S lock.

    T1T2T3

    l1 (A);r1(A);

    l1 (B); r1(B);

    l1(C); l1 (C);

    w1(A);u1(A);

    l1(D); r1(D);

    w1 (B); u1(B);

    l2(B);r2(B);

    l3(E);r3(E);

    w1(D);u1(D);

    w1(c);u1(C);

    l2(E) denied

    l3(F);r3(F);

    w3(F);u3(F);

    l3(G);r3(G)w3(E);u3(E);

    l2(E);r2(E);

    w3(G);u3(G);

    w2 (B); u2(B);

    w1 (E); u1(E);


    Why tree protocol works
    Why Tree Protocol works- would have to wait untill T1 release S lock.

    • The tree protocol implies a serial order on the transactions involved in a schedule.

    • The order of precedence can be defined as-

      Ti<S Tj ;If in a schedule S, the transactions Ti and Tj lock a node in common, and Ti locks the node first.

    • If precedence graph drawn from the precedence relations that we defined above has no cycles, then we claim that any topological order of transactions is an equivalent serial schedule.

    • For Example either ( T1,T2,T3) or (T3,T1,T2) is an equivalent serial schedule the reason for this serial order is that all the nodes are touched in the same order as they are originally scheduled.


    Secondary storage management

    If two transactions lock several elements in common, then they are all locked in same order.

    Precedence graph derived from schedule:

    1

    2

    3


    Secondary storage management

    Example:--4 Path of elements locked by two they are all locked in same order.transactions

    X T locks first

    P

    U locks first

    Z

    Y U locks first

    Now Consider an arbitrary set of transactions T1, T2;.. . . Tn,, that obey the tree protocol and lock some of the nodes of a tree according to schedule S.

    First among those that lock, the root. they do also in same order.

    If Ti locks the root before Tj, Then Ti locks every node in common with Tj does. That is Ti<sTj, But not Tj>sTi.


    Concurrency control
    CONCURRENCY CONTROL they are all locked in same order.

    SECTION 18.8

    Timestamps


    What is timestamping
    What is they are all locked in same order.Timestamping?

    • Scheduler assign each transaction T a unique number, it’s timestamp TS(T).

    • Timestamps must be issued in ascending order, at the time when a transaction first notifies the scheduler that it is beginning


    Timestamp ts t
    Timestamp TS(T) they are all locked in same order.

    • Two methods of generating Timestamps.

      • Use the value of system, clock as the timestamp.

      • Use a logical counter that is incremented after a new timestamp has been assigned.

  • Scheduler maintains a table of currently active transactions and their timestamps irrespective of the method used


  • Timestamps for database element x and commit bit
    Timestamps for database element X and commit bit they are all locked in same order.

    • RT(X):- The read time of X, which is the highest timestamp of transaction that has read X.

    • WT(X):- The write time of X, which is the highest timestamp of transaction that has write X.

    • C(X):- The commit bit for X, which is true if and only if the most recent transaction to write X has already committed.


    Physically unrealizable behavior

    U writes X they are all locked in same order.

    T reads X

    T start

    U start

    Physically Unrealizable Behavior

    • Read too late:

    • A transaction U that started after transaction T, but wrote a value for X before T reads X.


    Physically unrealizable behavior1

    U reads X they are all locked in same order.

    T writes X

    T start

    U start

    Figure: Transaction T tries to write too late

    Physically Unrealizable Behavior

    • Write too late:

    • A transaction U that started after T, but read X before T got a chance to write X.


    Dirty read

    U writes X they are all locked in same order.

    T reads X

    U start

    T start

    U aborts

    T could perform a dirty read if it reads X when shown

    Dirty Read

    • It is possible that after T reads the value of X written by U, transaction U will abort.


    Thomas write rule
    Thomas Write Rule they are all locked in same order.

    • Too late to repair the damage

    U writes X

    T writes X

    U Aborts

    T Start

    U Start

    T Commits


    Rules for timestamps based scheduling
    Rules for Timestamps-Based scheduling they are all locked in same order.

    • The scheduler, in response to a read or write request from a transaction T has the choice of:

      • a) Granting the request,

      • b) Aborting T (if T would violate physical reality) and restarting T with a new timestamp (abort followed by restart is often called rollback), or

      • c) Delaying T and later deciding whether to abort T or to grant the request(if the request is a read, and the read might be dirty)


    Rules for timestamps based scheduling1
    Rules for Timestamps-Based scheduling they are all locked in same order.

    • Scheduler receives a request rT(X)

      • If TS(T) ≥ WT(X), the read is physically realizable.

        • If C(X) is true, grant the request, if TS(T) > RT(X), set RT(X) := TS(T); otherwise do not change RT(X).

        • If C(X) is false, delay T until C(X) becomes true or transaction that wrote X aborts.

    • If TS(T) < WT(X), the read is physically unrealizable. Rollback T.


    Rules for timestamps based scheduling cont
    Rules for Timestamps-Based scheduling (Cont.) they are all locked in same order.

    • Scheduler receives a request WT(X).

      • If TS(T) ≥ RT(X) and TS(T) ≥ WT(X), write is physically realizable and must be performed.

        • Write the new value for X,

        • Set WT(X) := TS(T), and

        • Set C(X) := false.

    • if TS(T) ≥ RT(X) but TS(T) < WT(X), then the write is physically realizable, but there is already a later values in X.

      • If C(X) is true, then the previous writers of X is committed, and ignore the write by T.

      • If C(X) is false, we must delay T.

  • if TS(T) < RT(X), then the write is physically unrealizable, and T must be rolled back.


  • Rules for timestamps based scheduling cont1
    Rules for Timestamps-Based scheduling (Cont.) they are all locked in same order.

    • Scheduler receives a request to commit T. It must find all the database elements X written by T and set C(X) := true. If any transactions are waiting for X to be committed, these transactions are allowed to proceed.

    • Scheduler receives a request to abort T or decides to rollback T, then any transaction that was waiting on an element X that T wrote must repeat its attempt to read or write.



    Timestamps and locking
    Timestamps and Locking scheduler

    • Generally, timestamping performs better than locking in situations where:

      • Most transactions are read-only.

      • It is rare that concurrent transaction will try to read and write the same element.

  • In high-conflict situation, locking performs better than timestamps

  • The argument for this rule-of-thumb is:

    • Locking will frequently delay transactions as they wait for locks.

    • But if concurrent transactions frequently read and write elements in common, then rollbacks will be frequent in a timestamp scheduler, introducing even more delay than a locking system.


  • Concurrency control by validation
    Concurrency control schedulerby validation

    CS 257 : Database System Principles

    Section 18.9


    At a glance
    At a Glance scheduler

    What is Validation?

    Architecture of Validation based Scheduler

    Validation Rules

    Comparison between Concurrency Control Mechanisms


    Validation p1
    Validation scheduler(p1)

    Another type of Optimistic Concurrency control

    Allows transactions to access data without locks

    Validation Scheduler: Keeps record of what active transactions are doing

    Goes through ‘Validation Phase’ before the transaction starts to write values of database elements

    If there is a physically unrealizable behavior, the transaction is rolled back


    18 9 1 architecture of validation based scheduler p1
    18.9.1 Architecture of Validation based Scheduler scheduler(p1)

    Scheduler must be told for each transaction T

    • Read Set , RS(T) - Sets of database elements T reads

    • Write Set , WS(T) - Sets of database elements T writes

    • Three phases of the Validation Scheduler

      • Read

        • Transaction reads from Database all elements in its Read Set.

        • Also computes in its local address space all results its going to write.

      • Validate

        • Validates the transaction by comparing its read and write sets with those of other transactions.

        • If validation fails, transaction is rolled back, else proceeds to write phase.

      • Write

        • Writes to the database its values for the elements in its write set.


    Validation based scheduler
    Validation based Scheduler scheduler

    • Scheduler has an assumed serial order of the transactions to work with.

    • Maintains three sets

      • START : Set of transactions that have started but not yet completed

        • START (T) – time at which transaction started

      • VAL : Set of transactions that have been validated but not yet finished the writing of phase 3

        • START(T) & VAL(T) – time at which T validated

      • FIN : Set of transactions that completed phase 3

        • START(T), VAL(T), FIN(T) – time at which T finished


    18 9 2 validation rules
    18.9.2 Validation Rules scheduler

    T reads X

    U writes X

    • Case 1:

      • U is in VAL or FIN, that is U is validated

      • FIN(U) > START(T) , that is U did not finish before T started

      • RS(T) ∩ WS(U) is not empty (let it contain database element X)

      • Since we don’t know whether or not T got to read U’s value, we must rollback T to avoid a risk that the actions of T and U will not be consistent with the assumed serial order.

    U validated

    U starts

    T validating

    T starts


    18 9 2 validation rules1
    18.9.2 Validation Rules scheduler

    T writes X

    U writes X

    • Case 2:

      • U is in VAL , i.e. U has successfully validated

      • FIN(U) > VAL(T) , i.e. U did not finish before T entered its validation phase

      • WS(T) ∩ WS(U) is not empty (let database element X be in both write sets)

      • T and U must both write values of X , and if we let T validate, it is possible that it will write X before U does. Since we cannot be sure, we rollback T to make sure it does not violate the assumed serial order in which it follows U.

    U finish

    U validated

    T validating


    Rules for validating a transaction t
    Rules for Validating a transaction T scheduler

    • Check that RS(T) ∩ WS(U) = ᶲ for any previously validated U that did not finish before T started, i.e., if FIN(U) > START(T).

    • Check that WS(T) ∩ WS(U) = ᶲ for any previously validated U that did not finish before T validated, i.e., if FIN(U) > VAL(T).

      1. RS(T) ∩ WS(U) = ᶲ ; FIN(U) > START(T)

      2. WS(T) ∩ WS(U) = ᶲ ; FIN(U) > VAL(T)


    Example 18 2 9
    Example 18.2.9 scheduler

    4 Transactions T, U,V,W attempt to execute and validate

    U : RS = {B}

    WS = {D}

    W : RS ={A,D}

    WS = {A,C}

    • Read

    • Validate

    • Write

    T:

    RS = {A,B}

    WS ={A,C}

    V : RS = {B}

    WS = {D, E}


    Example 18 2 91
    Example 18.2.9 scheduler

    • Validation of U [RS = {B}; WS = {D}]

      • Nothing to check ; Reads {B} – U validates successfully – Writes {D}

    • Validation of T [RS = {A,B}; WS ={A,C}]

      • FIN(U) > START(T) ; RS(T) ∩ WS(U) should be empty  {A,B} ∩ {D} = ᶲ

      • FIN(U) > VAL(T) ; WS(T) ∩ WS(U) should be empty  {A,C} ∩ {D} = ᶲ

    • Validation of V [RS = {B}; WS = {D, E}]

      • FIN(T) > START(V); RS(V) ∩ WS(T) should be empty  {B} ∩ {A,C} = ᶲ

      • FIN(T) > VAL(V) ;WS(V) ∩ WS(T) should be empty  {D,E} ∩ {A,C} = ᶲ

      • FIN(U) > START(V) ;RS(V) ∩ WS(U) should be empty  {B} ∩ {D} = ᶲ

    • Validation of W [RS ={A,D}; WS = {A,C}]

      • FIN(T) > START(W); RS(W) ∩ WS(T) should be empty  {A,D} ∩ {A,C} = {A}

      • FIN(V) > START(W);RS(W) ∩ WS(V) should be empty  {A,D} ∩ {D,E} = {D}

      • FIN(V) > VAL(W);WS(W) ∩ WS(V) should be empty  {A,C} ∩ {D,E} = ᶲ

      • W is not validated , Is rolled back and hence does not write values A and C


    18 9 3 comparison between concurrency control mechanisms storage utilization
    18.9.3 Comparison between Concurrency Control Mechanisms – Storage Utilization

    • Timestamping and validation may use slightly more space than a locking.

    • A potential problem with validation is that the write set for a transaction must be known before the writes occur


    18 9 3 comparison between concurrency mechanisms delay
    18.9.3 Comparison between Concurrency Mechanisms - Delay Storage Utilization

    • The performance of the three methods depends on whether interaction among transactions is high or low.

      (Interaction  the likelihood that a transaction will access an element that is also being accessed by a concurrent transaction)

      • Locking delays transactions but avoids rollbacks, even when interaction is high.

      • Timestamps and validation do not delay transactions, but can cause them to rollback, which is a more serious form of delay and also wastes resources.

    • If interference is low, then neither timestamps nor validation will cause many rollbacks, and is preferable to locking


    Secondary storage management

    Thank You Storage Utilization


    ad
  • Login