Cs 257 chapter 15 9 summary of query execution database systems the complete book
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book PowerPoint PPT Presentation


  • 52 Views
  • Uploaded on
  • Presentation posted in: General

CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book. Krishna Vellanki 124. Introduction. What is Query Processor? Group of components of a DBMS that converts a user queries and data-modification commands into a sequence of database operations

Download Presentation

CS 257 Chapter – 15.9 Summary of Query Execution Database Systems: The Complete Book

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cs 257 chapter 15 9 summary of query execution database systems the complete book

CS 257Chapter – 15.9 Summary of Query ExecutionDatabase Systems: The Complete Book

Krishna Vellanki

124


Introduction

Introduction

What is Query Processor?

  • Group of components of a DBMS that converts a user queries and data-modification commands into a sequence of database operations

  • It also executes those operations

  • Must supply detail regarding how the query is to be executed


Building blocks of query processing

Building Blocks of Query processing

  • Query Execution:

    • The algorithms that manipulate the data of the database.

    • Focus on the operations of extended relational algebra.

3


Outline of query compilation

Outline of Query Compilation

4

Query compilation

  • Parsing: A parse tree for the query is constructed

  • Query Rewrite: The parse tree is converted to an initial query plan and transformed into logical query plan (less time)

  • Physical Plan Generation: Logical Q Plan is converted into physical query plan by selecting algorithms and order of execution of these operator.


Scanning tables

Scanning Tables

5

  • One of the basic thing we can do in a Physical query plan is to read the entire contents of a relation R.

  • Variation of this operator involves simple predicate, read only those tuplesof the relation R that satisfy the predicate.

  • Basic approaches to locate the tuplesof a relation R

    • Table Scan

      • Relation R is stored in secondary memory with its tuples arranged in blocks

      • It is possible to get the blocks one by one

    • Index-Scan

      • If there is an index on any attribute of Relation R, we can use this index to get all the tuples of Relation R


Sorting while scanning tables

Sorting While Scanning Tables

6

Number of reasons to sort a relation

  • Query could include an ORDER BY clause, requiring that a relation be sorted.

  • Algorithms to implement relational algebra operations requires one or both arguments to be sorted relations.

  • Physical-query-plan operator sort-scan takes a relation R, attributes on which the sort is to be made, and produces R in that sorted order


Parameters for measuring costs

Parameters for Measuring Costs

7

  • Parameters that affect the performance of a query

    • Buffer space availability in the main memory at the time of execution of the query

    • Size of input and the size of the output generated

    • The size of memory block on the disk and the size in the main memory also affects the performance

  • B: The number of blocks are needed to hold all tuples of relation R.

    Also denoted as B(R).

  • T is the number of tuples in relation R, also denoted as T(R).

  • V: The number of distinct values that appear in a column of a relation R

  • V(R, a)- is the number of distinct values of column for a in relation R


One pass algorithms for database operations

One-Pass Algorithms for Database Operations

The choice of an algorithm for each operator is an essential

part of the process of transforming a logical query plan into

a physical query plan.

  • Main classes of Algorithms:

    • Sorting-based methods

    • Hash-based methods

    • Index-based methods

  • Division based on degree difficulty and cost:

    • 1-pass algorithms

    • 2-pass algorithms

    • 3 or more pass algorithms


One pass algorithm methods

One-Pass Algorithm Methods

9

One-Pass Algorithms for Tuple-at-a-Time Operations: selection and projection

One-Pass Algorithms for Unary, fill-Relation Operations: Duplicate Elimination and Grouping

One-Pass Algorithms for Unary, fill-Relation Operations: Binary operations including Union, Intersection, Difference, Product and Join


Nested loop joins

Nested Loop Joins

  • Used for relations of any side.

  • Not necessary that relation fits in main memory

  • Uses “One-and-a-half” pass method in which for each variation:

  • One argument read just once.

  • Other argument read repeatedly.

  • Two kinds:

    • Tuple-Based Nested Loop Join

    • Block-Based Nested Loop Join


Improvement modification

Improvement & Modification

To decrease the cost

  • Method 1: Use algorithm for Index-Based joins

    • We find tuple of R that matches given tuple of S

    • We need not to read entire relation R

  • Method 2: Use algorithm for Block-Based joins

    • Tuples of R & S are divided into blocks

    • Uses enough memory to store blocks in order to reduce the number of disk I/O’s.


Physically unrealizable behaviors

Physically Unrealizable Behaviors

Read too Late

Transaction T tries to read too late


Cs 257 chapter 15 9 summary of query execution database systems the complete book

Write too Late

Transaction T tries to write too late


Problem with dirty data

Problem with dirty data

T could perform a dirty read if it is reads X


Cs 257 chapter 15 9 summary of query execution database systems the complete book

A write is cancelled because of a write with a later timestamp, but the writer then aborts


Timestamps vs locks

Timestamps Vs Locks


Two passed algorithm based on hashing

Two passed Algorithm based on hashing

Hashing is done if the data is too big to store in main memory buffers.

  • Hash all the tuples of the argument(s) using an appropriate hash key.

  • For all the common operations, there is a way to select the hash key so all the tuples that need to be considered together when we perform the operation have the same hash value.

  • This reduces the size of the operand(s) by a factor equal to the number of buckets.


Cs 257 chapter 15 9 summary of query execution database systems the complete book

Steps to be followed for a Two passed Algorithm based on hashing

  • Duplicate Elimination

  • Grouping and Aggregation

  • Union, Intersection, and Difference

  • Hash-Join Algorithm


Sort based vs hash based

Sort based Vs Hash based

  • For binary operations, hash-based only limits size to min of arguments, not sum

  • Sort-based can produce output in sorted order, which can be helpful

  • Hash-based depends on buckets being of equal size

  • Sort-based algorithms can experience reduced rotational latency or seek time


15 6 index based algorithms

15.6Index based Algorithms

  • Clustered Relation: Tuples are packed into roughly as few blocks as can possibly hold those tuples

  • Clustering indexes: Indexes on attributes that all the tuples with a fixed value for the search key of this index appear on roughly as few blocks as can hold them

  • A relation that isn’t clustered cannot have a clustering index

  • A clustered relation can have nonclustering indexes


Thank you

Thank You..!!


  • Login