Chapter 13: Query Processing. Database System Concepts. Chapter 1: Introduction Part 1: Relational databases Chapter 2: Relational Model Chapter 3: SQL Chapter 4: Advanced SQL Chapter 5: Other Relational Languages Part 2: Database Design
1. Parsing and translation
Let M denote memory size (in pages).
Thus total number of disk accesses for external sorting:
br ( 2 logM–1(br / M) + 1)
for each tuple tr in r do begin for each tuple tsin s do begintest pair (tr,ts) tosee if they satisfy the join condition if they do, add tr• tsto the result.endend
where nris number of record in R and bs and br are number of disk blocks in S and R
for each block Brofr do beginfor each block Bsof s do begin for each tuple trin Br do begin for each tuple tsin Bsdo beginCheck if (tr,ts) satisfy the join condition if they do, add tr• tsto the result.end end end end
The hash-join of r and s is computed as follows.
When partitioning a relation, one block of memory is reserved as the output buffer for each partition.
2. Partition r similarly.
3. For each i:
This hash index uses a different hash function than the earlier one h.
For each tuple tr locate each matching tuple tsin si using the in-memory hash index.
Output the concatenation of their attributes.
Relation s is called the build input and r is called the probe input.
2(br + bs logM–1(bs) – 1 + br + bs
Keep the first partition of the build relation in memory.
r 1 2... ns
1 . . . i –1 i +1 . . . n
r 1 2 ... ns
(r 1s) (r 2s) . . . (r ns)
r s: Output tuples in sito the result if they are already there in the hash index.
r – s: For each tuple in si, if it is there in the hash index, delete it from the index.
In the process of generating the internal form of the query, the parser checks the syntax of the user’s query, verifies that the relation names appearing in the query are names of relations in the database, and so on.
If the query was expressed in terms of a view, the parser replaces all references to the view name with the relational-algebra expression to compute the view.
It is the responsibility of the query optimizer to transform the query as entered by the user into an equivalent query that can be computed more efficiently.
Chapter 14 covers query optimization.
We can handle complex selections by computing unions and intersections of the results of simple selections.
Parsing of query languages differs little from parsing of traditional programming languages.
Based on performance studies conducted in the mid-1970s, database systems of that period used only nested-loop join and merge join.
These studies, which were related to the development of System R, determined that either the nested-loop join or merge join nearly always provided the optimal join method(Blasgen and Eswaran ); hence, these two wer the only join algorithms implemented in System R.
The System R study, however, did not include an analysis of hash join algorithms. Today, hash joins are considered to be highly efficient.