Practical Database Design and Tuning. Hardware Raid San Nas Software Hashing B Trees Optimization . Disk Storage Devices. Parallelizing Disk Access using RAID Technology. Secondary storage technology must take steps to keep up in performance and reliability with processor technology.
Different raid organizations were defined based on different combinations of the two factors of granularity of data interleaving (striping) and pattern used to compute redundant information.
Different raid organizations are being used under different situations
Advantages of SANs are:
There are numerous methods for collision resolution, including the following:
- Fixed number of buckets M is a problem if the number of records in the file grows or shrinks.
- Ordered access on the hash key is quite inefficient (requires sorting the records).
Dynamic and Extendible Hashing Techniques
B-tree structures. (a) A node in a B-tree with q – 1 search values. (b) A B-tree of order p = 3. The values were inserted in the order 8, 5, 1, 7, 3, 12, 9, 6.
The nodes of a B+-tree. (a) Internal node of a B+-tree with q –1 search values. (b) Leaf node of a B+-tree with q – 1 search values and q – 1 data pointers.
Implementing the SELECT Operation (cont.):
Search Methods for Simple Selection:
Implementing the JOIN Operation
Use Common Sense
First do the WHERE conditions in the various tables and then do the JOIN
Use DISTINCT if possible before the join
In the FROM statement think about the order to select the tables, it might make it faster
Cost Components for Query Execution
Factors that Influence Physical Database Design
Analyzing the database queries and transactions
For each query, the following information is needed.
Factors that Influence Physical Database Design (cont.)
Analyzing the expected frequency of invocation of queries and transactions
20 % queries are made 80 % of the time
Physical Database Design Decisions
Statistics internally collected in DBMSs q –1 search values. (b) Leaf node of a B+-tree with q – 1 search values and q – 1 data pointers.:
Size of individual tables
Number of distinct values in a column
The number of times a particular query or transaction is submitted/executed in an interval of time
The times required for different phases of query and transaction processing
Statistics obtained from monitoring:
I/O and device performance statistics
Query/transaction processing statistics
Locking/logging related statistics
Index statisticsKnow your Data (sample size it)