1 / 21

Planning Warehouse Storage

Planning Warehouse Storage. Chapter 9. Data Partitioning. Ease of: - Restructuring - Reorganization - Removal - Recovery - Monitoring - Management - Archiving - Indexing. Breaking up a data into separate physical units that can be handled

Download Presentation

Planning Warehouse Storage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Planning Warehouse Storage Chapter 9

  2. Data Partitioning • Ease of: - Restructuring - Reorganization - Removal - Recovery - Monitoring - Management - Archiving - Indexing • Breaking up a data into separate physical units that can be handled independently Add Order table Drop Other data is not affected

  3. Objects to Partition • Tables: - Fact - Dimension • Indexes

  4. Horizontal Partitioning • Table and index data are split by: - Time - Sales region or person - Geography - Organization - Line of business • Candidate columns appear in WHERE clause • Analysis determines requirement

  5. Vertical Partitioning You may use vertical partitioning when: • Speed of query and update actions is improved by it • Users require access to specific columns • Some data is changed infrequently • Descriptive dimension text may be better moved away from the dimension itself

  6. Partitioning Methods • Range partitioning (Oracle8 and Oracle8i) • Hash partitioning (Oracle8i) • Composite partitioning (Oracle8i)

  7. Star Query Optimization Optimum performance with star schema models 1. Dimensions are queried to create a 2. Cartesian product, computed against 3. Smaller reference table. 4. The result is joined to 5. A fact table to produce a query result.

  8. Star Transformation Market_Table Time_Table Fact_Table Product_Table

  9. Indexing Indexing is used because: • It is huge cost saving, greatly improving performance and scalability • Can replace a full table scan by a quick read of the index followed by a read of only those disk blocks that contain the rows needed

  10. B-Tree Index • Most common type of indexing • Used for high cardinality columns • Designed for few rows returned

  11. Bitmap Indexes • Provide performance benefits and storage savings • Store values as 1s and 0s • Use instead of B-tree indexes when: - Tables are large - Columns have relatively low cardinality

  12. Oracle8 and Oracle8i Index Enhancements • Oracle8 index enhancements: - Partitioned index - Index-organized tables • Oracle8i index enhancements: - Function-based index - New bitmap index improvements - Online index build and rebuild - Descending index - Statistics can be collected when an index is created

  13. Protecting the Database • RAID is essential with large databases • RAID improves: - Reliability - Storage management • There are different levels of RAID • You can eliminate disk contention with disk striping

  14. RAID 0: Striping Disk array controller The file is written to a four-drive disk array: • Block 1 on Drive 1 • Block 2 on Drive 2… • Block 5 in another sector on Drive 1

  15. RAID 0: Striping • Benefits: - Good for simultaneous reads and writes - No redundancy - Scalable • Limitations: - Not recommended for mission-critical systems - No recovery from data loss - One bad sector affects entire disk of data

  16. RAID 1: Mirrored Disk Disk array controller Disk 1 Disk 1 Mirror Disk 2 Disk 2 Mirror Copy of files stored on mirror disk

  17. RAID 1: Mirrored Disk • Benefits: - Complete data redundancy - No performance penalty - Improves reads - Scalability • Limitations: - Highest cost of all RAID configurations

  18. RAID 5: Independent Disk Array Disk array controller Disk 1 Disk 2 Disk 3 Disk 4 Data striped with parity across array

  19. RAID 5: Independent Disk Array • Benefits: - Efficient data integrity - Data reconstruction - Multiple concurrent seeks across array - Scalable • Limitations: - Disk overhead - Data write rate

  20. Backup • Plan at the design stage • Use hot backups for VLDBs • Back up necessary components: - Fact and dimension data - Warehouse schema - Metadata schema - Metadata • Export/Import utility - Disk space - Time

  21. Summary This lesson discussed the following topics: • Explaining vertical partitioning and horizontal partitioning • Distinguishing the different types of partitioning methods • Distinguishing between B-tree index and bitmap index • Understanding why warehouse typically uses RAID 0, or 5 to protect the database

More Related