1 / 18

Storage Considerations for the Physical Model

Storage Considerations for the Physical Model. Database Sizing. Sizing influences capacity planning and systems environment management. Sizing is required for: The database Other storage areas Indexes Sizing is not a science. Techniques vary. DWM suggests when to perform sizing.

yasir-hill
Download Presentation

Storage Considerations for the Physical Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Storage Considerations for the Physical Model

  2. Database Sizing • Sizing influences capacity planningand systems environment management. • Sizing is required for: • The database • Other storage areas • Indexes • Sizing is not a science. • Techniques vary. • DWM suggests when to perform sizing.

  3. Estimating the Database Size 1. Estimate the size of each row in the fact table. 2. Determine the grain of each dimension and estimate the number of entries in the finest level. 3. Multiply the number of rows of all dimensions and multiply the result by the fact table row size. 4. Determine whether the fact table is sparse or dense and estimate the reduction or increase in size.

  4. Validating Database Size Assumptions • After you estimate the size of the database, you can • validate your assumptions by doing the following: • Extract sample files • Load data into the database • Compute exact expected row lengths • Add overhead for indexing, rollback and temporary tablespaces, aggregates, views, and a file system staging area for flat files

  5. Example: Estimating the Database Size Description Estimation Estimate the size 52 bytes (assumed for this example) of one row of the fact table Estimate the Channel 3 channels entries in the Customer 63 ship_to_locations lowest level Product 36 items within each History 48 months dimension Multiply t he # of (3 x 63 x 36 x 48) x 52 = 16,982,784 bytes entries for each dimension and multiply the result by the fact table row size Sparsity is low, 16,982,784 * .10 = 1,698,278 adjust by 10% 16,982,784 - 1,698,278 = 15,284,506 bytes Estimated 15.3 MB database size

  6. Applying the Test Load Sampling • Analyze statistically significant data samples • Use test loads for different periods • Reflect day-to-day operations • Include seasonal data and worst-case scenarios: • Calculate the number of transactions • Use the average sales price approach • Consider indexes and summaries

  7. Test Load Sampling Using the Server • Load a sample of data. • Query to determine the number of rows per block. • Estimate based on the number of rows. Query 3000+rows

  8. ORDER_LINE Order_PK Order_Line_Num Item_FK Units PRODUCT Item_PK Package Item_Desc Item_Source Family_PK Class_PK Indexes Unique index Primarykey Nonunique index Foreignkey Unique index Primarykey Database Table data blocks Index data blocks

  9. Indexing Types • B*tree index • Bitmap index

  10. Indexing Types • B*tree index: small number of distinct values in a particular column • Bitmap index: used for star query transformations

  11. Indexing Strategies ? ? ? Columns, tables, and index types size = ‘SMALL’ 0 0 1 0 0 1 0 1 0 1 0 0 0 1 0 1 size = ‘MED’ 1 0 0 0 1 0 1 0 0 0 0 1 0 1 0 0 size = ‘LARGE’ 0 1 0 1 0 0 0 0 1 0 1 0 1 0 1 0 color = ‘BLUE’ 0 1 0 1 0 0 1 0 1 0 1 0 0 0 1 0 color = ‘RED’ 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 1 color = ‘GREEN’ 1 0 0 0 0 1 0 1 0 1 0 1 0 1 0 0 Result: color = blue and size = medium or large

  12. B*tree Index KING KING KING MILLER TURNER BLAKE BLAKE JAMES TURNER WARD MILLER SCOTT SMITH KING MARGIN JAMES JONES BLAKE CLARK FORD ADAMS ALLEN BLAKE-ROWID CLARK-ROWIDFORD-ROWID

  13. Bitmap Indexes • Store values as 1s and 0s • Are used instead of B*tree indexes when: • Tables are large • Columns have low cardinality • Multiple columns are constrained in the same query

  14. 101 single east male bracket_1 102 married central female bracket_4 103 married west female bracket_2 104 divorced west male bracket_4 105 single central female bracket_2 106 married central female bracket_3 REGION='east' REGION = 'central' REGION = 'west' 1 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 Bitmap Index Example CUSTOMERtable CUSTOMER_NBR MARITAL_STATUS REGION GENDER INCOME_LEVEL CREATE BITMAP INDEX REGION_IDX ON CUSTOMER(REGION); Sample bitmap index on the REGION column

  15. 0 0 0 0 1 0 1 1 col2 AND 0 OR 1 = 1 1 col3 0 1 0 1 1 0 0 1 1 0 1 1 col 5 Bitmap Index Example SELECT COUNT(*) FROM CUSTOMER WHERE MARITAL_STATUS = ‘married’ AND REGION IN (‘central’, ‘west’); REGION= 'west' Query Result REGION='central' MARITAL_STATUS ='married' 0 0 1 1 1 AND = 1 0 0 0 0 1 1

  16. Partitioning Tables and Indexes • Large tables and indexes can be partitioned into • smaller, more manageable pieces. Tablespace Segment Extent Blocks Structure

  17. Advantages of Partitioning • Consider using partitioning for: • Very large databases (VLDBs) • Reduction of down time for scheduled maintenance and reloading • Reduction of down time for data failure • Decision support systems (DSS) performance • I/O performance • Disk striping • Partition transparency

  18. Partitioning • Partition table and index data by: • Time • Sales • Geography • Organization • Line of business • Partition by time

More Related