1 / 38

Data Management and Index Options for SQL Server Data Warehouses

Data Management and Index Options for SQL Server Data Warehouses. Atlanta MDF. Lance England. http://lance-england.com. Goal. To present options to consider for your data warehouse. The options can be used separately or combined. Agenda. Data Warehouse Overview Partitioned Views

dunn
Download Presentation

Data Management and Index Options for SQL Server Data Warehouses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management and Index Options for SQL Server Data Warehouses Atlanta MDF

  2. Lance England http://lance-england.com

  3. Goal To present options to consider for your data warehouse. The options can be used separately or combined.

  4. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  5. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  6. Data Warehouse Overview

  7. Dimensional Model

  8. Dimension Tables • Many columns, denormalized for speed and simplicity • Have both surrogate keys and business keys • Often capture changes over time (slowly changing dimension)

  9. Fact Tables • A few columns that capture the measures of interest • Many key columns to dimension tables • Can grow very large To improve the scalability and manageability of large tables, we can partition the data. We can do this with partitioned views and/or partitioned tables.

  10. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  11. Partitioned Views (SQL 7) • Multiple tables, but presented as a single logical entity with a view • CHECK constraint on each table enables partition elimination in queries • Use UNION ALL in view definition to prevent engine from attempting de-duplication

  12. Partitioned Views (SQL 7) Pros • Adding/dropping tables is easy • Able to update statistics for a single partition (table) • Can partition on multiple columns Cons • Changing schema/indexes is multiplied by number of tables

  13. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  14. Partitioned Tables (SQL 2005) • The TABLE is the single logical entity, with partitioning abstracted away by the storage engine • Implemented with partition function and partition scheme • Can participate in partition elimination

  15. Partitioned Tables (SQL 2005) Pros • ALTER only one table • Uniform indexes help query analyzer • Can load/drop partitions with partition switching Cons • Can’t update statistics at partition level • Partition switching more complicated than adding/dropping tables • Single partitioning column only

  16. Partition Switching • Quickly add/remove partitions with data • Create a separate schema-identical table (with a CHECK constraint if switching IN) • Minimally logged meta-data operation

  17. SPLIT/MERGE Warning Avoid split or merge of populated partitions. Splitting or merging populated partitions can be extremely inefficient, as this may cause as much as 4 times more log generation, and also cause severelocking. http://sqlcat.com/sqlcat/b/top10lists/archive/2008/02/06/top-10-best-practices-for-building-a-large-scale-relational-data-warehouse.aspx

  18. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  19. Compression (SQL 2008) • Trading CPU usage for less I/O (more rows per page on disk and cache) • Object-level, not a database setting • Two types: Row or Page compression • Most opinions are to opt for page compression first, and scale back to row compression (or none) if CPU usage is unacceptable

  20. Row Compression • Removes unused space at data-type level • Variable-length storage for numbers and dates • Variable-length storage for CHAR, NCHAR • Optimizes storage of NULL and zero

  21. Page Compression • Row compression • Prefix compression – replaces repeated sequences in each column with a reference to the CI structure • Dictionary compression – replaces repeated sequences on the page with a reference to the CI structure

  22. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  23. Indexed Views (SQL 2005) • Great for aggregating data • Longlist of restrictions (see link): • When aggregating • Clustering key can only reference columns in GROUP BY • View definition must contain COUNT_BIG(*) • Can participate in partition switching (with extra steps) http://msdn.microsoft.com/en-us/library/ms191432.aspx

  24. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  25. Filtered Indexes (SQL 2008) • Index with a WHERE clause • Filtering column does not have to be included in index • Best for • Sparse columns (many NULLs) • Heterogeneous columns with categories of data (think FK columns) • Columns with ranges of values such as dollar amounts, time, and dates.

  26. Agenda Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  27. Columnstore Indexes (SQL 2012) Pros • Stores data by column; more efficient page retrieval • Optimized compression • Segment elimination (~1 million rows/per) • Batch-mode evaluation • Query operators pass batch of 1000 instead of row-by-row (not all operators supported) • Affected by #of cores and MAXDOP setting

  28. ColumnstoreIndexes Cons • Table can’t be updated once created • Workaround choices: • Disable index/load/rebuild • Partition switching • Partitioned view • Not good for single-record SEEK queries

  29. Clustered Columnstore (SQL 2014) • Less storage: Not a copy of data • Updatable • Improved batch mode support

  30. Clustered Columnstore

  31. Recap Data Warehouse Overview Partitioned Views Partitioned Tables Compression Indexed views Filtered indexes Columnstore indexes

  32. Linkapalooza DATA WAREHOUSE Microsoft EDW Architecture, Guidance and Deployment Best Practices http://msdn.microsoft.com/en-us/library/hh147624.aspx The Data Loading Performance Guide http://msdn.microsoft.com/en-us/library/dd425070(v=sql.100).aspx Top 10 Best Practices for Building a Large Scale Relational Data Warehouse http://sqlcat.com/sqlcat/b/top10lists/archive/2008/02/06/top-10-best-practices-for-building-a-large-scale-relational-data-warehouse.aspx

  33. Linkapalooza PARTITIONED TABLES Partitioned Table and Index Strategies Using SQL Server 2008 http://technet.microsoft.com/en-us/library/dd578580%28v=sql.100%29.aspx Strategies for Partitioning Relational Data Warehouses in Microsoft SQL Server http://technet.microsoft.com/en-us/library/cc966457.aspx Table Partitioning Resources http://www.brentozar.com/sql/table-partitioning-resources/ Query Processing Enhancements on Partitioned Tables and Indexes http://msdn.microsoft.com/en-us/library/ms345599.aspx

  34. Linkapalooza DATA COMPRESSION Data Compression: Strategy, Capacity Planning and Best Practices http://msdn.microsoft.com/en-us/library/dd894051%28v=sql.100%29.aspx

  35. Linkapalooza INDEXED VIEWS Creating Indexed Views (long list of restrictions) http://msdn.microsoft.com/en-us/library/ms191432.aspx Partition Switching with Indexed Views http://technet.microsoft.com/en-us/library/bb964715(v=sql.105).aspx FILTERED INDEXES Filtered Index Design Guidelines http://msdn.microsoft.com/en-us/library/cc280372(v=sql.100).aspx

  36. Linkapalooza COLUMNSTORE INDEXES Books Online http://msdn.microsoft.com/en-us/library/gg492088.aspx Columnstore Internals http://rusanu.com/2012/05/29/inside-the-sql-server-2012-columnstore-indexes/ SQL Server Columnstore Performance Tuning http://social.technet.microsoft.com/wiki/contents/articles/4995.sql-server-columnstore-performance-tuning.aspx#Ensuring_use_of_the_Fast_Batch_Mode_of_Query_Execution

  37. Linkapalooza CLUSTERED COLUMNSTORE INDEXES (SQL 2014) http://rusanu.com/2013/06/11/sql-server-clustered-columnstore-indexes-at-teched-2013/ TechEd Video http://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/DBI-B322#fbid=rAFPjiEmlNt

  38. Thanks! http://lance-england.com

More Related