Module 4 DesigningDatabases for Optimal Performance
Module Overview • Guidelines for Designing Indexes • Designing a Partitioning Strategy • Designing a Plan Guide • Designing Scalable Databases
Lesson 1: Guidelines for Designing Indexes • Guidelines for Selecting a Clustered Index • Guidelines for Selecting a Nonclustered Index • Guidelines for Selecting a Filtered Index • Guidelines for Selecting a Computed Column Index • Guidelines for Selecting a Strategy for Index Compression • Discussion: Using Indexing
Guidelines for Selecting a Clustered Index ü Create a clustered index on the frequently used columns ü Consider clustered index data types and column widths ü Consider the frequency of data changes Clustered Index
Guidelines for Selecting a Nonclustered Index id id indid = 2 indid = 2 root root Consider performance gain versus maintenance cost ü Index on frequently used search arguments ü Consider nonclustered indexes for columns with high selectivity ü Akers Akers Akers Akers Ganio Ganio … … Martin Martin ... ... Consider placing nonclustered indexes on foreign key columns ü Choose a nonclustered index to cover the query ü Consider using included columns ü sys.sysindexes Mather Owen 4:706:04 4:707:02 Page 12 - Root Page 12 - Root Nonleaf Level Martin Martin Page 37 Page 28 Page 37 Page 28 Martin Martin Martin Martin Smith Smith ... ... Leaf Level(Key Value) Page 51 Page 51 Page 61 Page 71 Page 61 Page 71 Page 41 4:706:03 Martin 4:708:01 Ganio Martin Smith 4:708:01 4:709:01 Akers 4:706:01 4:708:04 Mather 4:706:04 Hall Smith Mather Mather 4:706:04 4:706:04 4:709:03 Barr 4:705:03 Owen 4:707:02 Owen 4:707:02 Jones Owen 4:707:02 Smith 4:707:01 4:709:02 Con 4:704:01 Consider using sys.sysindexes to gather information about an index ü
Guidelines for Selecting a Filtered Index Create filtered indexes for heterogeneous data ü Create filtered indexes for subsets of data ü Compare views with filtered indexes ü Include a small number of key or included columns in a filtered index definition ü Use filtered indexes when columns contain well-defined subsets of data ü Compare indexed views with filtered indexes ü Use data conversion operators in the filter predicate ü Use referencing dependencies ü
Guidelines for Selecting a Computed Column Index • Choose a deterministic and precise computed column expression • Assign only values of other columns in the same row • Assess benefits for common or important queries • Assess performance cost against performance gain • Use CLR functions in computed columns to restrict access
Guidelines for Selecting a Strategy for Index Compression Compress tables with row size less than 8,060 bytes ü For individual partitions, set the compression type to NONE and for a list of partitions, set the type to ROW ü Avoid specifying out-of-range partitions ü Rebuild a heap to compress new pages allocated to the heap ü Compresses Nonclustered indexes individually ü Rebuild all the nonclustered indexes on the table to compress a heap ü Enable or disable ROW or PAGE compression online or offline ü Non–leaf-level pages do not receive page compression when compressing indexes ü Data compression is not available for data that is stored separately ü
Discussion: Using Indexing • Is it necessary for every table to have a clustered index? Justify your answer. • An Orders table has a clustered index on the InvoiceNumber (int). The most frequently executed queries use SARG arguments on the OrderDate (datetime) column. A nonclustered index has been created on the OrderDate column. What are the advantages and disadvantages of this clustered index?
Lesson 2: Designing a Partitioning Strategy • Overview of Partitioning • Guidelines for Planning Partitioned Tables and Indexes • Designing Partitions to Manage Subsets of Data • Designing Partitions to Improve Query Performance • Special Guidelines for Partitioned Indexes • Discussion: Using Partitioning
Overview of Partitioning Partitioning helps to break a large table into multiple physical files without comprising the integrity or structure of the database Advantages of Partitioning When to Implement Partitioning? • Partitioning makes large tables or indexes more manageable • Partitioned tables and indexes support designing and querying • Maintenance operations performed on subsets of data can be performed more efficiently • Partitioning a table or index might improve query performance • Implement partitioning when: • The table contains, or is expected to contain data that is used in different ways • Queries or updates against the table are not performing as intended • Maintenance costs exceed predefined maintenance periods
Guidelines for Planning Partitioned Tables and Indexes Partition scheme Maps each partition specified by the partition function to a filegroup Partition function Defines how the rows of a table or index are mapped to partitioning columns
Designing Partitions to Manage Subsets of Data Adding a table as a partition to an already existing partitioned table Switching a partition from onepartitioned table to another Removing a partition to form a single table
Designing Partitions to Improve Query Performance Partitioning for Join Queries Taking Advantage of Multiple Disk Drives Controlling Lock Escalation Behavior
Special Guidelines for Partitioned Indexes Partitioning Unique Indexes ü Partitioning Clustered Indexes ü Partitioning Nonclustered Indexes ü Memory Limitations and Partitioned Indexes ü
Discussion: Using Partitioning • What problems does table partitioning solve? How? • Please explain how to create a table partition, identifying the T-SQL object and statement level support
Lesson 3: Designing a Plan Guide • Overview of Plan Guide • Guidelines for Designing Plan Guides • Designing Plan Guides for Parameterized Queries • Discussion: Using Plan Guides
Overview of Plan Guide Plan guides in SQL Server are useful when a small subset of queries in a database application deployed from a third-party vendor are not performing as expected. Plan guides influence optimization of queries by attaching query hints or a fixed query plan to them Types of plan guides include: • Object plan guide • SQL plan guide • Template plan guide
Guidelines for Designing Plan Guides Attach a query plan to a plan guide Attach query hints to plan guide ü ü Follow the plan guide that matches requirements ü Evaluate the plan guide effect on the plan cache ü
Designing Plan Guides for Parameterized Queries Create a plan guide of type TEMPLATE to force parameterization If the query is not already being parameterized by SQL Server by using the sp_executesql or the PARAMETERIZATION FORCED database SET option Obtain the parameterized form of the query by executing the sp_get_query_template Create a plan guide of type SQL on the parameterized query 1 2 3 To obtain the parameterized form of a query and create a plan guide on it, perform the following steps:
Discussion: Using Plan Guides • What problems does plan guide solve? How?
Lesson 4: Designing Scalable Databases • Guidelines for Scaling-Out Databases • Overview of Federated Databases • Selecting Federated Databases • Overview of Scalable Shared Databases • Guidelines for Selecting Scalable Shared Databases • Overview of Replication • Guidelines for Selecting Replication • Overview of Database Mirroring • Guidelines for Selecting Database Mirroring • Discussion: Using Scalable Databases
Guidelines for Scaling-Out Databases Scale out to multiple database servers and instances Scale out with redundancy Scale up for improved performance
Overview of Federated Databases SQL Server shares the database processing load across a group of servers that process database requests cooperatively. This cooperative group of servers is called a federation.
Selecting Federated Databases Distributed Partitioned Views Symmetric Partitions Asymmetric Partitions Distributed Partitioned Views Symmetric Partitions Asymmetric Partitions • Symmetric partitions are effective when: • Related data is put on the same member server • Data is partitioned uniformly across the member servers • Asymmetric partitions can: • Improve the performance of databases that cannot be symmetrically partitioned • Partition a large, existing system by using a series of iterative, asymmetric improvements • To use distributed partitioned views, consider the: • Pattern of SQL statements executed by an application • Relationships of the tables • Frequency of SQL statements against the partitions • SQL statement routing rules
Overview of Scalable Shared Databases Scalable shared databases let you attach a read-only reporting database to multiple server instances over a storage area network (SAN) SAN Benefits Limitations • Allows workload scale-out on reporting databases by using commodity servers and hardware • Provides workload isolation • Ensures identical views of reporting data from all servers • The database must be on a read-only volume • The data files can be accessed only over a SAN • The databases do not support database snapshots
Guidelines for Selecting Scalable Shared Databases • Verify that the reporting servers and associated reporting database are running on identical platforms • Update all reporting servers for a scalable shared database uniformly • Limit your scalable shared database configurations to eight server instances per shared database • Ensure that the reporting database has the same layout as the production database • Use a single path for the reporting database and the production database • Ensure that the scalable shared database is on a read-only volume that is accessible over your SAN from all the reporting servers • Ensure that all the server instances use the same sort order • Ensure that all the server instances use the same memory footprint
Overview of Replication Snapshot Replication Distributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data Transactional Replication Takes an initial snapshot. Subsequent data changes and schema modifications are delivered to the Subscriber as they occur Merge Replication Takes an initial snapshot. Subsequent data changes and schema modifications are tracked with triggers Peer-to-Peer Replication Provides a scale-out and high-availability solution by maintaining copies of data across multiple server instances Replication
Guidelines for Selecting Replication Snapshot Replication Merge Replication Transactional Replication Peer-to-Peer Replication Snapshot Replication Merge Replication Transactional Replication Peer-to-Peer Replication • Create and secure the snapshot folder • Estimate the disk space required to transfer and store snapshot files • Schedule snapshots at off-peak hours • Set up a mail-enabled user account in Active Directory Domain Services (ADDS) • Ensure adequate space for the transaction log • Ensure adequate space for the distribution database • Declare primary keys for each published table • Consider the issues with using triggers • Consider using large object (LOB) data types • Use each node for its own distribution database • Avoid including tables in multiple peer-to-peer publications in a single publication database • Enable publications for peer-to-peer replication before creating subscriptions • Initialize subscriptions by using a backup • Avoid using identity columns • Ensure that any SELECT and INSERT statements that reference published tables use column lists • Filter out Timestamp columns during article validation • Specify a value of TRUE for the @stream_blob_columns parameter of sp_addmergearticle • Add a dummy UPDATE statement within a transaction • Track changes when performing bulk updates
Overview of Database Mirroring Benefits Witness Server (optional) Improved data protection Allows reporting of MirrorServer Improved database availability Data Flow Mirror Server Principal Server Improved availability of the production databaseduring upgrades Working of Database Mirroring
Guidelines for Selecting Database Mirroring Consider using the high-performance mode for disaster-recovery scenarios in which the principal and mirror servers are separated by a significant distance and where you do not want small errors to impact the principal server ü Consider using log shipping as an alternative to asynchronous database mirroring ü Consider setting the WITNESS property to OFF if the SAFETY property is set to OFF when you use Transact-SQL to configure high-performance mode ü When the principal server fails, you can: • Leave the database unavailable until the principal server becomes available • Manually update the database and then begin a new database mirroring session • Sparingly use forced service on the mirror server ü
Discussion: Using Scalable Databases • Federated databases can increase the total storage and performance in extremely high capacity or high performance systems. What is the single key element necessary to ensure that a query is executed on the server contains the appropriate data? • What is the primary problem that scalable shared databases solve? • A single table from the production database is required to be copied to a different database, on a different server instance. Select the best solution from the following options. Why?(A) Clustering, (B) Mirroring, (C) Replication
Lab 4: Designing Databases for Optimal Performance Virtual machine NYC-SQL1 Administrator User name Password Pa$$w0rd • Exercise 1: Applying Optimization Techniques • Exercise 2: Creating Plan Guides • Exercise 3: Designing a Partitioning Strategy Logon Information Estimated time: 60 minutes
Lab Scenario You are a lead database administrator at QuantamCorp. You are working on the Human Resources Vacation and Sick Leave Enhancement (HR VASE) project that is designed to enhance the current HR system of your organization. This system is based on the QuantamCorp sample database in SQL Server 2008. The main goals of the HR VASE project are as follows: • Provide managers with current and historical information about employee vacation and sick-leave data. • Provide permission to individual employees to view their vacation and sick-leave balances. • Provide permission to selected employees in the HR department to view and update employee vacation and sick-leave data. • Provide permission to the HR manager to view and update all data. •Ensure that the application uses the database in an optimal way and optimize the performance of reports for managers and HR personnel. You need to formulate a list of tasks that you would need to ensure optimal query performance. Before finalizing the task, you need to verify the result of each task. In this lab, you will examine the business requirements and identify different ways to improve performance. You will enhance the database performance by creating appropriate indexes, plan guide, and partition.
Lab Review • What is the purpose of examining the database model, schema, data metadata, and dynamic management views before you decide the course of action to improve query performance. • What is a plan guide? • You are developing a partitioning scheme for your application database. The table that you need to partition is sorted according to the date. Users usually access yearly data from that table. How would you design the partitioning scheme? • You are working on partitioning a data warehouse table by using a column that has the datetime datatype. Why you would you use RIGHT as the RANGE parameter for the partitioning scheme?
Module Review and Takeaways • Review Questions • Real-world Issues and Scenarios • List of Tools