1 / 47

Building Highly Available Systems with SQL Server™ 2005

Building Highly Available Systems with SQL Server™ 2005. Availability What does it mean to you?. Why not? Site is unavailable System is unavailable Database is unavailable Database is partially un available Table is unavailable Data is unavailable.

elvis-young
Download Presentation

Building Highly Available Systems with SQL Server™ 2005

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building Highly Available Systems with SQL Server™ 2005

  2. AvailabilityWhat does it mean to you? • Why not? • Site is unavailable • System is unavailable • Database is unavailable • Database is partially unavailable • Table is unavailable • Data is unavailable Can your customers get done, what they need to get done, when they need to do it? 24x7x365

  3. Take Advantage When?How much work to leverage the technology? Design and Architect Upgrade Immediate Minimal Work to Leverage • Online Index OperationsWhen Criteria NOT Met (minority) • Snapshot IsolationWith Update Conflict Detection • Replication • Online Index OperationsWhen Criteria Met • Snapshot IsolationStatement-level Snapshot • Snapshot IsolationTransaction-level Snapshot (RO) • Failover Clustering • Database Mirroring • Log Shipping • Database Snapshots • Partial Database Availability • Online Piecemeal Restore • Instant File Initialization • Fast Recovery • Improving Availability from Installation to Design • Availability in Layers to minimize downtime and data loss

  4. Improved Data Availability without Requiring Standby • Downtime is reduced and/or prevented for these barriers: • Database is partially unavailable • Table is unavailable • Data is unavailable

  5. Barriers to AvailabilityIsolated Failures • Continuing to work with isolated failures • Limiting the scope of failure • Partial Database Availability • Online Piecemeal Restore • Supporting Technology • Instant File Initialization • How do they work?

  6. What happens when… • Disks Fail • In SQL Server™ 2000 • Database is marked suspect • Users are unable to access the database • In SQL Server™ 2005 • Filegroup is marked offline • Users are able to access undamaged data

  7. What happens when… • Recovery begins • In SQL Server™ 2000 • Database is in a restoring state • Users are unable to access the database • File needs to be recreated and zero initialized • File Restore can proceed – offline • In SQL Server™ 2005 • Filegroup is in a restoring state • Users are able to access undamaged data • File can be recreated with instant file Initialization • Piecemeal Restore can proceed – online

  8. How is This Possible? • Fine grained operations are based on “functional partitioning” • Partitioning – in this sense – does not require Partitioned Tables • Partitioned Tables benefit significantly from fine grained operations • Partitioning for fine grained operations requires secondary, non-primary data files where data is strategically placed • Recovery of your damaged devices can be prioritized and then the database can be brought online in stages

  9. Functional PartitioningStrategies to separate Objects/Data • Related Object-groupings  Separate tables – strategically placed on different filegroups • Time-based data placement/partitioning  Structures designed for sliding window scenario • List-based groupings/partitioning  Range-based partitioning based on complete lists • To fully leverage Partial Database Availability for partitioned objects – use Partitioned Tables • Partitioned Tables – new feature in SQL Server™ 2005 to further simplify the process of building large data warehouses

  10. Benefits of Partitioning • Speed in managing sliding window  Partition manipulation outside of active table • Piecemeal Backup  Backup active components more frequently, inactive less frequently • Partial Database Availability  If a filegroup becomes unavailable the undamaged data remains available • Online Piecemeal Restore  During the restore, the undamaged data remains available

  11. Partial Database AvailabilityImproving Availability for Isolated Disaster • Undamaged data remains available while damaged data is inaccessible • File Status shown in sys.database_files catalog view • Page Errors written to suspect_pages table in msdb • Agent alerts: • Notification of the damaged file • Can take the database offline, if desired • Can automate the restore, for read-only data

  12. Database Components • Databaseconsists of… • Filegroupsconsist of… • Filesconsist of… • Extentsconsist of… • Pages consist of data TicketSalesDB Primary Readwritefilegroup File1 File2 2004 File3 Readonlyfilegroups 0 4 8 12 16 20 24 28 File Header 2003 File4 1 5 9 13 17 21 25 29 … 2002 File5 2 6 10 14 18 22 26 30 2001 File6 3 7 11 15 19 23 27 31 extent0 extent1 extent2 extent3 Log

  13. Improving Data Availability, Part IPartial Database Availability NameTitle Company

  14. Online Piecemeal RestoreImproving Availability during Recovery • Almost any component (page, file, filegroup) can be restored – ONLINE • If a page is damaged – restore only that page from a file, filegroup or database backup • If a file is damaged – restore only that file from a file, filegroup or database backup • If a filegroup is damaged – restore only that filegroup from a filegroup or database backup • Readonly filegroups can be restored without rolling forward log changes • Users can access the database during the restore

  15. Instant File InitializationImproving Availability by Reducing Downtime • SQL Server™ 2000 • All data and log files must be zero initialized • Downtime during recovery negatively impacted by the file creation phase of restore • SQL Server™ 2005 • Only log files must be zero initialized • Downtime during recovery significantly reduced by skipping zero initialization during the file creation phase of restore • Not only a benefit to Restore • Database Creation • All Restores: File, Filegroup and Database Restores • Database File Changes: autogrow, manual resizing

  16. Improving Data Availability, Part IOnline Piecemeal Restore NameTitle Company

  17. Summary: Isolated Failures

  18. Barriers to AvailabilityConcurrency Requirements • Database is available but the application/user cannot complete required operations • What about operational impacts? • Maintenance Operations which cause blocking  New Online Index Rebuilds • What about application impact? • Poorly designed and/or long running transactions • Varying data access patterns  New Snapshot Isolation options

  19. What happens when… • Indexes need to be rebuilt • In SQL Server™ 2000 • Index rebuilds require an exclusive table-level lock, resulting in offline rebuilds • Users are unable to access the table • In SQL Server™ 2005 • Rebuilds of an index can be performed online if a few simple criteria are met • Users are able to access the table

  20. Online Index OperationsImproving Concurrency during Index Maintenance • SQL Server™ 2000 • Offline Index Rebuilds; table data is unavailable during operation • Rebuild options: DBCC DBREINDEX and CREATE with DROP_EXISTING • SQL Server™ 2005 • Includes all of the above offline operations, plus… • New ALTER INDEX…REBUILD: • ONLINE – allows concurrent user access (queries as well as modifications) to the index during rebuild • OFFLINE – works using locks (same as SQL Server™ 2000) • If online is not possible by default, consider design alternatives to fully leverage online index rebuilds

  21. Online Index Rebuilds NameTitle Company

  22. What happens when… • Readers and Writers desire the same data • In SQL Server™ 2000 • Locking is used to guarantee the intended level of isolation • Users must wait to access locked data • Concurrency and performance compromised • Correctness is compromised when lower isolation levels are used to avoid locking • In SQL Server™ 2005 • Locking OR Versioning can be used to guarantee a variety of isolation levels • With versioning, Readers won’t block writers and writers won’t block readers • Performance improved if contention was primary bottleneck • Correctness is not compromised due to use of lower isolation levels

  23. Snapshot IsolationImproving Concurrency in Mixed Workloads • SQL Server™ 2000 • Isolation implemented solely through locking • Mixed workloads may experience: • Concurrency problems due to blocking • The Inconsistent Analysis problem • SQL Server™ 2005 • Isolation implemented using locking and versioning • Mixed workloads can improve read consistency and performance using: • Read committed with Statement-level snapshot to improve statement-level consistency • Snapshot Isolation to improve transaction-level consistency

  24. Snapshot Isolation NameTitle Company

  25. Summary: Concurrency Requirements

  26. Improved Availabilitywith Standby Technologies • Downtime is reduced and/or prevented for these barriers: • Site is unavailable • System is unavailable • Database is unavailable

  27. Barriers to AvailabilityCatastrophic Failures • Database is completely unavailable • Server is unavailable • Site is unavailable • Standby Technologies • Failover Clustering • Database Mirroring • Replication • Log Shipping • Supporting Technology • Fast Recovery • How do they work?

  28. Failover Cluster Failover ClusteringServer-level Redundancy • Established High Availability Technology • Hot Standby: Automatic Detection and Automatic Failover • No work loss exposure and no direct impact to workload • Protects against node failures • Geographically DispersedFailover Clusters with approved hardware • Recovery on failover improved by Fast Recovery

  29. Failover ClusteringNew for SQL Server™ 2005 • Faster Failover through Fast Recovery • Supports up to an 8-node Failover Cluster with Enterprise Edition • Supports up to a 2-node Failover Cluster with Standard Edition • Supports mounted volumes for better explicit disk usage – helps in server consolidation • Supports dynamic AWE for better memory utilization • Unattended setup • All SQL Server data services participate • Database Engine, SQL Server Agent, Full-Text Search • Analysis Services – Now has multiple instances

  30. Fast RecoveryImproving Availability by Reducing Downtime • Not only beneficial to Failover Clustering • On every server startup, Restart Recovery runs to guarantee consistency • Restart Recovery has two phases: • REDO: rolls forward committed transactions • UNDO: rolls back any incomplete transactions • In SQL Server™ 2005, users are allowed access after REDO SQL Server™ 2005 SQL Server™ 2000 Redo Undo ONLINE ONLINE

  31. Database MirroringDatabase-level Redundancy • Upcoming High Availability Technology • Released for testing and prototyping in SQL Server™ 2005 RTM • Certified for Production Use in the first half of 2006 • Supports three configurations: • High Availability • High Protection • High Performance

  32. Database MirroringTechnology Overview • Principal Database handles user activity • Mirror Database receives changes via secure, dedicated TCP channel • Server does NOT require a license if the server acts solely for redundancy • Optional Witness Server • Lightweight mechanism to help provide quorum • Can run on any SQL Server Edition • Supports three configurations: • High Availability • High Protection • High Performance

  33. Database MirroringBasic Principal of Synchronous Mirroring Acknowledge Commit Acknowledge Constantly Redoing on Mirror Transmit to Mirror Write to Local Log Committed in Log Write to Remote Log DB Log Log DB

  34. Database MirroringConfiguration Summary • Mirror database is available within seconds of failover • Mirror database is available for read-only analysis through the use of Database Snapshots High Availability High Protection High Performance • Automatic Detection • Automatic Failover • Uses synchronous form of mirroring • Requires Witness • Principal performance is affected by network speed and distance • No Automatic Detection • Manual Failover • Uses synchronous form of mirroring • Does not require Witness • Principal performance is affected by network speed and distance • No Automatic Detection • Manual Failover • Uses asynchronous form of mirroring • Does not require Witness • Principal performance is NOT affected by network speed and distance

  35. Database Mirroring NameTitle Company

  36. Database Scale OutPeer to Peer Replication • Identical databases continuously synchronize in near real time • Scale query workloads beyond what’s possible with a single database Example: Distributed Trading System Chicago London Tokyo

  37. Availability through ScalabilityPeer to Peer Replication • Enables load-balancing and improved availability through scalability • Database failures shouldn’t bring down the application system • Database upgrades should be done without outages • Individual databases can be taken online/offline and maintained without application downtime • Warm Standby • Small possibility of some data loss on failure

  38. Peer-to-Peer Replication • Based on Established Transaction Replication Technology • Based on Bi-directional Transactional Replication • All participants are peers • Schema is identical on all sites • Publish the updates made on “their” data • Subscribe to others to pick up their changes • No hierarchy as in “normal” transactional replication • A given set of data can be updated at only one site at a time • Data “ownership” is purely logical; does not prevent conflicts • SQL Server prevents a change from round-tripping

  39. Logreader Agent Logreader Agent Logreader Agent Dist DB Dist DB Dist DB Distribution Agent Distribution Agent Distribution Agent Peer to Peer Topology London Chicago Peer to PeerTransactional Replication Tokyo

  40. Peer-to-Peer Replication NameTitle Company

  41. Log ShippingDatabase-level Redundancy • Established High Availability Technology • Supports multiple secondary servers • Secondary for Failover • Secondary for Reporting • Secondary with delay for Human Error Recovery • Can be combined with other technologies such as Failover Clustering and Database Mirroring • New for SQL Server™ 2005 • Integration in SQL Server Management Studio • Log Shipping is not delayed during Database or Differential Backups

  42. Summary: Standby Technologies

  43. Barriers to Availability I’m going to modify this data…right… People here! This job would be great if it weren’t for……the users …the staff …us

  44. Barriers to Availability Many more barriers than discussedOnly some are addressable by database technologyBe sure to consider people, planning, procedures and training • Microsoft SQL Server™ 2005 gives you greatly improved tools to overcome these barriers to availability: • Database Server Failure or Disaster • Isolated Disk Failure • Data Access Concurrency Limitations • Database Maintenance and Operations • Availability at Scale • User or Application Error

  45. Summary • SQL Server™ 2005 offers greater availability – immediately • Many technologies available just by upgrading! • Some architected/implemented over time • SQL Server™ 2005 is more Available • Partially damaged databases remain available • Databases being recovered remain available • Instant File Initialization, Fast Recovery • New and Improved Replication Alternatives • SQL Server™ 2005 is more Robust

  46. Take Advantage When?How much work to leverage the technology? Design and Architect Upgrade Immediate Minimal Work to Leverage • Online Index OperationsWhen Criteria NOT Met (minority) • Snapshot IsolationWith Update Conflict Detection • Replication • Online Index OperationsWhen Criteria Met • Snapshot IsolationStatement-level Snapshot • Snapshot IsolationTransaction-level Snapshot (RO) • Failover Clustering • Database Mirroring • Log Shipping • Database Snapshots • Partial Database Availability • Online Piecemeal Restore • Instant File Initialization • Fast Recovery • Improving Availability from Installation to Design • Availability in Layers to minimize downtime and data loss

More Related