1 / 51

T-SQL Tools:

T-SQL Tools:. Simplicity for Synchronizing Changes Martin Perez. T-SQL Tools: Simplicity for Synchronizing Changes Abstract: One of most common problems a data professional may face is finding and syncing data changes efficiently between multiple systems.

doli
Download Presentation

T-SQL Tools:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. T-SQL Tools: Simplicity for Synchronizing Changes Martin Perez

  2. T-SQL Tools: Simplicity for Synchronizing Changes Abstract: One of most common problems a data professional may face is finding and syncing data changes efficiently between multiple systems. In this session we will look into different methods focusing on effective use of T-SQL and explore how these methods can mixed and matched in your workflows. Target Audience: Any DBA or developer that has ever faced the problem of moving data changes across databases.

  3. The Story We have a third party application critical to the company.  Any changes we make to the database means we are now on an “unsupported configuration”. No table changes, no new indexes. Also, there is no consistent “updated date time” data for the tables we need. Querying the online system causes noticeable blocking. We could not get data in the window for daily reporting. Brute force – Scan all

  4. Good News: A combination of differential database restore to read only standby with full table scans to find new and updated rows.  By restoring the database to a reporting server with more cores and faster storage than the leased server housing the application database, we could brute force the table compare and run multiple queries in parallel to find new and updated rows to feed into the data warehouse. The simplistic approach met the needs and needed relative little work to maintain once the initial work was done. The DBAs were able to set up and schedule the back and restore jobs and the BI Developers were able to get the data. No one was a bottle-neck for the other. Brute force – Scan all

  5. The Downside: Backup, restore and file copy all take time and need to be accounted for. Your storage and network speeds can easily become a limiting factor if you have a tight window for reporting. As table size increases the time needed to read all data can become unreasonable.  Brute force – Scan all

  6. The Upside: Relatively easy to implement using  well understood technology and existing infrastructure. Easy to support, DBAs and developers could each do the work they best prepared for and it provided immediate answers to the business. It was relatively low-tech but provided immediate value. Brute force – Scan all

  7. Recap: Dual table scans Compare every column Watch for datatype mismatch Handle NULL Values Collation mismatch Case-Sensitive v Case-Insensitive Batch your deletes Brute force

  8. Recap: • NOT IN • Primary Key • Indexed column • Can be efficient • NOT EXISTS • Good for comparing multiple columns • OUTER JOIN • Find new and deleted rows • EXCEPT • Good for comparing multiple columns Brute force

  9. USE StackOverflow; SELECT Id FROM dbo.Badges WHERE Id NOTIN ( SELECT Id FROM the_target.dbo.BadgesCopy ); GO Syntax – NOT IN– NOT IN

  10. SELECT Id FROM dbo.Badges AS s WHERENOTEXISTS ( SELECT t.Id FROM the_target.dbo.BadgesCopy AS t WHERE t.Id=s.Id ); Syntax – not exists

  11. SELECT s.Id FROM dbo.Badges AS s LEFTOUTERJOIN the_target.dbo.BadgesCopy AS t ON s.Id=t.Id WHERE t.Id ISNULL; Syntax– OUTER JOIN

  12. Similar execution plans • Index or table scan • Large tables, problematic performance

  13. Rowversion Change Tracking Change Data Capture Temporal Tables SQL Server for the assist

  14. Can narrow the range of rows to compare dramatically Additional programming and maintenance overhead Rowversion Change Tracking Change Data Capture Temporal Tables SQL Server for the assist

  15. Notpictured SQL Server Replication Triggers Third-Party Tools Hashbytes

  16. Each database has a counter that is incremented for each insert or update operation that is performed on a table that contains a rowversion column within the database. This counter is the database rowversion. https://docs.microsoft.com/en-us/sql/t-sql/data-types/rowversion-transact-sql Rowversion

  17. Alias for timestamp Can be added to existing table Watch out for transform times Rowversion

  18. One rowversion column per table Value increments with each Insert, Update operation, per row Not guaranteed to be unique Not recommended for Primary Key or Clustered Index “No-op” update increments the value Rowversion

  19. Efficient query based on increasing values Column can be indexed Rowversion

  20. Finding deleted rows is still a brute force operation Rowversion

  21. System maintained Unique per database (mostly) Easily find range of modified rows Easily find new rows Does not help tracking deletes No-op (update column to itself) increments row version One rowversion column allowed per table Rowversion RECAP

  22. An option since SQL Server 2018 is to enable change tracking. Once this is enabled at the database level we get an incrementing row version and the type of operation, insert update or delete. Deletes are available as separate operations and we have the list of columns that have changed per operation. Change tracking

  23. Enable/Disable change tracking Set retention period Set cleanup behavior Change tracking Database setup

  24. Enable for each table required • Set option for column changes Change tracking Table setup

  25. Usage • Query change table function • SYS_CHANGE_OPERATION • CHAR(1) for Insert, Update, Delete operations • SYS_CHANGE_COLUMNS • VARBINARY(4100) • Bit-mask for updated columns • SYS_CHANGE_VERSION • BIGINT Incremental version Change tracking

  26. Track columns updated allows gives us more details about update operations • Enables the update mask Change tracking Table setup

  27. Change tracking

  28. Create targeted updates Batch updates based on column groupings Change tracking

  29. Does not require Enterprise Edition Additional storage overhead Keep track of retention period No supported way of cleaning up data manually Understand how insert and update results change based on change version Keep track of KB Articles related to reported Change Tracking bugs Change tracking recap

  30. The Story: You need a bit more of the story as data changes. Deltas matter and maybe you want to audit the before and after values. Maybe you want to monitor for specific values as the data is being changed. Maybe you want to stream these changes to your data warehouse in near-real-time.  Change data capture

  31. Scenario: ERP system configuration changes were being made that caused time consuming closing process to fail. As year-end loomed, getting to the bottom of things became critical. Although the application had built-in auditing, the audit process added overhead and the output was not easily to read. Any solution put into place could not impact day-to-day operations. Change data capture

  32. Good News: If you are at least SQL Server 2008 Enterprise Edition or SQL Server 2016 Service Pack 1 Standard Edition, Change Data Capture may be the solution you’re looking for.  Change data capture

  33. Prior to SQL Server 2016 SP1, requires Enterprise Edition Up to two capture instances per table Stores before and after values per column Lot’s of flexibility (pro) Lot’s of flexibility (con) Change data capture

  34. Separate file group is recommended Additional IO overhead Make sure CDC SQL Agent Jobs are running Separate OLTP database files from Audit, Integration, ETL… Security for tracked data Restore WITH KEEP_CDC to preserve change data CDC setup Change data capture

  35. The Upside: Track before and after values Insert update and delete actions fully logged and available Change data can be read without hitting transactional tables Full control over retention length Multiple strategies available to maximize performance Change data capture

  36. The Downside: As you bring this solution into the mix you’ll find there’s More to Manage™.  Keeping track of storage needs require your attention and to take full advantage of the solution some custom development may be in order.  Use NCHAR or NVARCHAR data type for columns containing non-ASCII data could cause problems, code or table changes may be still needed. Change data capture

  37. Change data capture

  38. Change data capture

  39. Capture instanceRecommended Name the capture instance Test performance with net change support enabled Index must be unique Only include required columns Monitor rate of change for data Keep CDC data in its own filegroup Change data capture

  40. Creates system table per capture instance System columns are prefixed with “__$” Contains before and after values for updates Row is copied for deletes __$command_id – order of operation in a transaction Change data capture

  41. Change data capture

  42. LSN can be set by a date range Change data capture

  43. Change data capture

  44. Change data capture

  45. New things to monitor Watch Transaction log growth Watch cleanup job performance Cleanup window and data loads need to be coordinated Change data capture

  46. Additional IO overhead Requires more programming effort Requires more administration More options to limit interference with OLTP workloads Flexible (pro) Flexible (con) Available in SQL Server 2016 SP1 Standard Change data capture recap

  47. Available in SQL Server 2016 System-versioned Designed to keep a full history of data changes Temporal Tables

  48. Period start column added Period end column added New syntax for time-traveling queries Minimal maintenance compared to custom code “No-op” updates increments system time Temporal Tables

  49. AS OF <datetime> FROM <start> TO <end> BETWEEN <start> AND <end> CONTAINED IN (<start>, <end>) Temporal Tables – FOR SYSTEM TIME

  50. “Warm standby” Restore full/differential Read-only standby (no locks) Mitigate load on production system Combine withrowversion, change table, cdcetc… Combining Techniques

More Related