slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
January 15, 2014 13 Tips for using SQL Server Change Data Capture (Introduction) PowerPoint Presentation
Download Presentation
January 15, 2014 13 Tips for using SQL Server Change Data Capture (Introduction)

Loading in 2 Seconds...

play fullscreen
1 / 28

January 15, 2014 13 Tips for using SQL Server Change Data Capture (Introduction) - PowerPoint PPT Presentation


  • 128 Views
  • Uploaded on

BDBI Webcast Radio. The B aker’s D ozen B usiness I ntelligence Webcast Radio 13 SQL Server -Business Intelligence Productivity Tips Kevin S. Goff Microsoft SQL Server MVP . January 15, 2014 13 Tips for using SQL Server Change Data Capture (Introduction). BDBI Radio.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'January 15, 2014 13 Tips for using SQL Server Change Data Capture (Introduction)' - maisie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

BDBI Webcast Radio

The Baker’s Dozen

Business Intelligence

Webcast Radio

13 SQL Server -Business Intelligence

Productivity Tips

Kevin S. Goff

Microsoft SQL Server MVP

January 15, 2014

13 Tips for using SQL Server Change Data Capture (Introduction)

bdbi radio
BDBI Radio
  • Produced and Hosted by Kevin S. Goff, Microsoft SQL Server MVP and author of CoDe Magazine“Baker’s Dozen” Productivity Series from 2004-2013
    • http://www.codemag.com/People/Bio/Kevin.Goff
  • Pre-recorded webcast – Available twice a month
  • Site: http://www.BDBIRadio.com
  • My overall blog/site: http://www.KevinSGoff.net
  • Program format:
    • 13 quick tips on different SQL/BI/SharePoint topics
    • In any week, tips might cover one topic (13 features in PowerPivot, 13 SQL Interview topics), or 13 random tips on anything from data warehousing to MDX/DAX programming, to maybe even a few .NET tips for scenarios when the .NET and BI worlds collide
    • (Email me at BDBIRadio@gmail.comif you have questions)

13 tips for SQL Server CDC (Intro)

bdbi radio1
BDBI Radio
  • Today: 13 tips for getting started with SQL Server Change Data Capture
  • Change Data Capture (CDC) was a new feature added to SQL Server 2008 Enterprise
  • A major feature added to the database engine to support “audit trail” capabilities
  • This is an introductory session, first of a two-part series
  • Next broadcast I’ll have a more advanced session

13 tips for SQL Server CDC (Intro)

bdbi radio topics for today
BDBI Radio – Topics for today
  • Change Data Capture as a concept and why it’s important
  • History of capturing changes to data in SQL Server
  • How SQL Server CDC works and how it differs from past solutions
  • Configuring a database for CDC
  • Configuring tables for CDC
  • A simple CDC Demo with a table and some inserts, updates, deletes
  • Viewing the CDC Log tables
  • Querying the log tables
  • CDC Latency
  • Issue with CDC – capturing “who” made the change
  • Handling DDL Changes
  • Turning off SQL Server Agent
  • Limitations of Change Data Capture

13 tips for SQL Server CDC (Intro)

1 why change data capture is important
1 – Why Change Data Capture is important
  • For years, database developers/DBAs have looked for best way to…
    • Capture Changes that occur to data
    • Write out the changes to log (audit trail) tables
  • Log (Audit Trail) tables usually contain three pieces of information:
    • Who changed data
    • When was it changed
    • What was the change (i.e. Discount % changed from 5% to 7%)
  • In some industries, this functionality is absolutely critical
  • Need to know full history of changes – a customer profile might change 7 times in a year. Want to know each change, who made it, and when
  • Used in data warehousing systems – to feed a data warehouse with “what was added and what was changed” since the previous load
  • Challenge is capturing EVERY change – whether made by the application, by stored procedures, by import jobs, or manual updates in a query window in Management Studio

13 tips for SQL Server CDC (Intro)

1 why change data capture is important1
1 – Why Change Data Capture is important
  • Just to illustrate that last point – on the challenge of capturing EVERY change – whether made by the application, by stored procedures, by import jobs, or manual updates in a query window in Management Studio

Need mechanism that will capture what was changed, regardless of the original process

Some solutions are more robust and more effective than others

13 tips for SQL Server CDC (Intro)

2 history of capturing changes
2 – History of capturing changes
  • Prior to SQL Server 2008, here were the solutions available:
  • Database Triggers:
    • PRO: They are “tried and true” and they work, regardless of update scenario
    • CONs:
      • Often requires large amount of code (some will generate code via scripts & metadata)
      • In some instances, they can adversely affect performance
  • Use of OUTPUT INTO statement added in SQL 2005
    • PRO: Simpler than triggers – can redirect changes directly into INSERT/UPDATE statement
    • CONs:
      • Every DML statement on a table must be modified to include the OUTPUT INTO clause. Might not be practical in open-ended systems where UPDATES occur many ways
  • Implement logic in the application layer or manually in stored procs
    • CONs: same as with OUTPUT INTO (might not handle every update scenario)
  • Purchase third-party product (SQLAudit)
    • Debate on Build vs Buy, though some third-party products work well

13 tips for SQL Server CDC (Intro)

3 how cdc works how it differs
3 – How CDC works & how it differs
  • Found in SQL Server Enterprise, starting in 2008
  • Requires the use of SQL Server Agent
  • Mechanism in the Database Engine
  • Captures insert/update/delete activity and posts the details to change tracking tables
  • Details include:
    • State of an inserted row, after insert
    • State of an updated row (both before and after update)
    • State of a row just before deletion
    • Change tracking log tables include all versions of the row!
  • Feature not enabled by default – must run a few system stored procedures to configure

13 tips for SQL Server CDC (Intro)

3 how cdc works how it differs1
3 – How CDC works & how it differs
  • Database engine reads the transaction log asynchronously, using SQL Server Agent Jobs
  • Logs details associated with CRUD operations to change tracking tables
  • Change tracking tables contain complete version history of the rows
  • Note: CDC does not track “who” made the change – we need to handle that manually
  • CDC Jobs clean up change tracking logs rows that are 3+ days old

13 tips for SQL Server CDC (Intro)

4 configuring a database for cdc
4 – Configuring a database for CDC
  • SQL Server Agent must be ON
  • This might be a problem for organizations that manage database services using a Managed Services area – some of them might use other approaches for scheduling jobs and might not enable SQL Server Agent
  • Again, CDC requires SQL Server Agent
  • By default, CDC is not enabled on a database
  • To enable CDC on a database:
    • Exec sys.sp_cdc_enable_db

13 tips for SQL Server CDC (Intro)

5 configuring tables for cdc
5 – Configuring tables for CDC
  • We’ll create a simple table with three columns
  • Then we need to enable CDC for the table, using the system stored procedure exec sys.sp_cdc_enable_table
  • This will do two things:
    • Creates a change tracking table to store all versions of inserted & modified rows
    • Creates a job in SQL Server Agent to scan transaction log for DML statements
  • Again, SQL Server Agent MUST be turned on!!!
  • Next slide, we’ll see what SQL Server created

Specify the schema, the table, and an optional role for user access to the change tracking table that will be created

CDC creates 2 jobs in SQL Server Agent

13 tips for SQL Server CDC (Intro)

5 configuring tables for cdc1
5 – Configuring tables for CDC

Under system tables, we see the Change Tracking table for the base table TestEmployees. We’ll look at entries in the table in a few minutes

Capture job runs system stored procedure sys.sp_MScdc_capture_job

Cleanup job runs system stored procedure sys.sp_MScdc_cleanup_job

13 tips for SQL Server CDC (Intro)

6 cdc demo with inserts updates deletes
6 – CDC Demo with inserts, updates, deletes
  • Insert two rows, then change a rate on one of the rows, then delete a row
  • We want to see the state of each row after the inserts, before and after each update, and when the delete occurs
  • Next step we’ll see what’s in the change tracking log

13 tips for SQL Server CDC (Intro)

7 viewing the cdc log tables
7 – Viewing the CDC Log tables

Each row represents state of the row for each insert, update (old state, new state) and delete

  • First two rows ($Operation = 2) represent the state of the rows after the insert (for the 2 new rows)
  • Second two rows ($Operation =3 and 4) represent the “old version of the row” and “new version of the row” when Katy’s rate was changed from 21.00 to 22.00
  • The third row ($Operation = 1) represents the state of Katy’s row at the time of the deletion
  • We’ll look at the Log Sequence Number (_$start_lsn) later - this is important!
  • One thing missing: datetime of the update?

13 tips for SQL Server CDC (Intro)

8 querying the log tables
8 – Querying the log tables
  • On the issue of the timestamp (datetime) not being in the log table
  • SQL Server stores the time in a separate table (cdc.lsn_time_mapping), with a reference to the log sequence number column from the change tracking table

Time for all updates is stored in cdc.lsn_time_mapping

We can join this to the Change tracking table on the LSN

  • Time mapping table holds the Begin and End time for each update
  • We can join this table to the Change tracking table by the Start_LSN column
  • This also holds the Transaction ID – helpful for debugging
  • Reason for storing this in separate table: transactions might span multiple base tables

13 tips for SQL Server CDC (Intro)

8 querying the log tables1
8 – Querying the log tables
  • Here’s the JOIN

The transaction ID is useful for debugging “what actually happened”

Remember, data only lives in the log tables for 3 days!!!

13 tips for SQL Server CDC (Intro)

9 cdc latency
9 – CDC Latency
  • After you insert/update/delete data from the core tables, you might not immediately see data in the change tracking logs
  • Because the CDC jobs scan the transaction log asynchronously, there is a slight period of delay/latency
  • A transaction that updates a million rows in five seconds might not necessarily see data in the change tracking logs for at least another five seconds
  • You can adjust these jobs for different levels of latency – I’ll cover that next webcast

13 tips for SQL Server CDC (Intro)

10 handling who made an update
10 – Handling who made an update
  • Change Data Capture does not capture one potentially key piece of information: who made the change
  • Because CDC scans the transaction log asynchronously, after the transaction occurs, it cannot capture the user who made the change
  • We need to implement some kind of workaround

13 tips for SQL Server CDC (Intro)

10 handling who made an update1
10 – Handling who made an update
  • Run SSMS as a different user (SQLUser)
  • Run an update and change rows

Even tough current user is SQLUser, an update still marks last user as original user (both in the Employees table and also the change tracking log)

13 tips for SQL Server CDC (Intro)

10 handling who made an update2
10 – Handling who made an update
  • But wait – when we created dbo.TestCustomer, didn’t we include this in the DDL?
    • That only works on INSERTS, not UPDATES
  • OK , since the Change Tracking Log always inserts new rows (even when original tables are updated), can’t we change the DDL for the Change Tracking Log and make the LastUser column use a default value for current user?
    • This is a great question, and often asked.
    • Unfortunately, when SQL Server CDC inserts a row into the Change Tracking table, the “current user” is the SQL Service admin account, not the current user
  • So, considering an update could happen many different ways, how can we get CDC to accurately capture the user who made the change???

13 tips for SQL Server CDC (Intro)

10 handling who made an update3
10 – Handling who made an update
  • Only ways:
    • Explicitly specify the LastUser Column when the original table is originally updated
      • Might not always be possible or practical, if you have an open system where updates can occur many different ways
    • Database Trigger (ironically) to mark the LastUser in the original base table on a change – this will ripple through to the Change Tracking table

Code accesses the INSERTED and DELETED system tables, which are “exposed” by the transaction log and represent the state of the row BEFORE (DELETED) and AFTER (INSERTED) the update

Compare the “old value” and “new value” of each column whose change would warrant updating the Last User. This is the ONLY WAY to detect an actual change was made

13 tips for SQL Server CDC (Intro)

11 handling ddl changes
11 – Handling DDL Changes
  • Suppose you add a column to a core table
  • Will Change Data Capture automatically start logging that new column?
  • Unfortunately, no
  • We need to go through a series of steps:
    • Alter the core table to add the new column
    • Back up the change tracking table to another temp table
    • Disable CDC on that table (with a CDC stored procedure)
    • Re-enable CDC on that table (with a CDC stored procedure), which will add a reference to the new column (but will wipe out the data in the change tracking log, which is why we backed it up)
    • Select all of the data from the temp table (from step 2) back into the new change tracking table

13 tips for SQL Server CDC (Intro)

11 handling ddl changes1
11 – Handling DDL Changes

13 tips for SQL Server CDC (Intro)

12 turning off sql server agent
12 – Turning off SQL Server Agent
  • Suppose you turn off SQL Server Agent and then add rows to a table
  • Since the CDC scan job must be running in SQL Server Agent, turning Agent off will keep any logging from occurring
  • However, when you turn Agent back on, the job will run and “pick up” any changes from the transaction log that weren’t initially captured
  • Believe it or not, even if you attempt to truncate the transaction log in this period of time, any “uncaptured changes” will still be captured when Agent is turned back on!

13 tips for SQL Server CDC (Intro)

13 limitations of change data capture
13 – Limitations of Change Data Capture
  • Only available in Enterprise Edition of SQL Server
  • Cannot truncate tables involved in Change Data Capture
  • Cannot be implemented on tables that do not have a primary key
  • Cannot use CDC on a table that has a columnstore index

13 tips for SQL Server CDC (Intro)

next webcast advanced cdc
Next webcast: Advanced CDC
  • Tools in SSIS 2012 to work in conjunction with SSIS
  • Options for adjusting latency/delay of CDC scan jobs
  • Additional system stored procedures for using CDC
  • Performance benchmarks using CDC

13 tips for SQL Server CDC (Intro)

some good links references
Some good links/references
  • http://www.codeproject.com/Articles/28535/Change-Data-Capture-SQL-Server-2008
  • https://www.simple-talk.com/sql/learn-sql-server/introduction-to-change-data-capture-(cdc)-in-sql-server-2008/

13 tips for SQL Server CDC (Intro)

today s jazz recommendation
Today’s Jazz Recommendation
  • Florian Hoefner Group
  • Great jazz quartet
  • Search YouTube, you’ll find some GREAT performances
  • http://www.amazon.com/Songs-Without-Words-Florian-Hoefner/dp/B00784620I/ref=sr_1_1?ie=UTF8&qid=1390110649&sr=8-1&keywords=florian+hoefner
  • http://www.amazon.com/Falling-Up-Florian-Hoefner-Group/dp/B00DSAUM5M/ref=sr_1_2?ie=UTF8&qid=1390111023&sr=8-2&keywords=florian+hoefner

13 tips for SQL Server CDC (Intro)