1 / 29

Copying, Managing, and Transforming Data With DTS

Copying, Managing, and Transforming Data With DTS. Defining Bulk Insert Task Functionality. Quickly Loads Data from a File into SQL Server Encapsulates the Transact-SQL Bulk Insert Statement Supports Table or View Destinations in SQL Server Loads Data with No Applied Transformations

padma
Download Presentation

Copying, Managing, and Transforming Data With DTS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Copying, Managing, and Transforming DataWith DTS

  2. Defining Bulk Insert Task Functionality • Quickly Loads Data from a File into SQL Server • Encapsulates the Transact-SQL Bulk Insert Statement • Supports Table or View Destinations in SQL Server • Loads Data with No Applied Transformations • Supports Format Files to Specify File Layout • Requires Sysadmin or Bulkadmin Fixed Server Roles Membership The Bulk Insert Task is One of Three Ways to Run SQL Server Bulk Copy Operations

  3. Sidebar: SQL Server Bulk Copy Operations • What Do Bulk Copy Operations Offer? • Allow Fast Loading of Data into SQL Server • Configure Data Load Batches • Allow You to Control Logging Operations • Bcp Utility • Bulk Insert Task or T-SQL Bulk Insert Statement • Bulk Copy APIs for OLE DB, ODBC, DB-Library Applications • Ways to Access Bulk Copy Operations

  4. Defining the Sales_stage Table Load • Using the Bulk Insert Task to Load Tab-delimited File Data into Sales_stage • Loading Sales_stage with Data Bound for Sales_fact DTS Tab Delimited File Polaris

  5. Defining Execute SQL Task Functionality • Executing SQL Statements • Source database must understand SQL syntax • SQL statement determines task performance • Task supports single or multiple SQL statements • You can create queries in the DTS Query Designer • Running Parameterized Queries • Input parameters • Output parameters

  6. Using Parameterized Queries • Understanding Global Variable Basics • User-defined storage locations • Information is shared across package steps • Using Parameters with Global Variables • Assign global variable values to query input parameters • Store query results to a global variable with output parameters

  7. Global Variables Parameter ProductName CategoryName Parameter 1 Parameter 2 Parameter 1 The Parameter’s Position in the Query Determines Its Name Global Variables Provide Data to Input Parameters Creating Dynamic Queries ? Question Marks Represent Query Parameters SELECT * FROM product_dim WHERE product_name = ? AND category_name = ?

  8. Storing Query Results Storing Row Values SELECT begin_date, end_date FROM financial_period WHERE quarter = 1 Output Parameter Global Variables begin_date end_date BeginDate EndDate Storing Entire Rowsets SELECT * FROM product Output Parameter Global Variable Entire Rowset Product Store Query Results in Global Variables

  9. DTS Defining the Time_dim Data Load Time_dim_build Stored Procedure • Input Parameters • @p_start_date • @p_end_date

  10. Defining the DTS Data Pump • DTS Mechanism for Moving and Transforming Data • Allows for High-speed Batch Copying of Data • Contains Supplied Data Transformations • Can Also Define ActiveX Script Transformations • Provides An Extendable COM-based Architecture That Allows for Custom Transformations (C++) • Permits the Application of Transformation Logic to Specific Phases of a Data Pump Operation • Multi Phase Data Pump

  11. OLE DB ODBC Understanding How the Data Pump Processes Data X Forms Source Destination ActiveX Script Copy Trim String … Custom OLE DB ODBC • Connects to the source and destination • Reads OLE DB metadata about source and destination columns • Gathers data transformation definitions DTS Data Pump In Out Implements the transformation Writes completed record to the destination

  12. Defining the Tasks That Transform Data • The Transform Data Task • Inserts • The Transform Data Task • Inserts • The Data Driven Query Task • Inserts • Updates • Deletes • The ParallelDataPumpTask • Processes hierarchical rowsets

  13. Defining the Transform Data Task • Data Movement and Transformation Functionality • Copying data between heterogeneous data sources • Applying optional column level transformations • Extended Data Transfer Functionality • Supporting batch processing of data • Providing error-handling capabilities • Containing optimization settings for SQL Server destinations

  14. Selecting Transformation Types Transformation Description ActiveX Script Invokes user-defined ActiveX scripts. Copy Column Copies data from source to destination. DateTime String Converts a date to a new destination format. Lowercase String Converts a string to lowercase characters. Uppercase String Converts a string to uppercase characters. Middle of String Extracts a sub string of source data. Trim String Removes white space from a source string. Read File Copies contents of a file to a destination column. File path is specified by a source column. Write File Copies contents of a source column to a file. File path is specified by a second source column.

  15. Defining Column Mappings • One-to-One Mappings • Symmetric Many-to-Many Mappings • Asymmetric Mappings

  16. Creating Efficient Column Mappings • Minimizing the Number of Column Mappings • Using Many-to-Many Mappings When Possible • Grouping Common Transformations Together

  17. Loading Customer_dim Northwind OLTP SQL Server Database

  18. Performance Settings • Enabling Fast Load • Using high-speed bulk copy processing • Accepting batches of transformed data • Only applies to SQL Server destinations • Using a Table Lock • Configuring Batch Size

  19. Configuring Batch Size • Assembling Records into Groups • DTS commits records to database as a group • Insert batch size sets the number of records in the group • Understanding Default Behavior • Insert batch size is 0 • DTS assigns one batch for all records • Setting the Insert Batch Size • Value between 0-9999 • Setting value can improve performance

  20. Defining SQL Solutions • You Can Use the Source Query of the Transform Data Task to Implement Data Transformations • The Source SQL Statement Must Be Understood by the Source Database • The Performance of the Source Query Depends on the SQL Statement • You Can Use Parameters in the Source Query to Create Dynamic Source SQL Statements • If You Use the Source Query to Manipulate Data, You Can Use the Copy Column Transformation to Load Data into the Destination

  21. Applying SQL Solutions to Load Fact Tables • Using the Source Query to Join Staging Table Data to Dimension Tables • Retrieving Primary Key Values to Store as Foreign Keys on the Fact Table • Using a Copy Column Transformation in the Transform Data Task • Configuring Fast Load for SQL Server Destinations

  22. time_dim 134 1/1/2000 Loading the Fact Table Source Data DimensionTables customer id product id order date quantity_sales amount_sales • Identifying Dimension Application Key Values in the Fact Table Source Data • Retrieving Primary Keys from Each Dimension Table to Assign Foreign Keys ALFI 123 123 1/1/2000 1/1/2000 400 400 10,789 10,789 customer_dim ALFI 201 ALFI Alfreds Sales Fact Data cust_key prod_key time_key quantity_sales amount_sales product_dim 25 123 Chai 201 123 25 1/1/2000 134 400 400 10,789

  23. DTS Loading Sales_fact • Extracting Data from the Sales_stage Table • Assigning Foreign Keys by Retrieving Primary Keys from the Product_dim, Customer_dim, and Time_dim Dimensions

  24. Best Practices - Performing Inserts • Bulk Insert Task • Accessing data in files • Loading data into SQL Server destinations • Copying data with no transformations • Transform Data Task • Accessing any source • Loading to any destination • Creating data transformations • Using input parameters in the source query • Applying custom logic to phases of the data pump

  25. Best Practices - Performance Settings • Tuning the Transform Data Task • Fast load for SQL Server destinations • Batch size • Table lock • Tuning the Bulk Insert Task • Sort order for clustered indexes • Batch size • Table lock

  26. Best Practices - Executing Flexible Queries • The Data Driven Query Task • Execute flexible queries on a row-by-row basis • Meet flexibility needs that outweigh performance needs • Perform non-insert queries • The Execute SQL Task • Execute SQL statements and extended SQL statements • Perform parameterized queries • Assign query outputs to global variables

  27. Best Practices - Using Custom Tasks • Creating Reusable Functions and Utilities • Adding Functionality to DTS Package Designer • Implementing a Faster Alternative to ActiveX Script Tasks

  28. Best Practices - Creating Efficient Column Mappings • Minimizing the Number of Column Mappings • Using Many-to-Many Mappings When Possible • Grouping Common Transformations Together

  29. Best Practices - The Right Transformation Type • Using Supplied Transformations When Possible • Minimizing ActiveX Script Transformations When Performance Outweighs Flexibility • Using SQL Solutions with Copy Column Transformations • Developing Custom Transformations as a Faster Alternative to ActiveX Script Transformations

More Related