Data integration at microsoft technologies and solution patterns l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 46

Data Integration at Microsoft: Technologies and Solution Patterns PowerPoint PPT Presentation


  • 104 Views
  • Uploaded on
  • Presentation posted in: General

SESSION CODE: BIE202. Data Integration at Microsoft: Technologies and Solution Patterns. Jeff Bernhardt SSIS Product Unit Manager Microsoft Corporation. An Ancient Story from India …. A Modern Story from Redmond…. Integration Services. BizTalk Server. SQL Server Service Broker.

Download Presentation

Data Integration at Microsoft: Technologies and Solution Patterns

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Data integration at microsoft technologies and solution patterns l.jpg

SESSION CODE: BIE202

Data Integration at Microsoft:Technologies and Solution Patterns

Jeff Bernhardt

SSIS Product Unit Manager

Microsoft Corporation


An ancient story from india l.jpg

An Ancient Story from India …


A modern story from redmond l.jpg

A Modern Story from Redmond…

Integration Services

BizTalk Server

SQL Server

Service Broker

SQL Server

Replication

Others …


Data integration two globs of ideas why and how l.jpg

Data Integration: Two Globs of Ideas, Why and How

Replication / Synchronization

Data Management

Streams - CEP

Business Intelligence

SOA

Federated Views

Data Warehousing

Migration

Message Oriented

ETL – Bulk Move

CDC

Operational Consistency

Data Quality

Consolidation

Master Data

B2B

Technology Choices

Problem/Solution types


Di problem types building a dw and business intelligence l.jpg

DI Problem Types: Building a DW and Business Intelligence

Text Files

Reports

My DW

ETL

My ODS or OLTP System

  • Alter the shape

    • Create a Star Schema (de-nomalized for analysis queries)

    • Surrogate Keys (in place of business keys)

    • Pre-Aggregations (to support some types of reporting)

  • Track History

    • Slowly Changing Dimensions (history of entities)

    • Manage Partitions (once a month, roll up details and archive)

  • Take changes from the store

    • React to Inserts/Updates/Deletes.

    • Could be a “full refresh” or incremental


Di problem types data consistency between applications l.jpg

DI Problem Types: Data Consistency Between Applications

New Custom AR system in SQL

Old Accounts Receivable on SAP

Create Consistency

Create Consistency

  • A long running ‘bridge’

    • Existing systems will be left in place and kept in synch.

    • Reacts to changes in either system.

    • Needs a way to react to changes or messages to minimize tax on App systems

  • The systems are different

    • Often different back ends.

    • Match schemas, tables, columns

    • Consistent data domains (like keys)

    • Detect and resolve duplicates

    • Create a consistent level of granularity

      • Aggregate

      • Allocate


Di problem types migration or consolidation l.jpg

DI Problem Types: Migration or Consolidation

New Custom AR system in SQL

Once the design is set and tested, execute this

Old Accounts Receivable on SAP

Transfer all data and map the shape

  • Systems or Companies merged or acquired.

    • Bring the data together into the “new” place.

  • An integration system is design and built and tested to minimize the down time for the old system and make one smooth transition.

    • Match schemas

    • Consistent data domains (like keys)

    • Detect and resolve duplicates

  • May create a long running ‘bridge’ while the systems settle.


Di problem types master data management l.jpg

DI Problem Types: Master Data Management

Customers

Support

Customers

Marketing

Customers

Sales

Customers

Accounting

Customers

  • Creating ‘One Version of the Truth’

  • Data residing in many sources where each source schema is fixed but different. Combined into one store with a consistent schema

    • Pivot / Unpivot

    • Type and domain mapping

    • Key generation

  • Ensure quality

    • Remove duplicates

    • Provide missing data

    • Hard matching to find duplicates

  • Bulk update and trickle changes

  • Changes to central store delivered back to operational system


Di problem types b2b inter enterprise data sharing l.jpg

DI Problem Types: B2B (Inter-Enterprise Data Sharing)

PartsAreUs

EZ Buy

Orders

Internet / WAN

Supplier’s

System

Order Fulfillment System

WeShip

Shipper’s

System

  • Contracts

  • SLAs

  • Standardized formats

  • Long running transactions or business process

  • Loosely coupled

  • Coordination, message passing

  • A very specific perspective on Application Integration.


Organizing the problem solution ideas l.jpg

Organizing the Problem/Solution Ideas

  • Data Warehouse and Business Intelligence

  • Data Consistency Between Applications

  • Data System Migration and Consolidation

  • Master Data Management

  • Inter Enterprise Data Acquisition and Sharing


Technology types bulk movement l.jpg

Technology Types: Bulk Movement

Point B

Point A

ETL

Text Files

RDBMS

RDBMS

Text Files

ELT

XML

XML

  • Move a sizeable set of rows from point A to point B

  • Often

    • Part of a scheduled process

    • Transform the shape of the data being moved

    • Combine many sources or split into many destinations

  • Two flavors

    • ETL (Extract Transform Load)

      • SSIS

      • AscentialDatastage (IBM)

    • ELT (Extract Load Transform)

      • Oracle Warehouse Builder

      • Bulk Insert


Technology types message oriented movement l.jpg

Technology Types: Message Oriented Movement

C

B

A

Coordinator

Line Of Business Application

RDBMS

D

Event

XML

Text Files

  • Central ‘Coordinator’

    • Guarantees receipt and delivery of messages.

    • Components are ‘at rest’ until activated by the coordinator or an external event.

  • Data delivered in packets along with the message.

  • Terms that might fit in this category:

    • CDC

    • Trickle Feed

    • SOA

    • Message Bus


Technology types replication and synchronization l.jpg

Technology Types: Replication and Synchronization

Repl / Sync Agent

  • Maintaining equivalent copies of data in different locations

    • One master, many slaves

    • Multi-master

    • High Availability (live backups)

  • Similarity between systems

    • Most often table copies on the same brand of RDBMS

    • Heterogeneous possible

      • Attunity, Goldengate, etc.

      • Transformations: Little to none

  • Terms that might fit in this category:

    • CDC, Log mining

    • Merge Replication

    • Checksum tables


Technology types federated views l.jpg

Technology Types: Federated Views

Reports

View Provider

  • Answers queries directly from many source systems

  • View Provider may:

    • Optimize and execute the combined query (Joins, etc.)

    • Pushes query parts down to the source.

    • Provide unified security model

    • Provide unified metadata

    • Cache source data

    • Support Heterogeneous Sources


Technology types stream processing l.jpg

Technology Types: Stream Processing

Destination

Source

CEP Engine

Event Processing

Event

Log

Event

  • Monitor a stream of data, Create an event when

    • Temporal (time based) events occur

    • Running average or aggregate hits a limit

    • Interesting sequence of records is detected

  • Also called CEP (Complex Event Processing)

  • Different from the other Technology Types??? I Can’t tell yet.


Technology types data management and quality l.jpg

Technology Types: Data Management and Quality

  • A collection of services common to most Data Integration solutions

  • Shared semantic model

    • Metadata library

    • Manage hierarchies

    • Data artifact level security model

  • Data Quality

    • Profile to understand

    • Merge to resolve duplicates

    • Find approximate matches

    • Test and monitor quality.

  • Version management for data.


Organizing the technology choice ideas l.jpg

Organizing the Technology Choice Ideas

Bulk Movement

  • Message Oriented Movement

Replication and Synchronization

  • Federated Views

  • Data Management and Quality

  • Stream Processing (CEP)


Problems and technologies l.jpg

Problems and Technologies

  • Data Warehouse and Business Intelligence

  • Data Consistency Between Applications

  • Data System Migration and Consolidation

  • Master Data Management

  • Inter Enterprise Data Acquisition and Sharing

Bulk Movement

15%

  • Message Oriented Movement

10%

Replication and Synchronization

60%

  • Federated Views

  • Data Management and Quality

15%

  • Stream Processing (CEP)


Microsoft s offerings l.jpg

Microsoft’s Offerings

  • Data Warehouse and Business Intelligence

  • Data Consistency Between Applications

  • Data System Migration and Consolidation

  • Master Data Management

  • Inter Enterprise Data Acquisition and Sharing

Bulk Movement

SSIS

Service

Broker

  • Message Oriented Movement

BizTalk

SQL

Replication

Replication and Synchronization

Distributed

Query

  • Federated Views

Master Data

Services

  • Data Management and Quality

  • Stream Processing (CEP)

Stream

Insights


What should you use and when it depends on the artifacts facets attributes the flavor l.jpg

What Should You Use and When? It Depends On The Artifacts / Facets / Attributes / “the flavor”:


A brief tour of the microsoft offerings l.jpg

A Brief Tour of the Microsoft Offerings

  • Integration Services

  • Service Broker

  • Replication

  • Distributed Query

  • BizTalk Server

  • StreamInsights


Integration services ssis overview l.jpg

Integration Services (SSIS) - Overview


Integration service problems technologies artifacts l.jpg

Integration Service – Problems / Technologies / Artifacts

My ODS or OLTP System

Text Files

  • Constructing a Data Warehouse

  • Migration / Consolidation

  • Bulk Movement

  • ETL

My DW

SSIS


Integration services customer solution l.jpg

Integration Services – Customer Solution

Data Mart

(Reporting and Analytics)

Inventory Management

(Oracle)

Attunity CDC for Oracle

SSIS Package

Lookups, load facts and dimensions, surrogate key generation, …

SSIS Package

Data Warehouse

(SQL Server)

SQL Server Source

SSIS Package

Lookups, slowly changing dimensions, address cleansing, …

CRM

(SQL Server)

Flat File Source

SSIS Package

Data conversions, parsing, data quality, aggregations, …

SSIS Package

Manufacturing Data

(Flat files)

Staging DB

Operational Database

(Shop Floor Application)


Service broker overview l.jpg

Service Broker - Overview


Service broker problems technologies artifacts l.jpg

Service Broker – Problems / Technologies / Artifacts

Database 1

Database 2

  • Consistency Between Applications

  • Master Data Management (?)

  • Message Oriented Movement

conversation

Service A

Service B

Queue A

Queue B

Database 1

Database 2


Service broker customer solution l.jpg

Service Broker – Customer Solution

  • Bank, lost loan provisioning

  • Requires very fast processing and analysis of up to date data.

SSIS

Service Broker

subset 1 (x rows)

SSIS

server 1

subset 2 (x rows)

sproc

server 2

Result

Table

Source

Table

32 cores

subset n (x rows)

server n


Sql replication overview l.jpg

SQL Replication - Overview


Sql replication problems technologies artifacts l.jpg

SQL Replication – Problems / Technologies / Artifacts

  • Data Warehousing

  • Data consistency between Applications

  • Migration / Consolidation

  • Replication and Synchronization

Reporting and Staging

Enterprisee Information Management


Sql replication customer solution l.jpg

SQL Replication – Customer Solution

Corporate Offices

Edcon

  • 1000 branch offices

  • One way replication of catalog data from hub to spoke

  • Catalog downloads are partitioned with complex Dynamic and Join filtering – Catalog/Pricing data per store

  • Uses merge Replication for downloads

  • Subscribers located in each store - multi-user server databases

  • Uses Service Broker for uploads of transacted data – requires guaranteed in order delivery

LOB Systems

SSIS

(daily)

‘Central’

Service

Broker

Merge Replication

Branch Office

‘Branch’

Transactional

Replication

Online Terminal


Distributed query overview l.jpg

Distributed Query - Overview


Distributed query problems technologies artifacts l.jpg

Distributed Query – Problems / Technologies / Artifacts

Access

SQL

  • Business Intelligence

  • Master Data Management

  • Federated Views

SQL


Biztalk server overview l.jpg

BizTalk Server - Overview


Biztalk server problems technologies artifacts l.jpg

BizTalk Server – Problems / Technologies / Artifacts

LOB App

Bus

OLTP

  • B2B

  • Data Consistency Between Applications

  • Message Oriented Movement

XML

Docs

Orchestration Logic


Biztalk server customer solution l.jpg

BizTalk Server – Customer Solution

Emerging Practice: Loose Coupled Batch

  • BizTalk Coordinates process

  • Some traditional data flow through messages

  • Many SSIS Packages with complicated relationships and dependencies.

  • Messages control activation of SSIS pieces.

  • Messages deliver intermediate results or pointers to batch data.

  • Scale out SSIS execution

Bus

LOB App

Orchestration Logic

XML

Docs

OLTP

SSIS Package

SSIS Package

DW


Streaminsights overview l.jpg

StreamInsights - Overview


Streaminsights problems technologies artifacts l.jpg

StreamInsights – Problems / Technologies / Artifacts

  • Business Intelligence

  • Data Warehousing

  • Message Oriented Movement

  • Stream Processing

Data Sources, Operations, Assets, Feeds, Sensors, Devices

Input Data Streams

Results

Operational Data Store & Archive

CEP Engine

f(x)

g(y)

f'(x)

h(x,y)


Streaminsights customer solution l.jpg

StreamInsights– Customer Solution

Telco

  • Detect Fraud callbacks

  • During regular processing of data warehouse facts

  • Custom SSIS Component encapsulates the CEP engine.

  • Events detected and sent for follow-up

Switch

Logs

Switch

Logs

Switch

Logs

SSIS

Fact Processing

StreamInsight

Component

Fraud

DW


A bigger picture l.jpg

A Bigger Picture?


Resources l.jpg

Required Slide

Resources

Learning

  • Sessions On-Demand & Community

  • Microsoft Certification & Training Resources

www.microsoft.com/teched

www.microsoft.com/learning

  • Resources for IT Professionals

  • Resources for Developers

  • http://microsoft.com/technet

  • http://microsoft.com/msdn


Slide43 l.jpg

Required Slide

Complete an evaluation on CommNet and enter to win!


Slide44 l.jpg

Sign up for Tech·Ed 2011 and save $500 starting June 8 – June 31st

http://northamerica.msteched.com/registration

You can also register at the North America 2011 kiosk located at registrationJoin us in Atlanta next year


Slide45 l.jpg

© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.


  • Login