Transportation loading warehouse data
Download
1 / 24

Transportation: Loading Warehouse Data - PowerPoint PPT Presentation


  • 159 Views
  • Updated On :

Transportation: Loading Warehouse Data. Chapter 12. Transporting data into the Warehouse. Loading moves the data into the warehouse Loading can be time-consuming: - Consider the load window. - Schedule the task; automates all processes.

Related searches for Transportation: Loading Warehouse Data

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Transportation: Loading Warehouse Data' - fala


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Transportation loading warehouse data l.jpg

Transportation: Loading Warehouse Data

Chapter 12


Transporting data into the warehouse l.jpg
Transporting data into the Warehouse

  • Loading moves the data into the warehouse

  • Loading can be time-consuming:

    - Consider the load window.

    - Schedule the task; automates all

    processes.

  • Initial load moves large volumes

  • Subsequent refresh moves smaller volumes

  • Business determines the cycle


Extract processing environment l.jpg
Extract Processing Environment

  • After each time interval, build a new database

  • Run queries

Operational

database

T1

T2

T3


Warehouse processing environment l.jpg
Warehouse Processing Environment

  • Build a new database

  • After each time interval, add changes to database

  • Archive or pure oldest data

  • Run queries

Operational

database

T1

T2

T3


First time load l.jpg
First-Time Load

  • Single event that populates the data with historical data

  • Involves large volume of data

  • Employs distinct ETT tasks

  • Involves large amounts of processing after load

Operational

database

T1

T2

T3


Refresh l.jpg
Refresh

  • Performed according to a business cycle

  • Simple task

  • Less data to load than first-time load

  • Less-complex ETT

  • Smaller amounts of postload processing

Operational

database

T1

T2

T3


Building the transportation process l.jpg
Building the Transportation Process

Specification

  • Techniques and tools

  • File transfer methods

  • The load window

  • Time window for other tasks

  • First-time and refresh cycle

  • Connectivity bandwidth


Building the transportation process8 l.jpg
Building the Transportation Process

  • Test the proposed techniques

  • Document proposed load

  • Gain agreement on the process

  • Monitor

  • Review

  • Revise


Granularity l.jpg
Granularity

Low-level grain

- Expensive, high

level of processing,

more disk, detail

High-level grain

- Cheaper, less

processing, less disk,

little detail

  • Important design and operational issue

  • Space requirements

    - Storage

    - Backup

    - Recovery

    - Load


Transportation techniques l.jpg
Transportation Techniques

  • Tools

  • Utilities and 3GL

  • Gateways

  • Customized copy programs

  • Replication

  • FTP

  • Manual


Transportation technique considerations l.jpg
Transportation Technique Considerations

  • Tools are comprehensive but costly.

  • Data-movement utilities are fast and powerful.

  • Gateways are not always the fastest method:

    - Access other databases

    - Supply dependent data marts

    - Support a distributed environment

    - Provide real-time access if needed


Using sql loader to load data l.jpg
Using SQL*Loader to Load Data

Control file

Input files

  • Fastest load mechanism

  • Direct path

  • Parallel and unrecoverable

  • Direct-load INSERT (Oracle8)

  • Direct-path load API (Oracle8i)

SQL*Loader

Bad files

Log files

Discard files


Direct path load api in oracle8i l.jpg
Direct-Path Load API in Oracle8i

  • Allows ETT and other tools to load Oracle databases efficiently

  • Permits load behavior to be customized

  • Gives direct-path load performance

  • Provided complete access to all direct-load functionality using OCI


More transportation technique considerations l.jpg
More Transportation Technique Considerations

  • Use customized programs as a last resort

  • Replication is limited by data-transfer rates


Postprocessing of loaded data l.jpg
Postprocessing of Loaded Data

Transform

Extract

Transport

Create

Indexes

Generate

keys

Postprocessing

of loaded data

Summarize

Filter


Indexing data l.jpg
Indexing Data

  • Before load: fast index reenablement

  • During load: adds time to load window

  • After load: adds time to load window

Index

Operational

databases

Staging

files

Warehouse

database


Unique indexes l.jpg
Unique Indexes

  • Disable constraints load

  • Enable constraints to create index

Disable

constraints

Load

data

Enable

constraints

Create

index

Catch

errors

Reprocess


Creating artificial keys l.jpg
Creating Artificial Keys

  • Use generalized or derived keys

  • Maintain the uniqueness of a row

  • Use an administrative process to assign the key

  • Concatenate operational key with number:

    - Easy to maintain

    - Cumbersome keys

    - No clean value for retrieval


Creating unique keys for records l.jpg
Creating Unique Keys for Records

  • Assign a number from a list:

    - No semantic meaning

    - Extract operations must reference table

    assign numbers

    Update metadata

  • Verdict

109908

1


Creating summary tables l.jpg
Creating Summary Tables

  • CTAS

  • pCTAS

Summary data

Warehouse

Data marts


Verifying data integrity l.jpg
Verifying Data Integrity

  • Load data into intermediate file

  • Compare target flash totals with totals before load

Flash

totals

Counts

and

amounts

File

1

Load

Load

File

1

File

2

File

2

Preserve,

inspect,

fix, then load

Intermediate

file

Warehouse



Standard quality assurance checks l.jpg
Standard Quality Assurance Checks

  • Load status

  • Completion of the process

  • Completeness of the data

  • Data reconciliation

  • Violations

  • Reprocessing

  • Comparison of counts and amounts


Summary l.jpg
Summary

This lesson discussed the following topics:

  • First-time load considerations

  • Techniques for transporting data

  • Tasks involved in the postload processing stage