Agenda
Download
1 / 62

Agenda - PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on
  • Presentation posted in: General

Agenda. ODI Performance ODI Scheduling ODI Deployment /Release. Uli Bethke. Dublin based Blog www.bi-q.ie ODI 2007 Reviewer two ODI books ODI articles OTN Deputy chair OUG BI SIG. Next event 11 th June ODI advanced trainer. ODI performance.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Agenda

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Agenda

  • ODI Performance

  • ODI Scheduling

  • ODI Deployment/Release


Uli Bethke

  • Dublin based

  • Blog www.bi-q.ie

  • ODI 2007

  • Reviewer two ODI books

  • ODI articles OTN

  • Deputy chair OUG BI SIG. Next event 11thJune

  • ODI advanced trainer


ODI performance

ODI is a metadata driven (SQL) code generator using code templates (knowledge modules). It uses a Java agent to communicate and send data between source and target systems and the repository over the network.


SQL

  • > 80%: ODI performance issues = SQL issues => SQL main ODI skill

  • Perfect your SQL. Advanced SQL. Analytic Functions

  • Know your database(s) inside out. In particular the target

  • Understand, write, and modify Knowledge Modules


Agent

  • Light weight Java based application

  • Tied to host OS

  • Generates code based on ODI metadata.

  • Communicates source, target, repository.

  • JDBC data transport

  • XML

  • Jetty

  • Interpreters: Jython, JBS, JavaScript, Groovy

  • HSQLDB in memory database

  • Scheduler

  • Sizing


Agent

Target

  • Least amount of roundtrips. Network (JDBC, XML)

  • One target database server only (DW)

    Another Server

  • ODBC drivers

  • JEE agent on Weblogic

  • No support for target OS

  • Resources on target

  • DBA


interfaces

  • No!! KM using row by row processing

  • Use ODI functions rather than DB functions

  • Don’t overuse CKM (especially for large data volumes)

  • temp indexes (I$)

  • Gather statistics (C$, I$, TGT when applicable)

  • Rule of thumb: Use loader KMs or db link KMs rather than JDBC KMs


Source/target

  • Schemas on same database server. Physical schema and not data server.

  • Have sources physically close to target

  • Minimize impact on source

  • Chunking


CRITICAL PATH

Network Paths:Path Durations:

B>E>H6+2+11=19B>D>F6+4+14=24B>D>G6+4+10=20A>C>G9+8+10=27Critical Path


Micro Tuning

  • JDBC drivers

  • JVM

  • Type 4 or 5 JDBC drivers (Data Direct)

  • Array fetch size.

  • DB packet size.

  • Network packet size.


Performance Monitoring

  • ODI Log Data Mart

  • Facts

  • Dimensions

  • Metrics

  • Frontend


Dbms_sqltune_util0

  • dbms_sqltune_util0.sqltext_to_sqlid

  • Link to Data Dictionary Tables


maciEJKOCON

  • Dublin based

  • ODI 2005 (Sunopsis)

  • Reviewer two ODI books

  • Blog www.bi-q.ie

  • maciek@bi-q.ie


ORCHESTRATING DWHPROCESSES

  • Orchestration of Data Process Flow

    • Standard DWH Process flow orchestration

    • Packages in Oracle Data Integrator 10g

    • Load Plans in Oracle Data Integrator 11g

  • Process Flow use cases - efficiency analysis

  • Alternative scheduling

    • benefits


TYPICAL DATA FLOW in DWH

1

step

STAGE

E-LT

DATA EXTRACT

loads data from sources


TYPICAL DATA FLOW in DWH

1

2

step

step

DIMs

STAGE

E-LT

LABEL

provides structured labeling

information

DATA EXTRACT

loads data from sources


TYPICAL DATA FLOW in DWH

1

2

3

step

step

step

FACTS

DIMs

STAGE

E-LT

FACTS

consists of measurements, metrics or facts

LABEL

provides structured labeling

information

DATA EXTRACT

loads data from sources


TYPICAL DATA FLOW in DWH

1

2

3

step

step

step

FACTS

DIMs

STAGE

E-LT

FACTS

consists of measurements, metrics or facts

LABEL

provides structured labeling

information

DATA EXTRACT

loads data from sources

data transport &

transform units


TYPICAL DATA FLOW in DWH

1

2

3

step

step

step

FACTS

DIMs

STAGE

E-LT

FACTS

consists of measurements, metrics or facts

LABEL

provides structured labeling

information

DATA EXTRACT

loads data from sources

data transport &

transform units

ODI 11

Load Plans

ODI 10g

Packages

orchestration


ORCHESTRATION – ODIPACKAGES

using object directly

PRC_B

PKG_ABC

PKG_DE

INT_A

INT_D

INT_C

INT_E


ORCHESTRATION – ODIPACKAGES

using object directly

using scenarios – compiled code

PRC_B

PRC_B

SYNCHRONOUS

PKG_ABC

PKG_DE

PKG_ABCDE

PKG_DE

INT_A

INT_C

INT_A

INT_D

INT_C

INT_E


ORCHESTRATION – ODIPACKAGES

using object directly

using scenarios – compiled code

PRC_B

PRC_B

PRC_B

SYNCHRONOUS

PKG_ABCDE

PKG_ABC

PKG_DE

PKG_ABCDE

PKG_DE

PKG_DE

INT_A

INT_A

INT_C

INT_C

INT_A

INT_D

INT_E

INT_C

ASYNCHRONOUS


ODI 10g vs. ODI 11

FACTS

DIMs

STAGE

PRC_D

PRC_G

PRC_B

PKG_DM

A

C

D

F

E

G

B

PKG_FG

PKG_DE

PKG_ABC

ODI 10g

Packages

INT_C

INT_F

INT_A

INT_C


ODI 10g vs. ODI 11

FACTS

DIMs

STAGE

PRC_G

PRC_D

PRC_B

PKG_DM

PKG_FG

PKG_DE

PKG_ABC

ODI 10g

Packages

INT_A

INT_C

INT_F

INT_C

ODI 11

Load plans


ODI 10g vs. ODI 11

FACTS

DIMs

STAGE

PRC_B

PRC_G

PRC_D

PKG_DM

C

A

D

F

B

G

E

PKG_FG

PKG_DE

PKG_ABC

ODI 10g

Packages

INT_F

INT_C

INT_A

INT_C

ODI 11

Load plans

same

effect!


PROCESS FLOW EFFICIENCY ANALYSIS

Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts

30

A

B

G

C

F

D

E

sequential

10

30

10

10

10

10

parallel

10

30

10

10

30 + 30 + 10 = 70

G

A

B

F

D

E

C

10

30

10


PROCESS FLOW EFFICIENCY ANALYSIS

Standard Flow Orchestration: Stage-(stop)DIMs-(stop)Facts

30

A

B

G

C

F

D

E

sequential

10

10

10

10

10

30

parallel

10

30

10

10

30 + 30 + 10 = 70

B

D

E

C

G

A

F

10

30

10

  • DOWNSIDES:

  • POSSIBLE INEFFICIENCIES (IDLE RESOURCES)


PROCESS FLOW EFFICIENCY ANALYSIS

30

A

B

G

C

D

F

E

OPTIMIZATION ATTEMPT

10

10

10

30

10

10


PROCESS FLOW EFFICIENCY ANALYSIS

30

A

B

G

C

F

D

E

OPTIMIZATION ATTEMPT

sequential

30

10

10

10

10

10

30 + 10

10 + 30

+ 10 = 50

parallel

10

30

10

10

G

F

B

C

A

E

D

10

30

10

70  50 = 1.4 times quicker!

  • UPSIDE:

  • EFFICIENCY IMPROVED


ADVANCEDData Flow example


Enterprise DWH Data Flow example


Enterprise DWH Data Flow example


PROCESS FLOW EFFICIENCY ANALYSIS

30

A

B

G

C

F

D

E

OPTIMIZATION ATTEMPT

sequential

30

10

10

10

10

10

30 + 10

10 + 30

+ 10 = 50

parallel

10

30

10

10

G

F

B

C

A

E

D

10

30

10

70  50 = 1.4 times quicker!

  • UPSIDE:

  • EFFICIENCY IMPROVED

  • DOWNSIDES:

  • TIMINGS KNOWLEDGE REQUIRED

  • OVERALL DEPENDECY KNOWLEDGE REQURED


PROCESS FLOW EFFICIENCY ANALYSIS

30

A

B

G

C

F

D

E

OPTIMIZATION ATTEMPT

sequential

30

10

10

10

10

10

parallel

10

30

10

70

10

30 + 30 + 10 = 70

A

B

C

E

F

G

D

10

30

10

70

  • DOWNSIDE:

  • INEFFICIENCY EXISTS BUT CAN’T BE RESOLVED

  • CONSUMER WAITING & IMPACT


TraditionalScheduling - limitations

  • Possible inefficiencies (idle resources)

  • Timings knowledge required

  • Overall dependecy knowledge requred

  • Inefficiency exists but can’t be resolved

  • Consumer waiting & impact


TraditionalScheduling - limitations

  • Possible inefficiencies (idle resources)

  • Timings knowledge required

  • Overall dependecy knowledge required

  • Inefficiency exists but can’t be resolved

  • Consumer waiting & impact

SCHEDULER


DEPENDENCY DRIVENScheduling

C

E

A

E

D

C

B

C

E

D

D

A

B

B

A

C

E

E

B

A

C

D

D

A

C

D

E

B

A

C

B

B

D

A

E


DEPENDENCY DRIVENScheduling

C

A

B

E

D

C

B

D

E

E

C

A

B

A

D

C

D

B

A

C

D

E

E

B

A

D

E

B

A

C

C

B

PACKGAGES&LOAD PLANS

D

A

E


PROCESS FLOW EFFICIENCY ANALYSIS

sequential

30

30

A

A

B

B

G

G

C

C

D

F

F

D

E

E

30

10

10

10

30

10

10

10

10

10

10

10

parallel

10

30

10

10

30 + 30 + 10 = 70

10

30

10

70

D

C

B

A

F

G

E

70

10

10

10

10

10

30

30


PROCESS FLOW EFFICIENCY ANALYSIS

sequential

30

30

A

A

B

B

G

G

C

C

F

D

D

F

E

E

30

10

10

30

10

10

10

10

10

10

10

10

parallel

10

30

10

10

30 + 30 + 10 = 70

10

30

10

30

70

G

D

C

E

F

A

B

70

70

10

10

10

10

10

30

30

70  30 = 2.3 times faster!


Dependency Driven Scheduling

  • Simplifies orchestrating the flow

    • only immediate upstream definition required

    • execution timings not relevant

    • self-adapts in the most effective way

  • Improves overall E-LT performance

    • Less idle resources – better utilization

    • Independency

    • unveilsits full potential in complex Enterprise class DWHs (Inmon)


Dependency Driven Scheduling

  • Notifications

    • errors (+auto-restartability)

    • finish summary

    • logging

  • Multiple/overlapping E-LT streams

    • load with different frequencies

  • Parameterization

    • improved system stress control

    • process prioritization


FIRST RUN

10

processes


FIRST RUN

TODAY

10

584

1389

DEPENDENCIES

processes

processes


FIRST RUN

TODAY

10

584

1389

DEPENDENCIES

processes

processes

132 231 SCENARIOS RUN


FIRST RUN

TODAY

10

584

1389

DEPENDENCIES

processes

processes

132 231 SCENARIOS RUN

12h43m

TIME

LOAD PLANS


FIRST RUN

TODAY

10

584

1389

DEPENDENCIES

processes

processes

132 231 SCENARIOS RUN

2.9

12h43m

4h21m

TIME

TIMES

FASTER

LOAD PLANS

SCHEDULER


Enterprise DWH Data FloW


Release 1.0


Release 2.0 TST


TESTING Release 2.0


Deploy Release 2.0 PRD


The Hot fix SITUATION


Release frequently


CI environment


CI environment


The build master


AUTOMATE Stuff


ODI vs. Source control


ODI structure


Beyond intra build Dependencies


ad
  • Login