1 / 11

Informix Formation

Informix Formation. Chetana Mehta (chetana@pspl.co.in) PSPL, Pune. Outline. Overview of Formation PSPL’s role Future work. Relational Databases. Extraction Cleansing. Optimized Loader. ERP Systems. Data Warehouse Engine. Analyze Query. Purchased Data. Legacy Data.

mirari
Download Presentation

Informix Formation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Informix Formation Chetana Mehta (chetana@pspl.co.in) PSPL, Pune

  2. Outline • Overview of Formation • PSPL’s role • Future work

  3. RelationalDatabases ExtractionCleansing Optimized Loader ERP Systems Data Warehouse Engine AnalyzeQuery Purchased Data LegacyData Metadata Repository Data Warehouse Architecture

  4. What is ETL? • Extract data from existing operational and legacy data, transform and load the warehouse. • Issues: • Sources of data for the warehouse • Data quality at the sources • Merging different data sources • Data Transformation • How to propagate updates (on the sources) to the warehouse • Terabytes of data to be loaded

  5. Overview of Formation • ETL Tool • User-friendly • Scalable

  6. Operators • Join - Hash, Non-equi, Nested loop, Sort-merge • Aggregate/GroupBy • Sort • Deduplicate • Surrogate Key

  7. Performance Subsystem • Periodic statistics • Summary statistics • Operator summary • Group summary • Performance hints

  8. Periodic Statistics • No. of records pushed/pulled • Memory used • Disk reads/writes • Temporary space used

  9. Summary Statistics • No. of records pulled/pushed • Record size • Time when first/last record sent/received • No. of unique keys/groups • Ratio of output size to input size • Selectivity

  10. Performance Hints • Ideal memory size • Suggested memory size • Parallelizing

  11. Future work • Memory cognizant optimization • Parametric query optimization • Operator ordering • XML extensions

More Related