1 / 13

Module 5 Metadata, Tools, and Data Warehousing

Module 5 Metadata, Tools, and Data Warehousing. Section 4 Data Warehouse Administration. Data Warehouse and Characteristics.

zyta
Download Presentation

Module 5 Metadata, Tools, and Data Warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Module 5 Metadata, Tools, and Data Warehousing ITEC 450 Section 4 Data Warehouse Administration

  2. Data Warehouse and Characteristics A data warehouse is a subject-oriented, integrated, time-variant, non-volatile collection of data that is designed for query and analysis rather than for transaction processes. • Subject-oriented – data pertains to a particular subject instead of the many subjects pertinent to the company’s ongoing operations. • Integrated – consistent naming conventions, formats, encoding structures; from multiple data sources • Time-variant – data is identified with a particular time period, can study trends and changes • Non-updatable – data is stable in a data warehouse. Data loaded, and should not be removed. ITEC 450

  3. Comparison of Database Characteristics ITEC 450

  4. Data Warehouse and Business Intelligence • A data warehouse usually contains historical data derived from transaction data and other sources. • It enables an organization to consolidate data. • It includes • An extraction, transportation, transformation, and loading (ETL) solution • An online analytical processing (OLAP) engine • Client analysis tools • Reporting ITEC 450

  5. Analytical vs. Transaction Processing • Analytical processing – informational systems • DSS – decision support system • OLAP – online analytical processing • Data mining – the process of mining or discovery of new information in terms of patterns or rules from vast amounts of data • Transaction processing – operational system • OLTP – online transaction processing ITEC 450

  6. Data Warehouse Design • Star schema - data modeling technique used to map multidimensional decision support data into a relational database. It is excellent for ad-hoc queries, but bad for online transaction processing. It contains four components: • Fact table • Dimension tables • Attributes • Attribute hierarchies • Snowflake schema – a star schema in which the dimension tables have additional relationships ITEC 450

  7. Star Schema Components ITEC 450

  8. Star Schema Example ITEC 450

  9. Data Movement – ETL Process • ETL – Extract, Transform, and Load • Capture – extract or obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse • Scrub or data cleansing – uses pattern recognition and AI techniques to upgrade data quality • Transform – convert data from format of operational system to format of data warehouse • Load – place transformed data into the warehouse and create indexes ITEC 450

  10. Data Warehouse Performance • Perspectives of data warehouse performance • Extract performance – how ETL process performs • Data management – database design and data quality • Query performance – OLAP tuning • Server performance – hardware support • Automated summary tables • Provide a proper set of aggregate information • Commonly implement with materialized views or batch operation tables • DBMS features to support data warehousing • Materialized views – automatically creation of summaries • Bitmap indexes – widely used in data warehousing, in addition to B-tree • Parallel execution – multiple processes work together simultaneously to run a single SQL statement ITEC 450

  11. Module 5 Metadata, Tools, and Data Warehousing ITEC 450 Section 5 DBA Rules of Thumb

  12. The Rules of Thumb • Personal DBA handbook • Write down your own experience • Categorize them in a searchable note or repository • Backup everything and plan for worst all the time • Before making any changes, ensure that you can recover from them • Automation and share your knowledge • Create a systematic way to troubleshoot problems • Create, reuse and share scripts • Knowledge sharing will open many revenues for you • Next levels • Understand the business, not just the technology • Keep up-to-date on technology ITEC 450

  13. Course Summary (Your Learning) • DBA Roles and Responsibilities • DBMS Architecture, Physical and Logical Structures • DBMS Installation and Database Creation • Database Connectivity and Network Components • Database Security and Audit Capability • Database Backup and Recovery • Database Monitoring, DBMS System Tuning, Physical Configuration Optimization • SQL Query Coding and Tuning, Data Loading • Database Metadata, Data Dictionary • Data Warehouse Characteristics and Overview ITEC 450

More Related