130 likes | 271 Views
Module 5 Metadata, Tools, and Data Warehousing. Section 4 Data Warehouse Administration. Data Warehouse and Characteristics.
E N D
Module 5 Metadata, Tools, and Data Warehousing ITEC 450 Section 4 Data Warehouse Administration
Data Warehouse and Characteristics A data warehouse is a subject-oriented, integrated, time-variant, non-volatile collection of data that is designed for query and analysis rather than for transaction processes. • Subject-oriented – data pertains to a particular subject instead of the many subjects pertinent to the company’s ongoing operations. • Integrated – consistent naming conventions, formats, encoding structures; from multiple data sources • Time-variant – data is identified with a particular time period, can study trends and changes • Non-updatable – data is stable in a data warehouse. Data loaded, and should not be removed. ITEC 450
Data Warehouse and Business Intelligence • A data warehouse usually contains historical data derived from transaction data and other sources. • It enables an organization to consolidate data. • It includes • An extraction, transportation, transformation, and loading (ETL) solution • An online analytical processing (OLAP) engine • Client analysis tools • Reporting ITEC 450
Analytical vs. Transaction Processing • Analytical processing – informational systems • DSS – decision support system • OLAP – online analytical processing • Data mining – the process of mining or discovery of new information in terms of patterns or rules from vast amounts of data • Transaction processing – operational system • OLTP – online transaction processing ITEC 450
Data Warehouse Design • Star schema - data modeling technique used to map multidimensional decision support data into a relational database. It is excellent for ad-hoc queries, but bad for online transaction processing. It contains four components: • Fact table • Dimension tables • Attributes • Attribute hierarchies • Snowflake schema – a star schema in which the dimension tables have additional relationships ITEC 450
Star Schema Components ITEC 450
Star Schema Example ITEC 450
Data Movement – ETL Process • ETL – Extract, Transform, and Load • Capture – extract or obtaining a snapshot of a chosen subset of the source data for loading into the data warehouse • Scrub or data cleansing – uses pattern recognition and AI techniques to upgrade data quality • Transform – convert data from format of operational system to format of data warehouse • Load – place transformed data into the warehouse and create indexes ITEC 450
Data Warehouse Performance • Perspectives of data warehouse performance • Extract performance – how ETL process performs • Data management – database design and data quality • Query performance – OLAP tuning • Server performance – hardware support • Automated summary tables • Provide a proper set of aggregate information • Commonly implement with materialized views or batch operation tables • DBMS features to support data warehousing • Materialized views – automatically creation of summaries • Bitmap indexes – widely used in data warehousing, in addition to B-tree • Parallel execution – multiple processes work together simultaneously to run a single SQL statement ITEC 450
Module 5 Metadata, Tools, and Data Warehousing ITEC 450 Section 5 DBA Rules of Thumb
The Rules of Thumb • Personal DBA handbook • Write down your own experience • Categorize them in a searchable note or repository • Backup everything and plan for worst all the time • Before making any changes, ensure that you can recover from them • Automation and share your knowledge • Create a systematic way to troubleshoot problems • Create, reuse and share scripts • Knowledge sharing will open many revenues for you • Next levels • Understand the business, not just the technology • Keep up-to-date on technology ITEC 450
Course Summary (Your Learning) • DBA Roles and Responsibilities • DBMS Architecture, Physical and Logical Structures • DBMS Installation and Database Creation • Database Connectivity and Network Components • Database Security and Audit Capability • Database Backup and Recovery • Database Monitoring, DBMS System Tuning, Physical Configuration Optimization • SQL Query Coding and Tuning, Data Loading • Database Metadata, Data Dictionary • Data Warehouse Characteristics and Overview ITEC 450