1 / 22

Data Warehouse

Data Warehouse. Yong Shi CSE DEPARTMENT. Strategic delivery of information. The current Situation The never-ending quest to access any information, anywhere, anytime. The problem Data is scattered in many types of incompatible structures. Analytical processing requirements.

brandice
Download Presentation

Data Warehouse

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehouse Yong Shi CSE DEPARTMENT

  2. Strategic delivery of information • The current Situation The never-ending quest to access any information, anywhere, anytime. • The problem Data is scattered in many types of incompatible structures.

  3. Analytical processing requirements • Four levels of analytical processing: 1. Simple queries and reports 2. The ability to do “what if” processing 3. Step back and analyze what has previously occurred to bring about the current state of date 4. Analyze what has happened in the past and what needs to be done in the future for a specific change

  4. Information data superstore(IDSS) • Definition: The architecture needed to support the far-ranging requirements of the four levels of analysis. • Also called super data warehouse • Data warehouses is not an end of themselves but merely a step on the path to the information data super store

  5. Why need for a separate environment • The use of operational systems v.s data warehouse • The data’s characteristics • The type of access

  6. A strategy for building a data warehouse • Need indicators • Action steps • Three-stage data warehousing processing: model  build  deploy (understand) (establish) (implement)

  7. Organizational and cultural issues • Cultural imperatives • Success criteria • Satisfy users’ requirements • Make a significant contribution to the success of the business • The users accept and actively use it • The benefits are not exceeded by the costs • An adequate budget must be in place

  8. Organizational and cultural issues • Success criteria(continued) • The implementation of the data warehouse must not cause other problems that overshadow the benefits • A reasonable schedule must be established

  9. Organizational and cultural issues • End user(client) • Strategic architecture • User liaison • End-user support • Data analyst • Security office • Data administration

  10. Organizational and cultural issues • Database administration • Choosing the initial data and department • Establishing an infrastructure • Training users • Change in the power structure

  11. End Users • A crucial part of the project • Gathering requirements and managing expectations • Cost justification process • Design reviews • User perspective • User training

  12. A technical architecture for DW Data Manager Component Warehouse Data Data Delivery Component External Data Source Data Data Acquisition Component Data Access Component Middleware Component Information Directory Component Warehouse Data Design Component External Data Management Component

  13. Data Quality • Why is data quality important? Data is a critical issue It will limit the ability of the end users to make informed decision. It has a profound effect on the image of the enterprise. The poor one will make it difficult to make major changes in an organization.

  14. Data Quality • What is data quality? • The data is accurate • The data is stored according to data type • The data has integrity • The data is consistent • The databases are well designed • The data is accurate • The data is stored according to data type • The data has integrity • The data is consistent • The databases are well designed

  15. Data Quality • The data is not redundant • The data follow business rules • The data corresponds to established domains • The data is timely • The data is well understood

  16. Data Quality • The data satisfies the needs of the business • The user is satisfied with the quality of the data and the information derived from that data • There are no duplicate records • Data anomalies

  17. Data Quality • Assessment of existing data quality • Programs that abnormally terminate with data exceptions • Clients who experience errors/anomalies • Clients who do not know or are confused about what the data actually means • Data that cannot be shared due to lack of integration

  18. Data Quality • What data should be improved? The energy should be spent on data where the quality improvement will bring an important benefit to the business. We can ignore unimportant data and obsolete data. Other criteria: improve those which can be fixed and kept clean.

  19. Data Quality • Purification process • Determine the importance of data quality to the organization • Identify the enterprise’s most important data and evaluate the quality. • Determine users’ and owners’ perception of data quality. • Prioritize which data to purify. • Assemble and train a team to clean the data. • Select tools to aid in the purification process, etc.

  20. Data Quality • Data quality case • Lesson1: If those entering the data have a stake in the data being incorrect, the data will be incorrect. • Lesson2: Reports may show desired results, but the reports may be highly inaccurate.

  21. Directory/Catalog • The challenge Providing short-term benefit without disabling broader long-term information handling solutions. Getting data into a warehouse is only half of the process.

  22. Security in the data warehouse • Basic security concepts • Physical security • Stand-alone or shared security • Remote access

More Related