1 / 25

Databases Management Systems

Databases Management Systems. Data Warehousing Lectures based on material from Phil Trinder (HW) Monica Farrow G30 email : monica@macs.hw.ac.uk. Data Warehouse. A data warehouse is a specialised database to support strategic decision making Decision making Analyse the problem, e.g.

Jimmy
Download Presentation

Databases Management Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Databases Management Systems Data Warehousing Lectures based on material from Phil Trinder (HW) Monica Farrow G30 email : monica@macs.hw.ac.uk DBIS:Data Warehousing

  2. DBIS:Data Warehousing Data Warehouse • A data warehouse is a specialised database to support strategic decision making • Decision making • Analyse the problem, e.g. • Why are my sales not meeting my targets? • What products are not meeting their targets? • What are the trends for the failing products? • Generate alternative solutions, evaluate them, and choose the best

  3. Decision Support Systems • These are used by management to make strategic or policy decisions • They have existed for a long time • Characteristics • Aimed at loosely specified problems • Combine models and analytical approaches with data retrieval • Good usability for non-specialist use • Flexible: to support multiple decision-making approaches DBIS:Data Warehousing

  4. A wine club example • 100,000 members, 2000 wines, 150 suppliers, 750,000 orders per year • Systems : storage technology • Member administration : indexed sequential files • Stock control: relational database • Order processing: relational database • Despatch: proprietary database DBIS:Data Warehousing

  5. Wine Club Operational Schema Member places supplies Supplier Wine MemberOrder in On OrderItem Stock Is for DBIS:Data Warehousing

  6. Wine Club Situation • Competitors have moved in. Is our market share falling? • What products are increasing/decreasing in popularity? • Which products are seasonal? • Which members place regular orders? • Are some products more popular in certain parts of the country? • Which members concentrate on particular products? DBIS:Data Warehousing

  7. Strategic vs Operational Issues • Strategic: planning and policy making, long term and broad brush, higher levels of management, e.g. • When to launch a new product? • Should a supermarket open on a Sunday? • Operational: day-to-day running of business. Details and immediate, lower levels of management • Which items are out of stock? • What is the status of order 34522? DBIS:Data Warehousing

  8. Motivation for data warehousing • Operational data is not suitable to guide strategic decisions • Need to examine trends • What is happening over time? • Solution: record sales daily and analyse them • This is the start of a warehouse • Operational data is usually archived periodically DBIS:Data Warehousing

  9. Data warehouse characteristics • Subject-oriented e.g. sales • Non-volatile e.g. once data is stored, it isn’t changed • Integrated, data from multiple (operational) sources is accumulated in an integrated format • Time variant: data is recorded against time to allow trend analysis • Some data duplication • Less joins, more indexes • More data, derived data and attributes DBIS:Data Warehousing

  10. Data Warehouse design: dimensional analysis • Used to identify the requirements of the warehouse • What are the aspects of the data that are strategically important e.g. • Member • Product • Time always • We don’t know in advance exactly what the queries will be DBIS:Data Warehousing

  11. Smith Jones Bloggs MEMBER Q1 2008 Q4 2007 TIME Q3 2007 Macon Chablis Merlot Chardonnay PRODUCT 3D analysis DBIS:Data Warehousing

  12. Star Schema SALES Centralfacttable Wine Area Time Member DBIS:Data Warehousing

  13. Warehouse construction Source1 Extraction Presentation Source n Aggregate Navigators Integration DBMS DBIS:Data Warehousing

  14. Extraction • Retrieve data from all data sources: files, databases etc • Must be add-on to existing operational system. For example, • Day-end extraction run • Trigger to record sale DBIS:Data Warehousing

  15. Integration • Format Integration, similar to type mismatch • Examples: address or gender • ‘male’, ‘female’ • ‘M’, ‘F’ • 0 and 1 • Semantic integration: does a datum mean the same in the data being integrated? • Example – a sale • order processing: order received • stock control: extracted items from physical warehouse • despatch: goods shipped DBIS:Data Warehousing

  16. Star Schema Wine winecode, winename, vintage, description, price Area areacode, description SALES Centralfacttable winecode, membercode, areacode, quantity, itemcost Time timecode, date, periodno, quarterno, year Member membercode, membername, memberAddress DBIS:Data Warehousing

  17. Warehouse Database • Centre of star schema becomes a relation: the fact table – numeric facts and foreign keys • Sales(membercode, winecode, timecode, areacode, qty, itemcost) • Each dimension becomes a relation: a dimension table • Member(membercode, membername, memberaddress) • Wine(winecode, name, vintage, description, price) • Area(areacode, description) • There is ALWAYS a time dimension table • Time(timecode, date, periodno, quarterno, year) DBIS:Data Warehousing

  18. Using the Warehouse • The strategic questions can now be investigated using data extracted by SQL queries • For example, to discover which wines have increasing and decreasing sales, we can retrieve a table giving the total sales for each wine against time: • SELECT w.winename, t.period_number,SUM(s.qty)FROM sales s, wine w, time tWHERE s.winecode = w.winecode AND s.timecode = t.timecodeGROUP BY w.winename, t.periodnoORDER BY w.winename, t.periodno DBIS:Data Warehousing

  19. Using the Warehouse • The resulting table might be visualised as a graph, with a different coloured line for each product • Usually involves a lot of indexes • Bitmap indexes are widely used • May use materialised views • Commonly used sums, counts, joins DBIS:Data Warehousing

  20. Aggregation • Data warehouses grow continuously, and may become very large indeed • Problems: the time to compute a query and the size of the result can be very large indeed • Solution: aggregation DBIS:Data Warehousing

  21. Aggregates • Basic idea: sacrifice detail to reduce the size of the data • Store precomputed tables at a useful level of detail. Similar to stored joins. • Must be carefully selected, e.g. • Sales to each member of each wine summer for each quarter • Sales of each wine summed for each quarter for each month • Sales of each wine summed for each area for each quarter • Levels of aggregation • None(i.e. detail) • Light (e.g. monthly) • Highly (e.g. quarterly) DBIS:Data Warehousing

  22. Aggregate navigator • An aggregate navigator uses information about available aggregates to automatically rewrite queries to use them • It also records aggregates usage, so that unnecessary aggregates can be removed • It can suggest useful new aggregates DBIS:Data Warehousing

  23. Presentation requirements • Must be easy to use • Visualise the results of queries in many ways e.g. charts, graphs, scatter diagrams etc • Make good use of colour and dimensions 2D, 2.5D, 3D, animation Example of 2.5D graph • Have analysis tools: statistical and curve fitting • For example the product sales trend table would be plotted as a graph DBIS:Data Warehousing

  24. Discussion/Exercise • A company sells garden trees from several store locations around the country. People visit the store, and buy trees. The names of the customers are always recorded, and many customers place repeat orders. • The company would like to set up a data warehouse so that they can analyse details such as • Frequency of sales per customer • Which location has the best sales, ranked by season • Top selling tree by month • Etc etc • Create a suitable star schema, inventing appropriate attributes DBIS:Data Warehousing

  25. Answer This is a general answer – replace product by trees etc Anything sensible along these lines is ok DBIS:Data Warehousing

More Related