data warehousing and data mining n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
DATA WAREHOUSING AND DATA MINING PowerPoint Presentation
Download Presentation
DATA WAREHOUSING AND DATA MINING

Loading in 2 Seconds...

play fullscreen
1 / 17

DATA WAREHOUSING AND DATA MINING - PowerPoint PPT Presentation


  • 89 Views
  • Uploaded on

DATA WAREHOUSING AND DATA MINING. MARK VROOMAN AND AMINAT ONI PROJECT REPORT KENNESAW STATE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE CS8630- DATABASE ADMINISTRATION.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'DATA WAREHOUSING AND DATA MINING' - cwen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data warehousing and data mining

DATA WAREHOUSING AND DATA MINING

MARK VROOMAN AND AMINAT ONI

PROJECT REPORT

KENNESAW STATE UNIVERSITY

DEPARTMENT OF COMPUTER SCIENCE

CS8630- DATABASE ADMINISTRATION

purpose of data warehouse
Beazerhomes is a builder that has properties all over the United States. They have been building homes and apartments since 1995 and they feel the need to have a data warehouse to help record and store what the selling and renting trends and patterns for their buildings are and have been. This will help top management make decisions such as where to build more properties, which properties to sell and which to rent out, what incentives make their buildings high sellers, and what prices to put on the buildings to maximize profit and at the same time, encourage buyers.

The data warehouse will have top management access that will include all departments, middle management access that will cover areas they manage, and staff access for their individual sections of work. It will also allow for simple queries on past and present data.

Purpose of data warehouse
benefits of data warehouse
All the data from all the OLTP systems can be integrated and made consistent for easier query.

Data present in the data warehouse cannot be modified, but can only be updated and deleted. This ensures that the data gotten straight from the OLTP systems are the same ones used in business analysis.

Data stored in a data warehouse contains more detailed information on a more historical basis. This makes for a better analysis of home-buying and selling trends and patterns.

Benefits of Data warehouse
slide4

Data warehouse design

- Dimensional modeling is a technique used in describing the database component of a data warehouse. It organizes data in a form that corresponds to the way users query data. A dimensional model includes one more fact tables and a series of smaller tables called dimension tables. Different schemas can be used for dimensional modeling. These are star schema, snowflake schema, and starflake schema. In a star schema, the fact table is surrounded by dimension tables containing reference data that can be denormalized. It can also be described as each dimension table being linked to the primary key fact table.

- The data warehouse design chosen by Beazerhomes uses the star schema for its dimensional modeling. The fact tables; Lease, PropertySale, PropertyMaintenance, PropertyViewing, and Advert contain the primary keys for the dimension tables

slide6

OLTP System

- The OLTP (Online Transaction processing) system is the daily processing database that contains data used daily, on a short term basis. The data from the OLTP is what is stored in the data warehouse. The “DreamHome” contains tables from the OLTP for BeazerHomes and shows information on how each property is sold or rented.

The diagram below contains the fact and dimension tables for the OLTP system, “DreamHome”:

slide8

Data warehouse

  • The data warehouse contains historical and long-term information from the OLTP system.
  • The data will be loaded via an operational datastore. The data will be extracted from OLTP database by the operational datastore after the end of business hours on Friday (assuming a standard business week). The data will then be cleansed and its integrity will be verified. The data will then be loaded into the data warehouse by the operational datastore via the warehouse manager. Prior to the loading of the data, an incremental backup of the data warehouse will be performed. This will ensure a good rollback point if issues occur with the data loading. By carrying these activities out during non-business hours, there will be increased availability of resources and will not slow down operations during normal hours. This also allows time for a recovery if necessary.
slide11

E-R Description

  • The above diagram is the entity-relationship diagram for the data warehouse, that shows how each dimension or fact table interacts with the other tables. These are described as follows:
  • Promotion uses Advert displayed in the Newspaper, which recommends PropertyForRent and promotes PropertyForSale.
  • - Advert describes PropertyForSale.
  • - ClientRenter holds Lease and seeks PropertyViewing.
  • - Branch registers ClientRenter and offers PropertyForRent, sells PropertyForSale, has Staff, and contacts ClientBuyer.
  • - ClientBuyer requests PropertyViewing and agrees to PropertySale.
  • - PropertySale is for PropertyForSale.
  • - PropertyForRent is placed in Advert and takes PropertyViewing. It is also leased via Lease and requires PropertyMaintenance.
  • - Staff oversees PropertyForRent, manages PropertyForSale, and attends to PropertyMaintenance.
slide12

- Owner owns provides (rent to) PropertyForRent or owns PropertyForSale

 - PropertyViewing views PropertyForSale.

slide13

SAMPLE QUERIES

SELECT COUNT (timeID)

FROM PropertySale

Where timeID =

(SELECT (timeID) FROM Time WHERE year BETWEEN ‘2001’ AND ‘2003’);

This query will provide a count of the number of properties that were sold between the beginning of 2001 and the end of 2003. The BETWEEN command includes the endpoints of the range.

slide14

SAMPLE QUERIES, CONTD.

SELECT SUM (sellingPrice) AS TotalRevenue

FROM PropertySale

Where timeID =

(SELECT (timeID) FROM Time WHERE year BETWEEN ‘2001’ AND ‘2003’);

This query will give the total of sales (TotalRevenue) for the dates from the beginning of 2001 to the end of 2003.

slide15

Data Dictionary

The data dictionary is a catalog of data files in the data warehouse. It also contains the characteristics of each file and is only used for the management of the files. An excerpt from Beazerhomes’ data warehouse can be seen below. Although a data dictionary does not contain data, without one, a management system will not be able to access data information from the warehouse.

slide17

A data warehouse is a beneficial database storage system for organizations that want to have a competitive edge in their industry through using past and present data to view positive and negative trends and patterns in order to improve their technology, marketing, financial, and sales. To further make the benefits of a data warehouse more advantageous, data mining tools should be used to uncover unknown patterns and trends that may have been lost in large databases. Using data mining to reveal more complex queries, reports, and patterns will give an even more beneficial advantage to organizations.