1 / 12

MIS 451 Building Business Intelligence Systems

MIS 451 Building Business Intelligence Systems. Introduction to Data Mining. Why data mining?. OLAP can only provide shallow data analysis -- what Ex: sales distribution by product. Why data mining?. Shallow data analysis is not sufficient to support business decisions -- how

mura
Download Presentation

MIS 451 Building Business Intelligence Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MIS 451Building Business Intelligence Systems Introduction to Data Mining

  2. Why data mining? • OLAP can only provide shallow data analysis -- what • Ex: sales distribution by product

  3. Why data mining? • Shallow data analysis is not sufficient to support business decisions -- how • Ex: how to boost sales of other products • Ex: when people buy product 6 what other products do they are likely to buy? – cross selling

  4. Why data mining? • OLAP can only do shallow data analysis • OLAP is based on SQL SELECT PRODUCTS.PNAME, SUM(SALESFACTS.SALES_AMT) FROM DBSR.PRODUCTS PRODUCTS, DBSR.SALESFACTS SALESFACTS WHERE ( ( PRODUCTS.PRODUCT_KEY = SALESFACTS.PRODUCT_KEY ) ) GROUP BY PRODUCTS.PNAME; • The nature of SQL decides that complicated algorithm cannot be implemented with SQL. • Complicated algorithms need to be developed to support deep data analysis – data mining

  5. Why data mining? • OLAP results generated from data sets with large number of attributes are difficult to be interpreted • Ex: cluster customers of my company --- target marketing • Pick two attributes related to a customer: income level and sales amount

  6. Why data mining? • Ex: cluster customers of my company --- target marketing • Pick three attributes related to a customer: income level, education level and sales amount

  7. What is data mining? • Data mining is a process to extract hidden and interesting patterns from data. • Data mining is a step in the process of Knowledge Discovery in Database (KDD).

  8. Step 5: Interpretation & Evaluation Step 4: Data Mining Knowledge Step 3: Transformation Step 2: Cleaning Patterns Step 1: Selection Transformed Data Preprocessed Data Target Data Steps of the KDD Process Data

  9. Steps of the KDD Process • Step 1: select interested columns (attributes) and rows (records) to be mined. • Step 2: clean errors from selected data • Step 3: data are transformed to be suitable for high performance data mining • Step 4: data mining • Step 5: filter out non-interesting patterns from data mining results

  10. Data mining – on what kind of data • Transactional Database • Data warehouse • Flat file • Web data • Web content • Web structure • Web log

  11. Major data mining tasks • Association rule mining – cross selling • Clustering – target marketing • Classification – potential customer identification, fraud detection

  12. Reading : data mining book chapter 1

More Related