html5-img
1 / 12

The Art and Science of Data Mining

The Art and Science of Data Mining. By Marianna Dizik, Williams-Sonoma, Inc. . What is Data Mining?. Data Mining is the process of exploration and analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns and rules.

mahala
Download Presentation

The Art and Science of Data Mining

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Art and Scienceof Data Mining By Marianna Dizik, Williams-Sonoma, Inc.

  2. What is Data Mining? • Data Mining is the process of exploration and analysis, by automatic or semi-automatic means, of large quantities of data in order to discover meaningful patterns and rules

  3. Directed Data Mining: Classification Estimation Prediction The goal: To build a model that describes/predict one particular variable of interest in terms of the rest of available data Undirected Data Mining: Affinity grouping or association rules Clustering Description or visualization The goal: To establish some relationship among all the variables What Can Data Mining Do?

  4. The Business Context for Data Mining • DM as a Research Tool • Pharmaceutical, bioinformatics • DM for Process Improvement • Manufacturing process control • DM for Marketing • DM for Customer Relationship Management

  5. Technical Context for Data Mining • DM and Machine Learning: • AI, pattern recognition • DM and Statistics • DM and Decision Support • Data Warehouse: • OLAP, Data Marts, and Multidimensional Databases • Single view of the data to increase speed and responsiveness • DM and Computer Technology

  6. The Societal Context for Data Mining • Privacy

  7. Classical Techniques:Statistics • Descriptive Statistics and Visualization • Statistic for Prediction: • Linear Regression • Power Transformation • WLS Regression • Logistic Regression: ln(p/(1-p))=a+Bx • p/(1-p) – odds ratio • Ln(p/(1-p)) - logit

  8. Classic Techniques: Neighborhoods • Clustering: grouping similar objects • Hierarchical and Non-Hierarchical • Clustering for clarity (data reduction) • Clustering for action (policies, rules, promotions) • Clustering for outliers (fraud detection) • Nearest Neighbor Prediction • Recommendation Engines • Response Prediction • Stock Market Predictions • Building Taxonomies • Text Mining, Spam Identification

  9. Next Generation Techniques:Trees • Decision Trees: Segmentation with a Purpose • Exploration • Preprocessing • Prediction • CHAID (Chi-Square Interaction Detector) • CART (Gini metrics)

  10. Next Generation Techniques:Neural Networks • Neural Networks: Black Box Approach • Supervised Learning • Predictions • Unsupervised Learning • Clustering • Kohonen Network • Outlier Analysis • Feature Extraction • NN Models • Input Nodes • Hidden Nodes • Output Nodes

  11. Next Generation Techniques:Rule Induction • Rule Induction: Knowledge Discovery in unsupervised learning system • “If pickles are purchased, then ketchup is purchased” • Rule +Accuracy + Coverage • Target • The Antecedent • The Consequent • Based on Accuracy • Based on Coverage • Based on “Interestingness” • Rules do not Imply Causality

  12. Conclusion: Next GenerationBusiness Intelligence and Data Mining • Information Mining: • Uncover associations, patterns and trends • Detect deviations • Group and classify information • Develop predictive models • Knowledge Management (KM) • Text Mining • Data “Archeology”: • Semantic Computing Tools • Artificial Perception Systems

More Related