260 likes | 396 Views
Data Mining. Models Created by Data Mining. Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns. Knowledge Discovery in Databases (KDD). Select target data Preprocess data Transform (if necessary) Data mine information Interpret discovered structures.
E N D
Models Created by Data Mining • Linear Equations • Rules • Clusters • Graphs • Tree Structures • Recurrent Patterns
Knowledge Discovery in Databases (KDD) • Select target data • Preprocess data • Transform (if necessary) • Data mine information • Interpret discovered structures
Dependant and Independent Variables • Dependant Variable - Attribute to be predicted. • Independent Variable - Attributes used for making the prediction.
Fields Contributing to Data Mining • Database Technology • Statistics • Machine Learning • High Performance Computing • Pattern Recognition • Neural Networks • Data Visualization • Information Retrieval
Applications of Data Mining • Decision Making • Process Control • Information Management • Query Processing
Methods of Data Reduction • Drill-down analysis • Clustering • Aggregation • Simple Tabulation
Exploratory Data Analysis (EDA) • Distributions of Variables • Correlation Matrices • Multi-way Frequency Tables • Cluster Analysis • Classification Trees • Other multivariate techniques
Statistical Methods Used in Data Mining • Regression Analysis • Standard Distribution • Cluster Analysis
Industries Using Data Mining • Banking • Insurance • Medicine • Retail • Security • Sciences
Financial Uses of Data Mining • Fraud Detection • Money Laundering Detection • Risk Management
Medical Uses of Data Mining • Chemical Compounds • Genetic Material • Predictive Treatment Models
Retail Uses of Data Mining • Direct Marketing • Store Design • Store Operations
Security Uses of Data Mining • Assess crime patterns • Homeland Security • Identification of suspicious activities • Pre-screening
Scientific Uses of Data Mining • Image analysis • Classification of large data sets
Other Novel Uses for Data Mining • NBA’s Advanced Scout Program • Firefly
Predictive Analytics • An advanced form of data mining that makes prediction models for the behavior of variables in large data sets. • Highly specialized for each application
Uses of Predictive Analytics • Cost-Benefit Analysis • Predicting Customer Behavior • Reducing Costs
Financial Uses of Predictive Analytics • Credit Ratings • Economic Prediction Models • Federal Reserve
Text Mining • Extracts data from unstructured data sets • Allows for data mining of large data sets that are not databases
Sentiment Analysis • Uses semantic techniques and keywords to detect favorable and unfavorable opinions toward specific subjects.
Privacy Concerns with Data Mining • Big Brother • Puts too much power into the hands of Governmental Security Forces
False Positives in Data Mining for Security Reasons • Costs the people and the Government • Subject of controversy and civilian mistrust
Data Mining as Another Tool for Security • Government doesn’t wish to interfere in civilian life • Actual intrusions of privacy incur legal costs • Useful for correlating with other sources of data
Visual and Speech Processing • Examining large amounts of real-time input for specific data and relationships between data • Requires a certain amount of predictive modeling
Data Mining is an Essential Use of Computers • It makes the previously impossible possible • Powerful tool for progress and understanding • Lasting Impact