100 likes | 247 Views
DATA MINING L ecture #2. Data Mining, Machine Learning, & Statistics. Data Mining combines Methods & Tools from at least 3 areas Machine Learning Statistics Databases. Data Mining, Machine Learning, & Statistics. What is Data Mining? Data Mining is just Machine Learning!
E N D
Data Mining, Machine Learning, & Statistics Data Mining combines Methods & Tools from at least 3 areas • Machine Learning • Statistics • Databases
Data Mining, Machine Learning, & Statistics What is Data Mining? • Data Mining is just Machine Learning! • Data Mining is just Statistics! • What does Data Mining have to do with Databases?
Machine Learning • AI includes many DM techniques • AI is more general & involves areas outside DM • AI not concerned with Scalability • ML is a sub area of AI • Automation of a learning process • Writing programs that can learn • In DM, ML is used for “prediction” & “classification”
Supervised Learning • Learning from examples • Training Data Set • Training set acts as examples for classes • Formulation of Classification rule • Prediction of class of previously unseen data • Similar to Discriminate Analysis in Statistics
Unsupervised Learning • Learning from observation & discovery • NO Training Data Set • No prior knowledge of classes • Outcome is a set of class descriptors, one for each class discovered • Similar to Cluster Analysis in Statistics
Machine Learning: Applications • Speech Recognition • Training moving robots • Classification of astronomical objects • Game Playing
Machine Learning & Data Mining • ML is the basis for many DM tasks • Major differences between approaches taken by AI & Database disciplines • ML mainly focuses on the learning process • Tries to mimic human behavior • Aims at improving performance of an intelligent system/agent for problem solving tasks • DM aims at uncovering information • Databases require more efficient learning algorithms as DBs are normally large & noisy