230 likes | 344 Views
This overview delves into the essential classification techniques used in data mining, as outlined in CSE 591 by H. Liu. It covers the fundamentals of data formats, classification problems, and various approaches such as decision trees, k-nearest neighbors, and naive Bayes classifiers. The document emphasizes important concepts like information gain, building decision trees, and multilayer perceptrons. Additionally, it discusses the principles behind lazy learning and the challenges associated with noise sensitivity in classification algorithms.
E N D
Classification A task of induction to find patterns CSE 591: Data Mining by H. Liu
Outline • Data and its format • Problem of Classification • Learning a classifier • Different approaches • Key issues CSE 591: Data Mining by H. Liu
Data and its format • Data • attribute-value pairs • with/without class • Data type • continuous/discrete • nominal • Data format • flat CSE 591: Data Mining by H. Liu
Sample data CSE 591: Data Mining by H. Liu
Induction from databases • Inferring knowledge from data • The task of deduction • infer information that is a logical consequence of querying a database • Who conducted this class before? • Which courses are attended by Mary? • Deductive databases: extending the RDBMS CSE 591: Data Mining by H. Liu
Classification • It is one type of induction • data with class labels • Examples - • If weather is rainy then no golf • If • If CSE 591: Data Mining by H. Liu
Different approaches • There exist many techniques to classification • Decision trees • Neural networks • K-nearest neighbors • Naïve Bayesian classifiers • and many more ... CSE 591: Data Mining by H. Liu
A decision tree Outlook sunny overcast rain Humidity Wind YES high normal strong weak NO YES NO YES CSE 591: Data Mining by H. Liu
Inducing a decision tree • There are many possible trees • let’s try it on the golfing data • How to find the most compact one • that is consistent with the data? • Why the most compact? • Occam’s razor principle • Issue of efficiency w.r.t. optimality CSE 591: Data Mining by H. Liu
Entropy - Information gain - the difference between the node before and after splitting Information gain and CSE 591: Data Mining by H. Liu
Building a compact tree • The key to building a decision tree - which attribute to choose in order to branch. • The heuristic is to choose the attribute with the maximum IG. • Another explanation is to reduce uncertainty as much as possible. CSE 591: Data Mining by H. Liu
Learn a decision tree Outlook sunny overcast rain Humidity Wind YES high normal strong weak NO YES NO YES CSE 591: Data Mining by H. Liu
K-Nearest Neighbor • One of the most intuitive classification algorithm • An unseen instance’s class is determined by its nearest neighbor • The problem is it is sensitive to noise • Instead of using one neighbor, we can use k neighbors CSE 591: Data Mining by H. Liu
K-NN • New problems • lazy learning • large storage • An example • How good is k-NN? CSE 591: Data Mining by H. Liu
Naïve Bayes Classifier • This is a direct application of Bayes’ rule • P(C|X) = P(X|C)P(C)/P(X) X - a vector of x1,x2,…,xn • That’s the best classifier you can build • But, there are problems CSE 591: Data Mining by H. Liu
NBC (2) • Assume conditional independence between xi’s • We have • An example • How good is it in reality? CSE 591: Data Mining by H. Liu
Classification via Neural Networks Squash A perceptron CSE 591: Data Mining by H. Liu
What can a perceptron do? • Neuron as a computing device • To separate a linearly separable points • Nice things about a perceptron • distributed representation • local learning • weight adjusting CSE 591: Data Mining by H. Liu
Linear threshold unit • Basic concepts: projection, thresholding W vectors evoke 1 W = [.11 .6] L= [.7 .7] .5 CSE 591: Data Mining by H. Liu
Eg 1: solution region for AND problem • Find a weight vector that satisfies all the constraints AND problem 0 0 0 0 1 0 1 0 0 1 1 1 CSE 591: Data Mining by H. Liu
Eg 2: Solution region for XOR problem? XOR problem 0 0 0 0 1 1 1 0 1 1 1 0 CSE 591: Data Mining by H. Liu
Learning by error reduction • Perceptron learning algorithm • If the activation level of the output unit is 1 when it should be 0, reduce the weight on the link to the ith input unit by r*Li, where Li is the ith input value and r a learning rate • If the activation level of the output unit is 0 when it should be 1, increase the weight on the link to the ith input unit by r*Li • Otherwise, do nothing CSE 591: Data Mining by H. Liu
Multi-layer perceptrons • Using the chain rule, we can back-propagate the errors for a multi-layer perceptrons. Output layer Hidden layer Input layer CSE 591: Data Mining by H. Liu