1 / 36

E xamples of classification methods

E xamples of classification methods. CSIT5210. Content. KNN Decision Tree Naïve Bayesian Bayesian Belief Network Naïve Neural Network Multilayer Neural Network SVM. KNN. Question: Assignment 1 Q1 Solution: 1)Understand the distance function: # of different attributes . .

nailah
Download Presentation

E xamples of classification methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Examples of classification methods CSIT5210

  2. Content • KNN • Decision Tree • Naïve Bayesian • Bayesian Belief Network • Naïve Neural Network • Multilayer Neural Network • SVM

  3. KNN • Question: Assignment 1 Q1 • Solution: 1)Understand the distance function: # of different attributes. The distance between tuple 2 and tuple 3: one attribute is the same andthree attributes are different Dist(2,3) = |{Height(low!=med), Weight(med!=high), BloodPressure(med!=high)}| = 3

  4. KNN Trainingdata 2) Calculate the Distance table. Testing data

  5. KNN 3) For k=1 find the nearest 1 neighbor(choose the smaller id in ties), compare actual and prediction result. Wechoosetheonewithsmalleridtobreaktie There are 2 errors in the the 10 test data(11-20), so the error rate is 2/10 = 0.2

  6. KNN • 4) For k=3 repeat the above procedure. Use the majority win rule to get prediction result. There are 4 errors, so the error rate is 4/10 = 0.4

  7. Decision Tree • Question: Assignment 1 Q2 • Solution: • There are 6 yes and 6 no in the training data. So: Info(D) = I(6,6) • For all the 4 attributes, calculate the Information gained by branching on it: • E.g. : If branching on the attribute “age”, the data will be split into: • D(age=old) = {4 yes, 0 no } • D(age=young) = {2 yes, 6 no} • Infoage(D) = 8/12 * I(2,6) + 4/12 * I(4,0)

  8. Decision Tree • Age has the largest gain so we choose Age as the root. • For the age=old branch, all decisions are yes, cannot split any more. • For the age=young branch, repeat the gain calculation again.

  9. Decision Tree Thenwe choose married for splitting. The remaining data is: Dmarried=yes = {4approved=no} Dmarried=no = {2approved=yes,2approved=no} Wechooseapproved=yesforthebranchmarried=no.Hereisthefinaltree:

  10. DecisionTree Age young old Married Yes no yes Yes No Applythetreeonthetestingdata: errorrate=4/6=0.667

  11. Naive Bayesian • Question: Assignment 1 Q3 • Answer: • Inthetrainingdata,thereare6approved=yesand6approved=no,so • P(C1)=P(approved=yes)=6/12=0.5 • P(C2)=P(approved=no)=6/12=0.5 • Foreveryattributeandclass,computeP(X|Ci) • P(Sex = “male” | C1) = 4/6 = 0.667 • P(Sex = “female” | C1) = 2/6 = 0.333 • P(Sex = “male” | C2) = 4/6 = 0.667 • P(Sex = “female” | C2) = 2/6 = 0.333

  12. Naive Bayesian • P(Age = “old” | C1) = 4/6 = 0.667 • P(Age = “young” | C1) = 2/6 = 0.333 • P(Age = “old” | C2) = 0/6 = 0 • P(Age = “young” | C2) = 6/6 = 1 • P(Housing = “yes” | C1) = 1/6 = 0.167 • P(Housing = “no” | C1) = 5/6 = 0.833 • P(Housing = “yes” | C2) = 4/6 = 0.667 • P(Housing = “no” | C2) = 2/6 = 0.333

  13. Naive Bayesian • P(Employed = “yes” | C1) = 4/6 = 0.667 • P(Employed = “no” | C1) = 2/6 = 0.333 • P(Employed = “yes” | C2) = 1/6 = 0.167 • P(Employed = “no” | C2) = 5/6 = 0.833 • For the first testing data:X1 = (Sex = “female”, Age = “young”, Housing = “yes”, Employed = “yes”) P(X1|C1) = 0.333×0.333×0.167×0.667 = 0.012 P(X1|C2) = 0.333×1×0.667×0.167 = 0.037P(X1|C1) * P(C1) = 0.012 * 0,5 = 0.006 P(X1|C2) * P(C2) = 0.037 * 0,5 = 0.019 >P(X|C1) * P(C1) So X1 belongs to C2 (Approved=no)

  14. Naive Bayesian • For the remaining testing data, repeat the same procedure. • So the error rate is 2/3 = 0.667

  15. Bayesian Network • Question: • SmokingisprohibitedonHigh-Speedtrains.Ifsomeonesmokes,thealarmmaysound,and alsootherpassengersmayreportittothepolice.Ifthepolicehearthealarmorgetthereport,hewill,verypossibly,comeandarrestthesmoker. • Thiscanbemodeledinthe following Bayes network:

  16. Bayesian Network Thealarmisnotaccurateenough,itignoressomesmokingandsometimessoundfornothing. Noteverypassengerwanttoreportsmokersandsomepassengersmakemistakes.(The alarm does not affect passengers.) Smoking Report Alarm Policecomes Thepolice comes if he believes there is someone smoking. They don’t trust the alarm very much and they may, rarely, patrol on the train.

  17. Bayesian Network Suppose the probability of someone smoking is 0.5, what is the probability of the police comes? • Answer: • P(S=T)=0.5 andP(S=F)=0.5, • The alarm sounds: • P(A) = P(A|S)*P(S) + P(A|¬S)*P(¬S) = 0.4 • Passengers report: • P(R) = P(R|S)*P(S) + P(R|¬S)*P(¬S) = 0.25 • Police comes: P(P) = P(P|AR) * P(A) * P(R) + P(P|A¬R) * P(A) * P(¬R) + P(P|¬AR) * P(¬A) * P(R) + P(P|¬A¬R) * P(¬A) * P(¬R) = 0.08 + 0.12 + 0.09 + 0.0045 = 0.2945

  18. Naïve Neural Network • Question: • Given a perceptron, the training samples are given in the table below. • In addition, the initial weights are also given: W0=0.5, W1=0.4, W2=0.5. The learning rate α is 0.2. Please use the sample data as training data and update W0, W1, and W2.

  19. Naïve Neural Network • Answer: Step 1: • a = = -0.5 +0.4*0 + 0.5*0 = -0.5 < 0 • y = 0 = T1, so no need to change weights. Step 2: • a = = -0.5 +0.4*0 + 0.5*1 = 0 • y = 1 = T2, so no need to change weights.

  20. Naïve Neural Network • Step 3: • a = = -0.5 +0.4*1 + 0.5*1 = -0.1 < 0 • y = 0 ≠ T3, • ∆w0 = α (t-y) x0 = 0.2 * 1 * (-1) = -0.2 • ∆w1 = α (t-y) x1 = 0.2 * 1 * 1 = 0.2 • ∆w2 = α (t-y) x2 = 0.2 * 1 * 0 =0 • Thus, • w0= w0 + ∆w0 =0.5 – 0.2 =0.3 • w1 = w1 + ∆w1 = 0.4 + 0.2 = 0.6 • w2 = w2 + ∆w2 = 0.5

  21. Naïve Neural Network • Step 4 • a = = -0.3 +0.6*1 + 0.5*1 = 0.8 > 0 • Y = 1 = T4 , no need to change weights. • So, the final weights are: • w0=0.3 • w1=0.6 • w2=0.5

  22. Multilayer Neural Network • Given the following neural network with initialized weights as in the picture(next page), we are trying to distinguish between nails and screws and an example of training tuples is as follows: • T1{0.6, 0.1, nail} • T2{0.2, 0.3, screw} • Let the learning rate (l) be 0.1. Do the forward propagation of the signals in the network using T1 as input, then perform the back propagation of the error. Show the changes of the weights. Given the new updated weights with T1, use T2 as input, show whether the predication is correct or not.

  23. Multilayer Neural Network

  24. Multilayer Neural Network • Answer: • First, use T1 as input and then perform the back propagation. • At Unit 3: • a3 =x1w13 +x2w23+θ3 =0.14 • o3= = 0.535 • Similarly, at Unit 4,5,6: • a4 = 0.22, o4 = 0.555 • a5= 0.64, o5= 0.655 • a6= 0.1345, o6= 0.534

  25. Multilayer Neural Network • Now go back, perform the back propagation, starts at Unit 6: • Err6 = o6 (1- o6) (t- o6) = 0.534 * (1-0.534)*(1-0.534) = 0.116 • ∆w36 = (l) Err6 O3 = 0.1 * 0.116 * 0.535 = 0.0062 • w36 = w36 + ∆w36 = -0.394 • ∆w46 = (l) Err6 O4 = 0.1 * 0.116 * 0.555 = 0.0064 • w46 = w46 + ∆w46 = 0.1064 • ∆w56 = (l) Err6 O5 = 0.1 * 0.116 * 0.655 = 0.0076 • w56 = w56 + ∆w56 = 0.6076 • θ6 = θ6 + (l) Err6 = -0.1 + 0.1 * 0.116 = -0.0884

  26. Multilayer Neural Network • Continue back propagation: • Error at Unit 3: Err3= o3 (1- o3) (w36 Err6) = 0.535 * (1-0.535) * (-0.394*0.116) = -0. 0114 w13= w13 + ∆w13 = w13 + (l) Err3X1 = 0.1 + 0.1*(-0.0114) * 0.6 = 0.09932 w23= w23 + ∆w23 = w23 + (l) Err3X2 = -0.2 + 0.1*(-0.0114) * 0.1 = -0.2001154 θ3 = θ3 + (l) Err3 = 0.1 + 0.1 * (-0.0114) = 0.09886 • Error at Unit 4: Err4= o4 (1- o4) (w46 Err6) = 0.555 * (1-0.555) * (-0.1064*0.116) = 0.003 w14= w14 + ∆w14 = w14 + (l) Err4X1 = 0 + 0.1*(-0.003) * 0.6 = 0.00018 w24= w24 + ∆w24 = w24 + (l) Err4X2 = 0.2 + 0.1*(-0.003) * 0.1 = 0.20003 θ4 = θ4 + (l) Err4 = 0.2 + 0.1 * (0.003) = 0.2003 • Error at Unit 5: Err5= o5 (1- o5) (w56 Err6) = 0.655 * (1-0.655) * (-0. 6076*0.116) = 0.016 w15= w15 + ∆w15 = w15 + (l) Err5X1 = 0.3 + 0.1* 0.016 * 0.6 = 0.30096 w25= w25 + ∆w25 = w25 + (l) Err5X2 = -0.4 + 0.1*0.016 * 0.1 = -0.39984 θ5= θ5 + (l) Err5 = 0.5 + 0.1 * 0.016 = 0.5016

  27. Multilayer Neural Network • After T1, the updated values are as follows: • Now, with the updated values, use T2 as input: • At Unit 3: • a3 = x1w13 + x2w23+ θ3 = 0.0586898 • o3 = = 0.515

  28. Multilayer Neural Network • Similarly, • a4= 0.260345, o4 = 0.565 • a5= 0.441852, o5 = 0.6087 • At Unit 6: • a6 = x3w36 + x4w46+ x5w56+ θ6 = 0.13865 • o6 = = 0.5348 • Since O6 is closer to 1, so the prediction should be nail, different from given “screw”. • So this predication is NOT correct.

  29. SVM • Consider the following data points. Please use SVM to train a classifier, and then classify these data points. Points with ai=1 means this point is support vector. For example, point 1 (1,2) is the support vector, but point 5 (5,9) is not the support vector. • Training data: • Testing data:

  30. SVM • Question: • (a) Find the decision boundary, show detail calculation process. • (b) Use the decision boundary you found to classify the Testing data. Show all calculation process in detail, including the intermediate result and the formula you used.

  31. SVM • Answer: • a) As the picture shows, P1, P2, P3 are support vectors.

  32. SVM • Suppose w is (w1,w2). Since both P1(1,2) and P3(0,1) have y = 1, while P2(2,1) has y =-1: • w1*1+w2*2+b = 1 • w1*0+w2*1+b = 1 • w1*2+w2*1+b =-1 w1= -1, w2 = 1, b = 0 then, the decision boundary is: • w1* x1+w2 * x2 + b =0 -x1+x2 = 0 • Showed in the picture next page.

  33. SVM

  34. SVM • b) Use the decision boundary to classify the testing data: • For the point P9 (2,5) -x1+x2 = -2+5 = 3 >= 1 So we choose y = 1 • For the point P10 (7,2) -x1+x2 = -7+2 = -5 <= -1 So we choose y = -1 • Showed in the picture next page.

  35. SVM

  36. Q&A

More Related