1 / 4

Questions and Topics Review Nov. 30, 2010

Questions and Topics Review Nov. 30, 2010. Give an example of a problem that might benefit from feature creation How does DENCLUE form clusters? Why does DENCLUE use grid-cells? What are the main differences between DENCLUE and DBSCAN?

aldan
Download Presentation

Questions and Topics Review Nov. 30, 2010

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Questions and Topics Review Nov. 30, 2010 • Give an example of a problem that might benefit from feature creation • How does DENCLUE form clusters? Why does DENCLUE use grid-cells? What are the main differences between DENCLUE and DBSCAN? • Compute the Silhouette of the following clustering that consists of 2 clusters: {(0,0), (0,1), (2,2)} {(3,2), (3,3)}. • Compare Decision Trees, Support Vector Machines, and K-NN with respect to the number of decision boundary each approach uses! • K-NN is a lazy approach; what does it mean? What are the disadvantages of K-NN’s lazy approach? Do you see any advantages in using K-NN’s lazy approach. • Why do some support vector machine approaches map examples from a lower dimensional space to a higher dimensional space? • What is the role of slack variables in the Linear/SVM/Non-separable approach (textbook pages 266-270)—what do they measure? What properties of hyperplanes are maximized by the objective function f(w) (on page 268) in the approach? • Silhouette: For an individual point, i • Calculate a = average distance of i to the points in its cluster • Calculate b = min (average distance of i to points in another cluster) • The silhouette coefficient for a point is then given by:s = (b-a)/max(a,b)

  2. Support Vector Machines • What if the problem is not linearly separable?

  3. Linear SVM for Non-linearly Separable Problems • What if the problem is not linearly separable? • Introduce slack variables • Need to minimize: • Subject to (i=1,..,N): • C is chosen using a validation set trying to keep the margins wide while keeping the training error low. Parameter Inverse size of margin between hyperplanes Measures testing error Slack variable allows constraint violation to a certain degree

  4. Questions and Topics Review Nov. 30, 2010 • Discussion of Problem1/2of Assignment4 • Give an example of a problem that might benefit from feature creation • How does DENCLUE form clusters? Why does DENCLUE use grid-cells? What are the main differences between DENCLUE and DBSCAN? • Compute the Silhouette of the following clustering that consists of 2 clusters: {(0,0), (0.1), (2,2)} {(3,2), (3,3)}. • Compare Decision Trees, Support Vector Machines, and K-NN with respect to the number of decision boundary each approach uses! DT: many, rectangular for numerical attributes K-NN: many, convex polygons (Voronoi cells), SVM: one, hyperplane • K-NN is a lazy approach; what does it mean? What are the disadvantages of K-NN’s lazy approach? Do you see any advantages in using K-NN’s lazy approach. … advantages: for quickly changing streaming data learning the model might be a waste of time and a lazy approach might be better… • Why do some support vector machine approaches map examples from a lower dimensional space to a higher dimensional space? To make them linearly separable. • What is the role of slack variables in the Linear/SVM/Non-separable approach (textbook pages 266-270)—what do they measure? What properties of hyperplanes are maximized by the objective function f(w) (on page 268) in the approach?

More Related