- 656 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'K-nearest neighbor methods' - victoria

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

X

Kernel regression

- aka locally weighted regression, locally linear regression, LOESS, …

What does making the kernel wider do to bias and variance?

BellCore’s MovieRecommender

- Participants sent email to [email protected]
- System replied with a list of 500 movies to rate on a 1-10 scale (250 random, 250 popular)
- Only subset need to be rated
- New participant P sends in rated movies via email
- System compares ratings for P to ratings of (a random sample of) previous users
- Most similar users are used to predict scores for unrated movies (more later)
- System returns recommendations in an email message.

Suggested Videos for: John A. Jamus.

- Your must-see list with predicted ratings:
- 7.0 "Alien (1979)"
- 6.5 "Blade Runner"
- 6.2 "Close Encounters Of The Third Kind (1977)"
- Your video categories with average ratings:
- 6.7 "Action/Adventure"
- 6.5 "Science Fiction/Fantasy"
- 6.3 "Children/Family"
- 6.0 "Mystery/Suspense"
- 5.9 "Comedy"
- 5.8 "Drama"

The viewing patterns of 243 viewers were consulted. Patterns of 7 viewers were found to be most similar. Correlation with target viewer:

- 0.59 viewer-130 ([email protected])
- 0.55 bullert,jane r ([email protected])
- 0.51 jan_arst ([email protected])
- 0.46 Ken Cross ([email protected])
- 0.42 rskt ([email protected])
- 0.41 kkgg ([email protected])
- 0.41 bnn ([email protected])
- By category, their joint ratings recommend:
- Action/Adventure:
- "Excalibur" 8.0, 4 viewers
- "Apocalypse Now" 7.2, 4 viewers
- "Platoon" 8.3, 3 viewers
- Science Fiction/Fantasy:
- "Total Recall" 7.2, 5 viewers
- Children/Family:
- "Wizard Of Oz, The" 8.5, 4 viewers
- "Mary Poppins" 7.7, 3 viewers

- Mystery/Suspense:
- "Silence Of The Lambs, The" 9.3, 3 viewers
- Comedy:
- "National Lampoon\'s Animal House" 7.5, 4 viewers
- "Driving Miss Daisy" 7.5, 4 viewers
- "Hannah and Her Sisters" 8.0, 3 viewers
- Drama:
- "It\'s A Wonderful Life" 8.0, 5 viewers
- "Dead Poets Society" 7.0, 5 viewers
- "Rain Man" 7.5, 4 viewers
- Correlation of predicted ratings with your actual ratings is: 0.64 This number measures ability to evaluate movies accurately for you. 0.15 means low ability. 0.85 means very good ability. 0.50 means fair ability.

Algorithms for Collaborative Filtering 1: Memory-Based Algorithms (Breese et al, UAI98)

- vi,j= vote of user i on item j
- Ii = items for which user i has voted
- Mean vote for i is
- Predicted vote for “active user” a is weighted sum

weights of n similar users

normalizer

Basic k-nearest neighbor classification

- Training method:
- Save the training examples
- At prediction time:
- Find the k training examples (x1,y1),…(xk,yk) that are closest to the test example x
- Predict the most frequent class among those yi’s.
- Example: http://cgm.cs.mcgill.ca/~soss/cs644/projects/simard/

What is the decision boundary?

Voronoi diagram

Basic k-nearest neighbor classification

- Training method:
- Save the training examples
- At prediction time:
- Find the k training examples (x1,y1),…(xk,yk) that are closest to the test example x
- Predict the most frequent class among those yi’s.
- Improvements:
- Weighting examples from the neighborhood
- Measuring “closeness”
- Finding “close” examples in a large training set quickly

Ways of rescaling for KNN

Dot product:

Cosine distance:

TFIDF weights for text: for doc j, feature i: xi=tfi,j * idfi :

#docs in corpus

#occur. of term i in doc j

#docs in corpus that contain term i

William W. Cohen & Haym Hirsh (1998): Joins that Generalize: Text Classification Using WHIRL in KDD 1998: 169-173.

M2

Vitor Carvalho and William W. Cohen (2008): Ranking Users for Intelligent Message Addressing in ECIR-2008, and current work with Vitor, me, and Ramnath Balasubramanyan

Computing KNN: pros and cons

- Storage: all training examples are saved in memory
- A decision tree or linear classifier is much smaller
- Time: to classify x, you need to loop over all training examples (x’,y’) to compute distance between x and x’.
- However, you get predictions for every class y
- KNN is nice when there are many many classes
- Actually, there are some tricks to speed this up…especially when data is sparse (e.g., text)

Efficiently implementing KNN (for text)

IDF is nice computationally

Tricks with fast KNN

K-means using r-NN

- Pick k points c1=x1,….,ck=xkas centers
- For each xi, find Di=Neighborhood(xi)
- For each xi, let ci=mean(Di)
- Go to step 2….

Efficiently implementing KNN

dj3

Selective classification: given a training set and test set, find the N test cases that you can most confidently classify

dj2

dj4

Download Presentation

Connecting to Server..