1 / 12

Probability based Recommendation System

Probability based Recommendation System. Course : ECE541 Chetan Tonde Vrajesh Vyas Ashwin Revo Under the guidance of Prof. R. D. Yates. Problem Statement. movie_id.

zia
Download Presentation

Probability based Recommendation System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Probability based Recommendation System Course : ECE541 ChetanTonde VrajeshVyas AshwinRevo Under the guidance of Prof. R. D. Yates

  2. Problem Statement movie_id • Given Netflix™ data of users U = {u1, u2,.. un} each with movie history Y= {y1, y2,.. ym} and corresponding ratings V = {v1, v2,.. vm}. • To estimate rating’s for unseen movies. [To recommend movies to a user we select K highest estimates of the ratings for each movie .] user_id rating

  3. Basic Idea Method suggested by T. Hoffman called probabilistic Latent Semantic Analysis Random variable u and y are not independent. u u y y =? Introduce a latent variable z which makes user u and movie y conditionally independent. (θ = model parameter) z In a way z relates user u and movie y into independent groups

  4. Basic Idea-Prediction of ratings * u y • Based on z and y the model is extended to predict rating v. • The spread of ratings is assumed to be a mixture of Gaussians with µ and σ dependent on y and z (µy,z and σy,z). • * Hofmann [1] z v

  5. Model-fitting • Expectation Maximization algorithm • EM algorithm is an iterative procedure which converges to a (local) maximum of the maximum a posteriori probability function. • P(θ|X) = p(X|θ)p(θ) • where θ={σyz, μ yz}is a set of unknown parameters of data x. • In other words, EM is a general method to finding the maximum-likelihood estimate of the parameters from a given data-set. • The purpose is to estimate θ of the real data distribution. * * Hofmann[1]

  6. Results-(total log-likelihood) * *Hofmann[1]

  7. Removing Outliers

  8. Removing user outliers • Expected movie views µ andσ across all users . • Replace users with views above µ + 3σ and below µ - 3σ with some other user from data set ratings with maximum (first) user-user correlation of common movies. • This helps in removing outliers , thus improving the rmse value. • Rmse before : 0.6905 • Rmse after : 0.6598

  9. Results -observations • We ran the database having 8000 users and 600 movies and ratings ranging from 0-5 (0 representing movie not viewed) • The results were verified by blanking out a few hundred ratings and predicting expected ratings. (observed rmse ≈0.6598) • The number of latent variables z was decided empirically depending on the performance i.e. the one giving least rmse (k=6) ( k<6 ’under-fitting’ and k>6  over-fitting) • Since users give ratings on a personal scale , there is a need to normalize user ratings to mean 0 and variance 1 for every user. • But for certain movie ratings variance across different users the computed σyz will b very small for certain z , to avoid this we replaced σyz by 1.5 for all σyz less than 1e-4.

  10. Results - Observations • The EM algorithm is assumed to converged when the increase in maximum likelihood is very less (< 100), that implies that data is clustered to best possible approximation. • The matrix is sparse which makes the method sensitive to unreliable ratings, makes the model fitting to true ratings difficult. • Convergence of the EM ALGO significantly depends on the initial estimate of the model, which is mixture of Gaussians so initialization was done using observed ratings. • Model over-fits to the given data which isn't good because it also fits the sampling noise. Seen by rmse results on training data and unknown data (rmse high). • The execution time for the algorithm is around a minute for every iteration which is high. This is because of the complexity of the equations involved and large dataset.

  11. References • T. Hofmann: Collaborative Filtering via Gaussian Latent Semantic Analysis. In Proceedings of ACM Transaction on Information Systems, Volume -22, no. January 2004. • A. Das, M. Datar, A, Garg, S. Rajaram, Google News Personalization: Scalable online Collaborative Filtering. WWW 2007 / Track: Industrial Practice and Experience, May 8-12, Banff, Alberta, Canada. • Andrew Ng: Lectures on Machine Learning, WWW, http://www.youtube.com/user/stanforduniversity.

  12. Thank-You

More Related