Modeling User Rating Profiles
1 / 16

2. Introduction - PowerPoint PPT Presentation

  • Uploaded on

Modeling User Rating Profiles For Collaborative Filtering. Benjamin M. Marlin. University of Toronto. Department of Computer Science. Toronto, Ontario, Canada. 2. Introduction. AP 08. 1. Abstract.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' 2. Introduction' - chava-snyder

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
2 introduction

Modeling User Rating ProfilesFor Collaborative Filtering

Benjamin M. Marlin

University of Toronto. Department of Computer Science.Toronto, Ontario, Canada

2. Introduction

AP 08

2 introduction

1. Abstract

• We present a new latent variable model for rating-based collaborative filtering called the User Rating Profile model (URP). URP has complete generative semantics at the user and rating profile levels.

• URP is related to several models including a multinomial mixture model, the aspect model, and latent Dirichlet allocation, but has advantages over each.

• A variational Expectation Maximization procedure is used to fit the URP model. Rating prediction makes use of a well defined variational inference procedure.

• Empirical results on two rating prediction tasks using the EachMovie and MovieLens data sets show that URP attains lower error rates than the multinomial mixture model, the aspect model, and neighborhood-based techniques.

2. Introduction

2 introduction

Preference Indicators

Co-occurrence Pair (u,y):u is a user index and y is an item index.

Count Vector (n1u, n2u, … , nMu): nyuis the number of times (u,y) is observed.

Rating Triplet (u,y,r): u is a user index, y is an item index, r is a rating value.

Rating Vector (r1u, r2u, … , rMu): ryuis rating assigned to item y by user u.

Additional Features

In a pure formulation no additional features are used. A hybrid formulation incorporates additional content-based item and user features.

Preference Dynamics

In a sequential formulation the rating process is modeled as a time series. In a non-sequential formulation preferences are assumed to be static.

Collaborative Filtering Formulations

2 introduction

Formal Description:

Items: y=1,…,MUsers:u=1,…,NRatings: r=1,…,V

Additional Features: NonePreference Dynamics: Non-sequentialPreference Indicators: Ordinal rating vectors

Tasks: The two main tasks under this formulation are recommendation and rating prediction.

Rating prediction is the task of estimating all unknown ratings for the active user.

The focus of research is developing highly accurate methods for rating prediction.

Item List

Rating Prediction


1. Item y2

2. Item y3


Predicted Ratings

The Pure, Non-Sequential, Rating-Based Formulation

Rating Database

Active User Ratings

Figure 1: Given a rating prediction method, a recommendation method is easily obtained: predict, then sort.

2 introduction

3. Related Work

Neighborhood Methods:

• Introduced by Resnick et al (GroupLens), Shardanand and Maes (Ringo).

• All variants can be seen as modifications of the K-Nearest Neighbor classifier.

Rating Prediction:

1.Compute similarity measure between active user and all users in database.

2. Compute predicted rating for each item.

Multinomial Mixture Model:

• A simple mixture model with fast, reliable learning by EM, and low prediction time.

• Simple but correct generative semantics.Each profile is generated by 1 of K types.




Rating Prediction:

2 introduction



Latent Dirichlet Allocation:

The Aspect Model:

• Proposed by Blei et al. for text modeling.

• Can be used in a co-occurrence based CF formulation. Can not model ratings.

• A correct generative version of the dyadic aspect model. User’s distribution over types is random variable with Dirichlet prior.

• Many versions proposed by Hofmann. Of main interest are dyadic, triadic, and new vector version proposed by Marlin.

• All have incomplete generative semantics.

Learning (Vector):


• Model learned using variational EM or Minka’s Expectation propagation.

• Exact inference not possible.


Rating Prediction (Vector):

• Needs approximate inference. Variational methods result in an iterative algorithm.

2 introduction

Graphical Models:

Figure 2: Dyadic Aspect Model

Figure 3: Triadic Aspect Model

Figure 4: Vector Aspect Model

Co-occurrence to Ratings

Ratings to Rating profiles

Variable U: User indexVariable Z: Attitude indexVariable Y: Item IndexVariable R: Rating ValueParameter : P(Z|U=u)Parameter : P(R|Z=z,Y=y)

Variable U: User indexVariable Zy: Attitude indexVariable Ry: Rating valueVariable Y: Item IndexParameter : P(Z|U=u)Parameter : P(R|Z=z,Y=y)

Variable U: User indexVariable Z: Attitude indexVariable Y: Item IndexParameter : P(Z|U=u)Parameter : P(Y|Z=z)



2 introduction

Figure 5: LDA Model

Figure 6: URP Model

Variable  : P(Z|U=u) Variable Zy: Attitude indexVariable Ry: Rating valueVariable Y: Item indexParameter : Dirichlet prior Parameter : P(Ry |Z=z)

Variable  : P(Z|U=u) Variable Z: Attitude indexVariable Y: Item indexParameter : Dirichlet priorParameter : P(Y|Z=z)

Co-occurrence to Rating Profile

2 introduction

4. The URP Model

Model Specification:

Generative Process:

• Unlike a simple mixture model, each user has a unique distribution over .

• Unlike the aspect model family, there are proper generative semantics on .

• Unlike LDA, URP generates a set of complete user rating profiles


• The latent space description of a user is a Dirichlet random variable  that encodes a multinomial distribution over user types.

• Each setting of the multinomial variables Zy is an index into K user types or user attitudes.

• Each user attitude is represented by a multinomial distribution over ratings for each item encoded by .

• The multinomial variables Ry give the ratings for each item y. Possible values are from 1 to V.

1. For each user u = 1 to N 2. Sample  ~ Dirichlet()3. For each item y = 1 to M4. Sample z ~ Multinomial()5. Sample r ~ Multimonial(yz)

2 introduction


Variational Approximation

• Exact inference is intractable with URP. We define a fully factorized approximate q-distribution with variational multinomial parameters u, and variational Dirichlet parameters u.

Paramter Estimation

Variational Inference


2 introduction

Rating Prediction

• Once rating distributions are estimated, any number of prediction techniques can be used. The prediction technique should match the error measure used.

5. Experimentation

Strong Generalization Experiment:

• Users split into training set and testing set. Ratings for test users split into observed and unobserved sets. Trained on training users, tested on test users.

• Repeated on 3 random splits of data.

Weak Generalization Experiment:

• Available ratings for each user split into observed and unobserved sets. Trained on the observed ratings, tested on the unobserved ratings.

• Repeated on 3 random splits of data.

2 introduction

Error Measure:

Data Sets:

Normalized Mean Absolute Error:

• Average over all users of the absolute difference between predicted and actual ratings.

• Normalized by expectation of the difference between predicted and actual ratings under empirical rating distribution of the base data set.

EachMovie: Compaq Systems Research Center

• Ratings: 2,811,983

• Sparsity: 97.6%• Filtering: 20 ratings

• Users: 72916• Items: 1628 • Rating Values: 6

MovieLens: GroupLens Research Center

• Ratings: 1,000,209

• Sparsity: 95.7%• Filtering: 20 ratings

• Users: 6040• Items: 3900 • Rating Values: 5

Figure 7: Distribution of ratings in weak and strong filtered data sets compared to base data sets.

2 introduction

5. Experimentation and Results

6. Results



Figure 9: MovieLens Strong Generalization Results

Figure 8: MovieLens Weak Generalization Results

• URP and the aspect model attain the same minimum weak generalization error rate, but URP does so using far fewer model parameters.

2 introduction



Figure 11: EachMovie Strong Generalization Results

Figure 10: EachMovie Weak Generalization Results

• On the more difficult EachMovie data set, URP clearly performs better than the other rating prediction methods considered.

2 introduction

7. Conclusions and Future Work


• We have introduced URP, a new generative model specially designed for pure, non-sequential, ratings-based collaborative filtering. URP has consistent generative semantics at both the user level, and the rating profile level.

• Empirical results show that URP outperforms other popular rating prediction methods using fewer model parameters.

Future Work:

• Models with more intuitive generative semantics. Currently under study are a promising family of product models.

• Models that integrate additional features, or sequential dynamics, or both.

2 introduction

8. References

1. D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993-1022, January 2003.

2. John S. Breese, David Heckerman, and Carl Kadie. Empirical Analysis of Predictive Algorithms for Collaborative Filtering. In Proceedings of the Fourteenth Annual Conference on Uncertainty in Artificial Intelligence, pages 43-52, July 1998.

3. Thomas Hofmann. Learning What People (Don't) Want. In Proceedings of the European Conference on Machine Learning (ECML), 2001.

5. Thomas Minka and John Lafferty. Expectation-Propagation for the Generative Aspect Model. In Proceedings of the 18th Conference on Uncertainty in Artificial Intelligence, 2002.

6. R. M. Neal and G. E. Hinton. A new view of the EM algorithm that justifies incremental, sparse and other variants. In M. I. Jordan, editor, Learning in Graphical Models, pages 355-368. Kluwer Academic Publishers, 1998.

7. P. Resnick, N. Iacovou, M. Suchak, P. Bergstorm, and J. Riedl. GroupLens: An Open Architecture for Collaborative Filtering of Netnews. In Proceedings of ACM 1994 Conference on Computer Supported Cooperative Work, pages 175{186, Chapel Hill, North Carolina, 1994. ACM.

8. Upendra Shardanand and Patti Maes. Social information ltering: Algorithms for automating “word of mouth". In Proceedings of ACM CHI'95, volume 1, pages 210-217, 1995.