enhancing matrix factorization through initialization for implicit feedback databases
Download
Skip this Video
Download Presentation
Enhancing Matrix Factorization Through Initialization for Implicit Feedback Databases

Loading in 2 Seconds...

play fullscreen
1 / 20

Enhancing Matrix Factorization Through Initialization for Implicit Feedback Databases - PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on

Enhancing Matrix Factorization Through Initialization for Implicit Feedback Databases. Balázs Hidasi Domonkos Tikk Gravity R&D Ltd. Budapest University of Technology and Economics. CaRR workshop , 14th February 2012, Lisbon. Outline. Matrix factoriztion Initialization concept

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Enhancing Matrix Factorization Through Initialization for Implicit Feedback Databases' - rolf


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
enhancing matrix factorization through initialization for implicit feedback databases

Enhancing Matrix Factorization Through Initialization forImplicit Feedback Databases

Balázs Hidasi

Domonkos Tikk

Gravity R&D Ltd.

Budapest University of Technology and Economics

CaRR workshop, 14th February 2012, Lisbon

outline
Outline
  • Matrixfactoriztion
  • Initializationconcept
  • Methods
    • Naive
    • SimFactor
  • Results
  • Discussion
matrix factorization
MatrixFactorization
  • CollaborativeFiltering
  • One of the most commonapproaches
  • Approximatestheratingmatrixasproduct of low-rankmatrices

Items

R

P

Q

Users

matrix factorization1
Matrixfactorization
  • Initialize P and Q withsmallrandomnumbers
  • Teach P and Q
    • AlternatingLeastSquares
    • GradientDescent
    • Etc.
  • Transformsthedatato a featurespace
    • Separatelyforusers and items
    • Noisereduction
    • Compression
    • Generalization
implicit feedback
Implicit feedback
  • No ratings
  • User-iteminteractions (events)
  • Muchnoisier
    • Presence of an event mightnot be positivefeedback
    • Absence of an event doesnotmeannegativefeedback
    • No negativefeedback is available!
  • More commonproblem
  • MF for implicit feedback
    • Less accurateresultsduetonoise
    • Mostly ALS is used
    • Scalabilityproblems (ratingmatrix is dense)
concept
Concept
  • Good MF model
    • The featurevectors of similarentitiesaresimilar
    • Ifdata is toonoisy similarentitieswon’t be similarbytheirfeatures
  • Start MF from a „good” point
    • Featurevectorsimilaritiesare OK
  • Data is more thanjustevents
    • Metadata
      • Infoaboutitems/users
    • Contextualdata
      • Inwhatcontextdidtheeventoccured
    • Canweincorporatethosetohelp implicit MF?
naive approach
Naiveapproach
  • Describeitemsusinganydatawehave (detailedlater)
    • Long, sparsevectorsforitemdescription
  • Compressthesevectorstodensefeaturevectors
    • PCA, MLP, MF, …
    • Length of desiredvectors = Numberoffeaturesin MF
  • Usethesefeaturesas starting points
naive approach1
Naiveapproach
  • Compression and alsonoisereduction
  • Doesnotreallycareaboutsimilarities
  • Butoftenfeaturesimilaritiesarenotthatbad
  • If MF is used
    • Half of theresults is thrown out

Descriptors

Descriptorfeatures

Description of items

Itemfeatures

Items

simfactor algorithm
SimFactoralgorithm
  • Trytopreservesimilaritiesbetter
  • Starting from an MF of itemdescription
  • Similarities of items: DD’
    • Somemetricsrequiretransformationon D

Descriptors

Descriptorsfeatures

Description of items

(D)

Itemfeatures

Items

Description of items

(D’)

Itemsimilarities

(S)

Description of items

(D)

=

simfactor algorithm1
SimFactoralgorithm
  • Similarityapproximation
  • Y’Y  KxKsymmetric
    • Eigendecomposition

Itemfeatures (X’)

Itemsimilarities

(S)

Itemfeatures (X)

Descriptorsfeatures

(Y’)

Descriptorsfeatures

(Y)

Itemfeatures (X’)

Itemsimilarities

(S)

Itemfeatures (X)

Y’Y

simfactor algorithm2
SimFactoralgorithm
  • λ diagonal λ = SQRT(λ) * SQRT(λ)
  • X*U*SQRT(λ) = (SQRT(λ)*U’*X’)’=F
  • F is MxKmatrix
  • S  F * F’  Fusedforinitialization

Y’Y

U

λ

U’

=

Itemfeatures (X’)

Itemsimilarities

(S)

Itemfeatures (X)

U

SQRT

(λ)

SQRT

(λ)

U’

F’

Itemsimilarities

(S)

F

creating the description matrix
Creatingthedescriptionmatrix
  • „Any” dataabouttheentity
    • Vector-spacereprezentation
  • ForItems:
    • Metadatavector (title, category, description, etc)
    • Eventvector (whoboughttheitem)
    • Context-statevector (inwhichcontextstatewasitbought)
    • Context-event (inwhichcontextstatewhoboughtit)
  • ForUsers:
    • Allaboveexceptmetadata
  • Currently: Chooseonesourcefor D matrix
  • Contextused: seasonality
experiments similarity preservation
Experiments: Similaritypreservation
  • Real life dataset: online grocery shopping events
  • SimFactorapproximatessimilaritiesbetter
experiments initialization
Experiments: Initialization
  • Usingdifferentdescriptionmatrices
  • And bothnaiveandSimFactorinitialization
  • Baseline: random init
  • Evaluationmetric: [email protected]
experiments grocery db
Experiments: Grocery DB
  • Upto 6% improvement
  • Best methodsuseSimFactor and usercontextdata
experiments implicitized movielens
Experiments: „Implicitized” MovieLens
  • Keeping 5 starratings implicit events
  • Upto 10% improvement
  • Best methodsuseSimFactor and itemcontextdata
discussion of results
Discussion of results
  • SimFactoryieldsbetterresultsthannaive
  • Contextinformationyieldsbetterresultsthanotherdescriptions
  • Contextinformationseparateswellbetweenentities
    • Grocery: Usercontext
      • People’sroutines
      • Differenttypes of shoppingsindifferenttimes
    • MovieLens: Itemcontext
      • Differenttypes of movieswatchedondifferenthours
  • Context-basedsimilarity
why context
Whycontext?
  • Groceryexample
    • Correlationbetweencontextstatesbyusers low
  • Whycancontextawarealgorithms be efficient?
    • Differentrecommendationsindifferentcontextstates
    • Contextdifferentiateswellbetweenentities
      • Easiersubtasks
conclusion future work
Conclusion & Futurework
  • SimFactor Similaritypreservingcompression
  • Similaritybased MF initialization:
    • Descriptionmatrixfromanydata
    • ApplySimFactor
    • Use output asinitialfeaturesfor MF
  • Contextdifferentitatesbetweenentitieswell
  • Futurework:
    • Mixed descriptionmatrix (multipledatasources)
    • Multipledescriptionmatrix
    • Usingdifferentcontextinformation
    • Usingdifferentsimilaritymetrics
thanks for your attention
Thanksforyourattention!

For more of my recommender systems related research visit my website: http://www.hidasi.eu

ad