Overview of KDDCUP 2011

Overview of KDDCUP 2011 Nathan Liu nliu@cse.ust.hk

KDDCUP 2011 Music Recommendation • KDDCUP is the most prominent data mining competition. • In recent years, there have been a number of contest related to movie recommendation: • Netflix 2006: predict future ratings • KDDCUP 2007: how many ratings and who rated what • CAMRA 2010: context aware movie recommendation • KDDCUP 2011 is organized by yahoo and provides the first and largest music ratings datasets.

Yahoo Music

KDDCUP 2011 • There are three types of items: songs, artists, albums. • Songs and albums are annotated with genres. • You are given the date, time and scores of each user’s ratings of these different items. • Challenges: • Scale: biggest public dataset ever. 1 million user, 0.6 million items, 300 million ratings • Hierarchical item relation: song belong to albums, albums belong to artists. All of them are annotated with genre tags. • Rich meta data: over 900 genres • Fine temporal resolution: no previous challenge provided time in addition to date. • For the project, you will be provided with a small subset of the data and we will held a mini internal competition to determine which group obtained the best results.

KDDCUP 2011: Task 1 • The test set consists of hold out ratings from users in the training set. Each rating is time stamped. • In the test set, you are given who rated which items at what time. • You are asked to predict the rating scores. • Closely related to Netflix competition, but may require time of day effect consideration. • References: • Koren. Matrix Factorization Techniques for Recommender Systems. (IEEE Computer 2009) • Koren. Collaborative Filtering with Temporal Dynamics (KDD’09) • Xiong. Time-Evolving Collaborative Filtering (SDM’10) • Liu. Online Evolutionary Collaborative Filtering (RECSYS’10)

KDDCUP 2011: Task 2 • The test set consists of hold out ratings from users in the training set. Time has been removed. • In the test set, you are given 6 items for each user. • You are asked to predict which 3 of the 6 are actually rated by the user. • Closely related to KDDCUP 2007 “who rated what” and CAMRA2010 weekly recommendation track • References: • Hu. Collaborative Filtering for Implicit Feedback Datasets (ICDM’08) • Rendle. Bayesian Personalized Ranking from Implicit Feedback (UAI’09) • Cremonesi. Performance of Recommender Algorithms on Top-N Recommendation Tasks (RECSYS’10) • Steck. Training and Testing of Recommender Systems on Data Missing Not at Random (KDD’10)

For The Project • We will extract a subset for you to work on. • We will provide some basic algorithms. • You can choose to work on one of the two tasks. • The minimum requirement is that you should run thorough experiments with the provided algorithms and write a report on your findings about different algorithms. • There are also new things to try….

Things to Try (1): Ensemble • Same algorithm different parameter settings • Different algorithms • Stacking: • What meta learner? Gradient Boosted Decision Tree, Linear Regression • Any meta features? Tail vs. Head segmentation strategy • References: • Bao et. al. Stacking Recommendation Engines with Additional Meta-Features (RECSYS’09) • Jahrer et. al. Combining Predictions for Accurate Recommender Systems (KDD’10)

Things to Try (3): Exploiting Item Relations and Genres • From social network of users to networks of items. • Combining collaborative filtering with genre based prediction for alleviating sparseness. • References: • Ma. Recommender Systems with Social Regularization (WSDM’11) • Agarwal. Regression based Latent Factor Models (KDD’09) • Popescul. Probabilistic Models for Unified Collaborative and Content-based Recommendation in sparse-data environments (UAI’01) • Gunawardana. Tied Boltzman Machines for Cold Start Recommendations (RecSys’08)

Things to Try (2): Temporal Dynamics • Various possible types of temporal dynamics: • Long term effect: people getting pickier over time • Short term effect: festival mood • Time of day effect: day time vs. night time preference • Periodicity: every Friday night is party time • References: • Koren. Collaborative Filtering with Temporal Dynamics (KDD’09)

Overview of KDDCUP 2011

Overview of KDDCUP 2011

Presentation Transcript

Overview of 2011 Budgets

MDOP 2011 Overview

FY 2011 Budget Overview

2011 Budget Overview

WebPLUS Overview FY 2011

Overview of 2011 Workshop Results

AN ECONOMIC OVERVIEW 2011

Overview Presentation March, 2011

BRIEF OVERVIEW OF SOME 2011 ACTIONS

Overview of OpenADR May 4, 2011

Overview of 2011 Surrey intake

Overview of Influenza 2010-2011

2011 Corporate Overview

Overview Presentation March, 2011

Company Overview October 2011

Overview of clinical outcomes 2010-2011

Executive Overview March 2011

An overview of Census 2011

Overview august 2011

Executive Overview March, 2011

SoE 2011 – Overview

2011 Budget Overview