Collaborative Ranking Function Training for Web Search Personalization

National Technical University of Athens School of Electrical and Computer Engineering Divison of Computer Science Institute for the Management of Information Systems Athena Research Center Collaborative Ranking Function Training for Web Search Personalization Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU Athens, Greece)

Intro • How to personalize search results? 2

Intro irrelevant Partially relevant relevant irrelevant relevant unjudged unjudged 3 • How to personalize search results? • Step 1. Implicit (from user log clicks) or explicit feedback can give you relevance judgments, i.e. irrelevant, partially relevant, relevant

Intro 1. Text similarity between query-result, 2. rank of result in Google, 3. domain of the result url irrelevant Partially relevant relevant irrelevant relevant unjudged unjudged 4 • How to personalize search results? • Step 1. Implicit (from user log clicks) or explicit feedback can give you relevance judgments, i.e. irrelevant, partially relevant, relevant • Step 2. Extract features from query-result pairs.

Intro 1. Text similarity between query-result, 2. rank of result in Google, 3. domain of the result url irrelevant Partially relevant relevant irrelevant relevant TRAINED RANKING FUNCTION unjudged unjudged 5 • How to personalize search results? • Step 1. Implicit (from user log clicks) or explicit feedback can give you relevance judgments, i.e. irrelevant, partially relevant, relevant • Step 2. Extract features from query-result pairs. • Step 3. Feed a ranking function (i.e. RSVM) with judgments and features.

Intro irrelevant relevant Partially relevant relevant Partially relevant relevant Re-rank the results irrelevant irrelevant relevant irrelevant irrelevant unjudged unjudged • How to personalize search results? • Step 1. Implicit (from user log clicks) or explicit feedback can give you relevance judgments, i.e. irrelevant, partially relevant, relevant • Step 2. Extract features from query-result pairs. • Step 3. Feed a ranking function (i.e. RSVM) with judgments and features. • Step 4. Re-rank the results using trained function 6

Problem I • Usually users search in more than one different areas of interest • Example scenario 1: • A phd student searches for papers on “web search ranking” • They would prefer clicking on results with “acm” in title or url • They would also prefer pdf results • The same student searches for info about “samsung omnia cellphone” • They would prefer results with “review”, “specs” or “hands on” in title or abstract • They would also prefer results from blogs or forums or video results

Problem I • Usually users search in more than one different areas of interest • Example scenario 1: • A phd student searches for papers on “web search ranking” • They would prefer clicking on results with “acm” in title or url • They would also prefer pdf results • The same student searches for info about “samsung omnia cellphone” • They would prefer results with “review”, “specs” or “hands on” in title or abstract • They would also prefer results from blogs or forums or video results • Training a single ranking function model for this user: • Could favor video results while searching for papers • Could favor pdf results (cellphone manual) while searching for reviews about a cellphone 8

Problem II • Even users with different search behaviors may share a common search area • Example scenario 2: • User A is a phd student in IR and searches mostly for papers on their search area • User B is a linguist and searches mostly for papers on their search area • However, they could both be interested in new cellphones

Problem II • Even users with different search behaviors may share a common search area • Example scenario 2: • User A is a phd student in IR and searches mostly for papers on their search area • User B is a linguist and searches mostly for papers on their search area • However, they could both be interested in new cellphones • Training a common ranking function model for both users: • Would probably give a better model for searching on cellphones • Would probably give a worse model for the rest of the searches 10

Problem II • Even users with different search behaviors may share a common search area • Example scenario 2: • User A is a phd student in IR and searches mostly for papers on their search area • User B is a linguist and searches mostly for papers on their search area • However, they could both be interested in new cellphones • Training a single ranking function model for each user: • Would not utilize each user’s behavior on common search areas • Example: • User A is familiar with “www.gsmarena.com”, a very informative site about cellphones, while user B is not • Training a common ranking function on this particular search area, would favor “www.gsmarena.com” in both users’ searches • As a result, user B would become aware of this site and use it in future searches 11

Solution • Train multiple ranking functions • Each ranking function corresponds: • Not to a single user • Not to a group of users • But to a topic area: • A group of search results • With similar content • Collected from all users • When re-ranking search results: • Check which topic areas match with each new query • Re-rank the query’s results according to the ranking functions trained for those topic areas

Our method (phase 1) • Clustering on clicked results of all queries of all users • Clicked results are more informative than all results or the query • Partitional clustering (repeated bisections) • Result representation: • Term vector of size N (= the number of distinct terms in all results) • Every feature is represented by a weight w (relating a result with a term) • w depends on term’s tf and idf • Title and abstract are considered as result’s text • Use cosine similarity on term vectors as metric to compare two results • Clustering criterion function maximizes • Output • Clusters containing (clicked) results with similar content (topic clusters)

Our method (phase 2) • Cluster Indexing • To be able to calculate the similarity of each new query with each cluster • Extraction of title and abstract text of all results belonging to each cluster • Use of this text as textual representation of the cluster • Indexing of clusters-documents • Output • Inverted index on clusters’ textual representations

Our method (phase 3) • Multiple ranking function training • Use of Ranking SVM • Each ranking function model Fi is trained with clickstream data only from the corresponding cluster i • Features used: • Textual similarity between query and (title, abstract,URL) of the result • Domain of the result (.com, .edu, e.t.c.) • Rank in Google • Special words (“blog”, “forum”, “wiki”, “portal”, e.t.c.) found in result’s title, abstract or url • Features denoting textual similarity of each word with title/abstract/url • URL suffix (.html, .pdf, .ppt) • 100 most frequent words in all result documents of all searches • Features denoting textual similarity of each word with title/abstract/url

Our method (phase 4) • For each new query q: • We calculate its textual similarity wqi with each cluster i using the index from phase 2 • We produce N different rankings Rqi with rqij being the rank of result j, for query q after re-ranking results using the ranking function trained on cluster i • Final rank: • In other words: • wqi represents how similar is the content of cluster i with query q • rqij gives the result rank when using the ranking function of cluster I • We combine all produced rankings according to how similar they are to the query

Our method 17

Evaluation • Logging mechanism on google search engine: • Queries • Result lists • Clicked results • User IPs • Date and time • Search topics: • Gadgets • Cinema • Auto/Moto • Life & Health • Science • Users: 10 phd students and researchers from our lab • 1-3 search topics per user • 2 months search period • 671 queries • First 501 queries used as training set • Last 170 queries used as test set 18

Evaluation • Comparison of our method (T3) with: • Training a common ranking function for all users (T1) • Training one ranking function per user (T2) • Results: • Average change in rank between our method and the base methods: • Percentages of clicked results belonging to each cluster for each user 19

Conclusion-Future work • First cut approach to the problem • “Collaborative” training of ranking functions for personalization based on topic areas • Encouraging results • Preliminary experiments • Extensions • More extended experiments with (much) larger datasets • Experiments to verify the homogeneity of topic clusters • Experiments to verify the efficiency in very large datasets • More performance measures (precision/recall) • Topic clusters inference • Clustering on feature vectors (and not on text) of results • Use of pre-defined topic hierarchies and classification techniques for detecting topic areas (ODP) • Dynamic switching between topic clusters by the users themselves 20

Collaborative Ranking Function Training for Web Search Personalization

Collaborative Ranking Function Training for Web Search Personalization

Presentation Transcript

Multi-Task Learning and Web Search Ranking

Personalization and Search

Search Engines Personalization

Personalized Ranking Model Adaptation for Web Search

Data Mining for Web Personalization

Extracting Search-Focused Key N-Grams for Relevance Ranking in Web Search

Search Engines Personalization

Search Function

Mobile Web Search Personalization

Mobile Web Search Personalization

Web Search – Summer Term 2006 VI. Web Search - Ranking

Web Usage Mining for Semantic Web Personalization

Web Search – Summer Term 2006 VI. Web Search - Ranking (cont.)

Statistical Translation and Web Search Ranking

WEB PERSONALIZATION

SEO strategy for ranking on Search Enginefor ranking

Google Search Ranking

Collaborative Search

Ranking Search Results

Search Engines Personalization