Sigir 2013 recap
Download
1 / 30

SIGIR 2013 Recap - PowerPoint PPT Presentation


  • 78 Views
  • Uploaded on

SIGIR 2013 Recap. September 25, 2013. Today’s Paper Summaries. Yu Liu Personalized Ranking Model Adaptation for Web Search Nadia V ase Toward Self-Correcting Search Engines: Using Underperforming Queries to Improve Search Riddick Jiang

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'SIGIR 2013 Recap' - haley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Sigir 2013 recap

SIGIR 2013 Recap

September 25, 2013


Today s paper summaries
Today’s Paper Summaries

  • Yu Liu

    • Personalized Ranking Model Adaptation for Web Search

  • Nadia Vase

    • Toward Self-Correcting Search Engines: Using Underperforming Queries to Improve Search

  • Riddick Jiang

    • Fighting Search Engine Amnesia: Reranking Repeated Results

SIGIR 2013 Recap


Sigir 2013 reference material
SIGIR 2013 Reference Material

  • Jul 28 – Aug 1, 2013. Dublin, Ireland

  • Proceedings (ACM Digital library): http://dl.acm.org/citation.cfm?id=2484028

    • Available free via the eBay intranet

  • Best paper nominations: http://www.bibsonomy.org/user/nattiya/sigir2013

  • Papers we liked: SIGIR 2013 Recap Wiki

  • SIGIR 2014: July 6-11, Queensland, Australia

SIGIR 2013 Recap


Personalized ranking model adaptation for web search
Personalized Ranking Model Adaptation for Web Search

Hongning Wang (University of Illinois at Urbana-Champaign)Xiaodong He (Microsoft Research)Ming-Wei Chang (Microsoft Research)Yang Song (Microsoft Research)Ryen W. White (Microsoft Research)Wei Chu (Microsoft Bing)

Paper Review by Yu Liu

SIGIR 2013 Recap


Motivations
Motivations

  • Searcher’s information needs are diverse

  • Need personalization for web search

  • Existing methods for personalization

    • Extracting user-centric features [Teevan et al. SIGIR’05]

      • Location, gender, click history

      • Require large volume of user history

    • Memory-based personalization [White and Drucker WWW’07, Shen et al. SIGIR’05]

      • Learn direct association between query and URLs

      • Limited coverage, poor generalization

  • Major considerations

    • Accuracy

      • Maximize the search utility for each single user

    • Efficiency

      • Executable on the scale of all the search engine users

      • Adapt to the user’s result preferences quickly


Personalized ranking model adaptation
Personalized Ranking Model Adaptation

Adapting the global ranking model for each individual user

Adjusting the generic ranking model’s parameters with respect to each individual user’s ranking preferences


Linear regression based model adaptation
Linear Regression Based Model Adaptation

Lose function from any linear learning-to-rank algorithm, e.g., RankNet, LambdaRank, RankSVM

Complexity of adaptation

  • Adapting global ranking model for each individual user

SIGIR 2013 @ Dublin Ireland


Ranking feature grouping
Ranking feature grouping

  • Organize the ranking features so that shared transformation is performed on the parameters of features in the same group

  • Maps V original ranking features to K different groups

    • Grouping features by name - Name

      • Exploring informative naming scheme

        • BM25_Body, BM25_Title

      • Clustering by manually crafted patterns

    • Co-clustering of documents and features – SVD [Dhillon KDD’01]

      • SVD on document-feature matrix

      • k-Means clustering to group features

    • Clustering features by importance - Cross

      • Estimate linear ranking model on different splits of data

      • k-Means clustering by feature weights in different splits


Discussion
Discussion

  • A general framework for ranking model adaptation

    • Applicable to a majority of existing learning-to-rank algorithms

    • Model-based adaptation, no need to operate on the numerous data from the source domain

    • Within the same optimization complexity as the original ranking model

    • Adaptation sharing across features to reduce the requirement of adaptation data


Experimental setup
Experimental Setup

  • Dataset

    • Bing.com query log: May 27, 2012 – May 31, 2012

    • Manual relevance annotation

      • 5-grade relevance score

    • 1830 ranking features

      • BM25, PageRank, tf*idf and etc.

SIGIR 2013 @ Dublin Ireland


Improvement analysis
Improvement analysis

  • User-level improvement

    • Against global model

SIGIR 2013 @ Dublin Ireland


Conclusions
Conclusions

  • Efficient ranking model adaption framework for personalized search

    • Linear transformation for model-based adaptation

    • Transformation sharing within a group-wise manner

  • Future work

    • Joint estimation of feature grouping and model transformation

    • Incorporate user-specific features and profiles

    • Extend to non-linear models

SIGIR 2013 @ Dublin Ireland


Toward self correcting search engines using underperforming queries to improve search
TOWARD SELF-CORRECTING SEARCH ENGINES:USING UNDERPERFORMING QUERIES TO IMPROVE SEARCH

Ahmed Hassan (Microsoft)Ryen W. White (Microsoft Research)Yi-Min Wang (Microsoft Research)

Paper Review by Nadia Vase

SIGIR 2013 Recap


Overview
Overview QUERIES TO IMPROVE SEARCH

  • What to do with a dissatisfying query?

    • Why is it bad? New features to fix it?

    • If the same problem recurs, can find a pattern

  • Identify dissatisfying (DSAT) queries

  • Cluster them

  • Train specialized rankers+general ranker

SIGIR 2013 Recap


Identifying dissatisfying queries
Identifying dissatisfying queries QUERIES TO IMPROVE SEARCH

  • Use toolbar data

  • Based on search engine switching events

    • 60% of switching events: DSAT search

  • Trained classifier to predict switch cause

    • Logistic regression, 562 labeled, 107 users

    • Binary classifier

SIGIR 2013 Recap


Features for dissatisfying switches
Features for dissatisfying switches QUERIES TO IMPROVE SEARCH

SIGIR 2013 Recap


Clustering dsat queries
Clustering DSAT Queries QUERIES TO IMPROVE SEARCH

  • What to do with DSAT queries

  • DSAT instance has 140 binary features

    • Query: length, language, “phrase (NP, VP) type”, ODP category

    • SERP: direct answer/feature, query suggestion shown, spell correction, etc

    • Search instance: market (US, UK, etc), query vertical (Web, News, etc), search engine, temporal attributes

  • Use Weka’s implementation of FP-Growth to cluster

SIGIR 2013 Recap


Clustering fp growth
Clustering: FP-Growth QUERIES TO IMPROVE SEARCH

filter and order features &create the FP-tree

bottom-up algorithm to find attribute clusters

SIGIR 2013 Recap


Example of attribute sets
Example of attribute sets QUERIES TO IMPROVE SEARCH

SIGIR 2013 Recap


Building modified rankers
Building Modified Rankers QUERIES TO IMPROVE SEARCH

  • 2nd round ranker per each DSAT group

    • Trained DSAT data, general ranker’s output score

SIGIR 2013 Recap


Experiment results
Experiment results QUERIES TO IMPROVE SEARCH

SIGIR 2013 Recap


Fighting search engine amnesia reranking repeated results
Fighting Search Engine Amnesia: QUERIES TO IMPROVE SEARCHReranking Repeated Results

MiladShokouhi(Microsoft)Ryen W. White (Microsoft Research)Paul Bennett (Microsoft Research)FilipRadlinski(Microsoft)

Paper Review by Riddick Jiang

SIGIR 2013 Recap


Repetition
Repetition QUERIES TO IMPROVE SEARCH

40%-60% sessions have two queries or more

16- 44% of sessions (depending on the search engine) with two queries have at least one repeated result

Repetition increases to almost all sessions with ten or more queries

SIGIR 2013 Recap


Intuition
Intuition QUERIES TO IMPROVE SEARCH

  • Promote new results (previously missed or new)

  • Demote previously skipped results

  • Demote previously clicked results

    • Promote previously clicked results if clicked >= 2 (personal nav)

SIGIR 2013 Recap


Sigir 2013 recap

SIGIR 2013 Recap QUERIES TO IMPROVE SEARCH


Ctr for skipped results
CTR for skipped results QUERIES TO IMPROVE SEARCH

SIGIR 2013 Recap


Ctr for clicked results
CTR for clicked results QUERIES TO IMPROVE SEARCH

SIGIR 2013 Recap


Ranking features
Ranking features QUERIES TO IMPROVE SEARCH

SIGIR 2013 Recap


Evaluation
Evaluation QUERIES TO IMPROVE SEARCH

Personal Nav: Score, Position, and a Personal Navigation feature- counts the number of times a particular result has been clicked for the same query previously in the session

ClickHistory: Score, Position, and Click-history - click counts for each result on a per query basis

SIGIR 2013 Recap


A b testing
A/B testing QUERIES TO IMPROVE SEARCH

Interleave results from R-cube and control

randomly allocating each result position to R-cube or the baseline

Credit click to the corresponding ranker

Five days in June, 2012

370,000 queries

R-cube ranker was preferred for 53.8% of queries

statistically significant

SIGIR 2013 Recap