1 / 27

Fast Learning of Document Ranking Functions with the Committee Perceptron

Fast Learning of Document Ranking Functions with the Committee Perceptron. Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007. Briefly…. Joint work with Vitor Carvalho and Jaime Carbonell Submitted to Web Search and Data Mining conference (WSDM 2008) http://wsdm2008.org.

terin
Download Presentation

Fast Learning of Document Ranking Functions with the Committee Perceptron

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fast Learning of Document Ranking Functions with the Committee Perceptron Jonathan Elsas LTI Student Research Symposium Sept. 14, 2007

  2. Briefly… • Joint work with Vitor Carvalho and Jaime Carbonell • Submitted to Web Search and Data Mining conference (WSDM 2008) http://wsdm2008.org

  3. Evolution of Features in IR • “In the beginning, there was TF…” • It became clear that other features were needed for effective document ranking: IDF, document length… • Along came HTML: doc. structure & link network features… • Now, we have collective annotation: social book-marking features…

  4. Challenges • Which features are important? How to best choose the weights for each feature? • With just a few features, manual tuning or parameter sweeps sufficed. • This approach becomes impractical with more than 5-6 features.

  5. Learning Approach to Setting Feature Weights • Goal: Utilize existing relevance judgments to learn optimal weight setting • Recently has become a hot research area in IR. “Learning to Rank” (See SIGIR 2007 Learning To Rank workshophttp://research.microsoft.com/users/LR4IR-2007/)

  6. Assume our ranking function is of the form: Where Is a vector of feature values for this document-query pair Pair-wise Preference Learning • Learning a document scoring function • Treated as a classification problem on pairs of documents: • Resulting scoring function is used as the learned document ranker. Correct Incorrect

  7. Perceptron Algorithm • Proposed in 1958 by Rosenblatt • Online algorithm (instance-at-a-time) • Whenever a ranking mistake is made, update the hypothesis: • Provable mistake bounds & convergence

  8. Perceptron Algorithm Variants • Pocket Perceptron (Gallant, 1990) Keep the one-best hypothesis • Voted Perceptron (Freund & Schapire, 1999) Keep all the intermediate hypotheses and combine them at the end Often in practice, average hypotheses

  9. Committee Perceptron Algorithm • Ensemble method • Selectively chooses N best hypotheses encountered during training • Significant advantages over previous perceptron variants • Many ways to combine output of hypotheses • Voting, score averaging, hybrid approaches • Weight by a retrieval performance metric

  10. q, dR, dN R q, dR, dN N q, dR, dN q, dR, dN q, dR, dN q, dR, dN q, dR, dN Committee Perceptron Training Training Data Committee Current Hypothesis

  11. q, dR, dN R q, dR, dN N q, dR, dN q, dR, dN q, dR, dN q, dR, dN q, dR, dN Committee Perceptron Training Training Data Committee Current Hypothesis

  12. q, dR, dN R q, dR, dN N q, dR, dN q, dR, dN q, dR, dN q, dR, dN q, dR, dN Committee Perceptron Training Training Data Committee Current Hypothesis

  13. q, dR, dN R q, dR, dN N q, dR, dN q, dR, dN q, dR, dN q, dR, dN q, dR, dN Committee Perceptron Training Training Data Committee Current Hypothesis

  14. Evaluation • Compared Committee Perceptron to RankSVM (Joachims et. al., 2002) RankBoost (Freund et. al., 2003) • Learning To Rank (LETOR) dataset: http://research.microsoft.com/users/tyliu/LETOR/default.aspx • Provides three test collections, standardized feature sets, train/validation/test splits

  15. Committee Perceptron Learning Curves

  16. Committee Perceptron Performance

  17. Committee Perceptron Performance (OHSUMED)

  18. Committee Perceptron Performance (TD2004)

  19. Committee Perceptron Training Time • Much faster than other rank learning algorithms. • Training time on OHSUMED dataset: • CP: ~450 seconds for 50 iterations • RankSVM: > 21k seconds • 45-fold reduction in training time with comparable performance.

  20. Committee Perceptron: Summary • CP is a fast perceptron-based learning algorithm, applied to document ranking. • Significantly outperforms the pocket and average perceptron variants on learning document ranking functions. • Performs comparably to two strong baseline rank learning algorithms, but trains in much less time.

  21. Future Directions • Performance of the Committee Perceptron is good, but it could be better • What are we really optimizing? (not MAP or NDCG…)

  22. Loss Functions for Pairwise Preference Learners • Minimizing the number of mis-ranked document pairs • This only loosely corresponds to ranked-based evaluation measures • Problem: All rank positions treated the same

  23. Best BPREF Best MAP Problems with Optimizing the Wrong Metric

  24. Ranked Retrieval Pairwise- Preference Loss Functions • Average Precision places more emphasis on higher-ranked documents.

  25. Ranked Retrieval Pairwise- Preference Loss Functions • Average Precision places more emphasis on higher-ranked documents. • Re-writing AP as a pairwise loss function:

  26. Preliminary Results Using MAP-Loss Using Pairs-Loss

  27. Questions?

More Related