1 / 27

Fine-tuning Ranking Models:

Fine-tuning Ranking Models:. Vitor Jan 29, 2008 Text Learning Meeting - CMU. a two-step optimization approach. With invaluable ideas from …. Motivation. Rank, Rank, Rank… Web retrieval, movie recommendation, NFL draft, etc. Einat ’s contextual search Richard ’s set expansion (SEAL)

darice
Download Presentation

Fine-tuning Ranking Models:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fine-tuning Ranking Models: Vitor Jan 29, 2008 Text Learning Meeting - CMU a two-step optimization approach With invaluable ideas from ….

  2. Motivation • Rank, Rank, Rank… • Web retrieval, movie recommendation, NFL draft, etc. • Einat’s contextual search • Richard’s set expansion (SEAL) • Andy’s context sensitive spelling correction algorithm • Selecting seeds in Frank’s political blog classification algorithm • Ramnath’s thunderbird extension for • Email Leak prediction • Email Recipient suggestion

  3. Help your brothers! • Try Cut Once!, our Thunderbird extension • Works well with Gmail accounts • It’s working reasonably well • We need feedback.

  4. Thunderbird plug-in Leak warnings: hit x to remove recipient Suggestions: hit + to add Pause or cancel send of message Email Recipient Recommendation Timer: msg is sent after 10sec by default Classifier/rankers written in JavaScript

  5. Email Recipient Recommendation 36 Enron users

  6. Email Recipient Recommendation Threaded [Carvalho & Cohen, ECIR-08]

  7. Aggregating Rankings [Aslam & Montague, 2001]; [Ogilvie & Callan, 2003]; [Macdonald & Ounis, 2006] • Many “Data Fusion” methods • 2 types: • Normalized scores: CombSUM, CombMNZ, etc. • Unnormalized scores: BordaCount, Reciprocal Rank Sum, etc. • Reciprocal Rank: • The sum of the inverse of the rank of document in each ranking.

  8. Aggregated Ranking Results [Carvalho & Cohen, ECIR-08]

  9. Intelligent Email Auto-completion TOCCBCC CCBCC

  10. Can we do better? • Not using other features, but better ranking methods • Machine learning to improve ranking: Learning to rank: • Many (recent) methods: • ListNet, Perceptrons, RankSvm, RankBoost, AdaRank, Genetic Programming, Ordinal Regression, etc. • Mostly supervised • Generally small training sets • Workshop in SIGIR-07 (Einat was in the PC)

  11. Pairwise-based Ranking Goal: induce a ranking function f(d) s.t. Rank q d1 d2 d3 d4 d5 d6 ... dT We assume a linear function f Therefore, constraints are:

  12. Ranking with Perceptrons • Nice convergence properties and mistake bounds • bound on the number of mistakes/misranks • Fast and scalable • Many variants[Collins 2002, Gao et al 2005, Elsas et al 2008] • Voting, averaging, committee, pocket, etc. • General update rule: • Here: Averaged version of perceptron

  13. Rank SVM [Joachims, KDD-02], [Herbrich et al, 2000] • Equivalent to maximing AUC Equivalent to:

  14. Loss Function

  15. Loss Function

  16. Loss Function

  17. Loss Functions • SVMrank • SigmoidRank Not convex

  18. Fine-tuning Ranking Models Base ranking model Final model Base Ranker Sigmoid Rank e.g., RankSVM, Perceptron, etc. Non-convex: Minimizing a very close approximation for the number of misranks

  19. Gradient Descent

  20. Results in CC prediction 36 Enron users

  21. Set Expansion (SEAL) Results [Wang & Cohen, ICDM-2007] [Listnet: Cao et al. , ICML-07]

  22. Results in Letor

  23. Learning Curve TOCCBCC Enron: user lokay-m

  24. Learning Curve CCBCC Enron: user campbel-m

  25. Regularization Parameter s=2 TREC3 TREC4 Ohsumed

  26. Some Ideas • Instead of number of misranks, optimize other loss functions: • Mean Average Precision, MRR, etc. • Rank Term: • Some preliminary results with Sigmoid-MAP • Does it work for classification?

  27. Thanks

More Related