Fine tuning ranking models
Sponsored Links
This presentation is the property of its rightful owner.
1 / 27

Fine-tuning Ranking Models: PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on
  • Presentation posted in: General

Fine-tuning Ranking Models:. Vitor Jan 29, 2008 Text Learning Meeting - CMU. a two-step optimization approach. With invaluable ideas from …. Motivation. Rank, Rank, Rank… Web retrieval, movie recommendation, NFL draft, etc. Einat ’s contextual search Richard ’s set expansion (SEAL)

Download Presentation

Fine-tuning Ranking Models:

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Fine-tuning Ranking Models:

Vitor

Jan 29, 2008

Text Learning Meeting - CMU

a two-step optimization approach

With invaluable ideas from ….


Motivation

  • Rank, Rank, Rank…

    • Web retrieval, movie recommendation, NFL draft, etc.

    • Einat’s contextual search

    • Richard’s set expansion (SEAL)

    • Andy’s context sensitive spelling correction algorithm

    • Selecting seeds in Frank’s political blog classification algorithm

    • Ramnath’s thunderbird extension for

      • Email Leak prediction

      • Email Recipient suggestion


Help your brothers!

  • Try Cut Once!, our Thunderbird extension

    • Works well with Gmail accounts

  • It’s working reasonably well

  • We need feedback.


Thunderbird plug-in

Leak warnings:

hit x to remove recipient

Suggestions:

hit + to add

Pause or cancel send of message

Email Recipient Recommendation

Timer: msg is sent after 10sec by default

Classifier/rankers written in JavaScript


Email Recipient Recommendation

36 Enron users


Email Recipient Recommendation

Threaded

[Carvalho & Cohen, ECIR-08]


Aggregating Rankings

[Aslam & Montague, 2001]; [Ogilvie & Callan, 2003]; [Macdonald & Ounis, 2006]

  • Many “Data Fusion” methods

    • 2 types:

      • Normalized scores: CombSUM, CombMNZ, etc.

      • Unnormalized scores: BordaCount, Reciprocal Rank Sum, etc.

  • Reciprocal Rank:

    • The sum of the inverse of the rank of document in each ranking.


Aggregated Ranking Results

[Carvalho & Cohen, ECIR-08]


Intelligent Email Auto-completion

TOCCBCC

CCBCC


Can we do better?

  • Not using other features, but better ranking methods

  • Machine learning to improve ranking: Learning to rank:

    • Many (recent) methods:

      • ListNet, Perceptrons, RankSvm, RankBoost, AdaRank, Genetic Programming, Ordinal Regression, etc.

    • Mostly supervised

    • Generally small training sets

    • Workshop in SIGIR-07 (Einat was in the PC)


Pairwise-based Ranking

Goal: induce a ranking function f(d) s.t.

Rank q

d1

d2

d3

d4

d5

d6

...

dT

We assume a linear function f

Therefore, constraints are:


Ranking with Perceptrons

  • Nice convergence properties and mistake bounds

    • bound on the number of mistakes/misranks

  • Fast and scalable

  • Many variants[Collins 2002, Gao et al 2005, Elsas et al 2008]

    • Voting, averaging, committee, pocket, etc.

    • General update rule:

    • Here: Averaged version of perceptron


Rank SVM

[Joachims, KDD-02],

[Herbrich et al, 2000]

  • Equivalent to maximing AUC

Equivalent to:


Loss Function


Loss Function


Loss Function


Loss Functions

  • SVMrank

  • SigmoidRank

Not convex


Fine-tuning Ranking Models

Base ranking model

Final model

Base Ranker

Sigmoid Rank

e.g., RankSVM, Perceptron, etc.

Non-convex:

Minimizing a very close approximation for the number of misranks


Gradient Descent


Results in CC prediction

36 Enron users


Set Expansion (SEAL) Results

[Wang & Cohen, ICDM-2007]

[Listnet: Cao et al. , ICML-07]


Results in Letor


Learning Curve

TOCCBCC Enron: user lokay-m


Learning Curve

CCBCC Enron: user campbel-m


Regularization Parameter

s=2

TREC3

TREC4

Ohsumed


Some Ideas

  • Instead of number of misranks, optimize other loss functions:

    • Mean Average Precision, MRR, etc.

    • Rank Term:

    • Some preliminary results with Sigmoid-MAP

  • Does it work for classification?


Thanks


  • Login