multi task learning and web search ranking l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Multi-Task Learning and Web Search Ranking PowerPoint Presentation
Download Presentation
Multi-Task Learning and Web Search Ranking

Loading in 2 Seconds...

play fullscreen
1 / 52

Multi-Task Learning and Web Search Ranking - PowerPoint PPT Presentation


  • 166 Views
  • Uploaded on

Multi-Task Learning and Web Search Ranking. Gordon Sun ( 孙国政 ) Yahoo! Inc. March 200 7. Outline: Brief Review: Machine Learning in web search ranking and Multi-Task learning. MLR with Adaptive Target Value Transformation – each query is a task.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Multi-Task Learning and Web Search Ranking' - nyla


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
multi task learning and web search ranking

Multi-Task Learning and Web Search Ranking

Gordon Sun (孙国政)

Yahoo! Inc

March 2007

slide2

Outline:

  • Brief Review: Machine Learning in web search ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation – each query is a task.
  • MLR for Multi-Languages – each language is a task.
  • MLR for Multi-query classes – each type of queries is a task.
  • Future work and Challenges.
slide3

MLR (Machine Learning Ranking)

  • General Function Estimation and Risk Minimization:
  • Input: x = {x1, x2, …, xn}
  • Output: y
  • Training set: {yi, xi}, i = 1, …, n
  • Goal: Estimate mapping function y = F(x)
  • In MLR work:
  • x = x (q, d) = {x1, x2, …, xn} --- ranking features
  • y = judgment labeling: e.g. {P E G F B} mapped to {0, 1, 2, 3, 4}.
  • Loss Function: L(y, F(x)) = (y – F(x))2
  • Algorithm: MLR with regression.
slide4

Rank features construction

    • Query features:
      • query language, query word types (Latin, Kanji, …), …
    • Document features:
      • page_quality, page_spam, page_rank,…
    • Query-Document dependent features:
      • Text match scores in body, title, anchor text (TF/IDF, proximity), ...
  • Evaluation metric – DCG (Discounted Cumulative Gain)
  • where grades Gi = grade values for {P, E, G, F, B} (NDCG – 2n) DCG5 -- (n=5), DCG10 -- (n=10)
slide6

Milti-Task Learning

  • Single-Task Learning (STL)
      • One prediction task (classification/regression):
      • to estimate a function based on oneTraining/testing set:
      • T= {yi, xi}, i = 1, …, n
  • Multi-Task Learning (MTL)
      • Multiple prediction tasks, each with their own training/testing set:
      • Tk= {yki, xki}, k = 1, …, m, i = 1, …, nk
      • Goal is to solve multiple tasks together:
      • - Tasks share the same input space (or at least partially):
      • - Tasks are related (say, MLR -- share one mapping function)
slide7

Milti-Task Learning: Intuition and Benefits

  • EmpiricalIntuition
    • Data from “related” tasks could help --
    • Equivalent to increase the effective sample size
  • Goal: Share data and knowledge from task to task --- Transfer Learning.
  • Benefits
      • - when # of training examples per task is limited
      • - when # of tasks is large and can not be handled by MLR for each task.
      • - when it is difficult/expensive to obtain examples for some tasks
      • - possible to obtain meta-level knowledge
slide8

Milti-Task Learning: “Relatedness” approaches.

  • Probabilistic modeling for task generation
  • [Baxter ’00], [Heskes ’00], [The, Seeger, Jordan ’05],
  • [Zhang, Gharamani, Yang ’05]
  • • Latent Variable correlations
  • – Noise correlations [Greene ’02]
  • – Latent variable modeling [Zhang ’06]
  • • Hidden common data structure and latent variables.
  • – Implicit structure (common kernels) [Evgeniou,
  • Micchelli, Pontil ’05]
  • – Explicit structure (PCA) [Ando, Zhang ’04]
  • • Transformation relatedness [Shai ’05]
slide9

Milti-Task Learning for MLR

  • Different levels of relatedness.
    • Grouping data based on queries, each query could be one task.
    • Grouping data based on languages of queries, each language is a task.
    • Grouping data based on query classes
slide10

Outline:

  • Brief Review: Machine Learning in web search ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation – each query is a task.
  • MLR for Multi-Languages – each language is a task.
  • MLR for Multi-query classes – each type of queries is a task.
  • Future work and Challenges.
slide11

Adaptive Target Value Transformation

  • Intuition:
  • Rank features vary a lot from query to query.
  • Rank features vary a lot from sample to sample with same labeling.
  • MLR is a ranking problem, but regression is to minimize prediction errors.
  • Solution: Adaptively adjust training target values:
  • Where linear (monotonic) transformation is required
  • (nonlinear g() may not reserve orders of E(y|x))
slide12

Adaptive Target Value Transformation

  • Implementation: Empirical Risk Minimization
  • Where the linear transformation weights are regularized,
  • λα and λβ are regularization parameters, the p-norm.
  • The solution will be
slide13

Adaptive Target Value Transformation

  • Norm p=2 solution: for each (λα and λβ )
    • For initial (αβ) , find F(x) by solving:
    • For given F(x), solve for each (αq, βq), q = 1, 2, … Q.
    • Repeat 1 until
  • Norm p=1 solution, solve conditional quadratic programming [Lasso/lars]
  • Convergence Analysis: Assuming
slide18

Adaptive Target Value Transformation

Observations:

1. Relevance gain (DCG5 ~ 2%) is visible.

2. Regularization is needed.

3. Different query types gain differently from aTVT.

slide19

Outline:

  • Brief Review: Machine Learning in web search ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation – each query is a task.
  • MLR for Multi-Languages – each language is a task.
  • MLR for Multi-query classes – each type of queries is a task.
  • Future work and Challenges.
multi language mlr
Multi-Language MLR

Objective:

  • Make MLR globally scalable: >100 languages, >50 regions.
  • Improve MLR for small regions/languages using data from other languages.
  • Build a Universal MLR for all regions that do not have data and editorial support.
multi language mlr part 1
Multi-Language MLR Part 1
  • Feature Differences between Languages
  • MLR function differences between Languages.
multi language mlr distribution of text score
Multi-Language MLRDistribution of Text Score

Perf+Excellent urls

Bad urls

Legend: JP, CN, DE, UK, KR

multi language mlr distribution of spam score
Multi-Language MLRDistribution of Spam Score

Perf+Excellent urls

Bad urls

JP, KR similar

DE, UK similar

Legend: JP, CN, DE, UK, KR

multi language mlr training and testing on different languages
Multi-Language MLRTraining and Testing on Different Languages

Train Language

Test Language

% DCG improvement over base function

multi language mlr language differences observations
Multi-Language MLRLanguage Differences: observations
  • Feature difference across languages is visible but not huge.
  • MLR trained for one language does not work well for other languages.
multi language mlr part 2
Multi-Language MLR Part 2

Transfer Learning with Region features

multi language mlr query region feature
Multi-Language MLRQuery Region Feature
  • New feature: query region:
  • Multiple Binary Valued Features:
    • Feature vector: qr = (CN, JP, UK, DE, KR)
    • CN queries: (1, 0, 0, 0, 0)
    • JP queries: (0, 1, 0, 0, 0)
    • UK queries: (0, 0, 1, 0, 0)
  • To test the Trained Universal MLR on new languages: e.g. FR
    • Feature vector: qr = (0, 0, 0, 0, 0)
multi language mlr query region feature experiment results cjk and uk de models
Multi-Language MLRQuery Region Feature: Experiment resultsCJK and UK,DE Models

All models include query region feature

multi language mlr query region feature observations
Multi-Language MLRQuery Region Feature: Observations
  • Query Region feature seems to improve combined model performance in every case. Not always statistically significant.
  • Helped more when we had less data (KR).
  • Helped more when introducing “near languages” models (CJK, EU)
  • Would not help for languages with large training data (JP, CN).
multi language mlr experiments overweighting target language
Multi-Language MLRExperiments: Overweighting Target Language
  • This method deals with the common case where there is a language with a small amount of data available.
  • Use all available data, but change the weight of the data from the target language.
  • When weight=1 “Universal Language Model”
  • As weight->INF becomes Single Language Model.
multi language mlr overweighting target language observations
Multi-Language MLROverweighting Target LanguageObservations:
  • It helps on certain languages with small size of data (KR, DE).
  • It does not help on some languages (CN, JP).
  • For languages with enough data, it will not help.
  • The weighting of 10 seems better than 1 and 100 on average.
multi language mlr part 3
Multi-Language MLR Part 3

Transfer Learning with

Language Neutral Data and Regression Diff

multi language mlr selection of language neutral queries
Multi-Language MLRSelection of Language Neutral queries:
  • For each of (CN, JP, KR, DE, UK), train an MLR with own data.
  • Test queries of one language by all languages MLRs.
  • Select queries that showed best DCG cross different language MLRs.
  • Consider these queries as language neutral and could be shared by all language MLR development.
slide41

Multi-Language MLR

Evaluation of Language Neutral Queries on CN-simplified dataset (2,753 queries).

slide42

Outline:

  • Brief Review: Machine Learning in web search ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation – each query is a task.
  • MLR for Multi-Languages – each language is a task.
  • MLR for Multi-query classes – each type of queries is a task.
  • Future work and Challenges.
multi query class mlr
Multi-Query Class MLR

Intuitions:

  • Different types of queries behave differently:
    • Require different ranking features,

(Time sensitive queries page_time_stamps).

    • Expect different results:

(Navigational queries one official page on the top.)

  • Also, different types of queries could share the same ranking features.
    • .
  • Multi-class learning could be done in a unified MLR by
    • Introducing query classification and use query class as input ranking features.
    • Adding page level features for the corresponding classes.
multi query class mlr44
Multi-Query Class MLR

Time Recency experiments:

  • Feature implementation:
    • Binary query feature: Time Sensitive (0,1)
    • Binary page feature: discovered within last three month.
  • Data:
    • 300 time sensitive queries (editorial).
    • ~2000 ordinary queries.
    • Over weight time sensitive queries by 3.
    • 10-fold cross validation on MLR training/testing.
multi query class mlr45
Multi-Query Class MLR

Time Recency experiments result:

Compare MLR with and w/o page_time feature.

multi query class mlr46
Multi-Query Class MLR

Name Entity queries:

  • Feature implementation:
    • Binary query feature: name entity query (0,1)
    • 11 new page features implemented:

Path length

      • Host length
      • Number of host component (url depth)
      • Path contains “index”
      • Path contains either “cgi”, “asp”, “jsp”, or “php”
      • Path contains “search” or “srch”, …
    • Data:
    • 142 place name entity queries.
    • ~2000 ordinary queries.
    • 10-fold cross validation on MLR training/testing.
multi query class mlr47
Multi-Query Class MLR

Name Entity query experiments result:

Compared MLR with base model without name entity features.

multi query class mlr48
Multi-Query Class MLR

Observations:

  • Query class combined with page level features could help MLR relevance.
  • More research is needed on query classification and page level feature optimization.
slide49

Outline:

  • Brief Review: Machine Learning in web search ranking and Multi-Task learning.
  • MLR with Adaptive Target Value Transformation – each query is a task.
  • MLR for Multi-Languages – each language is a task.
  • MLR for Multi-query classes – each type of queries is a task.
  • Future work and Challenges.
future work and challenges
Future Work and Challenges
  • Multi-task learning extended to different types of training data:
    • Editorial judgment data.
    • User click-through data
  • Multi-task learning extended to different types of relevance judgments:
    • Absolute relevance judgment.
    • Relative relevance judgment
  • Multi-task learning extended to use both
    • Labeled data.
    • Unlabeled data.
  • Multi-task learning extended to different types of search user intentions.
slide51

Contributors from Yahoo! International Search Relevance team:

  • Algorithm and model development:
    • Zhaohui Zheng,
    • Hongyuan Zha,
    • Lukas Biewald,
    • Haoying Fu
  • Data exporting/processing/QA:
    • Jianzhang He
  • Srihari Reddy
  • Director:
    • Gordon Sun.