Predicting short term interests using activity based search context
This presentation is the property of its rightful owner.
Sponsored Links
1 / 32

Predicting Short-Term Interests Using Activity-Based Search Context PowerPoint PPT Presentation


  • 43 Views
  • Uploaded on
  • Presentation posted in: General

Predicting Short-Term Interests Using Activity-Based Search Context. CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh. Outline. Introduction Modeling Search Activity Study Conclusions. Introduction.

Download Presentation

Predicting Short-Term Interests Using Activity-Based Search Context

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Predicting short term interests using activity based search context

Predicting Short-Term Interests Using Activity-Based Search Context

CIKM’10

Advisor: Jia Ling, Koh

Speaker: Yu Cheng, Hsieh


Outline

Outline

  • Introduction

  • Modeling Search Activity

  • Study

  • Conclusions


Introduction

Introduction

  • Satisfying searchers’ information needs involves a through understanding of their interests through:

    - search query

    - search engine result page (SERP) clicks

    - post-SERP browsing behavior

  • Construct interest models of the current query which including:

    - previous queries

    - previous clicks on SERP

  • Evaluate the predictive effectiveness of these models using future actions


Modeling search activity

Modeling Search Activity

  • Data

    - The data set contained browser logs with both

    searching and browsing episodes.

    - Log entries include a timestamp for each page

    view, and the URL of the Web page visited

    - Only in English-speaking United States locale

    - Search sessions on the Bing Web search engine were

    extracted


Modeling search activity1

Modeling Search Activity

  • ODP Labeling

    - Represented context a distribution across categories in ODP

    topical hierarchy.

    - Provides a consistent topical representation of queries and page

    visits from which to build the models.

    - ODP category label can also reflect topical differences in the

    search results for a query or a user’s interests

    - Automatic classification skill to assign an ODP category labels to

    each page.

    - 219 categories at the top two levels of the ODP hierarchy were

    used ( called L)

    -


Modeling search activity2

Modeling Search Activity

  • ODP Labeling

    - Strategy of labeling a page

    1. Begin with URLs present in the ODP

    2. Incrementally prunes non-present URLs until a match is found,

    or miss declared

    3. Check for exact match with logistic regression classifier


Modeling search activity3

Modeling Search Activity

  • Sources and Source Combinations

    - ODP labels automatically assigned to the following

    sources:

    1. Query: the top 10 search results for the query

    2. SERPClick: the search results clicked by the user during the search

    session

    3. NavTrai: Web pages that the user visits from a SERP click


Modeling search activity4

Modeling Search Activity

  • Model Definitions– Query Model(Q)

    - For each query, the category labels for the top 10

    search results were obtained.

    - Probabilities are assigned to the categories in L by

    1. normalized click frequencies for each top 10 results

    from search-engine click log data

    2. the distribution across all ODP category labels

    - ODP categories in L that are not used to label are

    assigned the prior probabilities


Modeling search activity5

Modeling Search Activity

  • Model Definitions– Context Model(X)

    - The context model is constructed based on actions

    which comprise previous data as follows:

    1. Queries

    2. Web pages visited through a SERP click

    3. Web pages visited on the navigational trail

    following a SERP click


Modeling search activity6

Modeling Search Activity


Modeling search activity7

Modeling Search Activity

  • Model Definition – Intent Model(I)


Modeling search activity8

Modeling Search Activity

  • Relevance Model or Ground Truth (R)

    - The relevance model contains actions that occur

    following the current query in the session


Modeling search activity9

Modeling Search Activity


Study

Study


Study1

Study


Study2

Study


Study3

Study


Study4

Study

  • Learning Optimal Context Weights

    Steps

    1. Identify the optimal context weight (w) for each query

    on a held out training set

    2. Create features for the query and the context that could

    be useful in predicting w


Study5

Study

  • Learning Optimal Context Weights

    - To create a training set, the query, context, and

    relevance models were used to compute the

    optimal context weight per query by minimizing

    the regularized cross-entropy for each query

    independently.


Study6

Study

A regularizer that penalizes deviations from w=0.5


Study7

Study

  • Generating Features of Query and Context

    - Divide features into three classes:

    1. Query class: capturing characteristics of the current query and the query

    model.

    2. Context class: capturing aspects of the pre-query interaction behavior as

    well as features of the context model themselves.

    3. QueryContext: capturing aspects of how the query model and context

    model compare.

    - These features were generated for each session in the

    set and used to train a predictive model


Study8

Study

  • Generating Features of Query and Context

    - Query class


Study9

Study

  • Generating Features of Query and Context

    - Context class


Study10

Study

  • Generating Features of Query and Context

    - QueryContext class


Study11

study


Study12

study

  • Predicting the Optimal Context Weight

    - 60% of those queries for training, 20%for validation, 20%

    for testing

    - 10-fold cross validation was performed to improve result

    reliability.

    - The folds were constructed by splitting session, so that

    all queries in a session are used for either training,

    validation, or testing


Study13

study


Study14

study

  • Predicting the Optimal Context Weight

    The most performant features related to the information divergence to the query models and the context model


Study15

study

  • Predicting the Optimal Context Weight


Study16

study


Study17

study

  • Varying Context and Relevance Information


Conclusions

Conclusions

  • A study of investigating the effectiveness of activity-based context in predicting user’s search interests.

  • Explored the value of modeling the current query, its context and their combination, and different sources.

  • Intent models developed from many sources perform best overall.

  • Developed techniques to learn the optimal combinations.


  • Login