Implicit user modeling for personalized search
Download
1 / 32

Implicit User Modeling for Personalized Search - PowerPoint PPT Presentation


  • 145 Views
  • Uploaded on

Implicit User Modeling for Personalized Search. Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign. Current Search Engines are Mostly Document-Centered…. …. Search Engine. …. Documents. Search is generally non-personalized….

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Implicit User Modeling for Personalized Search' - hasad-grimes


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Implicit user modeling for personalized search

Implicit User Modeling for Personalized Search

Xuehua Shen, Bin Tan, ChengXiang Zhai

Department of Computer Science

University of Illinois, Urbana-Champaign


Current search engines are mostly document centered
Current Search Engines are Mostly Document-Centered…

...

Search

Engine

Documents

Search is generally non-personalized…


Example of non personalized search

As of Oct. 17, 2005

Car

Car

Software

Car

Animal

Car

Example of Non-Personalized Search

Query = Jaguar

Without knowing more about

the user, it’s hard to optimize…


Therefore personalization is necessary to improve the existing search engines

Therefore, personalization is necessary to improve the existing search engines.

However, many questions need to be answered…


Research questions
Research Questions existing search engines.

  • Client-side or server-side personalization?

  • Implicit or explicit user modeling?

  • What’s a good retrieval framework for personalized search?

  • How to evaluate personalized search?


Client side vs server side personalization
Client-Side vs. Server-Side Personalization existing search engines.

  • So far, personalization has mostly been done on the server side

  • We emphasize client-side personalization, which has 3 advantages:

    • More information about the user, thus more accurate user modeling (complete interaction history + other user activities)

    • More scalable (“distributed personalization”)

    • Alleviate the problem of privacy


Implicit vs explicit user modeling
Implicit vs. Explicit User Modeling existing search engines.

  • Explicit user modeling

    • More accurate, but users generally don’t want to provide additional information

    • E.g., relevance feedback

  • Implicit user modeling

    • Less accurate, but no extra effort for users

    • E.g., implicit feedback

We emphasize implicit user modeling


Jaguar example revisited

Suppose we know: existing search engines.

  • Previous query = “racing cars”

3. User just viewed an “Apple OS” document

“Jaguar” Example Revisited

  • “car” occurs far more frequently than “Apple” in pages browsed by the user in the last 20 days

All the information is naturally available to an IR system


Remaining research questions
Remaining Research Questions existing search engines.

  • Client-side or server-side personalization?

  • Implicit or explicit user modeling?

  • What’s a good retrieval framework for personalized search?

  • How to evaluate personalized search?


Outline
Outline existing search engines.

  • A decision-theoretic framework

  • UCAIR personalized search agent

  • Evaluation of UCAIR


Implicit user information exists in the user s interaction history

Implicit user information exists in the user’s interaction history.

We thus need to develop a retrieval framework for interactive retrieval…


Modeling interactive ir
Modeling Interactive IR history.

  • Model interactive IR as “action dialog”: cycles of user action (Ai ) and system response (Ri )


Retrieval decisions

History history.H={(Ai,Ri)}

i=1, …, t-1

Rt =?

Rt r(At)

Retrieval Decisions

Given U, C, At, and H, choose

the best Rt from all possible

responses to At

Query=“Jaguar”

User U: A1 A2 … … At-1 At

System: R1 R2 … … Rt-1

Click on “Next” button

Best ranking for the query

Best ranking of unseen docs

C

All possible rankings of C

Document

Collection

All possible rankings of unseen docs


Decision theoretic framework

User Model history.

Seen docs

M=(S, U…)

Information need

L(ri,At,M)

Loss Function

Optimal response: Rt (minimum loss)

expected risk

Inferred

Observed

Decision Theoretic Framework

Observed

User: U

Interaction history: H

Current user action: At

Document collection: C

All possible responses:

r(At)={r1, …, rn}


A simplified two step decision making procedure
A Simplified Two-Step history.Decision-Making Procedure

  • Approximate the expected risk by the loss at the mode of the posterior distribution

  • Two-step procedure

    • Step 1: Compute an updated user model M* based on the currently available information

    • Step 2: Given M*, choose a response to minimize the loss function


Optimal interactive retrieval

M* history.1

P(M1|U,H,A1,C)

L(r,A1,M*1)

R1

A2

M*2

P(M2|U,H,A2,C)

L(r,A2,M*2)

R2

A3

Optimal Interactive Retrieval

User

U

C

Collection

A1

IR system


Refinement of decision theoretic framework
Refinement of Decision Theoretic Framework history.

  • r(At): decision space (At dependent)

    • r(At) = all possible rankings of docs in C

    • r(At) = all possible rankings of unseen docs

  • M: user model

    • Essential component: U = user information need

    • S = seen documents

  • L(ri,At,M): loss function

    • Generally measures the utility of ri for a user modeled as M

  • P(M|U, H, At, C): user model inference

    • Often involves estimating U


Case 1 non personalized retrieval
Case 1: Non-Personalized Retrieval history.

  • At=“enter a query Q”

  • r(At) = all possible rankings of docs in C

  • M= U, unigram language model (word distribution)

  • p(M|U,H,At,C) = p(U |Q)


Case 2 implicit feedback for retrieval
Case 2: Implicit Feedback for Retrieval history.

  • At=“enter a query Q”

  • r(At) = all possible rankings of docs in C

  • M= U, unigram language model (word distribution)

  • H={previous queries} + {viewed snippets}

  • p(M|U,H,At,C) = p(U |Q,H)

Implicit User Modeling


Case 3 more general personalized search with implicit feedback
Case 3: More General Personalized Search history.with Implicit Feedback

  • At=“enter a query Q” or “Back” button, “Next” link

  • r(At) = all possible rankings of unseen docs in C

  • M= (U, S), S= seen documents

  • H={previous queries} + {viewed snippets}

  • p(M|U,H,At,C) = p(U |Q,H)

Eager Feedback


Benefit of the framework
Benefit of the Framework history.

  • Traditional view of IR

    • Retrieval  Match a query against documents

    • Insufficient for modeling personalized search (user and the interaction history are not part of a retrieval model)

  • The new framework provides a map for systematic exploration of

    • Methods for implicit user modeling

    • Models for eager feedback

  • The framework also provides guidance on how to design a personalized search agent (optimizing responses to every user action)



Ucair toolbar architecture http sifaka cs uiuc edu ir ucair download html
UCAIR Toolbar Architecture history.(http://sifaka.cs.uiuc.edu/ir/ucair/download.html)

UCAIR

User

query

Query

Modification

Search

Engine

(e.g.,

Google)

Search History Log

(e.g.,past queries,

clicked results)

User

Modeling

Result

Re-Ranking

clickthrough…

results

Result Buffer


Decision theoretic view of ucair
Decision-Theoretic View of UCAIR history.

  • User actions modeled

    • A1 = Submit a keyword query

    • A2 = Click the “Back” button

    • A3 = Click the “Next” link

  • System responses

    • r(Ai) = rankings of the unseen documents

  • History

    • H = {previous queries, clickthroughs}

  • User model: M=(X,S)

    • X = vector representation of the user’s information need

    • S = seen documents by the user


Decision theoretic view of ucair cont
Decision-Theoretic View of UCAIR history.(cont.)

  • Loss functions:

    • L(r, A2, M)= L(r, A3, M)  reranking, vector space model

    • L(r,A1,M)  L(q,A1,M)  query expansion, favor a good q

  • Implicit user model inference

    • X* = argmaxx p(x|Q,H), computed using Rocchio feedback

    • S* = all seens docs in H

Vector of a seen snippet

Newer versions of UCAIR have adopted language models


Ucair in action
UCAIR in Action history.

  • In responding to a query

    • Decide relationship of the current query with the previous query (based on result similarity)

    • Possibly do query expansion using the previous query and results

    • Return a ranked list of documents using the (expanded) query

  • In responding to a click on “Next” or “Back”

    • Compute an updated user model based on clickthroughs (using Rocchio)

    • Rerank unseen documents (using a vector space model)



A user study of personalized search
A User Study of Personalized Search history.

  • Six participants use UCAIR toolbar to do web search

  • Topics are selected from TREC web track and terabyte track

  • Participants explicitly evaluate the relevance of top 30 search results from Google and UCAIR


Ucair outperforms google precision at n docs
UCAIR Outperforms Google: history.Precision at N Docs

More user interactions

 better user models

 better retrieval accuracy



Summary
Summary history.

  • Propose a decision theoretic framework to model interactive IR

  • Build a personalized search agent for the web search

  • Do a user study of web search and show that UCAIR personalized search agent can improve retrieval accuracy


The End history.

Thank you !


ad