query suggestion using hitting time
Download
Skip this Video
Download Presentation
Query Suggestion Using Hitting Time

Loading in 2 Seconds...

play fullscreen
1 / 25

query suggestion using hitting time - PowerPoint PPT Presentation


  • 308 Views
  • Uploaded on

Query Suggestion Using Hitting Time. Qiaozhu Mei † , Dengyong Zhou ‡ , Kenneth Church ‡ † University of Illinois at Urbana-Champaign ‡ Microsoft Research, Redmond. Motivating Examples. Sports center. MSG. 1. Difficult for a user to express information need

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'query suggestion using hitting time' - JasminFlorian


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
query suggestion using hitting time

Query Suggestion Using Hitting Time

Qiaozhu Mei †, Dengyong Zhou ‡, Kenneth Church ‡

† University of Illinois at Urbana-Champaign

‡ Microsoft Research, Redmond

motivating examples
Motivating Examples

Sports center

MSG

1. Difficult for a user to express information need

2. Difficult for a Search engine to infer information need

Food Additive

Query Suggestions: Accurate to express the information need;

Easy to infer information need

motivating examples cont
Motivating Examples (Cont.)

Welcome to the hotel california

motivating examples personalization
Motivating Examples: Personalization

MSR

Metropolis Street Racer

Magnetic Stripe Reader

Molten salt reactor

Mars Sample Return

Mountain safety research

Actually Looking for Microsoft Research…

research questions
Research Questions
  • How can we generate query suggestions in a principled way?
  • Can we generate personalized query suggestions using the same method?
  • Can this method be generalized to other search related tasks?
rest of this talk
Rest of This Talk
  • Random Walk, Hitting Time, and Bipartite Graph
  • Generating Query Suggestion
  • Personalized Query Suggestion
  • Experiments
  • Discussion and Summary
random walk and hitting time
Random Walk and Hitting Time

P = 0.3

  • Hitting Time
    • TA: the first time that the random walk is at a vertex in A
  • Mean Hitting Time
    • hiA: expectation of TA given that the walk starts from vertex i

0.3

k

A

i

0.7

P = 0.7

j

computing hitting time
Computing Hitting Time

hiA = 0.7 hjA + 0.3 hkA + 1

h = 0

  • TA: the first time that the random walk is at a vertex in A

0.7

k

A

i

  • hiA: expectation of TA given that the walk starting from vertex i

0.7

Apparently, hiA = 0 for those

j

Iterative Computation

bipartite graph and hitting time
Bipartite Graph and Hitting Time
  • Bipartite Graph:
    • Edges between V1 and V2
    • No edge inside V1 or V2
    • Edges are weighted
    • e.g., V1 = query; V2 = Url

5

5

5

A

A

A

4

4

4

V1

V1

V1

0.4

0.4

0.4

V2

V2

V2

k

0.7

0.7

0.7

7

7

7

1

1

1

i

i

i

w(i, j) = 3

j

j

j

Expected proximity of query i to the query A : hitting time of i  A, hiA

  • convert to a directed graph, even collapse one group
generate query suggestion
Generate Query Suggestion
  • Construct a (kNN) subgraph from the query log data (of a predefined number of queries/urls)
  • Compute transition probabilities p(i  j)
  • Compute hitting time hiA
  • Rank candidate queries using hiA

Query

Url

300

T

www.aa.com

aa

15

www.theaa.com/travelwatch/planner_main.jsp

mexiana

american airline

en.wikipedia.org/wiki/Mexicana

intuition
Intuition
  • Why it works?
    • A url is close to a query if freq(q, url) dominates the number of clicks on this url (most people use q to access url)
    • A query is close to the target query if it is close to many urls that are close to the target query
personalized query suggestion
Personalized Query Suggestion
  • Queries are ambiguous
  • Different user  different information need  different query suggestions
  • Simple approach: build the graph, compute hitting time solely based on the user’s history
  • Data Sparseness
    • E.g., you cannot see a query if you never used it
  • Alternative: modify the bipartite graph instead of rebuilding all
personalize the bipartite graph
Personalize the Bipartite Graph
  • Key: How to compute
    • From w(url, user, query) – Sparse data!
    • Compute a smoothed p(Url | User, Query)

Query

Url

Reweight edges using personalized

Probs.

T

aa

www.aa.com

pseudo query:

P

“aa” + user

www.theaa.com/travelwatch/planner_main.jsp

alcoholics anonymous

en.wikipedia.org/wiki/Alcoholics_Anonymous

Introduce a pseudo (personalized query)

american airline

www.alcoholics-anonymous.org

personalization with backoff mei and church 08
Personalization with Backoff (Mei and Church 08)

Full personalization: sparse data!

156.111.188.243

156.111.188.*

Personalization with backoff:

156.111.*.*

156.*.*.*

No personalization: lose the opportunity

*.*.*.*

  • We don’t have enough data for everyone!
    • - Backoff to classes of users (e.g., IP)
experiments
Experiments
  • Query Suggestion using Query Logs
    • commercial search engine log (1.5 year)
    • 637 million queries; 585 million urls
    • Query-click bipartite graph
  • Author/keyword suggestion using DBLP
    • titles and authors from DBLP
    • 110k of papers, 580k authors
    • Coauthor graph, keyword graph, author-keyword bipartite graph
  • Baselines: nearest neighbor; personalized pagerank
result query suggestion ii
Result: Query Suggestion (II)

Query = aa

Query = ranknet

result author suggestion
Result: Author Suggestion

Favor students, especially current students

Query = Jon Kleinberg

(personalized

Pagerank is

similar)

Famous researchers + former students

result keyword suggestion for author
Result: Keyword Suggestion for Author

Query = Michael I. Jordan

Query = Jiawei Han

discussions
Discussions
  • Hitting time effectively boosts infrequent queries
    • Nearest Neighbor & personalized pagerank favorites frequent queries
  • Fast convergence: a few iterations and a subgraph gets most of the value
  • No parameter to tune
  • Can be generalized to many other tasks (on different graphs)
ranking on query log graph and search tasks
Ranking on Query log Graph and Search Tasks
  • Query  Query: query suggestion
  • Url  Url: finding related pages

www.cs.jhu.edu/~brill 

      • "research.microsoft.com/users/brill”
  • IP  IP: finding similar users
  • Url  Query: Annotation, Summarization, ads term
  • Query  Url: Search
  • IP, Query  Url: Personalized Search
  • IP, Query  Query: Personalized Query Suggestion
  • Many other opportunities!
summary
Summary
  • Generate query suggestions using hitting time on query-click graph
  • Personalized query suggestion
  • Generalizable to other search tasks
  • Future work:
    • Different types of graphs: e.g., query sessions
    • Combine with other features
    • Large scale evaluation
ad