Mining term association patterns from search logs for effective query reformulation
Download
1 / 24

Mining Term Association Patterns from Search Logs for Effective Query Reformulation - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Mining Term Association Patterns from Search Logs for Effective Query Reformulation. Xuanhui Wang and ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign. Ineffective Queries. reduce space command latex. Effective Queries. squeeze space command latex.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Mining Term Association Patterns from Search Logs for Effective Query Reformulation' - trevor


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Mining term association patterns from search logs for effective query reformulation

Mining Term Association Patterns from Search Logs for Effective Query Reformulation

Xuanhui Wang and ChengXiang Zhai

Department of Computer Science

University of Illinois at Urbana-Champaign

ACM CIKM 2008, Oct. 26-30, Napa Valley


Ineffective queries
Ineffective Queries Effective Query Reformulation

reduce space command latex

ACM CIKM 2008, Oct. 26-30, Napa Valley


Effective queries
Effective Queries Effective Query Reformulation

squeeze space command latex

ACM CIKM 2008, Oct. 26-30, Napa Valley


More examples
More Examples Effective Query Reformulation

  • If you want to wash your vehicle

    • “vehicle wash”, “auto wash”

    • “car wash”, “truck wash”

  • If you want to buy a car

    • “auto quotes”

    • “auto sale quotes”?

    • “auto insurance quotes”?

ACM CIKM 2008, Oct. 26-30, Napa Valley


What makes a query ineffective
What Makes a Query Ineffective? Effective Query Reformulation

  • Vocabulary mismatch

    • “reduce space command latex” vs “squeeze space command latex”

    • “auto wash” vs “car wash”

  • Lack of discrimination

    • “auto quotes” vs “auto sale quotes”

Term substitution

Term addition

How can we help improving ineffective queries?

ACM CIKM 2008, Oct. 26-30, Napa Valley


Our contribution
Our Contribution Effective Query Reformulation

  • We cast query reformulation as term levelpattern mining from search logs

  • We define two basic types of patterns at term level and propose probabilistic methods

    • Context-sensitive term substitution

      • “autocar | _wash”, “car  auto | _trade”

    • Context-sensitive term addition

      • “+sale | auto_quotes”

  • We evaluate our methods on commercial search engine logs and show their effectiveness

ACM CIKM 2008, Oct. 26-30, Napa Valley


Problem formulation
Problem Formulation Effective Query Reformulation

q = auto wash

Search logs

Task 1:Contextual

Models

Task 3:

Pattern Mining

Query

Collection

autocar | _washautotruck | _wash

Patterns

Task 2:Translation

Models

+southland | _auto wash…

car washtruck wash

southland auto wash…

Offline part

Online part

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 1 contextual models
Task 1: Contextual Models Effective Query Reformulation

  • Syntagmatic relations

  • Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

G: General context

rental: 0.375enterprise: 0.125budget: 0.125pricing: 0.125…

Model PG( * |car)

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 1 contextual models1
Task 1: Contextual Models Effective Query Reformulation

Syntagmatic relations

Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

L1: 1st Left Context

rental: 0.333enterprise: 0.333budget: 0.333…

Model: P L1( * | car)

ACM CIKM 2008, Oct. 26-30, Napa Valley

9


Task 1 contextual models2
Task 1: Contextual Models Effective Query Reformulation

Syntagmatic relations

Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

R1: 1st Right context

rental: 0.4pricing: 0.2pictures: 0.2accidents: 0.2 …

Model: P R1( * |w)

ACM CIKM 2008, Oct. 26-30, Napa Valley

10


Task 2 translation models
Task 2: Translation Models Effective Query Reformulation

  • Paradigmatic relations (“car” and “auto”)

  • Capture terms that are substitutable with w

  • Similar contexts  high translation probability

  • Translation models

Probability of generating s’s context from w’s contextual model

Size of L1 context

Size of R1 context

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 3 1 pattern mining term substitution
Task 3.1: Pattern Mining–Term Substitution Effective Query Reformulation

q=[w1…wi-1wiwi+1…wn]

Global factor:translation model

Substitute wi by s

q’=[w1…wi-1swi+1…wn]

Local factor

Which word s should be chosen?

ACM CIKM 2008, Oct. 26-30, Napa Valley


Estimating local factor
Estimating Local Factor Effective Query Reformulation

s

w1…wi-1__wi+1…wn

Independence

Ignore those terms far away

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 3 2 pattern mining term addition
Task 3.2: Pattern Mining–Term Addition Effective Query Reformulation

q=[w1…wi-1wi…wn]

Uniform

Adding r before wi

q’=[w1…wi-1rwi…wn]

Similar to the Local Factor in Term Substitution Patterns

ACM CIKM 2008, Oct. 26-30, Napa Valley


Evaluation data preparation
Evaluation: Data Preparation Effective Query Reformulation

Future logs

History Logs

5/1/2006

5/20/2006

5/31/2006

  • From Microsoft Live Labs

History Collection

4.4M queries

1.6M are distinct

1.3M user sessions

Used to construct test cases

ACM CIKM 2008, Oct. 26-30, Napa Valley


Examples of contextual models
Examples of Contextual Models Effective Query Reformulation

  • Left and Right contexts are different

  • General context mixed them together

ACM CIKM 2008, Oct. 26-30, Napa Valley


Examples of translation models
Examples of Translation Models Effective Query Reformulation

  • Conceptually similar keywords have high translation probabilities

  • Provide possibility for exploratory search in an interactive manner

ACM CIKM 2008, Oct. 26-30, Napa Valley


Examples of term substitution
Examples of Term Substitution Effective Query Reformulation

  • Substitution is context sensitive

  • Intuitively, reworded queries are more effective

ACM CIKM 2008, Oct. 26-30, Napa Valley


Effectiveness comparison of term substitution experiment design
Effectiveness Comparison of Term Substitution – Experiment Design

Q1

Q2

Session

Qk

R21

R22

R23

Rk1

Rk2

Rk3

C1

C3

C2

How well can a reformulated query rank C1, C2, and C3 on the top?

reformulation

Q1

Q1’

Q2’

Q3’

dx

C3

C1

C2

dx

dx

C1

dx

dx

dx

dx

C2

dx

C3

dx

Best [email protected]=0.6

[email protected]

0.6

0.2

0.4

ACM CIKM 2008, Oct. 26-30, Napa Valley


Results
Results Design

Our method

[Jones’06]

#Recommended Queries

Our method reformulates queries more effectively

ACM CIKM 2008, Oct. 26-30, Napa Valley


Term addition patterns
Term Addition Patterns Design

Term addition patterns can refine a broad query

ACM CIKM 2008, Oct. 26-30, Napa Valley


Related work
Related Work Design

  • Query suggestions [e.g., Jones’06, Sahami et al’06]

    • Discover pattern at query level

    • Rely on external resources or training data

    • Does not consider the effectiveness

  • Query modifications in IR [Rocchio’71, Anick’03]

    • Expand queries from returned documents

    • Does not rely on search logs, mostly adding terms

  • Related work in NLP community [Lin’98, Rapp’02]

    • Finding synonym or near synonyms

    • Syntagmatic and paradigmatic relations

    • Not used for query reformulation

ACM CIKM 2008, Oct. 26-30, Napa Valley


Conclusions and future work
Conclusions and Future Work Design

  • We propose a new way to mine search logs for patterns to address ineffective queries

    • Vocabulary mismatch

    • Lack of discrimination

  • We define and mine two basic patterns at term level

    • Context-sensitive term substitution patterns

    • Context-sensitive term addition patterns

  • Experiments show the effectiveness of our methods

  • In the future,

    • Use relevance judgments instead of clicks

    • Exploit click information for better query reformulation

ACM CIKM 2008, Oct. 26-30, Napa Valley


Thank you

Thank You! Design

ACM CIKM 2008, Oct. 26-30, Napa Valley


ad