Mining term association patterns from search logs for effective query reformulation
This presentation is the property of its rightful owner.
Sponsored Links
1 / 24

Mining Term Association Patterns from Search Logs for Effective Query Reformulation PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on
  • Presentation posted in: General

Mining Term Association Patterns from Search Logs for Effective Query Reformulation. Xuanhui Wang and ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign. Ineffective Queries. reduce space command latex. Effective Queries. squeeze space command latex.

Download Presentation

Mining Term Association Patterns from Search Logs for Effective Query Reformulation

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Mining term association patterns from search logs for effective query reformulation

Mining Term Association Patterns from Search Logs for Effective Query Reformulation

Xuanhui Wang and ChengXiang Zhai

Department of Computer Science

University of Illinois at Urbana-Champaign

ACM CIKM 2008, Oct. 26-30, Napa Valley


Ineffective queries

Ineffective Queries

reduce space command latex

ACM CIKM 2008, Oct. 26-30, Napa Valley


Effective queries

Effective Queries

squeeze space command latex

ACM CIKM 2008, Oct. 26-30, Napa Valley


More examples

More Examples

  • If you want to wash your vehicle

    • “vehicle wash”, “auto wash”

    • “car wash”, “truck wash”

  • If you want to buy a car

    • “auto quotes”

    • “auto sale quotes”?

    • “auto insurance quotes”?

ACM CIKM 2008, Oct. 26-30, Napa Valley


What makes a query ineffective

What Makes a Query Ineffective?

  • Vocabulary mismatch

    • “reduce space command latex” vs “squeeze space command latex”

    • “auto wash” vs “car wash”

  • Lack of discrimination

    • “auto quotes” vs “auto sale quotes”

Term substitution

Term addition

How can we help improving ineffective queries?

ACM CIKM 2008, Oct. 26-30, Napa Valley


Our contribution

Our Contribution

  • We cast query reformulation as term levelpattern mining from search logs

  • We define two basic types of patterns at term level and propose probabilistic methods

    • Context-sensitive term substitution

      • “autocar | _wash”, “car  auto | _trade”

    • Context-sensitive term addition

      • “+sale | auto_quotes”

  • We evaluate our methods on commercial search engine logs and show their effectiveness

ACM CIKM 2008, Oct. 26-30, Napa Valley


Problem formulation

Problem Formulation

q = auto wash

Search logs

Task 1:Contextual

Models

Task 3:

Pattern Mining

Query

Collection

autocar | _washautotruck | _wash

Patterns

Task 2:Translation

Models

+southland | _auto wash…

car washtruck wash

southland auto wash…

Offline part

Online part

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 1 contextual models

Task 1: Contextual Models

  • Syntagmatic relations

  • Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

G: General context

rental: 0.375enterprise: 0.125budget: 0.125pricing: 0.125…

Model PG( * |car)

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 1 contextual models1

Task 1: Contextual Models

Syntagmatic relations

Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

L1: 1st Left Context

rental: 0.333enterprise: 0.333budget: 0.333…

Model: P L1( * | car)

ACM CIKM 2008, Oct. 26-30, Napa Valley

9


Task 1 contextual models2

Task 1: Contextual Models

Syntagmatic relations

Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

R1: 1st Right context

rental: 0.4pricing: 0.2pictures: 0.2accidents: 0.2 …

Model: P R1( * |w)

ACM CIKM 2008, Oct. 26-30, Napa Valley

10


Task 2 translation models

Task 2: Translation Models

  • Paradigmatic relations (“car” and “auto”)

  • Capture terms that are substitutable with w

  • Similar contexts  high translation probability

  • Translation models

Probability of generating s’s context from w’s contextual model

Size of L1 context

Size of R1 context

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 3 1 pattern mining term substitution

Task 3.1: Pattern Mining–Term Substitution

q=[w1…wi-1wiwi+1…wn]

Global factor:translation model

Substitute wi by s

q’=[w1…wi-1swi+1…wn]

Local factor

Which word s should be chosen?

ACM CIKM 2008, Oct. 26-30, Napa Valley


Estimating local factor

Estimating Local Factor

s

w1…wi-1__wi+1…wn

Independence

Ignore those terms far away

ACM CIKM 2008, Oct. 26-30, Napa Valley


Task 3 2 pattern mining term addition

Task 3.2: Pattern Mining–Term Addition

q=[w1…wi-1wi…wn]

Uniform

Adding r before wi

q’=[w1…wi-1rwi…wn]

Similar to the Local Factor in Term Substitution Patterns

ACM CIKM 2008, Oct. 26-30, Napa Valley


Evaluation data preparation

Evaluation: Data Preparation

Future logs

History Logs

5/1/2006

5/20/2006

5/31/2006

  • From Microsoft Live Labs

History Collection

4.4M queries

1.6M are distinct

1.3M user sessions

Used to construct test cases

ACM CIKM 2008, Oct. 26-30, Napa Valley


Examples of contextual models

Examples of Contextual Models

  • Left and Right contexts are different

  • General context mixed them together

ACM CIKM 2008, Oct. 26-30, Napa Valley


Examples of translation models

Examples of Translation Models

  • Conceptually similar keywords have high translation probabilities

  • Provide possibility for exploratory search in an interactive manner

ACM CIKM 2008, Oct. 26-30, Napa Valley


Examples of term substitution

Examples of Term Substitution

  • Substitution is context sensitive

  • Intuitively, reworded queries are more effective

ACM CIKM 2008, Oct. 26-30, Napa Valley


Effectiveness comparison of term substitution experiment design

Effectiveness Comparison of Term Substitution – Experiment Design

Q1

Q2

Session

Qk

R21

R22

R23

Rk1

Rk2

Rk3

C1

C3

C2

How well can a reformulated query rank C1, C2, and C3 on the top?

reformulation

Q1

Q1’

Q2’

Q3’

dx

C3

C1

C2

dx

dx

C1

dx

dx

dx

dx

C2

dx

C3

dx

Best [email protected]=0.6

[email protected]

0.6

0.2

0.4

ACM CIKM 2008, Oct. 26-30, Napa Valley


Results

Results

Our method

[Jones’06]

#Recommended Queries

Our method reformulates queries more effectively

ACM CIKM 2008, Oct. 26-30, Napa Valley


Term addition patterns

Term Addition Patterns

Term addition patterns can refine a broad query

ACM CIKM 2008, Oct. 26-30, Napa Valley


Related work

Related Work

  • Query suggestions [e.g., Jones’06, Sahami et al’06]

    • Discover pattern at query level

    • Rely on external resources or training data

    • Does not consider the effectiveness

  • Query modifications in IR [Rocchio’71, Anick’03]

    • Expand queries from returned documents

    • Does not rely on search logs, mostly adding terms

  • Related work in NLP community [Lin’98, Rapp’02]

    • Finding synonym or near synonyms

    • Syntagmatic and paradigmatic relations

    • Not used for query reformulation

ACM CIKM 2008, Oct. 26-30, Napa Valley


Conclusions and future work

Conclusions and Future Work

  • We propose a new way to mine search logs for patterns to address ineffective queries

    • Vocabulary mismatch

    • Lack of discrimination

  • We define and mine two basic patterns at term level

    • Context-sensitive term substitution patterns

    • Context-sensitive term addition patterns

  • Experiments show the effectiveness of our methods

  • In the future,

    • Use relevance judgments instead of clicks

    • Exploit click information for better query reformulation

ACM CIKM 2008, Oct. 26-30, Napa Valley


Thank you

Thank You!

ACM CIKM 2008, Oct. 26-30, Napa Valley


  • Login