mining term association patterns from search logs for effective query reformulation
Download
Skip this Video
Download Presentation
Mining Term Association Patterns from Search Logs for Effective Query Reformulation

Loading in 2 Seconds...

play fullscreen
1 / 24

Mining Term Association Patterns from Search Logs for Effective Query Reformulation - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Mining Term Association Patterns from Search Logs for Effective Query Reformulation. Xuanhui Wang and ChengXiang Zhai Department of Computer Science University of Illinois at Urbana-Champaign. Ineffective Queries. reduce space command latex. Effective Queries. squeeze space command latex.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Mining Term Association Patterns from Search Logs for Effective Query Reformulation' - trevor


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
mining term association patterns from search logs for effective query reformulation

Mining Term Association Patterns from Search Logs for Effective Query Reformulation

Xuanhui Wang and ChengXiang Zhai

Department of Computer Science

University of Illinois at Urbana-Champaign

ACM CIKM 2008, Oct. 26-30, Napa Valley

ineffective queries
Ineffective Queries

reduce space command latex

ACM CIKM 2008, Oct. 26-30, Napa Valley

effective queries
Effective Queries

squeeze space command latex

ACM CIKM 2008, Oct. 26-30, Napa Valley

more examples
More Examples
  • If you want to wash your vehicle
    • “vehicle wash”, “auto wash”
    • “car wash”, “truck wash”
  • If you want to buy a car
    • “auto quotes”
    • “auto sale quotes”?
    • “auto insurance quotes”?

ACM CIKM 2008, Oct. 26-30, Napa Valley

what makes a query ineffective
What Makes a Query Ineffective?
  • Vocabulary mismatch
    • “reduce space command latex” vs “squeeze space command latex”
    • “auto wash” vs “car wash”
  • Lack of discrimination
    • “auto quotes” vs “auto sale quotes”

Term substitution

Term addition

How can we help improving ineffective queries?

ACM CIKM 2008, Oct. 26-30, Napa Valley

our contribution
Our Contribution
  • We cast query reformulation as term levelpattern mining from search logs
  • We define two basic types of patterns at term level and propose probabilistic methods
    • Context-sensitive term substitution
      • “autocar | _wash”, “car  auto | _trade”
    • Context-sensitive term addition
      • “+sale | auto_quotes”
  • We evaluate our methods on commercial search engine logs and show their effectiveness

ACM CIKM 2008, Oct. 26-30, Napa Valley

problem formulation
Problem Formulation

q = auto wash

Search logs

Task 1:Contextual

Models

Task 3:

Pattern Mining

Query

Collection

autocar | _washautotruck | _wash

Patterns

Task 2:Translation

Models

+southland | _auto wash…

car washtruck wash

southland auto wash…

Offline part

Online part

ACM CIKM 2008, Oct. 26-30, Napa Valley

task 1 contextual models
Task 1: Contextual Models
  • Syntagmatic relations
  • Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

G: General context

rental: 0.375enterprise: 0.125budget: 0.125pricing: 0.125…

Model PG( * |car)

ACM CIKM 2008, Oct. 26-30, Napa Valley

task 1 contextual models1
Task 1: Contextual Models

Syntagmatic relations

Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

L1: 1st Left Context

rental: 0.333enterprise: 0.333budget: 0.333…

Model: P L1( * | car)

ACM CIKM 2008, Oct. 26-30, Napa Valley

9

task 1 contextual models2
Task 1: Contextual Models

Syntagmatic relations

Capture terms frequently co-occur with w inside queries

enterprise car rental rental car budget car rentalcar pricingcar picturescar accidents…

Sample query collection

R1: 1st Right context

rental: 0.4pricing: 0.2pictures: 0.2accidents: 0.2 …

Model: P R1( * |w)

ACM CIKM 2008, Oct. 26-30, Napa Valley

10

task 2 translation models
Task 2: Translation Models
  • Paradigmatic relations (“car” and “auto”)
  • Capture terms that are substitutable with w
  • Similar contexts  high translation probability
  • Translation models

Probability of generating s’s context from w’s contextual model

Size of L1 context

Size of R1 context

ACM CIKM 2008, Oct. 26-30, Napa Valley

task 3 1 pattern mining term substitution
Task 3.1: Pattern Mining–Term Substitution

q=[w1…wi-1wiwi+1…wn]

Global factor:translation model

Substitute wi by s

q’=[w1…wi-1swi+1…wn]

Local factor

Which word s should be chosen?

ACM CIKM 2008, Oct. 26-30, Napa Valley

estimating local factor
Estimating Local Factor

s

w1…wi-1__wi+1…wn

Independence

Ignore those terms far away

ACM CIKM 2008, Oct. 26-30, Napa Valley

task 3 2 pattern mining term addition
Task 3.2: Pattern Mining–Term Addition

q=[w1…wi-1wi…wn]

Uniform

Adding r before wi

q’=[w1…wi-1rwi…wn]

Similar to the Local Factor in Term Substitution Patterns

ACM CIKM 2008, Oct. 26-30, Napa Valley

evaluation data preparation
Evaluation: Data Preparation

Future logs

History Logs

5/1/2006

5/20/2006

5/31/2006

  • From Microsoft Live Labs

History Collection

4.4M queries

1.6M are distinct

1.3M user sessions

Used to construct test cases

ACM CIKM 2008, Oct. 26-30, Napa Valley

examples of contextual models
Examples of Contextual Models
  • Left and Right contexts are different
  • General context mixed them together

ACM CIKM 2008, Oct. 26-30, Napa Valley

examples of translation models
Examples of Translation Models
  • Conceptually similar keywords have high translation probabilities
  • Provide possibility for exploratory search in an interactive manner

ACM CIKM 2008, Oct. 26-30, Napa Valley

examples of term substitution
Examples of Term Substitution
  • Substitution is context sensitive
  • Intuitively, reworded queries are more effective

ACM CIKM 2008, Oct. 26-30, Napa Valley

effectiveness comparison of term substitution experiment design
Effectiveness Comparison of Term Substitution – Experiment Design

Q1

Q2

Session

Qk

R21

R22

R23

Rk1

Rk2

Rk3

C1

C3

C2

How well can a reformulated query rank C1, C2, and C3 on the top?

reformulation

Q1

Q1’

Q2’

Q3’

dx

C3

C1

C2

dx

dx

C1

dx

dx

dx

dx

C2

dx

C3

dx

Best [email protected]=0.6

[email protected]

0.6

0.2

0.4

ACM CIKM 2008, Oct. 26-30, Napa Valley

results
Results

Our method

[Jones’06]

#Recommended Queries

Our method reformulates queries more effectively

ACM CIKM 2008, Oct. 26-30, Napa Valley

term addition patterns
Term Addition Patterns

Term addition patterns can refine a broad query

ACM CIKM 2008, Oct. 26-30, Napa Valley

related work
Related Work
  • Query suggestions [e.g., Jones’06, Sahami et al’06]
    • Discover pattern at query level
    • Rely on external resources or training data
    • Does not consider the effectiveness
  • Query modifications in IR [Rocchio’71, Anick’03]
    • Expand queries from returned documents
    • Does not rely on search logs, mostly adding terms
  • Related work in NLP community [Lin’98, Rapp’02]
    • Finding synonym or near synonyms
    • Syntagmatic and paradigmatic relations
    • Not used for query reformulation

ACM CIKM 2008, Oct. 26-30, Napa Valley

conclusions and future work
Conclusions and Future Work
  • We propose a new way to mine search logs for patterns to address ineffective queries
    • Vocabulary mismatch
    • Lack of discrimination
  • We define and mine two basic patterns at term level
    • Context-sensitive term substitution patterns
    • Context-sensitive term addition patterns
  • Experiments show the effectiveness of our methods
  • In the future,
    • Use relevance judgments instead of clicks
    • Exploit click information for better query reformulation

ACM CIKM 2008, Oct. 26-30, Napa Valley

thank you

Thank You!

ACM CIKM 2008, Oct. 26-30, Napa Valley

ad