Rutgers information interaction lab at trec 2005 trying hard
Download
1 / 26

Rutgers Information Interaction Lab at TREC 2005: Trying HARD - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

Rutgers Information Interaction Lab at TREC 2005: Trying HARD. N.J. Belkin, M. Cole, J. Gwizdka, Y.-L. Li, J.-J. Liu, G. Muresan, D. Roussinov*, C.A. Smith, A. Taylor, X.-J. Yuan Rutgers University; *Arizona State University. Our Major Goal.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Rutgers Information Interaction Lab at TREC 2005: Trying HARD' - leroy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Rutgers information interaction lab at trec 2005 trying hard

Rutgers Information Interaction Lab at TREC 2005: Trying HARD

N.J. Belkin, M. Cole, J. Gwizdka, Y.-L. Li, J.-J. Liu, G. Muresan, D. Roussinov*, C.A. Smith, A. Taylor, X.-J. Yuan

Rutgers University; *Arizona State University


Our major goal
Our Major Goal

  • Clarification forms (CFs) are simulations of user-system interaction

  • Users are unwilling to engage in explicit interaction unless payoff is high, and interaction is understood as relevant

  • Is explicit interaction worthwhile, and if so, under what circumstances?


General approach to the question
General Approach to the Question

  • Use relatively “standard” interactive elicitation techniques to enhance/ disambiguate original query

  • Compare results to baseline

  • Compare results to baseline plus relatively “standard” non-interactive query enhancement techniques, in particular, pseudo-rf


Methods for automatic query enhancement
Methods for Automatic Query Enhancement

  • Pseudo-relevance feedback (standard Lemur)

  • Language modeling-based query expansion (clarity), derived from collection

  • Web-based query expansion


Methods for user based query enhancement
Methods for User-Based Query Enhancement

  • User selection of terms suggested by “clarity” and web methods (user selection based on Koenemann & Belkin, 1996; Belkin, et al., 2000)

  • Elicitation of extended information problem descriptions (elicitation based on Kelly, Dollu & Fu, 2004; 2005)


Hypotheses for automatic enhancement
Hypotheses for Automatic Enhancement

  • H1: Query expansion using “clarity”-derived terms will improve performance over baseline & baseline + pseudo-rf

  • H2: Query expansion using web-derived terms will improve performance, ditto

  • H2b: Query expansion using both clarity- and web-derived terms will improve performance, ditto


Hypotheses for user based query enhancement
Hypotheses for User-Based Query Enhancement

  • H3: Query expansion with terms selected by the user from those suggested by clarity- and web-derived terms will improve performance, over everything else

  • H4: Query expansion using “problem statements” elicited from users will increase performance over baseline & baseline + pseudo-rf


Hypothesis for when elicitation is useful
Hypothesis for When Elicitation is Useful

  • H5: The effectiveness of query expansion using problem statements will be negatively correlated with query clarity.


Query run designations

RUTGBL: Baseline query (title + description)

RUTGBF3: Baseline + pseudo-rf (Lemur)

RUTGWS1: Baseline + 0.1(Web-suggested terms)

RUTGLS1: Baseline + 0.1(clarity-suggested terms)

RUTGAS1: Baseline + 0.1(all suggested terms)

RUTGUS1: Baseline + 0.1(terms selected by user)

RUTGUG1: Baseline + 0.1(user-generated terms)

RUTGALL: Baseline + all suggested terms and all user-generated terms

Query Run Designations


Identification of suggested terms
Identification of Suggested Terms

  • Clarity: Compute query clarity for topic baseline (Lemur QueryClarity); sort terms accordingly; choose top ten

  • Web: Next slide, please


Navigation by Expansion Paradigm (NBE)

undocumented

aliens

arrested

border

trafficked

haitians

WWW

WWW

WWW

Title: human smugglingDescription: Identify incidents of human smuggling


Navigation by expansion paradigm nbe
Navigation by Expansion Paradigm (NBE)

  • Step1: Overview of the surroundings

    • Produces words and phrases “clearly related” to the topic

    • Internet mining: topic sent to Google

    • Logistic regression on the “signal to noise” ratio:

      • Signal = df(results)/#results

      • Noise = df(web)/#web

      • Pr = 1 – exp (-(signal/noise – 1)/a)

  • Step2: Valid “moves” identified

    • Related concepts from step 1 and those that

      • Are present in AQUAINT

      • Would affect search results if selected: impact estimate = P*df*idf

  • Step 3: Selected moves executed

    • E.g. by query expansion:

      • Score = original score + expansion score * expansion factor


Combination run
“Combination” Run

  • Combining pseudo-rf with user-selected terms from CF1 (run RUTBE)

  • R-Prec. for RUTBE 0.334

  • Substantially better than all other runs, but not comparable, because using different ranking function (BM25) and different differential weighting (0.3 for added terms)

  • Indicative of possible improvements




System implementation
System Implementation

  • Lemur 3.1, 4.0, 4.1, using StructQueryEval

  • Could we ask for somewhat more detailed documentation from the Lemur group?


Comparison to other sites

R-precision

MAP

[email protected]

Mean

SD

Mean

SD

Mean

SD

Overall Baseline median

0.252

0.149

0.190

0.147

0.408

0.28

RUTGBL

0.270

0.167

0.206

0.163

0.408

0.30

Overall Final median

0.264

0.152

0.207

0.161

0.45

0.30

RUTGALL

0.299*

0.182

0.253

0.188

0.49**

0.31

Comparison to Other Sites



Summary of significant differences r prec

BL

AS1

LS1

WS1

US1

UG1

BF3

ALL

BL

0.270

AS1

*

0.278

----

LS1

*

0.279

n/s

----

WS1

*

0.281

n/s

n/s

----

US1

*

0.282

n/s

n/s

n/s

----

UG1

*

0.286

n/s

n/s

n/s

n/s

----

BF3

n/s

0.287

n/s

n/s

n/s

n/s

n/s

----

ALL

n/s

0.299

n/s

n/s

n/s

n/s

n/s

n/s

----

Summary of Significant Differences, R-Prec.


Varying weights of baseline terms w r t cf2 terms
Varying Weights of Baseline Terms w.r.t.CF2 Terms


Varying weights of cf2 terms w r t baseline terms
Varying Weights of CF2 Terms w.r.t. Baseline Terms


Cf2 baseline terms equal weights

Run name

R-Precision

Precision at 10

Mean Average Precision

Mean

SD

Mean

SD

Mean

SD

RUTGBL

0.270

0.167

0.408

0.3

0.206

0.16

Q1

0.290

0.178

0.498*

0.325

0.236

0.183

Q2

0.274

0.181

0.474*

0.321

0.223

0.181

Q3

0.295

0.164

0.498**

0.303

0.237**

0.175

Q1Q2

0.298*

0.182

0.514**

0.326

0.248**

0.190

Q1Q3

0.313*

0.176

0.538***

0.314

0.263**

0.186

Q1Q2Q3

0.314**

0.179

0.564***

0.304

0.268**

0.190

CF2 & Baseline Terms, Equal Weights


Results w r t hypotheses
Results w.r.t. Hypotheses

  • H1, H2, H3, H4 weakly supported w.r.t. baseline, not to pseudo-rf

  • H5 not supported

    • No correlation between baseline query clarity, and effectiveness of expanding with CF2 terms


Discussion 1
Discussion (1)

  • Both automatic and user-based query enhancement improved performance over baseline, but not over pseudo-rf

  • No significant differences in performance between any enhancement methods, except Q1 v. Q1+Q3 (r-precision, 0.290 vs. 0.313)


Discussion 2
Discussion (2)

  • Some benefit both from automatic methods, and to explicit interaction with user, which require some effort from the user that goes beyond initial query formulation

  • This interpretation of the results depends on the assumption that title+description queries are accurate simulations of user behavior


Tentative conclusions
(Tentative) Conclusions

  • Results indicate that invoking user interaction for query clarification is unlikely to be cost effective

  • Alternative might be to develop ways to encourage more elaborate query formulation in the first instance, enhanced with automatic methods.

  • Subsequent enhancement could be via implicit sources of evidence, rather than explicit questioning, requiring no additional effort from the user.


ad