Query reformulation user relevance feedback
Download
1 / 13

Query Reformulation: User Relevance Feedback - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

Query Reformulation: User Relevance Feedback. Introduction. Difficulty of formulating user queries Users have insufficient knowledge of the collection make-up Users have insufficient knowledge of the retrieval environment Query reformulation to improve user query two basic methods

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Query Reformulation: User Relevance Feedback' - brett-mccarthy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Query reformulation user relevance feedback

Query Reformulation:User Relevance Feedback


Introduction
Introduction

  • Difficulty of formulating user queries

    • Users have insufficient knowledge of the collection make-up

    • Users have insufficient knowledge of the retrieval environment

  • Query reformulation to improve user query

    • two basic methods

      • query expansion

        • Expanding the original query with new terms

      • term reweighting

        • Reweighting the terms in the expanded query


Introduction1
Introduction

  • Approaches for query reformulation

    • user relevance feedback

      • based on feedback information from the user

    • local analysis

      • based on information derived from the set of documents initially retrieved (local set)

    • global analysis

      • based on global information derived from the document collection


User relevance feedback
User Relevance Feedback

  • User’s role in URF cycle

    • is presented with a list of the retrieved documents

    • marks relevant documents

  • Main idea of URF

    • selecting important terms, or expressions, attached to the documents that have been identified as relevant by the user

    • enhancing the importance of these terms in new query formulation

    • effect: the new query will be moved towards the relevant documents and away from the non-relevant ones


User relevance feedback1
User Relevance Feedback

  • Advantages of URF

    • it shields the user from the details of the query reformulation process

      • users only have to provide a relevance judgment on documents

    • it breaks down the whole searching task into a sequence of small steps which are easier to grasp

    • it provides a controlled process designed to emphasize relevant terms and de-emphasize non-relevant terms


URF for Vector Model

  • Assumptions

    • the term-weight vectors of the documents identified as relevant to the query have similarities among themselves.

    • non-relevant documents have term-weight vectors which are dissimilar from the ones for the relevant documents.

  • Basic idea

    • reformulate the query such that it gets closer to the term-weight vector space of the relevant documents


The perfect vector model query
The Perfect (Vector Model) Query

  • Assume we know what documents are relevant and which are not.

  • Given:

    • a collection of N documents

    • Cr : the set of relevant documents

  • What is the optimal query?


Back to reality
Back to Reality

  • Actually, what we are trying to figure out is which documents are relevant and which are not.

  • Our ideal query & definitions:

    • a collection of N documents

    • Cr : the set of relevant documents

    • Dr : set of documents user identified as relevant

    • Dn : set of retrieved documents not relevant

    • α, β, γ : tuning constants

  • Modified Query

  • (Rochio)


Rochio ide variations
Rochio & Ide Variations

  • Standard Rochio

  • Ide (Regular)

  • Ide (Dec_Hi)

  • where maxnonrelevant(dj): the highest ranked non-relevant document


Tuning the feedback
Tuning the Feedback

  • Modified Query

  • How do we set the tuning constants α, β, γ?

    • Rochio originally set α = 1

    • Ide originally set α = β = γ = 1

  • Often, positive relevance feedback is more valuable than negative relevance feedback.

    • this implies: β > γ

    • purely positive feedback mechanism: γ = 0


URF for Vector Model

  • Includes both query expansion and term reweighting

  • Advantages

    • simplicity

      • modified term weights are computed directly from the set of retrieved documents

    • good results

      • modified query vector does reflect a portion of the intended query semantics

  • Issue: As with all learning techniques, this assumes the information need is relatively static.


Evaluation of relevance feedback strategies
Evaluation of Relevance Feedback Strategies

  • Simplistic evaluation is to compare the results of the modified query to the original query.

    • Does not work!!!

    • Results are great but mostly due to higher ranking of documents returned by original query.

    • User has already seen these documents.


Evaluation of relevance feedback strategies1
Evaluation of Relevance Feedback Strategies

  • More realistic evaluation

    • Compute precision and recall on residual collection (those documents not returned by the original query)

    • Because highly-ranked documents are removed, these results can be worse than for the original query.

    • That is okay if we are comparing between relevance feedback approaches.


ad