1 / 13

Query Reformulation: User Relevance Feedback

Query Reformulation: User Relevance Feedback. Introduction. Difficulty of formulating user queries Users have insufficient knowledge of the collection make-up Users have insufficient knowledge of the retrieval environment Query reformulation to improve user query two basic methods

Download Presentation

Query Reformulation: User Relevance Feedback

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Query Reformulation:User Relevance Feedback

  2. Introduction • Difficulty of formulating user queries • Users have insufficient knowledge of the collection make-up • Users have insufficient knowledge of the retrieval environment • Query reformulation to improve user query • two basic methods • query expansion • Expanding the original query with new terms • term reweighting • Reweighting the terms in the expanded query

  3. Introduction • Approaches for query reformulation • user relevance feedback • based on feedback information from the user • local analysis • based on information derived from the set of documents initially retrieved (local set) • global analysis • based on global information derived from the document collection

  4. User Relevance Feedback • User’s role in URF cycle • is presented with a list of the retrieved documents • marks relevant documents • Main idea of URF • selecting important terms, or expressions, attached to the documents that have been identified as relevant by the user • enhancing the importance of these terms in new query formulation • effect: the new query will be moved towards the relevant documents and away from the non-relevant ones

  5. User Relevance Feedback • Advantages of URF • it shields the user from the details of the query reformulation process • users only have to provide a relevance judgment on documents • it breaks down the whole searching task into a sequence of small steps which are easier to grasp • it provides a controlled process designed to emphasize relevant terms and de-emphasize non-relevant terms

  6. URF for Vector Model • Assumptions • the term-weight vectors of the documents identified as relevant to the query have similarities among themselves. • non-relevant documents have term-weight vectors which are dissimilar from the ones for the relevant documents. • Basic idea • reformulate the query such that it gets closer to the term-weight vector space of the relevant documents

  7. The Perfect (Vector Model) Query • Assume we know what documents are relevant and which are not. • Given: • a collection of N documents • Cr : the set of relevant documents • What is the optimal query?

  8. Back to Reality • Actually, what we are trying to figure out is which documents are relevant and which are not. • Our ideal query & definitions: • a collection of N documents • Cr : the set of relevant documents • Dr : set of documents user identified as relevant • Dn : set of retrieved documents not relevant • α, β, γ : tuning constants • Modified Query • (Rochio)

  9. Rochio & Ide Variations • Standard Rochio • Ide (Regular) • Ide (Dec_Hi) • where maxnonrelevant(dj): the highest ranked non-relevant document

  10. Tuning the Feedback • Modified Query • How do we set the tuning constants α, β, γ? • Rochio originally set α = 1 • Ide originally set α = β = γ = 1 • Often, positive relevance feedback is more valuable than negative relevance feedback. • this implies: β > γ • purely positive feedback mechanism: γ = 0

  11. URF for Vector Model • Includes both query expansion and term reweighting • Advantages • simplicity • modified term weights are computed directly from the set of retrieved documents • good results • modified query vector does reflect a portion of the intended query semantics • Issue: As with all learning techniques, this assumes the information need is relatively static.

  12. Evaluation of Relevance Feedback Strategies • Simplistic evaluation is to compare the results of the modified query to the original query. • Does not work!!! • Results are great but mostly due to higher ranking of documents returned by original query. • User has already seen these documents.

  13. Evaluation of Relevance Feedback Strategies • More realistic evaluation • Compute precision and recall on residual collection (those documents not returned by the original query) • Because highly-ranked documents are removed, these results can be worse than for the original query. • That is okay if we are comparing between relevance feedback approaches.

More Related