1 / 26

Discovering Key Concepts in Verbose Queries

Discovering Key Concepts in Verbose Queries. Michael Bendersky and W. Bruce Croft University of Massachusetts SIGIR 2008. Objective. “Discovering Key Concepts in Verbose Queries”. Objective. “Discovering Key Concepts in Verbose Queries” <num> Number 829 <title> Spanish Civil War support

jadzia
Download Presentation

Discovering Key Concepts in Verbose Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discovering Key Concepts in Verbose Queries Michael Bendersky and W. Bruce Croft University of Massachusetts SIGIR 2008

  2. Objective • “Discovering Key Concepts in Verbose Queries”

  3. Objective • “Discovering Key Concepts in Verbose Queries” • <num> Number 829 <title> Spanish Civil War support <desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

  4. Objective • “Discovering Key Concepts in Verbose Queries” • <num> Number 829 <title> Spanish Civil War support <desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

  5. Objective • “Discovering Key Concepts in Verbose Queries” • Use of key concepts?

  6. Objective • “Discovering Key Concepts in Verbose Queries” • Use of key concepts? • Combine with current IR model

  7. Retrieval Model • Conventional Language Model: score(q,d) = p(q|d) =

  8. Retrieval Model • Conventional Language Model: score(q,d) = p(q|d) = • New Model: score(q,d) = p(q|d) = =

  9. Final Retrieval Function score(q,d) =

  10. Final Retrieval Function score(q,d) = Language Model

  11. Final Retrieval Function score(q,d) = Key Concepts

  12. What is a Concept? • Noun phrase in a query

  13. What is a Concept? • Noun phrase in a query • <num> Number 829 <title> Spanish Civil War support <desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

  14. What is a Concept? • Noun phrase in a query • <num> Number 829 <title> Spanish Civil War support <desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

  15. Finding ‘Key’ Concepts • Rank concepts by p(ci|q)

  16. Finding ‘Key’ Concepts • Rank concepts by p(ci|q) • Compute p(ci|q) by frequency? • <num> Number 829 <title> Spanish Civil War support <desc> Provide information on all kinds of material international support provided to either side in the Spanish Civil War

  17. Finding ‘Key’ Concepts • Approximate p(ci|q) by machine learning • h(ci) is ci’s query-independent importance score • p(ci|q) = h(ci) / ciqh(ci) AdaBoost.M1 h(ci) ci

  18. Features of a Concept • is_cap : is capitalized • tf : in corpus • idf : in corpus • ridf : idf modified by Poisson model • wig : weighted information gain; change in entropy from corpus to retrieved data • g_tf : Google term frequency • qp : number of times the concept appears as a part of a query in MSN Live • qe : number of times the concept appears as exact query in MSN Live

  19. TREC Corpus

  20. Exp 1: Identifying Key Concept • Cross-validation on corpus • Each fold has 50 queries • Check whether the top concept is a key concept • Assume 1 key concept per query during annotation

  21. Exp 1: Identifying Key Concept

  22. Exp 1: Identifying Key Concept • Better than idf ranking

  23. Exp 2: Information Retrieval score(q,d) = • Use only the top 2 concepts for each query • q is the entire <desc> section •  = 0.8

  24. Exp 2: Information Retrieval • KeyConcept[2]<desc> : author’s method • SeqDep<desc> : include all bigrams in query

  25. Exp 2: Information Retrieval

  26. What to take home? • Singling out key concepts improves retrieval

More Related