1 / 13

INFORMATION RETRIEVAL TECHNIQUES BY DR . ADNAN ABID

INFORMATION RETRIEVAL TECHNIQUES BY DR . ADNAN ABID. Lecture # 27 Mean Average Precision Non Binary Relevance DCG NDCG. ACKNOWLEDGEMENTS. The presentation of this lecture has been taken from the underline sources

jeremyv
Download Presentation

INFORMATION RETRIEVAL TECHNIQUES BY DR . ADNAN ABID

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INFORMATION RETRIEVAL TECHNIQUESBYDR. ADNAN ABID Lecture # 27 Mean Average Precision Non Binary Relevance DCG NDCG

  2. ACKNOWLEDGEMENTS The presentation of this lecture has been taken from the underline sources • “Introduction to information retrieval” by PrabhakarRaghavan, Christopher D. Manning, and Hinrich Schütze • “Managing gigabytes” by Ian H. Witten, ‎Alistair Moffat, ‎Timothy C. Bell • “Modern information retrieval” by Baeza-Yates Ricardo, ‎ • “Web Information Retrieval” by Stefano Ceri, ‎Alessandro Bozzon, ‎Marco Brambilla

  3. Outline • Mean Average Precision • Mean Reciprocal Rank • Cumulative Gain • Discounted Cumulative Gain • Normalized Discounted Cumulative Gain

  4. Mean Average Precision(MAP) • Average Precision: Average of the precision values at the points at which each relevant document is retrieved. • Ex1: (1 + 1 + 0.75 + 0.667 + 0.38 + 0)/6 = 0.633 • Ex2: (1 + 0.667 + 0.6 + 0.5 + 0.556 + 0.429)/6 = 0.625 • Mean Average Precision: Average of the average precision value for a set of queries.

  5. Mean average precision • If a relevant document never gets retrieved, we assume the precision corresponding to that relevant doc to be zero • MAP is macro-averaging: each query counts equally • Now perhaps most commonly used measure in research papers • Good for web search? • MAP assumes user is interested in finding many relevant documents for each query • MAP requires many relevance judgments in text collection

  6. Mean Reciprocal Rank • Consider rank position, K, of first relevant doc • Could be – only clicked doc • Reciprocal Rank score = • MRR is the mean RR across multiple queries

  7. Non-Binary Relevance • Documents are rarely entirely relevant or non-relevant to a query • Many sources of graded relevance judgments • Relevance judgments on a 5-point scale • Multiple judges • Click distribution and deviation from expected levels (but click-through != relevance judgments)

  8. Cumulative Gain • Withgraded relevance judgments, we can compute the gain at each rank. • Cumulative Gain at rank n: (Where reli is the graded relevance of the document at position i)

  9. Discounted Cumulative Gain • Uses graded relevance as a measure of usefulness, or gain, from examining a document • Gain is accumulated starting at the top of the ranking and may be reduced, or discounted, at lower ranks • Typical discount is 1/log (rank) • With base 2, the discount at rank 4 is 1/2, and at rank 8 it is 1/3

  10. Discounting Based on Position • Users care more about high-ranked documents, so we discount results by 1/log2(rank) • Discounted Cumulative Gain:

  11. Normalized Discounted Cumulative Gain (NDCG) • To compare DCGs, normalize values so that a ideal ranking would have a Normalized DCGof 1.0 • Ideal ranking:

  12. Normalized Discounted Cumulative Gain (NDCG) • Normalize by DCG of the ideal ranking: • NDCG ≤ 1 at all ranks • NDCG is comparable across different queries

More Related