Evaluation of IR Performance

Evaluation of IR Performance Dr. Bilal IS 530 Fall 2006

Searching for Information • Imprecise • Incomplete • Tentative • Challenging

IR Performance Precision Ratio = the number of relevant documents retrieved the total number of documents retrieved

IR Performance Recall Ratio = the number of relevant documents retrieved the total number of relevant documents

Why Some Items Are Not Retrieved? • Indexing errors • Wrong search terms • Wrong database • Language variations Other (to be answered by students)

Why Do We Get Unwanted Items or Results? • Indexing errors • Wrong search terms • Homographs • Incorrect term relations Other (to be answered by students)

Boolean Operators • OR increases recall • AND increases precision • NOT increases precision by elimination

Recall and Precision in Practice • Inversely related • Search strategies designed for high precision or high recall (or medium) • Needs of users dictate search strategy towards recall or precision • Practice helps changing queries to favor recall or precision

Recall and Precision 1.0 Recall 1.0 Precision

Relevance • A match between a query and information retrieved • A judgment • Can be judged by anyone who is informed of the query and views the retrieved information

Relevance • Judgment is dynamic • Documents can be ranked by likely relevance • In practice, not easy to measure • Not focused on user needs

Pertinence • Based on information need rather than a match between a query and retrieved documents • Can only be judged by user • May differ from relevance judgment

Pertinence • Transient, varies with many factors • Not often used in evaluation • May be used as a measure of satisfaction • User-based, as opposed to relevance

High Precision Search • Use these strategies, as appropriate: • Controlled vocabulary • Limit feature (e.g., specific fields, major descriptors, date(s), language, as appropriate) • AND operator • Proximity operators carefully • Truncation carefully

High Recall Search • Use these strategies, as appropriate: • OR logic • Keyword searching • No or minimal limit to specific field(s) • Truncate • Broader terms

Relevance Judgment • Users base it on: • Topicality • Aboutness • Utility • Pertinence • Satisfaction

Improving IR Performance • Good mediation of search topic before searching • User presence during search, if possible • Preliminary search judged by user • Evaluation during search (by searcher or by searcher and user) • Refinement of search strategies • Searcher evaluation of final results • User evaluation of final results

Improving IR Performance • Better system design • Better indexing and word parsing • Better structure of thesauri • Better user interface (e.g., more effective help feature) • Better error recovery feedback • User-centered design

Evaluation of IR Performance

Evaluation of IR Performance

Presentation Transcript

Evaluation of Student Performance:

Evaluation of IR systems

Evaluation of IR Performance

Performance Evaluation

Evaluation of IR Systems

PERFORMANCE EVALUATION

Performance Evaluation

Evaluation of Pupil Performance

IR System Evaluation

Performance Evaluation

Performance Evaluation of Coded UWB-IR on Multipath Fading Channels

Evaluation of Financial Performance

Performance Evaluation

Relevance and Evaluation of IR Performance

PERFORMANCE EVALUATION

Evaluation of IR Systems

Performance Evaluation of Architectures

Evaluation of Investment Performance

Performance Evaluation

Philosophy of IR Evaluation

Performance Evaluation

Evaluation of Financial Performance