190 likes | 286 Views
Learn about the importance of relevance in IR, factors affecting precision and recall, and strategies to improve search performance.
E N D
Relevance and Evaluation of IR Performance Dr. Bilal IS 530 Fall 2009
Searching for Information • Imprecise • Incomplete • Tentative • Challenging
Why Some Items Are Not Retrieved? • Indexing errors • Wrong search terms • Wrong database • Language variations Other (to be answered by students)
Boolean Operators • OR increases recall • AND increases precision • NOT increases precision by elimination
Relevance • A match between a query and information retrieved • A judgment • Can be judged by anyone who is informed of the query and views the retrieved information
Relevance • Judgment is dynamic • Documents can be ranked by likely relevance • In practice, not easy to measure • Not focused on user needs
Pertinence • Based on information need rather than a match between a query and retrieved documents • Can only be judged by user • May differ from relevance judgment
Pertinence • Transient, varies with many factors • Not often used in evaluation • May be used as a measure of satisfaction • User-based, as opposed to relevance
Relevance Judgment • Users base it on: • Topicality • Aboutness • Utility • Novelty • Satisfaction
IR Performance Recall Ratio = the number of relevant documents retrieved the total number of relevant documents
Recall and Precision in Practice • Inversely related • Search strategies designed for high precision or high recall (or medium) • Needs of users dictate search strategy towards recall or precision • Practice helps changing queries to favor recall or precision
Recall and Precision 1.0 Recall 1.0 Precision
IR Performance Precision Ratio = the number of relevant documents retrieved the total number of documents retrieved
High Precision Search • Use these strategies, as appropriate: • Controlled vocabulary • Limit feature (e.g., specific fields, major descriptors, date(s), language, as appropriate) • AND operator • Proximity operators carefully • Truncation carefully
High Recall Search • Use these strategies, as appropriate: • OR logic • Keyword searching • No or minimal limit to specific field(s) • Truncate • Broader terms
Improving IR Performance • Good mediation of search topic before searching • User presence during search, if possible • Preliminary search judged by user • Evaluation during search (by searcher or by searcher and user)
Improving IR Performance • Refinement of search strategies • Searcher to evaluate final results • User to evaluate final results
Improving IR Performance • Better system design • Better indexing and word parsing • Better structure of thesauri • Better user interface (e.g., more effective help feature) • Better error recovery feedback • User-centered design
Relevance in Information Science • Dr. Tefko Saracevic talk at SIS on: “Relevance in information science” • To watch the streaming video and the slide show, click on • http://mediabeast.ites.utk.edu/mediasite4/Viewer/?peid=fb8f84cb-9f82-499f-b12c-9a56ab5cf5ba