1 / 10

The TREC Conferences trec.nist

The TREC Conferences http://trec.nist.gov. Ellen Voorhees. TREC Philosophy. TREC is a modern example of the Cranfield tradition system evaluation based on test collections Emphasis on advancing the state of the art from evaluation results

vic
Download Presentation

The TREC Conferences trec.nist

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The TREC Conferenceshttp://trec.nist.gov Ellen Voorhees

  2. TREC Philosophy • TREC is a modern example of the Cranfield tradition • system evaluation based on test collections • Emphasis on advancing the state of the art from evaluation results • TREC’s primary purpose is not competitive benchmarking • experimental workshop: sometimes experiments fail!

  3. Cranfield at Fifty • Evaluation methodology is still valuable… • carefully calibrated level of abstraction • has sufficient fidelity to real user tasks to be informative • general enough to be broadly applicable, feasible, relatively inexpensive • …but is showing some signs of age • size is overwhelming our ability to evaluate • new abstractions need to carefully accommodate variability to maintain power

  4. Evaluation Difficulties • Variability • despite stark abstraction, user effect still dominates Cranfield results • Size matters • effective pooling has corpus size dependency • test collection construction costs depend on number of judgments • Model coarseness • even slightly different tasks may not be good fit • e.g., legal discovery, video features

  5. TREC 2009 • All tracks used some new, large document set • Different trade-offs in adapting evaluation strategy • tension between evaluating current participants’ ability to do the task and building reusable test collections • variety of tasks that are not simple ranked-list retrieval

  6. ClueWeb09 Document Set • Snapshot of the WWW in early 2009 • crawled by CMU with support from NSF • distributed through CMU • used in four TREC 2009 tracks: Web,Relevance Feedback, Million Query, and Entity • Full corpus • about one billion pages and 25 terabytes of text • about half is in English • Category B • English-only subset of about 50 million pages (including Wikipedia) to permit wider participation

  7. TREC 2009 Participants

  8. The TREC Tracks Personal documents Retrieval in a domain Answers, not documents Searching corporaterepositories Size, efficiency, & web search Beyond text Beyond just English Human-in-the-loop Streamed text Static text Blog Spam Chemical IR Genomics Novelty QA, Entity Legal Enterprise Terabyte, Million Query Web VLC Video Speech OCR Cross-language Chinese Spanish Interactive, HARD, Feedback Filtering Routing Ad Hoc, Robust 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

  9. TREC 2010 • Blog, Chemical IR, Entity, Legal, Relevance Feedback, Web continuing • Million Query merged with Web • New “Sessions” track: investigate search behavior over a series of queries (series of length 2 for first running in 2010)

  10. TREC 2011 • Track proposals due Monday (Sept 27) • New track on searching free text fields of medical records likely

More Related