STIR:. Simultaneous Achievement of high Precision and high Recall through S ocio- T echnical I nformation R etrieval Robert S. Bauer, Teresa Jade www.H5technologies.com & Mitchell P. Marcus www.cis.upenn.edu/~mitch/. June 7, 2007. The e-Discovery IDEAL: High P with High R.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Simultaneous Achievement ofhigh Precision and high Recall throughSocio-Technical Information Retrieval
Robert S. Bauer, Teresa Jadewww.H5technologies.com
Mitchell P. Marcus
June 7, 2007
P=0.8 (or better)@R=0.8 (or better)
P=2/3(or better)@R=2/3(or better)
High P & Low R= RISK (important docs not retrieved)
Low P & High R= COST (many more documents must be reviewed)
(from Chapter 3, “Retrieval System Evaluation” by Chris Buckley and Ellen M. Voorhees, inTREC: Experiment and Evaluation in Information Retrieval, Voorhees & Harman, ed., MIT Press, 2005, p62, Fig. 3.1)
Retrieval Acceptableto lowest limitof statistical uncertaintyRecall Improvement
Sampled Corpus Tests for 12 Topics in case I during STIR Training
LinguisticsDimensions of e-Discovery: Socio-Technical-IR
State of Affairs