80 likes | 153 Views
Learn about The Claim Framework and its role in identifying scientific claims in the vast sea of available text in life sciences. Explore how text mining can automate claim identification, bridging the gap between retrieval and synthesis. Discover the potential to uncover implicit connections and streamline access to relevant articles. Join the shift towards a more efficient and accurate scientific discourse.
E N D
The Claim Framework Catherine Blake School of Library and Information Science University of Illinois at Urbana-Champaign clblake@illinois.edu
Motivation • Relentless increase in electronically available text • Life Sciences • 17 millionth entry added in April 2007 • 5,200 journals indexed • 12,000 new articles each week ! • Chemistry – more than 110,000 articles in 1 year alone • Consequences: • Hundreds of thousands of relevant articles • Implicit connections between literature go unnoticed Shift from Retrieval to Synthesis
The Claim Framework • Scientists use a shared sublanguage to express claims made in an empirical study The Claim Framework captures the key characteristics of the claim sublanguage • Text mining can be used to populate the Claim Framework automatically An automated system will identify all and only the claims that have been identified manually
Claim Definition • “To assert in the face of possible contradiction” • Example sentence reporting a claim • “This study showed that Tamoxifen reduces the breast cancer risk” • Explicit Claim in the Claim Framework • Tamoxifenagent • reduceschange • [breast cancer risk] object
Inter Annotator Agreement Information Facet Kappa Agreement Agent 0.71 substantial Object 0.77 substantial Change 0.57 moderate Change+ChangeDir 0.88 almost perfect
Interested ? • Send me an email clblake@illinois.edu • To see more details on the Claim Framework and an automated approach to populate explicit claims: • Blake, C. (2010) Beyond genes, proteins, and abstracts: Identifying scientific claims from full-text biomedical articles, Journal of Biomedical Informatics, 43(2), 173-189.