1 / 6

Finding Informative Sentences in Full-text Journal Articles

Finding Informative Sentences in Full-text Journal Articles. Introduction. “Informative”: make assertions about a gene’s function Examples: Positive: The in vivo interaction between CIPK23 and CBL1 or CBL9 was confirmed using BiFC assays as shown in Figure 6F. [PMID: 16814720]

peta
Download Presentation

Finding Informative Sentences in Full-text Journal Articles

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Informative Sentences in Full-text Journal Articles

  2. Introduction • “Informative”: make assertions about a gene’s function • Examples: • Positive: The in vivo interaction between CIPK23 and CBL1 or CBL9 was confirmed using BiFC assays as shown in Figure 6F. [PMID: 16814720] • Negative: We do not yet know how these protein complexes activate or inhibit the kaiBC promoter. [PMID: 12441347]

  3. Motivation • Information Overload • Double-exponential growth of peer-review literature • Breakdown of disciplinary boundaries • Identifying informative sentences can: • Provide a simple mechanism for aggregating gene function information • Provide evidence sentences for database annotation • Provide basis for generating gene summarizations [Hunter and Cohen, Mol Cell. Mar 2006]

  4. Related Work • Gene References Info Function (GeneRIFs) in the Entrez Gene database • Two Problems • Many Entrez genes have no GeneRIFs • GeneRIFs were mostly pulled from abstracts rather than the body of the article

  5. System and Method Biomedical Full Text Articles I. HTML Parsing Stripping off HTML tags II. Document Zoning: Filtering certain sections, e.g. materials and methods The in vivo interaction between CIPK23 and CBL1 or CBL9 was confirmed using BiFC assays as shown in Figure 6F. [PMID: 16814720] III. Sentence Selection Scoring each sentence according to its: 1. keywords of interest [user specific] 2. location 3. mentions of gene/protein names 4. summary-indicative cue words 5. mentions of experimental methods 6. relation with figures/tables

  6. Two Applications • Finding More GeneRIFs for Entrez Genes (Lu et al., Pac Symp Biocomput, 2006) • 20% more accurate than other methods • Predicted GeneRIFs for over 8,000 human genes • Finding Sentences about Protein-Protein Interaction (BioCreative, 2006) • An int’l competition with 11 participating teams • Finding key sentences for IntAct and MINT database curators

More Related