Beespace prototype design meeting entity recognition
Download
1 / 10

Beespace Prototype Design Meeting Entity Recognition - PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on

Beespace Prototype Design Meeting Entity Recognition. Jing Jiang 09/28/2005. Entity Recognition in Prototype V1. Target entities: gene names Supervised learning: LingPipe (word trigram and tag bigram model) Training data: BioCreative (manually annotated)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Beespace Prototype Design Meeting Entity Recognition' - sven


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Beespace prototype design meeting entity recognition

Beespace Prototype Design MeetingEntity Recognition

Jing Jiang

09/28/2005


Entity recognition in prototype v1
Entity Recognition in Prototype V1

  • Target entities: gene names

  • Supervised learning: LingPipe (word trigram and tag bigram model)

  • Training data:

    • BioCreative (manually annotated)

    • Drosophila (generated from gene lists)


Sample results
Sample Results

  • http://sifaka.cs.uiuc.edu/jiang4/Beespace


Performance
Performance

  • Some gene names without explicit mention of “gene” can be captured

    • E.g., “glutathione S-transferase”

  • Problems

    • Gene-like phrases, e.g., “China 2”, “13.8”

    • Mismatch of gene name boundaries and noun phrase boundaries, e.g., “nicotinic” in “nicotinic pathway”


V2 entity types
V2 -- Entity Types

  • Annotation guideline for BioCreative

    • Guideline for Beespace?

    • Ontology? (GENIA ontology)

  • What to tag?

    • Genes and proteins

    • Family of genes

    • Gene descriptions

  • Entity boundaries and noun phrase boundaries

    • Tag only noun phrases that refer to genes or tag any occurrence of a gene name inside a noun phrase?


Sample sentences
Sample Sentences

  • A dose-dependent transactivation of human hARE-mediated chloramphenicol acetyltransferase (cat) geneexpression was observed upon treatments of the Hepa-1 transfectants with TPA, a known inducer, as well as with CAPE.

  • In the present study, we identified its preferred binding sequence as 5'-CCCTATCGATCG-ATCTCTACCT-3' and characterized its DNA -binding properties using truncated Mblk-1 mutants.


Sample sentences cont
Sample Sentences (cont.)

  • At least two kinds of nicotinic receptors seem to be involved in honeybee memory, an alpha-bungarotoxin-sensitive and an alpha-bungarotoxin-insensitive receptor.

  • The involvement of nicotinic pathways in memory formation and retrieval processes was tested by injecting…


Sample sentences cont1
Sample Sentences (cont.)

  • We report the cloning of a honeybee CSP gene calledASP3c, as well as the structural and functional characterization of the encoded protein.

  • Natural occurring variatioin in npr-1, a gene encoding a putative receptor for an NPY-like molecule, causes variation in feeding behaviour.


Sample sentences cont2
Sample Sentences (cont.)

  • The gene encoding ZENK, an EARLY IMMEDIATEGENE well known in other learning and memory contexts, has figured prominently in molecular songbird research thus far.

  • This is because frequent contacts of these types cause an increase in the expression of the gene encoding a glucocortocoidreceptor in the hippocampus, and…


Training data
Training Data

  • Dictionary

  • Rules/guidelines

  • Bootstrapping

  • Cross-domain training

    • Can training data in other domains (fly, human, etc.) still be useful?


ad