1 / 34

Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation

Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation. Dimitar Hristovski, 1 PhD, Andrej Kastrin, 2 B orut Peterlin, 2 MD PhD , Thomas C Rindflesch, 3 PhD 1 Institute of Biomedical Informatics, Medical Faculty, University of Ljubljana, Slovenia

ralph
Download Presentation

Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Relations for InterpretingDNA Microarray Data and for Novel Hypotheses Generation Dimitar Hristovski,1 PhD, Andrej Kastrin,2 Borut Peterlin,2 MD PhD, Thomas C Rindflesch,3 PhD 1Institute of Biomedical Informatics, Medical Faculty, University of Ljubljana, Slovenia 2Institute of Medical Genetics, University Medical Centre, Ljubljana, Slovenia3National Library of Medicine, National Institutes of Health, Bethesda, MD, U.S.A. e-mail: dimitar.hristovski@mf.uni-lj.si

  2. Introduction Microarray experiments: • great potential to support progress in biomedical research, • results NOT EASY to interpret, • information about functions and relations of relevant genes needs to be extracted from the vast biomedical literature

  3. Related Work • Text mining and microarray analysis • Literature-based Discovery

  4. Proposed Solution • Computerized text analysis system • Extract semantic relations from literature • SemRep • Integrate with microarray experiments • Develop tools for: • Interpretation • Novel hypotheses generation

  5. Overall Design Medline GEO SemRepSem.rels Extraction R Bioconductorscripts microarrays semantic relations Integrated Database=semantic relations +microarrays Interpretation & Discovery Tools

  6. SemRep • Extracts semantic relations from biomedical text (implemented in Prolog) • Based on UMLS Metathesaurus and Semantic Network • <MetaConc> SEMNET RELATION <MetaConc> • Database of relations extracted from MEDLINE • 6.7M citations (01/01/1999 through 03/31/2009) • 43M sentences • 21M relation instances • 7M relation types 6

  7. Semantic Relations Extracted Wide range of relations in: Clinical medicine Molecular genetics Pharmacogenomics Genetic Etiology: associated_with, predisposes, causes Substance Relations: interacts_with, inhibits, stimulates Pharmacological Effects: affects, disrupts, augments Clinical Actions: administered_to, manifestation_of, treats, Organism Characteristics: location_of, part_of, process_of Co-existence: co-exists_with 7

  8. Examples “… the loss of Mbd1 could lead to autism-like behavioral phenotypes …” Relation: MDB1 causes Autistic Disorder “… Mbd1 can directly regulate the expression of Htr2c, one of the serotonin receptors, …” Relation: MBD1 interacts_with HTR2C 8

  9. Interpretation of Microarrays Find known facts from the literature: • Desease related: • Associated genes • Current treatments • … • Microarray Genes: • Relations between genes (INHIBITS, STIMULATES, …) • Relations between the genes and anything else

  10. Relations with “Parkinson” as Argument?

  11. What Treats Parkinson?

  12. What (causes, associated_with) Parkinson?

  13. Sentences from which Relations are Extracted

  14. Genes from the Microarray Related to Anything?

  15. Novel Hypotheses Generation • Based on discovery patterns • Discovery patterns: • search templates that have a higher likelihood of returning a new discovery • Specific discovery patterns for specific discovery tasks

  16. Discovery Patterns • Inhibit the upregulated: • Search for substances, genes, ... which, according to the literature, inhibit the top N (e.g. 300) genes that are upregulated on a given microarray • Such substances, genes, … might be used to regulate the upregulated genes • Stimulate the downregulated: • Search for substances, genes, ... which, according to the literature, stimulate the top N (e.g. 300) genes that are downregulated on a given microarray • Such substances, genes, … might be used to regulate the downregulated genes

  17. Discovery Patterns – Graphical View Maybe_Treats1? Microarray Literature Drug Z1(or substance) Inhibits Genes Y1 Upregulated Disease X Drug Z2(or substance) Stimulates Downregulated Genes Y2 Maybe_Treats2?

  18. Results – Inhibit the Upregulated • Parkinson microarray GSE8397 • HSP27 (HSPB1) gene is upregulated on the microarray • We identified paclitaxel and quercetin as substances that inhibit the expression of this gene

  19. Inhibit the Upregulated

  20. Results – Stimulate the Downregulated • NR4A2 downregulated on the microarray • We found out that: • Pramipexol stimulates expression of NR4A2 • NR4A2 is associated with Parkinson disease

  21. Explaining a Relation - Closed Discovery

  22. Closed Discovery – Aligned Relations

  23. Evaluation • Estimate – based on [Masseroli, BMC Bioinformatics 2006]: • Extract known facts – baseline precision on 2,042 extracted relations: • Gene – Disease (causes, assoc_with, …) P=74.2% • Gene – Gene (inhibits, stimulates, …) P=41.95% • Propose Argument-Predicate distance for filtering (Gene-Gene): • At distance no more than 1: P=70.75%; R=43.6% • At distance no more than 2: P=55.88%; R=66.28% • We use Argument-Predicate distance for ranking of semantic relations and we show relations more likely to be correct first.

  24. Conclusion • A new bioinformatics tool for interpretation and novel hypotheses generation • Based on integration of semantic relations extracted from literature with microarrays • Available at: • http://sembt.mf.uni-lj.si

  25. Syntactic Processing Mbd1 can directly regulate the expression of Htr2c • MedPost tagger and shallow parser [ NP[head([… inputmatch(mdb1),tag(noun)])], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… inputmatch(htr2c),tag(noun)])] ] 26

  26. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)])], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358)])] ] 27

  27. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ] • Match semantic type patterns to ontology: <gngm> INTERACTS_WITH <gngm> 28

  28. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ] • Match semantic type patterns to ontology: <gngm> INTERACTS_WITH <gngm> 29

  29. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ] • Match semantic type patterns to ontology: <gngm> INTERACTS_WITH <gngm> • Apply indicator rule: Verb(regulate)  INTERACTS_WITH 30

  30. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ] • Match semantic type patterns to ontology: <gngm> INTERACTS_WITH <gngm> • Apply indicator rule: Verb(regulate)  INTERACTS_WITH 31

  31. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ] • Match semantic type patterns to ontology: <gngm> INTERACTS_WITH <gngm> • Apply indicator rule: Verb(regulate)  INTERACTS_WITH • Substitute concepts for semantic types: 32

  32. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ] • Match semantic type patterns to ontology: <gngm> INTERACTS_WITH <gngm> • Apply indicator rule: Verb(regulate)  INTERACTS_WITH • Substitute concepts for semantic types: 33

  33. Semantic Processing • Identify concepts: MetaMap and ABGene [ NP[head([… semtype(gngm),entrez(MBD1,4152)], ... [verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],... NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ] • Match semantic type patterns to ontology: <gngm> INTERACTS_WITH <gngm> • Apply indicator rule: Verb(regulate)  INTERACTS_WITH • Substitute concepts for semantic types: MBD1 INTERACTS_WITH HTR2C 34

More Related