1 / 12

The Linked Clinical Data Project

The Linked Clinical Data Project. Jyotishman Pathak , PhD Rick Kiefer. SemTIG November 4 , 2011. Purpose.

duaa
Download Presentation

The Linked Clinical Data Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Linked Clinical Data Project JyotishmanPathak, PhD Rick Kiefer SemTIG November 4, 2011

  2. Purpose • The Linked Clinical Data (LCD) project aims to investigate emerging Semantic Web technologies for developing an ontology-driven framework for high-throughput phenotyping using Electronic Medical Records (EMRs) to analyze multi-factorial phenotypes. • Investigate ontology-based techniques. • Develop a framework for publishing and integrating. • Propose and validate semantic reasoning techniques to support rapid cohort identification

  3. LCD Architecture Med Index NRAF MRIS MICS Health Quest MCLSS Endpoint Virtual Server MCLSS Databases Web Server Linked Open Drug Data Endpoints Linked Data API Thick Client Application Request Selector Thin Client Application Viewer Virtuoso SQL SPARQL Response Formatter RDF View Mobile Client Application

  4. Project – Automated SNPedia • SNPedia contains a wealth of data but the information in the wiki is manually curated. The focus of this project is to automate the results using patient data. • Using MCLSS, identify patients with specific conditions. • Join with OMIM to determine the genetic locus associated with those conditions • Join with dbSNP to identify potentially associated SNPs. • Each of the joins will be done using a single federated SPARQL query. • Results will then be compared to data in SNPedia

  5. Disease to SNP architecture RDF View Mapping SPARQL Query Patient Request Disease MCLSS SNOMED/ICD9 OMIM Gene Results SNP dbSNP Endpoints Databases

  6. dbSNP/OMIM federated query PREFIX omim: <http://bio2rdf.org/omim_resource:> PREFIX dbsnp: <http://edison.mayo.edu:8890/schemas/dbsnp2/>SELECT DISTINCT ?rsID ?geneSymbol  ?alleleName { SERVICE <http://omim.bio2rdf.org/sparql> { SELECT ?geneSymbol ?alleleName WHERE { ?alleleVariant rdf:type omim:AllelicVariant; <http://purl.org/dc/terms/title> ?alleleName; omim:symbol ?geneSymbol. FILTER(regex(str(?alleleName), "Diabetes", "i")). } } SERVICE <http://edison.mayo.edu:8890/sparql> { SELECT ?rsID WHERE { ?s dbsnp:symbol ?geneSymbol; dbsnp:rsid ?rsID. } } }

  7. Partial SPARQL results

  8. Process – Creating dbSNP endpoint • No endpoint could be found so one had to be created. • Download dbSNP database from a Sybase dump • Use Perl to filter the tables in order to isolate desired data and rewrite into tab delimited form. • Create tables in mySQL and import the files. • Use Virtuoso to link to the tables • Create RDF views by mapping the table columns to the desired endpoint subjects

  9. Hurdles • Endpoints • Difficult to find • Unreliable up time • Unknown age of data • Schema documentation • Environment • Linux - could not find ODBC driver for Virtuoso • Virtuoso Bridge did not work with db2 • Virtual server – no admin permissions • Windows 2008 server – bug in webDAV access

  10. Hurdles • Virtuoso • Did not support federated queries until March. • March release has bugs • Unable to run SPARQL queries against non-local endpoints • Federated queries of mixed location crashes the server • Beta fix release has performance issues • Documentation – outdated and poor navigation

  11. Next steps • MCLSS • Identify small MCLSS views • Federated query with SIDER and RxNorm • Use TMO/etc for RDMS -> RDF mapping • dbSNP RDF view • Standardized RDMS -> RDF mapping • Visual graph for dbSNP/OMIM • SNPedia • Alter Bob’s Perl script to download data • Upload in mySQL for comparisions

  12. Questions? Thank you! Bob Freimuth – Perl scripts to filter and transform the dbSNP database as well as invaluable sharing of genomic knowledge and advice. Website http://informatics.mayo.edu/LCD/index.php/Main_Page

More Related