Ncicb informatics
Download
1 / 47

ncicb informatics - PowerPoint PPT Presentation


  • 171 Views
  • Uploaded on

NCICB Informatics. Providing Innovative and Integrative Informatics Solutions Himanso Sahni (SAIC) Sharon Settnek (SAIC). NCI Center for Bioinformatics Building common architecture, common tools, and common standards. access portals. participating group nodes. Clinical Trials.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'ncicb informatics' - Anita


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Ncicb informatics l.jpg

NCICB Informatics

Providing Innovative and Integrative

Informatics Solutions

Himanso Sahni (SAIC)

Sharon Settnek (SAIC)


Nci center for bioinformatics building common architecture common tools and common standards l.jpg
NCI Center for BioinformaticsBuilding common architecture, common tools, and common standards

accessportals

participatinggroup nodes

ClinicalTrials

MolecularSignatures

CMAP

caCore

  • Establish common data elements

  • Provide data exchange infrastructure

  • Develop electronic data interfaces

  • Distribute architecture model

  • Provide application tool chest

  • Develop portals

CancerGenomics

MouseModels

GAI

CGAP


Cacore http ncicb nci nih gov l.jpg
caCOREhttp://ncicb.nci.nih.gov


Cabio l.jpg
caBIO

  • Thecancer Bioinformatics Infrastructure Objects (caBIO) is a standards based set of bioinformatics components

  • caBIO objects simulate the behavior of actual genomic components such as genes, chromosomes, sequences, libraries, clones, ontologies, etc.

  • caBIO provides access to a variety of genomic data sources including, Unigene, Homologene, LocusLink, RefSeq, BioCarta, GoldenPath (via DAS), and NCICB’s CGAP (Cancer Genome Anatomy Project) and GAI (Genetic Annotation Initiative) data repositories

  • caBIO is “open source” and provides an abstraction layer that allows developers to access genomic information using a standardized tool set without concerns for implementation details and data management



Model extensions l.jpg
Model Extensions

Clinical Protocols

  • A clinical protocols object model facilitates the integration of clinical data with genomic data

Animal Models

  • An animal models object model supports queries between human and animal models of cancer


Cmap cancer molecular analysis project http cmap nci nih gov l.jpg

Powered

by

caBIO!

CMAPCancer Molecular Analysis Projecthttp://cmap.nci.nih.gov


Molecular targets l.jpg

Powered

by

caBIO!

Molecular Targets

  • A collection of genes organized by pathways can be displayed facilitating the evaluation of anomalies


Gene navigator l.jpg

Powered

by

caBIO!

Gene Navigator

  • The Gene Navigator presents gene-to-gene and enzyme-to-enzyme relationships as nodes in a connected graph

    • Spring, Rosette, and Clan-based algorithms are represented

  • Links between related nodes identify the relationships and the source pathways in which they occur

  • Researchers can drill down to detailed information about genes and pathways


Targeted agents l.jpg

Powered

by

caBIO!

Targeted Agents

  • Researchers can retrieve information about agents linked to multiple targets and contexts


Clinical trials l.jpg

Powered

by

caBIO!

Clinical Trials

  • Researchers can view detailed information about therapeutic trials associated with histology types and agents

  • A clinical protocols portal is available to allow researchers to search and submit clinical protocols affiliated with Specialized Programs of Research Excellence (SPOREs)


Cabio architecture l.jpg
caBIO Architecture

  • caBIO was designed using a J2EE architecture with client interfaces, server components, back-end objects and data sources

  • Clients (browsers, applications) can receive information (HTML and XML) from back-end objects over HTTP

  • Client applications can also communicate with back-end objects via Java RMI (Java applications)

  • Non-Java based applications can communicate via SOAP or HTTP

  • Server components communicate with back-end objects via Java RMI

  • Back-end objects communicate directly with data sources (database, URLs, flat files)

  • caBIO web services can be advertised to facilitate information sharing

    • RDF can be used to advertise content to crawlers and agents

    • A UDDI registry may be configured to advertise services

    • caBIO services can be advertised via bioMOBY central


Cabio architecture13 l.jpg

Clients

Presentation Layer

Object Layer

Data Sources

Web Server

Servlet Container

JSPs

External

Databases

HTML/HTTP

Data Access Objects

Servlets

Object Managers

Browsers

SOAP Engine

JDBC

EVS

XML/HTTP

Other Apps

RMI

caDSR

UI Bean

Domain Objects

SOAP

HTTP

XML Builder

Chromosomes

Genes

URLs

XSLT Engine

Tissues

Clusters

Agents

RDF

FTP

Libraries

Sequences

DTDs

Flat

Files

XML

Docs

Diseases

XSL

Style Sheet

Other

Java Apps

caBIO Architecture


Data sources l.jpg
Data Sources

External Public Databases

CGAP Database

UniGene

RefSeq

Reference Sequences

Genes, Sequences Chromosomes

caBIO

KEGG

BioCarta

CGAP/ GAI

CTEP/ SPOREs

Pathways

Pathways

SNPs

Trials

Locus Link

Homolo Gene

UCSC Golden Path

DAS

Homologs

Gene Loci, Locus Link Summaries

Gene Annotations

Genes, Sequences


Cabio usage l.jpg
caBIO Usage

Facilitates solving Complex Queries such as:

Find me the Pathways,

with Genes that are expressed in tissues

with a particular Histopathology that includes

a particular Organ and a particular Disease.


Cabio apis l.jpg
caBIO APIs

  • A Java API is available for Java programmers

  • A Simple Object Access Protocol (SOAP) API is provided for non-Java programmers

  • An HTTP API is available

    • Developers can request XML or HTML (with XSL)


Java packages l.jpg
Java Packages

  • gov.nih.nci.caBIO.bean

    • Contains domain objects to access genomic and biomedical components

  • gov.nih.nci.caBIO.util.das

    • Primary interface to the UCSC DAS

    • Uses JAXB to convert DAS DTDs to objects

  • gov.nih.nci.caBIO.evs

    • Provides synonym search and concept based search to the NCI’s Enterprise Vocabulary System (EVS)

  • gov.nih.nci.caBIO.webservices

    • Provides access to caBIO via SOAP

  • gov.nih.nci.caBIO.servlet

    • Provides access to caBIO via HTTP

  • gov.nih.nci.caBIO.util

    • Provides interface to caBIO utilities


Java api l.jpg
Java API

Domain objects have companion

SearchCriteria objects

Gene myGene = new Gene();

GeneSearchCriteria criteria = new GeneSearchCriteria();

criteria.setSymbol("pTEN");

SearchResult result = myGene.search(criteria);

Gene[] genes = (Gene[]) result.getResultSet();

  • caBIO supports nested SearchCriteria

    • SearchCriteria from one object type can be fed as parameters into SearchCriteria of another type.

  • Complex queries without any SQL


Traverse relationships in model l.jpg
Traverse Relationships in Model

Find me the Pathways,

with Genes that are expressed in tissues with a particular Histopathology that includes a particular Organ and a particular Disease.

INPUT

Disease

Histopathology

Genes

Organ

Pathways

OUTPUT


Findpathway l.jpg
findPathway

Input disease, organ; create SearchCriteria Objects:

public Pathway[] findPathway(String disease, String organ) {

DiseaseSearchCriteria diseaseCriteria =

new DiseaseSearchCriteria();

OrganSearchCriteria organCriteria =

new OrganSearchCriteria();

HistopathologySearchCriteria histoCriteria =

new HistopathologySearchCriteria();

GeneSearchCriteria geneCriteria =

new GeneSearchCriteria();

PathwaySearchCriteria pathCriteria =

new PathwaySearchCriteria();


Findpathway21 l.jpg
findPathway

Nest the SearchCriteria, then do the search:

diseaseCriteria.setName(disease);

organCriteria.setName(organ);

histoCriteria.putSearchCriteria(diseaseCriteria,CriteriaElement.AND);

histoCriteria.putSearchCriteria(organCriteria, CriteriaElement.AND);

geneCriteria.putSearchCriteria(histoCriteria, CriteriaElement.AND);

pathCriteria.putSearchCriteria(geneCriteria, CriteriaElement.AND);

Pathway myPathway = new Pathway();

return myPathway.searchPathways(pathCriteria);

}



Web services soap l.jpg
Web Services: SOAP

http://cabio.nci.nih.gov/soap/services/index.html


Soap api l.jpg
SOAP API

Perl Example

use SOAP::Lite;

$s = SOAP::Lite

->uri(urn:nci-gene-service)

->proxy("http://cabio.nci.nih.gov/soap/servlet/rpcrouter");

my %searchCriteria=();

$searchCriteria{symbol}=“pTEN”;

$som=$s->getGenes(SOAP::Data->type(map =>\%searchCriteria));

$xmldoc = $som->result;


Soap output with xlinks l.jpg
SOAP output with xlinks

<?xml version="1.0" encoding="UTF-8" ?>

<nci-core>

- <gov.nih.nci.caBIO.bean.Gene id="2221" xmlns:xlink="http://www.w3.org/1999/xlink/">

<name>PTEN</name>

<title>phosphatase and tensin homolog (mutated in multiple advanced cancers 1)</title>

<dbCrossRefs>{LOCUS_LINK=5728, OMIM=601728, UNIGENE=10712}</dbCrossRefs>

<Pathwayxlink:href=

"http://lpgprot101.nci.nih.gov:5080/CORE/GetXML?operation=Pathway&GeneId=2221" />

[Additional xlinks for ExpressionExperiment, Organ, Chromosome, GeneHomolog,

Sequence, Gene Alias, Protein, SNP, and MapLocation]

</gov.nih.nci.caBIO.bean.Gene>

[2 Additional Genes with “PTEN” in their name]

- <searchResult>

<hasMore>false</hasMore> <startsAt>1</startsAt><endsAt>3</endsAt>

</searchResult>

</nci-core>


Soap with returnheavyxml l.jpg
SOAP with returnHeavyXML

Data is now returned in full.

Here is a Pathway object snippet:

<gov.nih.nci.caBIO.bean.Pathway id="92">

<name>ptenPathway</name>

<displayValue>PTEN Dependent Cell Cycle Arrest and Apoptosis</displayValue>

<pathwayDiagram>ptenPathway.svg</pathwayDiagram>

</gov.nih.nci.caBIO.bean.Pathway>


Http api l.jpg
HTTP API

Direct access to XML-formatted data via URLs:

http://cabio.nci.nih.gov/servlet/GetXML?

operation=Gene&Symbol=pTEN

Method

Parameter Value

Search Parameter


Cabio future l.jpg
caBIO Future

  • MYcaBIO Application

    • an user interface to the caBIO API

  • MYcaBIO Kernel

    • federation of caBIO servers to share information between local data sources and the NCICB caBIO server

  • Web Service Advertisement

    • publishing caBIO services


Mycabio l.jpg
MYcaBIO

  • Allows researchers to browse the caBIO objects, map their data with caBIO objects, and retrieve the results of the data query without programming expertise

  • Researchers can upload MS Excel files and obtain query results in MS Excel format


Cabio kernel l.jpg
caBIO Kernel

mycaBIO Client

  • Facilitates the creation of a federation of caBIO servers to share information between local data sources and the NCICB caBIO server

    • Leverages the JXTA protocol for peer-to-peer communication

Proxy

NCICB caBIO server

Object/DB bridge

5. Queries Persistence layer

DSI

(Data Source Identifier)

4. Parses query & DSI and

authenticates user (in any)

.

1. Sends query and user info

8. Returns returns

6.Returns objects to requestor

3. Passes query to NCICB server

LocalcaBIOServer

7.Queries data map

Datamap

8.Queries Persistence layer

Object/DB bridge

2. Parses query & DSI and

authenticates user

.


Cabio and web services l.jpg

Service Registry

caBIO and Web Services

Locate

Publish

Service Requestor

Service Provider

Invoke

  • caBIO is a “Service Provider” providing web services accessible via SOAP

  • caBIO services are also described using the Web Services Description Language (WSDL)

  • caBIO will publish Web Services to a central registry

    • The Universal Description, Discovery, and Integration (UDDI) registry can be leveraged

  • Third party applications can publish their services and share information by leveraging caBIO service interfaces identified in the WSDL

  • Applications can request services by obtaining the appropriate location of the WSDL from the UDDI registry

  • Applications can invoke the appropriate Web Service identified in the WSDL by issuing a SOAP request


Clinical trials web services l.jpg

caBIO Clinical Service (J2EE)

Clinical TrialsWeb Services

Web Server

Component Container

2. Invoke

Publish

Third-Party Clinical Service (J2EE)

Clinical

Trials

Registry

Web Server

1. Locate

Clinical Trials Application(s)

Publish

Component Container

Publish

2. Invoke

Third-Party Clinical Service (.NET)

Web Server/ Component Container


Biomoby and cabio l.jpg
bioMOBY and caBIO

MOBY Central

2. Results - WSDL

Register – MOBY Object

1. Query – Object Type

caBIO

MOBY Client(s)

3. Transact – URI Query

MOBY Server(s)


Cabio benefits l.jpg
caBIO Benefits

  • Provides an abstraction layer that allows developers to access genomic information using a standardized tool set without concerns for implementation details

  • Permits access to allow developers to obtain the information they need from a variety of data sources without data management

  • Manages the display of large volumes of data to assist in load balancing

  • Provides an effective mechanism for performing complex queries that rely on diverse data sources

  • Facilitates information sharing without managing linkages between multiple data sources


Lessons learned l.jpg
Lessons learned

  • Integration

    • Data

    • Framework

  • Vocabulary

    • Implement upfront


Slide36 l.jpg

Supporting Technologies

Genome Anatomy, Genetic Variants

Animal

Models

Clinical Research Data

Gene Expression Data

Laboratory Data

Molecular Targets, Pathways


Gene expression data portal gedp http gedp nci nih gov l.jpg
Gene Expression Data Portal (GEDP)http://gedp.nci.nih.gov

  • The Gene Expression Data Portal (GEDP) allows users to submit, search, and analyze microarray experiments (Affy and Spotted Arrays)

    • The microarray database was designed based on industry standards (MAGEML)


Experiment details l.jpg
Experiment Details

  • Researchers can retrieve experiment details, data sets, and MAGEML documents

    • The GEDP automatically generates MAGEML from submitted experiments


Microarray analysis l.jpg

Powered

by

caBIO!

Microarray Analysis

  • Researchers can retrieve a list of pathways represented in the selected array

  • Researchers can view each pathway, the genes on the chip, and view expressed genes and the level of gene expression


Calims http lpglims nci nih gov l.jpg
caLIMShttp://lpglims.nci.nih.gov

  • caLIMS is an “Enterprise” Web based system designed to automate workflow in the laboratory

    • Workflow operations include user and laboratory administration, inventory, project creation, project execution, project results, collaboration, and equipment operation

  • caLIMS can be extended to interface with a variety of laboratory equipment and analysis tools


Cancer models database cmd http cancermodels nci nih gov l.jpg
Cancer Models Database (CMD) http://cancermodels.nci.nih.gov

  • The Cancer Models Database allows both intramural and extramural researchers to search and submit mouse models

    • All models submitted by extramural researchers are curated to ensure data integrity


Model components l.jpg
Model Components

  • Researchers can search for and submit model information including:

    • General Information

    • Genetic Descriptions

    • Carcinogenic Agents

    • Publications

    • Histopathology

    • Therapeutic Approaches

    • Cell Lines

    • Images

    • Microarray Data (via GEDP interface)


Controlled vocabulary l.jpg
Controlled Vocabulary

  • A generic vocabulary tree browser provides an interface to NCICB vocabulary

  • The vocabulary browser was designed to provide ontology browsing and the selection of specific concepts

    • An API is available to facilitate re-use across NCICB applications


Caimage http caimageportal nci nih gov l.jpg
caImage http://caimageportal.nci.nih.gov

  • The NCICB has developed an image portal to allow researchers to search for mouse and human images and annotations

    • Human and mouse images and annotations were provided by the MMHCC


Image annotations l.jpg
Image Annotations

  • Image annotations are represented as XML

    • Image annotations may include a detailed description, species, organ, diagnosis, strain, and image dimensions across regions of interest

  • The NCICB is exploring existing standards and interfacing (DICOM-SR) and interfacing with federated image servers (MIRC)

  • Future annotation tools may leverage MYcaBIO technologies


Summary l.jpg

Summary

Building integrative technologies increases the ‘ROIT’ and facilitates innovation

caBIO:

http://ncicb.nci.nih.gov/core/caBIO

caBIO users list:

http://list.nih.gov/archives/cabio_users.html

caBIO developers list:

http://list.nih.gov/archives/cabio_developers.html


Acknowledgements l.jpg

NCICB

Kenneth Buetow

Peter Covitz

Carl Schaefer

Robert Clifford

Mike Edmonson

Frank Hartel

Sherri DeCoronado

SAIC

Scott Gustafson

Mike Connolly

Joshua Phillips

Brian Levine

Acknowledgements


ad