CaArray: Cancer Array Informatics
This presentation is the property of its rightful owner.
Sponsored Links
1 / 48

caarray.nci.nih/ PowerPoint PPT Presentation


  • 103 Views
  • Uploaded on
  • Presentation posted in: General

caArray: Cancer Array Informatics Open Source Tools for Microarray Data Management, Analysis and Annotation. caArray overview & demo Mervi Heiskanen (15 min) caArray architecture Scott Gustafson (15 min) webCGH overview & demo David Hall (15 min). http://caarray.nci.nih.gov/.

Download Presentation

caarray.nci.nih/

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Caarray nci nih

caArray: Cancer Array InformaticsOpen Source Tools for Microarray Data Management, Analysis and Annotation

caArray overview & demo

Mervi Heiskanen (15 min)

caArray architecture

Scott Gustafson (15 min)

webCGH overview & demo

David Hall (15 min)

http://caarray.nci.nih.gov/


Caarray nci nih

caArray Data Portal &

Data Analysis Tools

  • Data Portal: Promotes data sharing, - submission of original, raw data files with associated experiment and sample information.

  • Data analysis and visualization tools:

    • webCGH (NCICB/RTI), XpressionWay (NCICB/SAIC)

    • caBIG tools:

      • caWorkbench - Columbia

      • DWD - UNC Lineberger

      • GenePattern - MIT/Broad ?

      • Magellan - UC San Francisco

      • VISDA – Georgetown

      • Cancer Molecular Pages – Burnham

      • Function Express – Wash U Siteman

      • GoMiner –NCI/CCR


Caarray version 1 0

caArray version 1.0

  • Key features:

  • MIAME 1.1 compliant data annotation forms

  • Support for Affymetrix and GenePix native files

  • MAGE-ML import and export

  • controlled vocabularies (MGED ontology)

  • access to data via MAGE-OM API

  • caArray installations:

  • NCICB caArray instance supports NCI funded programs.

  • Local installations at the cancer centers:

    caBIG funded caArray adopters (Lombardi, Wistar, NYU)


Caarray nci nih

  • caArray listservs:

  • caArray developers

  • caArray users

  • caArray team


Caarray compliance with standardization efforts

caArray: Compliance with Standardization Efforts

  • MIAME

    • Minimum Information About a Microarray Experiment

    • 1.1 Draft 6 (April 1, 2002)

    • http://www.mged.org/Workgroups/MIAME/miame_1.1.html

  • MAGE-ML

    • MicroArray and GeneExpression Object Model and Markup Language

    • 1.1 (October 2003)

    • http://www.omg.org/docs/formal/03-10-01.pdf

  • MGED Ontology

    • Microarray Gene Expression Data Ontology

    • 1.1.8 (April 2004)

    • http://mged.sourceforge.net/ontologies/MGEDontology.php

caBIG compatibility guidelines

http://cabig.nci.nih.gov/guidelines_documentation/caBIG_Compatibility_Document


Caarray nci nih

  • class TechnologyType

  • namespace:

    • http://mged.sourceforge.net/ontologies/MGEDOntology.daml#

  • documentation:

    • The technology type or platform of the reporters on the array.

  • type:

    • primitive

  • superclasses:

    • ArrayDesignPackage

  • used in classes:

    • FeatureGroup

  • used in individuals:

    • in_situ_oligo_featuresspotted_antibody_featuresspotted_colony_featuresspotted_ds_DNA_featuresspotted_protein_featuresspotted_ss_oligo_features

  • class CellLineDatabase

  • namespace:

    • http://mged.sourceforge.net/ontologies/MGEDOntology.daml#

  • documentation:

    • Database of cell line information.

  • type:

    • primitive

  • superclasses:

    • Database

  • used in classes:

    • CellLine

  • used in individuals:

    • ATCC_CulturesCABRI_Human_and_Animal_Cell_lines


Caarray phase 2

caArray Phase 2

  • caArray 1.2 (June 2005)

    • Support for additional file formats via a software toolkit

    • Public search without login

    • Copy bio sample information

  • caArray 1.5 (September 2005)

    • XpressionWay, pathway visualization tool

    • Integration with caDSR 3.0

  • caArray 1.7 (December 2005)

    • Store filtered and normalized data

    • User management user interface

  • caArray 2.0 (March 2006)

    • Embedded MAGE-ML validation

All releases:

Defect fixes and

usability

enhancements


Caarray nci nih

Acknowledgements

  • NCICB/SAIC

  • Development team:

  • Hangjiong Chen

  • Scott Gustafson

  • Juergen Lorenz

  • John Moy

  • Sumeet Muju

  • Beth Neuberger

  • Phu Tran

  • Jim Zhou

  • QA:

  • Durga Addepalli

  • Andrew Shinohara

  • Ye Wu

  • NCICB/TerpSys

  • Don Swan, Jamie Keller

  • Research Triangle Institute

  • David Hall (webCGH)

NCICB

Sue Dubman, Mervi Heiskanen, Xioapeng Bian, Subha Madhavan, Carl Schaefer, Gilberto Fragoso, Denise Warzel…

and Ken Buetow


Caarray s architecture

caARRAY’s Architecture

Credits to

Sumeet Muju

Phu Tran


Caarray nci nih

caArray Architecture

TOMCAT WEB

EJB CONTAINER

CONTAINER

caCORE

------------

VOCAB

VOCAB

caBIO

MGR EJB

INTERFACE

caDSR

EVS

SECURITY

SECURITY

MGR EJB

OBJECTS

SERVLET

DATA

S

T

PROTOCOL

TRANSFER

U

BROWSER

MGR EJB

R

OBJECT

T

SECURITY

S

(DTO)

JSP

DB

OBJECT

EXPERIMENT

RELATIONAL

MAGE

MGR EJB

BRIDGE

MANAGER

(OJB)

)

MAGE-ML

Experiment and

ArrayDesign

S

OTHER

T

C

K

MGR EJB

E

T

S

J

caARRAY

-

B

E

O

DB

G

E

A

G

M

A

M

(

MAGE-ML

NATIVE DATA

IMPORTER MDB

FTP APPLET

FTP STAGING AREA

FILE

NETCDF API

FILE UPLOADER

FILE SHARE

MDB

NETCDF API

MAGE-OM API

MAGE-OM

MAGE-OM

JAR

OBJECTS

RMI MGR

MAGE-OM

PERSISTENCE


Caarray nci nih

caArray Interfaces: caArray EJB API

  • caArrayEJB API: Provides transaction control, asynchronous processes,service location, common security and distributed capabilities for submission and retrieval of Microarray Experiments.

    • The caArray presentation layer utilizes the above functionality via the caArrayEJB API.

    • Data Transfer Objects (DTOs) utilized to transfer data between calling application and the EJBs.

    • APIs can be used for federated access and submission of transaction data.


Caarray nci nih

caArray Interfaces: Mage-OM API

  • MAGE-OM API :Provides fine grain search and retrieval of all caArray data via a caBIO-like RMI based API.

    • The MAGE-OM API maps the MAGE objects to the new caArray database schema.

    • RMI Security module incorporated for user/group level data access.

    • NetCDF API logic incorporated for faster retrieval of data

    • Built to be grid enabled


Caarray middleware

caArray Middleware

  • Data Representation

    • Data Transfer Objects (DTO)

    • MicroArray Gene Expression Software Toolkit (MAGE-stk)

    • DTO - MAGE-stk Conversion

  • Data Persistence

    • Data Access Layer

      • ObJectRelationalBridge (OJB)

      • OJB Abstraction Layer and Data Access Objects (DAO)

    • EJB Layer

      • Stateless Session Façade

      • Bean-managed Persistence

    • NETCDF Files

      • Large Data Set

      • Fast Binary Access

  • MAGE-ML Import and Export

    • Message-Driven Beans


Mage ml import and export an example

MAGE-ML Import and Export: An Example

<MAGE-ML identifier="gov.nih.nci.ncicb.caarray:MAGEML:123:1">

<AuditAndSecurity_package>

<Contact_assnlist>

<Person identifier="gov.nih.nci.ncicb.caarray:Person:456:1"

lastName="Doe"

firstName="John">

</Person>

<Contact_assnlist>

</AuditAndSecurity_package>

<Experiment_package>

<Experiment_assnlist>

<Experiment identifier="gov.nih.nci.ncicb.caarray:Experiment:789:1"

name=“Sample Experiment">

<Descriptions_assnlist>

<Description text="This is a sample experiment."></Description>

</Descriptions_assnlist>

<Providers_assnreflist>

<Person_ref identifier="gov.nih.nci.ncicb.caarray:Person:456:1"/>

</Providers_assnreflist>

</Experiment>

</Experiment_assnlist>

</Experiment_package>

</MAGE-ML>

Identifiable element

Referenced Identifiable element to be resolved


Mage ml import and export

MAGE-ML Import and Export

  • Modified from the MAGE-stk’s MAGE-ML SAX-based parser to include a persistence mechanism to insert, update and resolve (look up) parsed objects

  • Any valid MAGE-ML can be imported. MAGE-ML is assumed valid. Validation is typically done using ArrayExpress’s MAGEValidator

  • Identifiable objects are first resolved from database by matching their identifier, and if resolved the in-coming object is updated against the existing one

    • Identifier represents the globally unique key of a MAGE object across domains for its entire lifecycle

    • Identifier is separate from persisted MAGE-stk object’s primary key which is only internal to caARRAY


Mage ml export

MAGE-ML Export

  • The entire object graph of an object, e.g., ArrayDesign, Experiment, is traversed to collect all Identifiable objects

  • The MAGE-stk’s MAGEJava object is utilized to contain all the Identifiable objects collected

    • When an Identifiable object is encountered, the appropriate method in the MAGEJava object is discovered and invoked using reflection to store the object into it

  • Ultimately MAGEJava.writeMAGEML(Writer) is invoked to recursively invoke the same method of all the contained Identifiable objects.

  • Xerces’s XMLSerializer pretty-formats the XML content as it is being written with appropriate new lines and indentations


Caarray nci nih

A caArray Configuration

caArray 1

caWorkbench

caBIO

caArray

caDSR / EVS

schema

Security

caARRAY EJB

MAGE-OM API

JAVA

GRID

MAGE-ML

APP

(future)

caARRAY EJB

MAGE-OM API

NCICB Security

caDSR / EVS

caArray

schema

caWorkbench

caBIO

NCICB


Caarray nci nih

webCGHA web application for the visualization and analysis of array-based CGH and gene expression data

David Hall, Ph.D.

Research Triangle Institute


Arraycgh

arrayCGH


Webcgh functions

webCGH Functions

  • Visualization of copy number and gene expression levels

  • Interrogation of genome features

  • Data normalization and analysis

  • Virtual experiments


Whole genome view

Whole-genome View


Ideograms

Ideograms


Chromosome 17

Chromosome 17


Chromosome 171

Chromosome 17


Caarray nci nih

Zoom


Annotated genes

Annotated Genes


Gene list

Gene List


Gene watch

Gene Watch


Data flow

Data Flow

Database

Database

Adaptor

Adaptor

Transformer

Op

Op

Op

Op

X

Analytical Pipeline

Cache

Plot Generator


Analytical pipelines

Analytical Pipelines


Architecture

Architecture


Key design features

Key Design Features


Key design features1

Key Design Features


Past present future

Past, Present, Future

  • Dec. 2003 – Version 1.0

    • Basic plots, analytics, GEDP

  • March 2005 – Version 2.0

    • More plots, analytics, caArray

  • Late April 2005 – Version 2.1

    • Mouse/human plots

    • CGH/gene expression

    • SKY/M-FISH&CGH integration


Webcgh team

webCGH Team

  • NCICB

    • Mervi Heiskanen

  • RTI

    • David Hall

    • Vesselina Bakalov

    • Ying Chen

    • Matt Westlake

    • Bing Liu

    • Laxminarayana Ganapathi

    • Sheping Li

    • Stuart Allen


  • Login