1 / 14

Support for MAGE-TAB in caArray 2.0

Support for MAGE-TAB in caArray 2.0. Overview and feedback. MAGE-TAB Workshop January 24, 2008. Agenda. Brief overview of caArray 2.0 caArray 2.0 and MAGE-TAB MAGE-TAB feedback. What is caArray?. caArray is a caBIG™-compliant microarray data repository at the NCICB

danno
Download Presentation

Support for MAGE-TAB in caArray 2.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Support for MAGE-TAB in caArray 2.0 Overview and feedback MAGE-TAB Workshop January 24, 2008

  2. Agenda • Brief overview of caArray 2.0 • caArray 2.0 and MAGE-TAB • MAGE-TAB feedback

  3. What is caArray? • caArray is a caBIG™-compliant microarray data repository at the NCICB • Developed to support a federated model of microarray data sharing • Developed in line with MIAME and MAGE guidelines caArray 1.6 caArray 2.0

  4. Goals of caArray 2.0 • Address Adopter feedback gained from our 1.x experience • Improve the user experience for storing and retrieving data produced • Simplify and improve the performance of data access through the API and grid service, for analytical applications • Harmonize with caBIG™ tissue repository (caTissue) and annotation repository (caBIO) • Support additional array platforms, including SNP arrays • Organize the application around workflow between investigators and the labs that serve them • Use an agile software development approach that will allow more frequent feature additions and better responsiveness to the user community

  5. Features of caArray 2.0 • Store array data associated with experiment and sample annotations • Data entry through graphical user interface or MAGE-TAB • Parse Affymetrix, Illumina and GenePix formats for expression and SNP arrays • Role-based permissions for data access • Programmatic access via a Java API and grid service • Manage protocols and controlled vocabularies • MGED Ontoloty 1.3.1 comes pre-loaded • Basic Browse and Search Functionality

  6. caArray 2.0 Annotations • Capture information for • Experiment information • Contacts • Publications • Sample Annotations • Source • Sample • Extract • Labeled Extracts • Hybridizations

  7. caArray 2.0 supported formats Parsable file formats • Annotation • MAGE-TAB .ADF, IDF, SDRF • Array data - parsed • Affymetrix Expression and SNP • . CDF, .CEL, .CHP • Illumina Expression and SNP • .CSV • GenePix • .GAL, .GPR Unparsed formats • Affymetrix: .dat, .exp, .rpt, .txt • Illumina: .txt, .idat • Agilent: .txt, .tsv • ImaGene: .txt, .tiv • Nimblegen: .txt, .gff

  8. caArray 2.0 permissions • Role-based permissions for each Installation • Anonymous user • System Administration • Principle investigator/Biostatistician/Lab Administrator/Lab Scientist • Data is Private until made Public • Experiment title, PI, # samples are visible but experiment content is not available to the anonymous user • Collaboration groups can be managed by the PI for pre-public collaboration • CSM 4.0 • Experiment-level and samples-level security

  9. caArray 2.0 API and Grid Service • Support for MAGE-TAB level of annotation – Simplified implementation of MAGE • API provides a data service and analytical services • Data service allows users to use CQL to issue queries that traverse the domain model • Analytical services provide convenience methods for data access

  10. Browse by Experiments Organism Provider Array design Search by specifying Keyword Category caArray 2.0 browse and search

  11. MAGE-TAB in caArray 2.0 • Support MAGE-TAB v1.0 – ADF, IDF, SDRF • Term Source providers and associated Terms are captured as Controlled Vocabularies (Manage Vocabularies) • Protocols imported and viewable in Manage Protocols • Characteristics displayed on the relevant detail pages • Original files are stored in association with the Experiment • Edits made to the information in the UI are not reflected in these files • Future feature – MAGE-TAB export based on current database values

  12. MAGE-TAB for data migration caArray 1.6 >> caArray 2.0 • Experiments in caArray 1.6 being migrated to 2.0 are being exported in MAGE-TAB format along with the associated native array data files • Challenges included • MAGE-OM >>MAGE-TAB mapping • Most challenges due to validation that all data “made it” over (not really a MAGE-TAB issue) • Manual checking still needed Jackson Labs internal MAD database >> caArray 2.0

  13. MAGE-TAB Feedback • Initial experience with end-user-type customers is that there is a learning curve associated with using the SDRF, especially with regard to applying controlled vocabularies • Need tools to facilitate this • Source vs. Sample vs. Extract vs. Labeled Extract • Often confusion over “what goes where” • From Jackson Labs: • Documentation is good for a biologist-type end-user, but software engineer would like more detail • More real-life examples would be helpful

  14. Specific requests to consider • Need a way to specify required fields for particular implementations • caArray UI has certain required fields – need to be able to specify these in a MAGE-TAB template • Associate “Supplemental” files with an experiment • In IDF, recommend adding a field to specify the type of array experiment (Gene Expression, SNP, aCGH, etc.)

More Related