1 / 11

MAGE-TAB - The ArrayExpress Production Experience

This document discusses the ArrayExpress production experience, including data acquisition, validation, extension, and downloads. It also covers the long-term future of ArrayExpress and provides a tutorial on submitting in MAGE-TAB format.

marcum
Download Presentation

MAGE-TAB - The ArrayExpress Production Experience

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAGE-TAB - The ArrayExpress Production Experience Helen Parkinson, PhD

  2. Content • All change at ArrayExpress • Data acquisition • Validation • Extension • Downloads • Long Term Future • Tutorial – submitting in MAGETAB format

  3. MAGEML MAGEML MAGEML AE M.EXPRESS MAGEML MAGETABULATOR MAGETAB MIGRATION Tracking MAGETAB M.EXPRESS AE2 MAGETAB MAGETABULATOR

  4. Data acquisition • MAGETAB data acquisition is integrated with existing tab2mage submissions • MAGETAB export is being added to the MIAMExpress system • All MAGE-ML submissions will be converted to MAGETAB • We will unify data acquisition on MAGETAB • We decided to do most curation/validation/ontology matching at the end for MAGETAB submissions • MAGETAB makes curator edit and user update much easier • Human readable tab delimited formats=efficient curation • 1600 Experiments processed (1600/3700) • All curated • Subset of ArrayExpress MAGETAB data will be re-curated at migration

  5. Automated processing and validation • Sections • MAGETAB Column Headers • MAGTAB Column Orders • MAGETAB Content – length, terms • External data files – released monthly • vs. ArrayExpress content • MIAME score • DW candidates

  6. Extensibility • Solexa data • Proteomics • Metabolomics • Array Genotype data (Gen2Phen) • Association study data (Gen2Phen, Engage) • Locus specific SNP data • Clinical Data • …..

  7. Downloads • All ArrayExpress data will be available in MAGETAB format now (exported direct from AE) • ~90% is currently available and passes checks (issues with MAGE-OM->MAGETAB) • More ontology term sources will be added incrementally – NCI thesaurus/OBI/ArrayExpress Factor Ontology • Beta MAGETAB ArrayExpress Bioconductor Module (Huber, Kauffman) • All MAGETAB generation code is available • All validation code is available

  8. Ontologies • Working to develop OBI to replace MGED ontology • Generating a sample/factor ontology for ArrayExpress based on data content • Developed in Protégé/OWL format • Will be served from OLS • Also mapping to external ontologies for samples e.g NCI thesaurus • Text mining to annotate external data using dictionaries based on NCI thesaurus and some custom ones (GEOimporter, tab2mage->MAGETAB) • Data import, meta analysis

  9. Future: ArrayExpress and Community • ArrayExpress Submission in MAGETAB ADF format • All ArrayExpress ADF in MAGETAB format • Alpha ArrayExpress-MAGETAB BioConductor MAGETAB importer • AE2 • AE2 data migration • More people post their MAGETAB examples and we agree on a gold std validated set for typical cases • Community lists of MAGETAB supportive tools where people can register their interests and describe their applications (like GO tools) • Addressing HLA • MAGETAB model, firm up the spec • Decide what factors really are, and whether the MAGE case is still valid – controlled vs uncontrolled variables instead? • Issues with global variables - inter experiment comparison of compounds needs to know dose even if dose doesn’t vary in an experiment

  10. Acknowledgments • Anna Farne • Ele Holloway • James Malone • Margus Lukk ArrayExpress Production Team • Helen Parkinson • Tim Rayner • Faisal Rezwan • Eleanor Williams • Mengyao Zhao • Holly Zheng • Mohammad Shojatalab ArrayExpress Development Team • Funding EC - FELICS, EMERALD, Gen2Phen, MUGEN NIH - MAGE grant

  11. Tutorial • Creation of MAGETAB templates • Completion of a pre-made template • Curation • Scoring and validation templates • Viewing Data in ArrayExpress • Backend of the template generation/tracking system • www.ebi.ac.uk/~parkinso/MAGETAB_tutorial/

More Related