1 / 44

MAGE : Revised submission against LSR RFP-007 "Gene Expression"

MAGE : Revised submission against LSR RFP-007 "Gene Expression". Ugis Sarkans, EBI Michael Miller, Rosetta Inpharmatics. Overview. Acknowledgements Specification history and structure Fundamental Terms UML Packages Mapping from PIM to XML-PSM Schedule Resources. Doug Bassett (Rosetta)

tien
Download Presentation

MAGE : Revised submission against LSR RFP-007 "Gene Expression"

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAGE:Revised submissionagainst LSR RFP-007"Gene Expression" Ugis Sarkans, EBI Michael Miller, Rosetta Inpharmatics

  2. Overview • Acknowledgements • Specification history and structure • Fundamental Terms • UML Packages • Mapping from PIM to XML-PSM • Schedule • Resources

  3. Doug Bassett (Rosetta) Derek Bernhart (Affymetrix) Alvis Brazma (EBI) Steve Chervitz (Affymetrix) Francisco Dela Vega (Applied Biosystems) Michael Dickson (NetGenics) David Frankel (IONA) Ken Griffiths (NetGenics) Scott Markel (NetGenics) Michael Miller (Rosetta) Dave Nellesen (Incyte) Alan Robinson (EBI) Ugis Sarkans (EBI) Barry Schwartz (Affymetrix) Martin Senger (EBI) Paul Spellman (Stanford) Jason Stewart (NCGR) Charles Troup (Agilent) Acknowledgements • participants of MAGE programming jamboree (hosted by Iobion) in Toronto, September 2001

  4. Model -Driven Architecture • Platform Independent Model (UML) • most of the effort spent on this • Platform Specific Model • XML • UML (refined from PIM): • not used (Rational Rose profile for UML not that useful) • DTD • generated from PIM • manual modifications

  5. History of the submittal • lifesci/01-06-02 - an interim draft before the Danvers meeting • not enough time to work out XML • lifesci/01-08-01 - not the final submission • programming jamboree after the Toronto meeting helped a lot, especially in the XML mapping area • lifesci/01-10-01 - current submission

  6. Specification Structure • Text document with explanations, including all diagrams • prepared partly by exporting from Rational Rose • PIM, UML model as a single XMI file • XMI => DTD translation software (as a formal representation of the mapping rules) • XML DTD

  7. Fundamental Terms • BioSample - tissue, cell-line, etc. that may be treated • BioMaterial - generic term for biological-based material • BioSequence - an abstraction of a biological sequence • BioAssay • treatment of an array with a labeled extract, i.e. hybridization • experimental step in a broader sense

  8. Fundamental Terms (2) • Reporter - the physical representation of biosequence(s) on an array • Feature - location on an array • Event - description of an action, i.e. treatment of a BioSample or the act of hybridization • Transformation - a specific Event, transforming a set of data to another set of data.

  9. UML Packages (1) • BioSequence and BQS • BioMaterial • BioEvent • ArrayDesign and DesignElement • ArrayManufacture • BioAssay • BioAssayData

  10. UML Packages (2) • Experiment • HigherLevelAnalysis • Miscellaneous • Describable • Measurement • QuantitationType • Protocol • Audit and Security

  11. BioEvent Protocol Treatment HigherLevelAnalysis Transformation Experiment BioMaterial BioAssayData BioAssay Audit QuantitationType ArrayManufacture Measurement DesignElement ArrayDesign Description BSANE BQS BioSequence UML Packages (3)

  12. Package dependencies

  13. Important package dependencies

  14. Experiment • Represents the container for a hierarchical grouping of BioAssays • ExperimentDesign decribes and annotates the overall design and purpose of the experiment • Description of experimental steps can be structured by ExperimentalFactors/ FactorValues: • ExperimentalFactor is a part of ExperimentDesign • FactorValues can be attached to BioAssays

  15. Experiment

  16. HigherLevelAnalysis • The results of performing analysis on the BioAssayData from an Experiment • Clustering allows specifying the results of analysis as a hierarchical tree • Cluster Nodes can have NodeValues and are associated with *Dimension objects

  17. BioAssayData • The data associated with either a measured BioAssay or a derived BioAssay • Data is conceptually a 3-D matrix, with dimensions: • BioAssayDimension • DesignElementDimension • QuantitationTypeDimension • Transformations are used to capture data processing sequence and rules • *Mapping objects formalize dimension translations • Two representations for BioDataValues: • a set of BioDataTuples • BioDataCube

  18. BioAssayData

  19. BioAssayData DesignElement BioAssay Transformation QuantitationType

  20. QuantitationType • StandardQuantitationTypes and SpecializedQuantitationTypes • list of SQTs • can refer to a Channel object • QuantitationTypeMap - within BioAssayData package

  21. BioAssay • Three types of BioAssays (experimental steps): • PhysicalBioAssay • Contains information and annotation on the event of joining an Array with BioMaterial, typically with LabeledExtract(s); also, Treatments • MeasuredBioAssay • FeatureExtraction • DerivedBioAssay • corresponds to a dry-lab experimental step

  22. BioAssay

  23. Array • Manufacturing information about the implementation of an array design • Defects and deviations from the design can be recorded • FeatureDefects • ZoneDefects • The LIMS biomaterial information for what was put on each feature can be recorded here • ArrayGroups and Fiducials

  24. Array

  25. BioMaterial • Describes how a BioSource is treated to obtain the BioMaterial for Hybridization (typically a LabeledExtract) • Used by a BioAssayCreation in combination with an Array to produce a PhysicalBioAssay • A set of treatments are typically linear in time but can form a Directed Acyclic Graph • Formalization of Treatments with Compounds

  26. BioMaterial

  27. DesignElement • DesignElements • Features are the locations on the array • Reporters represents some biological sequence (clone, oligo, etc.) that can be placed on one or more features • immobilized characteristics • CompositeSequence is a grouping that represents a biological sequence composed of other biological sequences (gene, exon, etc.) • biological characteristics • *Maps - for relating Features to Reporters etc • MismatchInformation

  28. DesignElement

  29. BioSequence • BioSequence class - abstraction of various biosequences • DatabaseEntries for characterizing BioSequences • Simplication of BSANE draft; will need to be compatible with the end result of BSANE

  30. ArrayDesign • ArrayDesign describes a microarray design that can be manufactured • Zone information • DesignElementGroups

  31. ArrayDesign

  32. BioEvent • Abstraction of various MAGE events: • physical (e.g., BioMaterial Treatment) • data manipulation (Transformation) • Have associated ProtocolApplications (an ordered list) • Subclasses have some target (the result of the BioEvent) • Often have sources • Relevant for BioMaterial, BioAssay, BioAssayData packages

  33. Protocol • Protocol and ProtocolApplication • Protocol describes a generic laboratory procedure or analysis algorithm • ProtocolApplication describes the actual application of a protocol • ProtocolApplication: • values for the replaceable parameters • any variation from the Protocol • Similarly: • Hardware and HardwareApplication • Software and SoftwareApplication

  34. Protocol

  35. Miscellaneous (1) • Hierarchy of top-level abstract classes • Extendable - can have properties • Describable - can have also Descriptions and Security and Audit information • Identifiable - also has (unambiguous within some scope) identifier and a name • AuditAndSecurity package • Contact/Person/Organization classes • tracking of changes (audit trail) • user security (access rights to MAGE objects)

  36. Miscellaneous (2) • Description package • Description is a container for • free text description • OntologyEntries • DatabaseEntries • BibliographicReferences • BQS package • BibliographicReference class • Measurement package • Measurement is a quantity with a unit • simple Measurement ontology provided

  37. DTD & XML Format <MAGE-ML> <{packageName}_package> <{className}_assnlist> <!-- generated container element --> <{className}> <!-- independent class elements --> <{container}> <!-- one of *_assn, *_assnref, *_assnlist, *_assnreflist --> <{className or className_ref}> …<!-- alternating {container} and {className or className_ref} --> </{className or className_ref}> </{container}> </{className}> </{className}_assnlist> ... <!-- more independent classes --> </{packageName}_package>> ... <!-- more packages --> </MAGE-ML> * slide borrowed from Angel Pizarro, UPenn

  38. XML tree example MAGE-ML AuditAndSecurity_pkg Experiment_pkg Contact_assnlist Experiment_assnlist Contact Experiment Provider_assnref ExperimentDesign_assn ExperimentDesign Contact_ref * slide borrowed from Angel Pizarro, UPenn

  39. Programming APIs • Mapping of OM to language-specific OMs • API’s are automatically generated from the OM specifications • Get/set methods for associations • Get/set methods for attributes • XML <=> language-specific OM marshallers/unmarshallers - also automatically generated

  40. Programming APIs (cont.) • Use standard modules/packages • Xerces, JDK, etc. • Implementation in Java, C++, Perl • Building annotation tools/database access modules on top of these APIs

  41. Schedule • LSR ‘vote to vote’ at Dublin OMG meeting in November • LSR, AB, DTC votes at Dublin OMG meeting • Setting up FTF • open source implementation efforts • Jamboree II at EBI, December 6-11 • MAGE v.2.0 • current MAGE <=> MAGE v.2.0 mapping rules

  42. Web Sites • MAGE specification - hosted by Rosetta • links to documents • presentations • UML models • XMI files • Rose .mdl files • HTML version • PNG image files of diagrams • http://www.geml.org/omg.htm • MGED programming effort: • http://sourceforge.net/projects/mged

  43. Mailing Lists • Specification-related • lsr-ge@ebi.ac.uk • to subscribe, send the following to majordomo@ebi.ac.uk subscribe lsr-ge <yourEmailAddress> • MAGE-STK development-related • https://lists.sourceforge.net/lists/listinfo/mged-mage

  44. Questions?

More Related