1 / 21

The Functional Genomics Experiment Object Model (FuGE)

The Functional Genomics Experiment Object Model (FuGE). Andrew Jones, School of Computer Science, University of Manchester. MGED Society. What is FuGE?. Various groups have tried to fuse MAGE and PEDRo in the past Such a model would be difficult to manage

adelio
Download Presentation

The Functional Genomics Experiment Object Model (FuGE)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Functional Genomics Experiment Object Model (FuGE) Andrew Jones, School of Computer Science, University of Manchester MGED Society

  2. What is FuGE? • Various groups have tried to fuse MAGE and PEDRo in the past • Such a model would be difficult to manage • FuGE is a model of the common components of functional genomics experiments • Aims to help the development of data standards • Should allow some cross-compatibility between different ‘omics experiments • Microarray & proteome standards will use parts of FuGE for some data formats

  3. So, what is FuGE? • An object model in UML (close to 1st stable release) • An XML Schema (in development) • A software API (will be created from UML) • FuGE use ontologies extensively, such as MGED Ontology or its successor (FuGO) Developed by members of MGED / PSI with input from cross-omics experimentalists e.g. RSBI

  4. What is FuGE not…? • Not an effort to create one data standard for all lab techniques • This problem is hard at technical level and v hard getting agreement from all groups • Not a model for metabolomics metadata • But it might help in the development of one • …and we would like to encourage input from the metabolomics community

  5. FuGE Structure • 2 sections: Common and Bio • Common – components that aid the development of a rich data standard • Protocols, external references, auditing and security settings • Bio – biological specific components • Biological (or chemical) materials, bio sequences • Summary of an investigation structure • References to data model specific to each domain

  6. Protocols • Protocols have a set of ordered atomic actions • Actions are user-entered text or ontology terms • Protocols can be associated with Software and Equipment • Protocols, Software and Equipment can have a set of defined Parameters • Mechanism for defining a standard protocol, and an instance of a protocol (date, operator…) • Nested protocols can be defined for representing complex procedures • An Action can be a reference to another Protocol

  7. = Inputs and outputs of Protocols = Instance of some Protocol FuGE Workflow Material Material Treatment Treatment Material Material Treatment Data Acquisition Data Transformation Material Data Data

  8. FuGE Workflow • Materials defined using terms from ontologies • Treatments defined by Protocols • Data represented in domain specific format • FuGE is the “glue” for sticking components together Material Material Treatment Treatment Material Material Treatment Data Acquisition Data Transformation Material Data Data

  9. Other useful components • Each object can be tagged with audit info: • Who made a change, when, what type of change • Security information: • users, groups for accessing/changing data • Consistent mechanism for identifying objects • Life sciences IDs (LSIDs) used to uniquely ID components • Objects can be referenced across documents • Mechanism for linking to external databases, literature refs and ontologies

  10. Investigation model • Stores a summary of the investigation to facilitate queries • Purpose of investigation (hypothesis) • Design of the investigation • e.g. strain differences, gene knockout, drug doses, time course • Stores the important variables • Values from ontology e.g. gene names, units etc… • Links from variables to relevant data items

  11. Benefits of shared components • Queries over common annotation • Samples, hypotheses, protocols • Shared software for experimental annotation and analysis • Microarrays, proteomics and metabolomics (and other experiments!) performed in same lab • Developing standards for each technique is a hard problem • Shared resources could alleviate the problems (audit, security, identifying objects, ontologies)

  12. Using FuGE in Practice • Imports parts of UML or XML Schema and extend with domain-specific components • Example: Attempting to integrate FuGE with our Manchester metabolomics database • Reference a FuGE entry for investigation structure and bio samples • Define ontologies and use FuGE as it is for experimental metadata • This would not include a format for mass spec or NMR data, which would also be needed

  13. Conclusions • FuGE was created to solve the general problem: • What are the common requirements for a “functional genomics” data standard? • MGED will use FuGE for generating MAGE version 2 • PSI evaluating FuGE for protein separation standard format • FuGE-based systems being implemented by a number of organisations • FuGE could help develop a metabolome format http://fuge.sourceforge.net

  14. Acknowledgements • FuGE has been developed in collaboration with many groups, including: • Angel Pizarro (U Penn) • Paul Spellman (Lawrence Berkley) • Michael Miller (Rosetta) • Members of Fred Hutchinson CRC, Seattle • RSBI • Various other members of MGED and PSI http://fuge.sourceforge.net

  15. DescribableIdentifiable

  16. Common.Description • Many classes inherit from Describable • Link to Audit / Security details • URI and text description

  17. Protocol

  18. Audit

  19. Investigation

  20. Material

  21. Common.Data • Ordered set of Dimensions • Data stored in Matrix • Matrix must be extended with subclasses

More Related