1 / 13

Overview IST-2001-38344

Overview IST-2001-38344. Cells are a collection of protein nanomachines. A biological challenge. To build models of protein complexes & understand the function of each component, based upon available evidence.

Download Presentation

Overview IST-2001-38344

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview IST-2001-38344

  2. Cells are a collection of protein nanomachines

  3. A biological challenge • To build models of protein complexes & understand the function of each component, based upon available evidence. • However, to build evidence for each protein interaction, a biologist must find, integrate, compare & then validate the results from a number of separate resources.

  4. DNA ‘chips’ Modelling HTP Sequencing SNP Gene prediction Proteomics Domain analysis Synchrotron Genomics & Proteomics Expression Folding PROTEIN STRUCTURES DNA

  5. Interaction Space Expression Space Literature Space Genomics & Proteomics

  6. The need for computerised information systems • New HTP methods produce orders of magnitude more data than before: • More than is interpretable manually. • Data are stored in a (semi-)structured format. • Much knowledge is in literature & patents: • 13,000,000 abstracts in MEDLINE. • Knowledge is stored in an unstructured format. • Solution: computerised information systems: • Enable data mining & visualisation of integrated resources, with text analysis.

  7. Components of bioGrid • Gene expression: • ExpressionSpace: • Clustering of microarray data. • May require large memory. • Protein interaction: • PSIMAP: • Predict interactions between protein domains. • May pre-compute as relatively unchanging. • Literature: • GoPubMed-D: • Organises corpus of documents into the GO ontology. • Lexical analysis requires lengthy compute.

  8. Expression Space: Space Explorer Interaction Space: PSIMAP LLNE YLEEVE EYEEDE LLNE YLEEVE EYEEDE LLNE YLEEVE EYEEDE Literature Space: Classification Server bioGrid: An integrated platform for gene expression data, protein interaction data, and literature

  9. Workflow for use case - Part I • Search literature for papers about the experimental system studied: • Microarray & mitochondria. • Upload the gene expression data set. • Cluster the gene expression data set. • Identify a cluster that contains genes of interest, e.g. energy production. • Examine the expression profiles of the genes in the cluster.

  10. Workflow for use case - Part II • Calculate an induced PSIMAP graph for the genes in the expression cluster. • Explore PSIMAP graph & nodes. • For pairs of genes predicted to interact: • Search literature for papers citing both genes. • Classify literature to assess possible function or metabolic processes of genes. • Assimilate evidence for components of a protein complex.

  11. Distributed technology implementation • Globus, Unicore, Legion, … • Are geared towards computational complexity, not semantic complexity. • BioGrid’s approach: • Agent-based approach. • Integration of rules, reasoning, and messaging in a Java-environment. • Using meta-model. • Advantage: • Easy to maintain, easy to use, includes code distribution, architecture independent, geared towards farms of local and remote machines.

  12. Prova-AA • Extensions to Prova for rule-based agent scripting. • Prova-AA introduces: • Messaging (local, JMS, and JADE). • Reaction rules. • Context-dependent inline reactions for asynchronous messaging. • Embedding of Prova agents in Java and Web app’s. • Advantages: • Cooperating agents vs. GRID RPC. • Ease of development and maintenance. • Platform independence and portability. • High level specification of communication protocols. • Native syntax integration with Java. • Low-cost creation of distributed workflows. And ad-hoc networks of computation nodes.

  13. Proposed Architecture of integrated platform

More Related