1 / 40

EcoCyc , MetaCyc, and the Pathway Tools Software

EcoCyc , MetaCyc, and the Pathway Tools Software. Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International pkarp@ai.sri.com http:// www.ai.sri.com/pkarp/talks / BioCyc.org EcoCyc.org , MetaCyc.org. MetaCyc Family of Pathway/Genome Databases.

jett
Download Presentation

EcoCyc , MetaCyc, and the Pathway Tools Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EcoCyc, MetaCyc, and the Pathway Tools Software Peter D. Karp, Ph.D. Bioinformatics Research Group SRI International pkarp@ai.sri.com http://www.ai.sri.com/pkarp/talks/ BioCyc.org EcoCyc.org, MetaCyc.org

  2. MetaCyc Family ofPathway/Genome Databases • 1,700+ databases from multiple institutions • Cover all domains of life with microbial emphasis • All DBs derived from MetaCyc via computational pathway prediction • Common schema • Common controlled vocabularies • Common methodologies Archives of Toxicology 2011

  3. Curated Databases Within the MetaCyc Family

  4. BioCyc Collection of 1,100 Pathway/Genome Databases • Pathway/Genome Database (PGDB) – combines information about • Pathways, reactions, substrates • Enzymes, transporters • Genes, replicons • Transcription factors/sites, promoters, operons • Tier 1: Literature-Derived PGDBs • MetaCyc • EcoCyc -- Escherichia coli K-12 • Tier 2: Computationally-derived DBs, Some Curation -- 28 PGDBs • HumanCyc, BsubCyc • Mycobacterium tuberculosis • Tier 3: Computationally-derived DBs, No Curation -- The remainder

  5. EcoCyc Project – EcoCyc.org • E.coli Encyclopedia • Review-level Model-Organism Database for E. coli • Tracks evolving annotation of the E. coli genome and cellular networks • The two paradigms of EcoCyc • “Multi-dimensional annotation of the E. coli K-12 genome” • Positions of genes; functions of gene products – 76% / 66% exp • Gene Ontology terms; MultiFun terms • Gene product summaries and literature citations • Evidence codes • Multimeric complexes • Metabolic pathways • Regulation of gene expression and of protein activity Karp, Gunsalus, Collado-Vides, Paulsen Nuc. Acids Res. 35:7577 2007ASM News 70:25 2004 Science 293:2040

  6. EcoCyc = E.coli Dataset + Pathway/Genome Navigator URL: EcoCyc.org Pathways: 260 Reactions: Metabolic: 1446 Transport: 287 Compounds: 1,830 EcoCycv15.0 Citations: 21,000 Proteins: 4,479 Complexes: 895 RNAs: 285 Regulation: Operons: 3,409 Trans Factors: 206 Promoters: 1,878 TF Binding Sites: 2,394 Reg Interactions: 5345 Genes: 4,489

  7. EcoCyc on the iPhone

  8. EcoCyc on the iPhone

  9. PortEco.org • EcoCyc + PortEco = E. coli model-organism database • Query multiple E. coli databases simultaneously • E. coli gene expression archive • E. coli Wiki • ~40 E. coli and Shigella databases available at BioCyc.org

  10. MetaCyc: Metabolic Encyclopedia • Describe a representative sample of every experimentally determined metabolic pathway • Describe properties of metabolic enzymes • Literature-based DB with extensive references and commentary • Pathways, reactions, enzymes, substrates • MetaCycvsBioCyc: Experimentally elucidated pathways • Jointly developed by • P. Karp, R. Caspi, C. Fulcher, SRI International • L. Mueller, A. Pujar, Boyce Thompson Institute • S. Rhee, P. Zhang, Carnegie Institution Nucleic Acids Research2010

  11. Applications of MetaCyc • Reference source on metabolic pathways and enzymes • Predict pathways from genomes • Metabolic engineering • Find desired metabolic pathways and reactions • Find enzymes with desired activities, regulatory properties • Determine cofactor requirements

  12. MetaCyc Data -- Version 15.4

  13. Pathway Tools Software

  14. Pathway Tools Software + Annotated Genome PathoLogic Genome-Scale Flux Model Pathway/Genome Database Pathway/Genome Navigator Pathway/Genome Editors Briefings in Bioinformatics 11:40-79 2010

  15. Pathway Tools Software: PathoLogic • Computational creation of new Pathway/Genome Databases • Transforms genome into Pathway Tools schema and layers inferred information above the genome • Predicts operons • Predicts metabolic network • Predicts which genes code for missing enzymes in metabolic pathways • Infers transport reactions from transporter names

  16. Pathway Tools Software:Pathway/Genome Editors • Interactively update PGDBs with graphical editors • Support geographically distributed teams of curators with object database system • Gene and protein editor • Reaction editor • Compound editor • Pathway editor • Operon editor • Publication editor

  17. Pathway Tools Software:Pathway/Genome Navigator • Querying and visualization of: • Pathways • Reactions • Metabolites • Genes/Proteins/RNA • Regulatory interactions • Chromosomes • Two modes of operation: • Web mode • Desktop mode • Most functionality shared, but each has unique functionality

  18. Cellular Overview Diagram • Combines metabolic map and transporters • Automatically generated for each organism • Zoomable, queryable • Web-based and desktop • BioCyc.org • Tools  Cellular Overview • Tools  Regulatory Overview • Fastest with Safari, Chrome, Firefox

  19. Omics Data Graphing on Cellular Overview

  20. Genome Overview

  21. Genome Poster

  22. Regulatory Overview and Omics Viewer • Show regulatory relationships among gene groups

  23. Genome BrowserChIP-Chip Data Shown in Graph Track

  24. Enrichment Analysis “My experiments yielded a set of genes/metabolites. What do they have in common?” • Given a set of genes: • What GO terms are statistically over-represented in that set? • What metabolic pathways are over-represented? • What transcriptional regulators are over-represented? • Given a set of metabolites: • What metabolic pathways are statistically over-represented in that set?

  25. Automated Generation of Metabolic Flux Models from PGDBs Joint work with Mario Latendresse

  26. Goals • Decrease the time required to construct FBA models from 9-12 months to several weeks • Create richer FBA models that are tightly coupled to genome and regulatory information • Make FBA models and results more transparent

  27. Approach: Derive FBA Models from PGDBs • Store and update metabolic model within Pathway Tools • Export to constraint solver for model execution/solving • Fast generation of metabolic model from annotated genome • Pathway Tools schema • Associate a wealth of information with each metabolic model • Unique identifiers and controlled vocabulary for model components • Tools for querying and visualization of metabolic models • Tools for model debugging and analysis • Reaction balance checking • Dead-end metabolite analysis • Visualize reaction flux using cellular overview • Multiple gap filling

  28. FBA Model Execution • Runs SCIP solver on .lp file • Konrad-Zuse-ZentrumfürInformationstechnik Berlin • Interpret SCIP output • Determine if SCIP found a solution • Map fluxes to PGDB reactions • Display resulting fluxes on the Cellular Overview

  29. Model Debugging via Multiple Gap Filling • Most FBA models are not initially solvable because of incomplete or incorrect information • Use meta-optimization to postulate alterations to a model to render it solvable • Each alteration has an associated cost; minimize cost of alterations • Formulate as MILP and submit to SCIP

  30. Multiple Gap Filling of FBA Models • Reaction gap filling (Kumar et al, BMC Bioinf 2007 8:212): • Reverse directionality of selected reactions • Add a minimal number of reactions from MetaCyc to the model to enable a solution • Reaction cost is a function of reaction taxonomic range • Metabolite gap filling: Postulate additional nutrients and secretions • Partial solutions: Identify maximal subset of biomass components for which model can yield positive production rates

  31. Comparative Analysis • Via Cellular Overview • Comparative genome browser • Comparative pathway table • Comparative analysis reports • Compare reaction complements • Compare pathway complements • Compare transporter complements

  32. Advanced Query Form • Intuitive construction of complex database queries of SQL power

  33. Work in Progress • Computation of reaction atom mappings • Program to generate metabolic pathways that synthesize target compound from feedstock compound

  34. How to Learn More • BioCyc.org Help menu • BioCyc Webinars • Biocyc.org/webinar.shtml • Publications page • Biocyc.org/publications.shtml • Tutorials held at SRI • Next week: FBA

More Related