1 / 17

Creating a … Community Database Organism-Specific Database Model-Organism Database

Creating a … Community Database Organism-Specific Database Model-Organism Database. Why Create a PGDB?. Perform pathway analyses as part of a genome project Analyze omics data Create a central information resource for the organism Create an FBA model Perform comparative analyses.

marcie
Download Presentation

Creating a … Community Database Organism-Specific Database Model-Organism Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creating a …Community DatabaseOrganism-Specific DatabaseModel-Organism Database

  2. Why Create a PGDB? • Perform pathway analyses as part of a genome project • Analyze omics data • Create a central information resource for the organism • Create an FBA model • Perform comparative analyses

  3. Model Organism Databases • DBs that describe the genome and other information about an organism • Curated by experts for that organism • No one group can curate all the world’s genomes • Distribute workload across a community of experts to create a community resource • Every sequenced organism with an active experimental community requires a MOD • Integrate genome data with information about the biochemical and genetic network of the organism • Integrate literature-based information with computational predictions

  4. Rationale for MODs • Each “complete” genome is incomplete in several respects: • 40%-60% of genes have no assigned function • Roughly 7% of those assigned functions are incorrect • Many assigned functions are non-specific • MODs are platforms for global analyses of an organism • Interpret omics data in a pathway context • In silico prediction of essential genes • Characterize systems properties of metabolic and genetic networks

  5. What is Curation? • Ongoing updating and refinement of a PGDB • Correct false-positive and false-negative predictions • Incorporate information from experimental literature • Update genome sequence • Update gene functions, gene positions, gene names • Author comments and citations • Add new pathways, modify existing pathways • Enter information about regulatory networks

  6. Issues in Creating Public MODs • Obtaining funding • Scoping the project • Identify user community • Obtain buy-in and help from scientific community • IT: Set up database server, Web server • Hire and train curators

  7. Questions • Do you intend to make your PGDB public and to update it on an ongoing basis? • To create a Model Organism Database?

  8. Administering Pathway Tools

  9. Obtaining Pathway Tools • Free to non-commercial organizations • To obtain license agreement go to BioCyc.org and click on Software/Database Download • Follow Installation Guide • ptools-local directory • Locate in common directory • PGDBs created by all users who use this ptools installation • PGDBs downloaded via the registry • ptools-init.dat for this ptools installation

  10. New Pathway Tools Releases • Major releases = External software releases • Twice per year • Announced on ptools-users mailing list • Minor releases twice per year affect only our BioCyc.org Web site and flatfile distributions • We support one prior release only • Releases announced on ptools-users@ai.sri.com • Read release notes at • http://brg.ai.sri.com/ptools/release-notes.html • Install process: • Upgrade schema of your DB (software assisted)

  11. PGDB Storage:File or Relational Database • File storage: • Advantages: • No RDBMS installation and configuration • Disadvantages: • Must be loaded and saved in its entirety • No transaction history • No concurrent access for multiple users • Oracle/MySQL storage: • Advantages: • Faster read access, faster saves • Concurrent update access for multiple users • Stores history of all PGDB updates • Disadvantages: • RDBMS must be installed and configured

  12. Multiuser Access to PGDBs • PGDB stored within one Oracle or MySQL server • Each curator installs PTools on their workstation • Different curators can use different software platforms • Workstations query RDBMS server via internet • Local disk cache speeds access • For each frame access, PTools queries • In-memory cache, disk cache, RDBMS server • After curator saves changes, all changes made by other users are loaded into curator’s session

  13. How to Release a PGDB? • Decide on release frequency and schedule • Don’t wait until it’s perfect to release it! • Freeze curation for 1 week • Quality assurrance • Run consistency checker • Tools -> Consistency Checker • Also updates organism-summary statistics • Update publications, authors in organism frame • Update via Organism editor • Create new version of PGDB • ptools-local/pgdbs/yeastcyc/1.0/kb/yeastbase.ocelot • Edit against the new version, release the old version • Author release notes • Register PGDB in SRI PGDB registry • Will allow SRI to include it in BioCyc

  14. Pathway Tools Data Import/Export • File->Export • File->Import • Export/import to/from tab-delimited files • Export to Genbank, SBML, BioPAX • Export to attribute-value files • Attribute-value files can be imported into BioWarehouse • Relational database system for bioinformatics database integration

  15. Napster Comes to Bioinformatics • Public sharing of Pathway/Genome Databases • PGDB registry maintained by SRI at URL http://biocyc.org/registry.html • Registry operations • List contents of registry • Download PGDBs listed in the registry • Register PGDBs you have created

  16. Registry Details • Why register your PGDB? • Declare existence of your PGDB in a central location • Facilitate its download by other scientists • Facilitate its inclusion in BioCyc.org • Why download a PGDB? • Desktop Navigator provides more functionality than Web • Comparative operations • Programmatic querying and processing of PGDB • Registration process • Registered PGDBs have open availability by default • Authors can provide their own license agreements • Registered PGDBs reside in authors’ FTP site or HTTP server

  17. Desktop versus Web Mode • Pathway Tools runs in two different modes: • Desktop mode • Web mode (e.g., BioCyc.org) • Desktop vs Web functionality in Pathway Tools http://biocyc.org/desktop-vs-web-mode.shtml • You can run both desktop and web modes at your site • Your PTools web server need not be open to the public

More Related