1 / 26

Management and Distribution of Chemical Data in the Protein Data Bank

Management and Distribution of Chemical Data in the Protein Data Bank. John Westbrook, Dimitris Dimitropoulos, Jasmine Young, Peter Rose, Philip E. Bourne and Helen Berman RCSB Protein Data Bank. U.S. Government Chemical Databases and Open Chemistry August 26, 2011.

odette
Download Presentation

Management and Distribution of Chemical Data in the Protein Data Bank

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Management and Distribution of Chemical Data in the Protein Data Bank John Westbrook, Dimitris Dimitropoulos, Jasmine Young, Peter Rose, Philip E. Bourne and Helen Berman RCSB Protein Data Bank U.S. Government Chemical Databases and Open Chemistry August 26, 2011

  2. What is the Protein Data Bank? Single international archive for information about the structure of large biological molecules PDB depositions should be restricted to atomic coordinates that are substantially determined by experimental measurements on specimens containing biological macromolecules Outcome of a Workshop on Archiving Structural Models of Biological Macromolecules (2006) Structure 14: 1211-1217

  3. What is the content of the PDB? Public archive (August 2011) More than 75,000 entries More than 550,000 files Requires over 115 GB of storage Data dictionaries Derived data files For each entry Atomic coordinates Sequence information Description of structure Experimental data Release status information Internal archive Depositor correspondence Depositor contact information Paper records Documentation Historical records from Day One

  4. Who manages the PDB? EMBL-EBI, Wellcome Trust, BBSRC, NIGMS, EU NSF, NIGMS, DOE, NLM, NCI, NINDS, NIDDK NBDC-JST NLM

  5. Who uses the PDB? Depositors Users

  6. Number of released entries Year:

  7. Chemical data in PDB Understanding the interactions between proteins and small molecules is key to understanding biological function • Providing accurate chemical descriptions is a major focus of PDB annotation • All polymer and small molecule chemical components are described in the PDB Chemical Component Dictionary • Significant software and data infrastructure has been created to maintain this dictionary and to provide a consistent chemical representation across the PDB archive • Chemical representation in the PDB is under constant scrutiny and is continuously improved

  8. Standardize residue/atom nomenclature New? No Annotate chemical definition Yes Chemical Component Dictionary Perceived covalent structure Compare with dictionary Process deposited entry Chemical components How does new chemistry enter the PDB? Deposited coordinates

  9. Assessing data quality Chemical data in PDB are experimentally derived subject to modeling restraints PDB entry 3dnb; 1.3 Å resolution PDB entry 6bna; 2.21 Å resolution

  10. How are data checked now? • Chemistry • Polymer (match to sequence DB and internal consistency) • Ligands, ions, inhibitors (match to dictionary) • Geometry • Close contacts • Valence geometry • Torsion angles • Experimental data • Model vs. structure factors

  11. NMRValidation Meetings held September 2009, January 2011 Report in progress Chairs: Gaetano Montelione (Rutgers), Michael Nilges (Institut Pasteur) Small-Angle Scattering Members: Jill Trewhella (University of Sydney), Dmitri Svergun (EMBL Hamburg), Andrej Sali (UCSF), Mamoru Sato (Yokohama City University), John Tainer (Scripps) On-going focus on data quality Method-specific Validation Task Forces have been convened to collect recommendations and develop consensus on method-specific issues, including validation checks that should be performed and identification of validation software applications. X-ray Validation 2008 Workshop on Next Generation Validation Tools for the wwPDB White paper accepted by Structure Chair: Randy J. Read (University of Cambridge) 3DEM Validation Meeting September 2010 Chairs: Richard Henderson (Maps, Cambridge University), Andrej Sali (Models, UCSF) White paper in progress

  12. Documenting PDB chemistryin the Chemical Component Dictionary • Library of all polymer and non-polymer chemical components in PDB • ~13,000 chemical component definitions • 400 additional definitions of amino acid protonation variants • ~700 new components released this year • ~1700 component definitions updated this year • Maintained by members of the wwPDB

  13. wwPDB resources wwpdb.org

  14. Chemical Component Dictionary and data download options • Chemical definitions in mmCIF, PDBML/XML and SDF/MOL formats • Tabulations of SMILES, InChI and InChI key descriptors for each chemical definition • Bundles of coordinates extracted from PDB entries for each ligand in the archive, stored in mmCIF, PDBML and SDF/MOL formats

  15. Chemical Component Dictionary content Molecular names and synonyms Chemical formula, formula weight, and formal charge Atom and residue nomenclature Polymer linking type Model coordinates (an example from a PDB entry) Computed coordinates (Corina or OpenEye) Connectivity and bond types Stereochemistry and aromaticity Systematic names (ACDLabs & OpenEye) SMILES, InChi, and InChiKey descriptors Release status and revision history

  16. Chemical Component Dictionary Interpretation Definitions include • Common or representative forms of the molecule • Generally neutral and complete molecules • Off-the-shelf reagents used to prepare an experimental sample • Model coordinates from a single experimental observation • Computed coordinates from programs: Corina or OpenEye/Omega

  17. Searching the Chemical Component Dictionaryligand-expo.rcsb.org Search options • Molecular Name • Formula • SMILES • InChI/InChIKey • PDB component identifier • Chemical substructure Browsing options • Standard and modified amino acids • Standard and modified nucleotides • Selected top-selling pharmaceuticals • Common aromatic ring systems

  18. Ligand Expo: Browse dictionary content

  19. Ligand Expo: View chemical details

  20. Ligand Expo: View chemical details

  21. Ligand Expo: Find data in related resources

  22. Find small molecules at the RCSB PDBhttp://www.pdb.org Simple search for all entries containing a particular ligand

  23. RCSB PDB Small molecule Advanced Search • Interactive chemical structure search with graphics • Exact, substructure, superstructure, MW searches • Restricted formula searches

  24. RCSB PDB report and display of molecular interactions

  25. Access • RCSB Protein Data Bank • www.pdb.org • Ligand Expo • ligand-expo.rcsb.org • wwPDB • www.wwpdb.org • Dictionary Resources • mmcif.pdb.org • pdbml.pdb.org

  26. NIGMS Acknowledgements Operated by two members of the RCSB: The RCSB PDB is a member of the Supported by:

More Related