wwpdb - PowerPoint PPT Presentation

www wwpdb org n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
wwpdb PowerPoint Presentation
Download Presentation
wwpdb

Loading in 2 Seconds...

play fullscreen
1 / 65
wwpdb
3 Views
Download Presentation
dunlape
Download Presentation

wwpdb

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. www.wwpdb.org September 29, 2008

  2. Agenda 10:00 am. Welcome and Introductions KH 10:15 Overview of recent wwPDB progress HB 10:35 Outreach HN 10:55 NMR Task Force JM 11:15 Improvements in Data Deposition and Processing KH 11:45 New Projects HB Noon Working Lunch 1:00pm Funding Update All 1:30 Matters Arising Committee membership Next meeting 2:00 Discussion 3:00 Executive Session 3:15 Feedback 3:30 Adjourn

  3. Worldwide Protein Data Bank www.wwpdb.org Overview Helen Berman

  4. wwPDBAC 2007 (on wwPDB Intranet)

  5. wwPDBAC 2007 Recommendations • Structure factors and/or NMR restraints should be a prerequisite for receiving a PDB ID • Done • Inform the relevant journals of this new policy • Done; adopted by some but not all • Validation Establish additional X-ray crystallography and NMR validation procedures • In progress Results should be made available to depositors immediately after submission. Upon depositor request, the validation reports should be made available to designated scientific journal editors • Possible now, journal policies have not as yet changed Work to establish recommendations for additional experimental data deposition and release requirements • In progress

  6. wwPDB AchievementsOctober 2007 - September 2008 • Continued growth of archive – now more than 50,000 structures • Website updates • Download statistics available • Publications and presentations • Enhanced complex molecule annotation • New Format document • Initiation of Common Annotation Tool development

  7. Depositions

  8. Depositions to the PDB by decade Number of released entries Year:

  9. Depositor locations Download locations RCSB PDB PDBe PDBj

  10. PDB File Downloads Last 12 months FTP: 256,753,220 HTTP: 47,102,103 Total: 303,855,323

  11. Worldwide Protein Data Bank www.wwpdb.org Outreach Haruki Nakamura

  12. Outreach wwPDB website Simultaneous updating PDB archives Publications Professional society meetings Presentations Exhibit booth

  13. wwPDB website Deposition and Release Policies Deposition and download statistics Format Description Meeting information and preliminary recommendations

  14. Simultaneous weekly update of PDB archive • In the past, PDBj site started to copy the latest data and load them to the local database system only after the RCSB-PDB archive was updated on Wednesday. Therefore, there was some delay in updating the database at PDBj. This frustrated potential PDBj users and they preferred to access RCSB-PDB. • From Sept. 2008, PDBj copies the latest data directly from the internal database in RCSB-PDB to pre-construct the PDBj database on Saturday midnight. • By receiving a mail sent from RCSB-PDB automatically after updating the public ftp-site on every Wednesday, the ftp-site at PDBj is also updated with little time delay.

  15. Joint publications • K. Henrick, Z. Feng, W. Bluhm, D. Dimitropoulos, J.F. Doreleijers, S. Dutta, J.L. Flippen-Anderson, J. Ionides, C. Kamada, E. Krissinel, C.L. Lawson, J.L. Markley, H. Nakamura, R. Newman, Y. Shimizu, J. Swaminathan, S. Velankar, J. Ory, E.L. Ulrich, W. Vranken, J. Westbrook, R. Yamashita, H. Yang, J. Young, M. Yousufuddin, and H. Berman (2008) Remediation of the Protein Data Bank Archive. Nucleic Acids Res. 36(Database issue): D426-D433. • J.L. Markley, E.L. Ulrich, H. Berman, K. Henrick, H. Nakamura, and H. Akutsu (2008) BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): New policies affecting biomolecular NMR depositions. J Biol NMR. 40: 153-155. • S. Dutta, K. Burkhardt, G.J. Swaminathan, T. Kosada, K. Henrick, H. Nakamura, and H.M. Berman, Data deposition and annotation at the Worldwide Protein Data Bank, in Methods in Molecular Biology, 426: Structural Proteomics: High-Throughput Methods, B.G. Kobe, Mitchell; Huber, Thomas Editor. 2008, Humana Press: Totowa, NJ. • C.L. Lawson, S. Dutta, J.D. Westbrook, K. Henrick, and H.M. Berman (2008) Representation of viruses in the remediated PDB archive. Acta Cryst. D64: 874-882.

  16. Interactions • Exchange visits • PDBe/RCSB PDB • PDBj/RCSB PDB • PDBj/BMRB • BMRB/RCSB PDB • BMRB/PDBe • Phone conference with site directors-twice a year • VTC’s among staff • BMRB/RCSB PDB twice a month (ADIT-NMR) • MSD/RCSB PDB weekly • RCSB PDB/PDBj and BMRB/PDBj • BMRB/PDBe • Daily emails among staff • PDBe/RCSB PDB • PDBj/RCSB PDB • BMRB/RCSB PDB, PDBj, PDBe wwPDB Retreat 2007

  17. wwPDB Retreat

  18. IUCr Osaka 2008 • Joint exhibition stand • Presentations • Keynote lecture, What the Protein Data Bank tells us about the past, present and future of structural biology • Validation talk, Data Quality in the PDB Archive • Q&A at the Commission on Biological Macromolecules • Specialized Participation • Small Angle Commission • Workshop on New Routes to Crystallographic Data Publication • COMCIFs

  19. http://www.eccb08.org A demonstration describing the wwPDB highlighting the collaboration as well as services offered by member sites

  20. Worldwide Protein Data Bank www.wwpdb.org NMR Update John Markley

  21. NMR structure depositions • Number of NMR structures deposited through ADIT-NMR (09/01/07-08/31/08) • BMRB -> RCSB PDB 461 • PDBj - BMRB -> PDBj 112 • Restraints remediation • Processing is virtually complete • Will be released as soon as it can be made consistent with the remediated chemical components dictionary

  22. wwPDB policies and rules on NMR entries Two types of NMR experiments will be distinguished in the PDB entries Solution NMR Solid-state NMR NMR entries will have new PDB records MDLTYP to indicate MINIMIZED AVERAGE NUMMDL tospecify number of models in entry These changes are reflected in Format Guide 3.2

  23. wwPDB policies and rules on NMR entries • The numbering of models is sequential, beginning with 1 • All models in a deposition (ensemble members and minimized average, if provided) should be superimposed in an appropriate author determined manner, and only one superposition method should be used. • All models in an NMR ensemble and the minimized average structure, if provided, should have the same sequence and covalent structure (exact same number and type of atoms: hydrogens and heavy atoms), and chemistry (e.g., protonation state)

  24. Policies clarified by NMR Task Force August 26, 2008 • PDB will accept minimized average structures only if they meet the above criteria for alignment and covalent structure • The number of models will not be limited in a PDB file • Chemical shifts deposition will become mandatory Depositors are encouraged to avail themselves of third-party validation software prior to deposition of NMR structures

  25. Worldwide Protein Data Bank www.wwpdb.org Improvements in Data Deposition and Annotation Kim Henrick

  26. A year of VTC’s and discussions

  27. PDB Contents Guide Version 3.2 The goal was to further clarify all formats and procedures so as to create a more uniform archive

  28. Process • Every record was reviewed for scientific correctness and clarity by wwPDB annotators • Some records were added and others expanded • Task Force members were consulted where appropriate

  29. Added PDB Format Records SPLIT for large structures to indicate number of PDB entries NUMMDL number of MODELS in an entry MDLTYP model types and if C-alpha only chains REMARK 0 Re-refinement notice REMARK 475 Residues modeled with zero occupancy REMARK 480 Polymer atoms modeled with zero occupancy REMARK 620 Metal coordination REMARK 630 Inhibitor Description DBREF1 / DBREF2 To matchvery long UniProt Identifiers DBREF (standard format still used)

  30. Internal Documentation

  31. Results • Complete new Format document produced and released to public September 15, 2008 • Files will be processed according to this specification starting November 15, 2008 • All files in archive will be brought up to this standard Q1 2009

  32. X-ray Validation Task Force Workshop April 14-16, 2008 EBI, Hinxton, UK www.wwpdb.org/workshop/2008/index.html Randy Read (Chair), Paul Adams, Axel Brunger, Paul Emsley, Robbie Joosten, Gerard Kleywegt, Eugene Krissinel, Thomas Luetteke, Zbyszek Otwinowski, Tassos Perrakis, Jane Richardson, Will Sheffler, Janet Smith, Ian Tickle, Gert Vriend

  33. wwPDB Validation Task Force Workshop report to be published in Fall 2008 Candidate global and local validation measures were identified These measures were reviewed in terms of the requirements of depositors, reviewers, and users This meeting of the X-ray Validation Task Force was held to collect recommendations and develop consensus on additional validation that should be performed on PDB entries, and to identify software applications to perform validation tasks. Preliminary Outcomes:

  34. Remediation and Curation of Complex Chemistry in the PDB

  35. Inhibitor molecules: annotate the chem comp dictionary and migrate details to PDB entries Ribosomal (postranslational modifications) and non-ribosomal cyclic, modified and conjugated peptides: consistently given a SEQRES , SOURCE; annotate an entity look up table and transfer to PDB entries SCOPE

  36. 2VUM AMANITIN

  37. recently shown to be gene product Mapping to UNIPROT e.g. AMATX_AMAPH (P85421) 2VUMcyclically permuted, and needs to be corrected SEQRES 1 M 8 ASN HYP ILX TRX GLY ILE GLY CSX to SEQRES 1 M 8 ILX TRX GLY ILE GLY CSX ASN HYP to align with the gene sequence for beta-amanitin from Amanita phalloides, and alpha-amanitin from Amanita bispoigera. The encoded sequence would be, Ile-Trp-Gly-Ile-Gly-Cys-Asn-Pro Needs MODRES to match gene product AMANITIN

  38. Cyclic, Modified and Conjugated Peptides May be Ribosomal or Non-Ribosomal Non-gene peptides e.g. actinomycin D i.e. require a gene cluster Nonribosomal peptides http://bioinfo.lifl.fr/norine/ or Novel Antibiotics DataBase http://www.nih.go.jp/~jun/NADB/search.html

  39. Value to users To understand unique and shared aspects of a particular occurrence To find a specific system : Some components of a PDB file, such as inhibitors and antibiotic peptides, might not be found or even be apparent To study related ligands across different proteins

  40. Challenges • Inclusion of non-standard amino acid, nucleotides, or other chemical groups in sequence • Non-linear (cyclic or branched) sequences • Microheterogeneity (some cases) • Non-uniform annotation of the same molecule in different PDB entries • Lack of annotation regarding the source and function of these molecules

  41. Solutions • Analysis and classification • Identify antibiotics and inhibitors and group them into polymeric molecules or single components • Dictionary updates • Build single chemical components for appropriate cases • Update dictionary with source, function and other details • Remediation and future processing • Edit/revise files to include compound name, sequence, source and function for all antibiotics and inhibitors • Establish rules and procedures to make new annotations consistent

  42. Single component vs. Polymeric • Single component antibiotics or inhibitors • Build component and retain subcomponent information; annotate dictionary with details about molecule • Migrate details from dictionary to entry files in specific remarks • e.g. D-Phenylalanyl-L-prolyl-L-arginine chloromethyl ketone (PPACK) • Polymeric (peptide-like) antibiotics or inhibitors • Present sequence, compound name, and source information as any regular polymer • Include details about functions in specific remarks • e.g. post-translationally modified ribosomal peptides, non-ribosomal cyclic, modified or conjugated peptides

  43. ~1300 identified PDB entries Antibacterial Antiviral Antimicrobial Antifungal Antibiotic Overlap with Anticancer Anti-inflammatory Immunosuppressant Herbicide How many? • Antibiotics • Single component: ~1000 • Polymeric: ~300 • Inhibitors • Natural and synthetic inhibitors of enzymes and other cellular processes • Single component: ~350 • Polymeric:~350 • Others • Toxins: ~120

  44. THIOSTREPTON

  45. 4 PDB entries with 4 different representations 1e9w SEQRES THR ILE ALA DHA ALA DHA PYT 2jq7 SEQRES ILE ALA DHA ALA 1oln LINKed HETs ROP incorrectly used 3cf5 is single molecule TXX SEQRES should be TZO THR TZB TSI TZO XAA QUA ILE ALA DHA ALA XBB TZO DHA PYT Now matched in all 4 entries, TXX obsolete THIOSTREPTON

  46. THIOSTREPTON _entity.pdbx_description ; Thiostrepton complex bacterial natural product containing thiazole rings that's used as a topical veterinary antibiotic and also has promising antimalarial and anticancer activity first isolated from bacteria in 1955, thiostrepton has an unusual type of antibiotic activity: It disables protein biosynthesis by binding to ribosomal RNA and one of its associated proteins and interacts directly with 23S rRNA nucleotides 1067A and 1095A ; _entity.type “Polypeptide, sulfur containing antibiotic” _entity.details ; Thiostrepton is a macrocyclic antibiotic incorporating thiazoles and other atypical amino acids. Patented in 1961, thiostrepton has been used as an antibiotic and acts by binding to ribosomes to prevent the binding of the EF-G elongation factor and GTP to the 50S riobsomal subunit. Thiostrepton is an inducer of tipA, a gene that controls the bacterial transcription regulators, TipAL and TipAS, members of the MerR proteins that are central regulators in multidrug resistance. Closely related to siomycin, a recently discovered inhibitor of oncogenic transcription factor - FoxM1. The thiostrepton-resistant gene is also commonly used as a selective marker for recombinant DNA/plasmid technologies.

  47. 1 “CAS” “1393-48-2” ? 1 “PUBCHEM” “16130278” ? 1 “Merck Index” “11:9295 ; 14:9364” ? 1 “RTECS” “XN6300100” ? 1 “MDL number” “MFCD00135828” http://www.mdli.com/ 1 “EG/EC Number” “215-734-9” ? 1 “ChemSpider” 10469505 http://www.chemspider.com/ 1 “URL” http://www.fermentek.co.il/Thiostrepton.htm ? 1 “URL” http://www.tebu-bio.com/file/product/170BIA-T1158-1/ ? 1 “URL” http://www.bioaustralis.com/pdfs/thiostrepton.pdf ? 1 “Sigma Aldrich” “T8902” http://www.sigmaaldrich.com/ 1 “Chemical Class” “macrolide” ? 1 “MESH” “Peptides, Cyclic [D04.345.566]” ? 1 “Pharm. Action” “Anti-Bacterial Agent” ? 1 “Image” http://pubs.acs.org/cen/images/8239/8239notw4image.gif ? 1 “Image” http://en.wikipedia.org/wiki/Image:Thiostrepton.png ? THIOSTREPTON

  48. Alert - New Protein Modifications Thu, September 25, 2008 1:17 pm John S. Garavelli UniProt/RESID database micrococcin P1 SCTTCVCTCSCCT Bacillus cereus strain ATCC 14579 UniProt:Q812G9_BACCR, Incorrectly annotated as a Putative lantibiotic peptide Now believe that all the pyridinyl polythiazole antibiotics, including micrococcin P1, thiostrepton, thiocillin, GE2270 A and sulfamycin B, are genetically encoded directly.

  49. THIOSTREPTON TZO THR TZB TSI TZO XAA QUA ILE ALA DHA ALA XBB TZO DHA PYT SEQRES QUA ILE ALA SER ALA SER CYS THR THR CYS ILE CYS THR CYS SER CYS SER SER NH2