1 / 25

Databases and Resources on 3D Structures of Biological Macromolecules

Databases and Resources on 3D Structures of Biological Macromolecules. Inter-University DEA/DES Bioinformatics 2000-2001 Shoshana J. Wodak, SCMBB-ULB. The different types of macromolecular structure databases. - Major public repositories for (primary) structural data.

dannon
Download Presentation

Databases and Resources on 3D Structures of Biological Macromolecules

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Databases and Resources on 3D Structures of Biological Macromolecules Inter-University DEA/DES Bioinformatics 2000-2001 Shoshana J. Wodak, SCMBB-ULB

  2. The different types of macromolecular structure databases - Major public repositories for (primary) structural data - Databases of derived information: -Classifications of protein domains & folds -etc….. -On-line servers for analysing protein structures -defining structural domains -assigning secondary structure -calculating surfaces and volumes -etc…. Overview of the different databases and underlying methods

  3. Repositories for data on 3D structures of Biological Macromolecules PDB: Protein Data Bank : 3D structures of biological macromolecules [http://www.rcsb.org/pdb/] MMDB:Entrez (NCBI) structure database (no models) [http://www.ncbi.nlm.nih.gov:80/Structure/MMDB/mmdb.shtml] BioMagResBank: data on 3D structures determined by NMR [http://www.bmrb.wisc.edu/] CSD: Cambridge small molecule database [http://www.ccdc.cam.ac.uk/] http://www.expasy.ch/alinks.html#Proteins

  4. Protein Data Bank (PDB) Total available structures Deposited structures for the year Dec. 5, 2000 13861 coordinate entries 4406 structures factor files 904 NMR restraints files New folds Old folds

  5. Methods for determining 3D structures of biological macromolecules Methods yielding models at atomic resolution - X-ray diffraction - Neutron diffraction - High resolution Nuclear Magnetic Resonance RNM Methods yielding low resolution models, or info on structure - Low-angle x-ray diffraction - Electron microscopy; electron diffraction - CD and infra-red spectroscopy

  6. The 3D structure of biological macromolecules by X-ray diffraction Diffraction pattern diffracted X-ray beams in discrete directions, l incident x-ray beam,l X-ray source crystal Crystal

  7. From the diffraction pattern to the 3D atomic model Diffraction pattern Atomic model -derive phases -compute r(x) = FT (F(s)) -build and refine model

  8. The 3D structure of biological macromolecules by NMR spectroscopy (NOE) distance constraints Chem. Shifts, Jcoupl 2D proton NMR spectrum Of b-hairpin Atomic model of b-sheet from NMR data

  9. Insulin gene enhancer protein Mammalian (rat) Engrailed homoedomain Drosophila Melanogaster NMR structure, 50 conformations Crystal structure, Resolution: 2.1Å

  10. From atomic model to physical properties and molecular function Ribbon drawing, Rasmol/MolMol/ Electrostatic potential displayed on molecular surface

  11. Classifications of protein structures (domains) CATH: structural classification of proteins, [http://www.biochem.ucl.ac.uk/bsm/cath/] SCOP: Structural classification of proteins [http://scop.mrc-lmb.cam.ac.uk/scop/] FSSP:Fold classification based on structure alignments [http://www.sander.ebi.ac.uk/fssp/] HSSP: Homology derived secndary structure assignments [http://www.sander.ebi.ac.uk/hssp/] DALI:Classification of protein domains [http://www.ebi.ac.uk/dali/domain/] VAST: structural neighbours by direct 3D structure comparison [http://www.ncbi.nlm.nih.gov:80/Structure/VAST/vast.shtml] CE: Structure comparisons by Combinatorial Extension [http://cl.sdsc.edu/ce.html]

  12. Some aspects of methodology - Secondary structure assignments - Structure comparisons; structure-structure alignments - defining structural domains from atomic coordinates - Calculation of the molecular and solvent accessible surface (will be dealt with subsequently ) - Homology modelling (will be dealt with subsequently)

  13. Secondary structure assignments -By the crystallographer -visual inspection; modelling programs (‘O’) ) -By completely automatic procedures -DSSP (Kabsch & Sander, 1983) computes f,y angles, H-bonds solvent accessibilities

  14. Structure comparisons and structure-structure alignments Structure B Structure A Q: Is structure A similar to structure B ? A: from structure alignments see accompanying transparencies

  15. Defining Domains: What for? Identify regions of the polypeptide chain that fold independently; are stable on their own folding units; initiation sites for folding Identify gene fusion or gene insertion events from analysis of the 3D structure rrelate to evolutionary history Allow for meaningful structural classification of proteins rSCOP ; CATH classifications

  16. Defining Domains: What for? Link domain structure to function Enzyme active sites are often at domain interfaces; domain movements play a functional role Different structural domains can be associated with different functions DNA Methyltransferase Cathepsin D

  17. Methods for Identifying Domains • Underlying principles: • Interactions between residues within domains are more extensive than between domains D 1 D 2 Wetlaufer (1973) Richardson (1981) • Interactions are modelled by counting inter-atomic contacts or computing buried surface area

  18. Methods for Identifying Domains Underlying principles: • Domain limits are defined by identifying groups of residues such that Nb of contacts between groups is minimized. N N C C 4-cuts 1-cut N C 2-cuts

  19. Methods for Identifying Domains Visual inspection -Philips (1956) hen lysozyme -Porter (1959;1973) immunoglobulin light chains -Drenth et al. (1968) protease papain -Wetlaufer (1973) & Richardson (1981)several proteins Systematic surveys -Rossman & Liljas (1974) domains from distance maps -Crippen (1978) cluster segments/contact density -Rose (1979) iterative splitting of contiguous segments -Wodak & Janin (1981) & Rashin (1981) buried surface area & globularity index -Sander (1981)domain limits from Cacontacts

  20. Lactate dehydrogenase Domains From Contact Map

  21. Lactate dehydrogenase Hierarchic splitting of Rose (1979) Hierarchic assembly of segments Crippen (1978) Hierarchic splits based on SA scans Wodak&Janin(1981) Concanavlin A

  22. Methods for Identifying Domains Systematic surveys (continued) -Kikuchi et al. (1988) domains from distance maps -Holm & Sander (1994)cluster residues based on contacts -Islam et al. (1994) cluster segments with minimum inter-domain contacts -Siddiqui & Barton (1995) successive splits of contiguous segments -Sowdhamini & Blundell (1995) cluster secondary structure elements -Swindell (1995a,b) search for hydrophobic cores -Wernisch et al. (1999)graph heuristic/Voronoi cells -Taylor (1999)heuristic based on Ising model - Jones et al. (1998)consensus of 3 methods + manual

  23. STRUctural Domain Limits (STRUDL) Wernsich , Huntings & Wodak (1999) New generation of procedures • Uses interface area as contact measure • Is based on a graph heuristic • - partitions Into any arbitrary set of residues • with no reference to chain connectivity or • secondary structure • -approximates closely the exact solution (B&B) • Generated partitions are accepted or rejected on basis of set of optimised criteria

  24. Domain assignments with minor differences 1gph1 1gph1 STRUDL CATH 1pgd 1pgd STRUDL CATH

  25. Other databases or servers of derived structural data -HSSP: Homology-derived secondary structure of proteins db [http://www.sander.ebi.ac.uk/hssp/] -Mol_R_Us: generate images for structures in PDB [http://molbio.info.nih.gov/cgi-bin/pdb] -TOPS: Protein topology atlas [http://tops.ebi.ac.uk/tops/html/1tph1.html] -BMM & DEE: servers computing domain limits from coordinates [http://jura.ebi.ac.uk:8080/3Dee/help/help_intro.html] -ReLiBase: Receptor/ligand complexes db [http://relibase.ebi.ac.uk/reli-cgi/rll?/reli-cgi/general_layout.pl+home] -SWISS-MODEL: Automatically generated protein models db -ModBase: Db of comparative protein structure models (for links see Expasy server)

More Related