Macromolecular structure

Bioinformatics Macromolecular structure

Contents • Determination of protein structure • Structure databases • Secondary structure elements (SSE) • Tertiary structure • Structure analysis • Structure alignment • Domain recognition • Structure prediction • Homology modelling • Threading/folder recognition • Secondary structure prediction • ab initio prediction

Structure Determination of protein structure Jacques van Heldenjvanheld@ucmb.ulb.ac.be

Crystallisation Hanging drop method / vapour diffusion method Microscope 1-Dilute protein solution Microscope slide many different conditions of 1&2 must be tried 2-Concentrated salt solution Crystal Slide courtesy from Shoshana Wodak

Determination of protein structure Diffraction pattern Atomic model Slide courtesy from Shoshana Wodak

The resolution problem q q q A high resolution protein structure : 1.5 - 2.0 Å resolution Slide courtesy from Shoshana Wodak

Nuclear Magnetic Resonance (NMR) Source: Branden & Tooze (1991)

Interatomic forces • Covalent interactions • Hydrogen bonds • Hydrophobic/hydrophilic interactions • Ionic interactions • van der Waals force • Repulsive forces

Structure Structure databases Jacques van Heldenjvanheld@ucmb.ulb.ac.be

Structure databases • PDB (Protein database) • Official structure repository • SCOP (Stuctural Classification Of Proteins) • Structure classification. Top level reflect structural classes.The second level, called Fold, includes topological and similarity criteria. • CATH (Class, Architecture, Topology and Homologous superfamily)

PDB entry header

Class Architecture Topology CATH - A protein domain classification • In CATH, protein domains are classified according to a tree with 4 levels of hierarchically • Class • Architecture • Topology • Homology Figure from Shoshana Wodak

Classifications of protein structures (domains) CATH: structural classification of proteins, [http://www.biochem.ucl.ac.uk/bsm/cath/] SCOP: Structural classification of proteins [http://scop.mrc-lmb.cam.ac.uk/scop/] FSSP:Fold classification based on structure alignments [http://www.sander.ebi.ac.uk/fssp/] HSSP: Homology derived secondary structure assignments [http://www.sander.ebi.ac.uk/hssp/] DALI:Classification of protein domains [http://www.ebi.ac.uk/dali/domain/] VAST: structural neighbours by direct 3D structure comparison [http://www.ncbi.nlm.nih.gov:80/Structure/VAST/vast.shtml] CE: Structure comparisons by Combinatorial Extension [http://cl.sdsc.edu/ce.html] Slide courtesy from Shoshana Wodak

Books • Branden, C. & Tooze, J. (1991). Introduction to protein structure. 1 edit, Garland Publishing Inc., New York and London. • Westhead, D.R., J.H. Parish, and R.M. Twyman. 2002. Bioinformatics. BIOS Scientific Publishers, Oxford. • Mount, M. (2001). Bioinformatics: Sequence and Genome Analysis. 1 edit. 1 vols, Cold Spring Harbor Laboratory Press, New York. • Gibas, C. & Jambeck, P. (2001). Developing Bioinformatics Computer Skills, O'Reilly.

Structure Secondary structure elements Jacques van Heldenjvanheld@ucmb.ulb.ac.be

Secondary structure - -helix Carbon Nitrogen Oxygen 3.6 residues hydrogen bond Source: Branden & Tooze (1991)

Hydrophobicity of side-chain residues in helices Blue: polar Red: basic or acidic Source: Branden & Tooze (1999)

Secondary structure -  sheets Antiparallel Parallel Source: Branden & Tooze (1991)

Secondary structure - twist of  sheets Mixed  sheet Source: Branden & Tooze (1991)

Angles of rotation • Each dipeptide unit is characterized by two angles of rotation • Phi around the N-Calpha bond • Psi around the Calpha-C bond Image from Branden & Tooze (1999)

The Ramachandran map Dipeptide unit Dipeptide unit Slide courtesy from Shoshana Wodak

Structure Tertiary structure Jacques van Heldenjvanheld@ucmb.ulb.ac.be

Retinol binding protein (PDB:1rpb) -sheet loop -helix Combinations of secondary structures

Bioinformatics Analysis of structure Jacques van Heldenjvanheld@ucmb.ulb.ac.be

Structure-structure alignment and comparison Structure B Structure A Question: Is structure A similar to structure B ? Approach: structure alignments Slide courtesy from Shoshana Wodak

Analyzing conformational changes Open form Closed form Citrate synthase, ligand induced conformational changes Domain motion and small structural distortions Slide courtesy from Shoshana Wodak

Defining Domains: What for? • Link between domain structure and function Different structural domains can be associated with different functions Enzyme active sites are often at domain interfaces; domain movements play a functional role DNA Methyltransferase Cathepsin D Slide courtesy from Shoshana Wodak

Methods for Identifying Domains • Underlying principle • Domain limits are defined by identifying groups of residues such that the number of contacts between groups is minimized. N N C C 4-cuts 1-cut N C 2-cuts Slide courtesy from Shoshana Wodak

Lactate dehydrogenase Domains From Contact Map Slide courtesy from Shoshana Wodak

Structure Structure prediction Jacques van Heldenjvanheld@ucmb.ulb.ac.be

Methods for structure prediction • Homology modelling • Building a 3D model on the basis of similar sequences • Threading • Threading the sequence on all known protein structures, and testing the consistency • Secondary structure prediction • ab initio prediction of tertiary structure • For proteins of normal size, it is almost impossible to predict structures ab initio. • Some results have been obtained in the prediction of oligopeptide structures.

Homology modelling - steps • Similarity search • Modelling of backbone • Secondary structure elements • Loops • Modelling of side chains • Refinement of the model • Verification • Steric compatibility of the residues

Homology modelling - similarity search • Starting from a query sequence, search for similar sequences with known structure. • Search for similar sequences in a database of protein structures. • Multiple alignment. • A weight can be assigned to each matching protein (higher score to more similar proteins) • The higher is the sequence similarity, the more accurate will be the predicted structure. • When one disposes of structure for proteins with >70% similarity with the query, a good model can be expected. • When the similarity is <40%, homology modeling gives poor results. • The lack of available structures constitutes one of the main limitations to homology modeling • In 2004, PDB contains

Homology modelling - Backbone modelling • Modelling of secondary structure elements • a-helices • b-sheets • For each secondary structure element of the template, align the backbone of query and template. • Loop modelling • Databases of loop regions • Loop main chain depends on number of aa and neighbour elements (a-a, a-b, b-a, b-b)

Homology modelling - Side chain modelling • Side-chain conformation (model building and energy refinement) • Conserved side chains take same coordinates as in the template. • For non-conserved side chains, use rotamer libraries to determine the most favourable conformation.

Homology modelling - refinement • After the steps above have been completed, the model can be refined by modifying the positions of some atoms in order to reduce the energy.

Macromolecular structure

Macromolecular structure

Presentation Transcript

Macromolecular Chemistry

Macromolecular CryoCrystallography Synchrotrons

Macromolecular Electron Microscopy

Macromolecular Chemistry

HTCondor and macromolecular structure validation

Discovering Macromolecular Interactions

Heteronuclear Relaxation and Macromolecular Structure and Dynamics Outline:

Macromolecular Chemistry

Macromolecular Structure Database Structural Database Infrastructure Services for Europe

Eugene Krissinel Macromolecular Structure Database keb@ebi.ac.uk

Workshop on Biological Macromolecular Structure Models

Macromolecular Chemistry

Hierarchy of Biological Complexity Macromolecular machines Protein and nucleic acid structure

the European Macromolecular Structure Database (EMSD).

Workshop on Biological Macromolecular Structure Models

Macromolecular structure refinement

Discovering Macromolecular Interactions

Macromolecular Chemistry

Macromolecular Machines - Introduction

Macromolecular Structures

High-speed macromolecular structure determination on a Superbend Beamline 8.3.1