- 2011- 3D Structures of Biological Macromolecules Exercise: Structural Comparison of Proteins. Jürgen Sühnel email@example.com. Leibniz Institute for Age Research, Fritz Lipmann Institute, Jena Centre for Bioinformatics Jena / Germany.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
3D Structuresof Biological Macromolecules
Exercise: Structural Comparison of Proteins
Leibniz Institute for Age Research, Fritz Lipmann Institute,
Jena Centre for Bioinformatics
Jena / Germany
Supplementary Material: http://www.fli-leibniz.de/www_bioc/3D/
Generate a set of superposed three-dimensional coordinates for each input structure in such a way that the root-mean-square-deviation for all atom pairs or selected subsets of atom pairs is minimal.
The most basic approach for the alignment of 3D structures requires a precalculated sequence alignment as input. An especially simple situation occurs if multiple conformations of the same protein are compared. In this case no alignment is necessary, since the sequences are the same. This method traditionally uses a simple least-squares fitting algorithm, in which the optimal rotations and translations are found by minimizing the sum of the squared distances among all structures.
Algorithms based on multidimensional rotations and modified quaternions have been developed to identify topological relationships between protein structures without the need for a predetermined alignment.
Quantitative StructuralComparisonof Protein Structures
Root Mean Square Deviation
RMSD = S (xai - xbi)2+(yai - ybi)2+(zai - zbi)2
Beginning with an input PDB file or set of files, SuperPosefirst extracts the sequences of all chains in the file(s).
Eachsequence pair is then aligned using a Needleman–Wunschpairwise alignment algorithm.
If the pairwise sequence identity falls below the defaultthreshold (25%), SuperPose determines the
secondary structureusing VADAR (volume, area, dihedral angle reporter) andperforms a secondary structure
alignment using a modified Needleman–Wunschalgorithm.
After the sequence or secondary structure alignmentis complete, SuperPose then generates a
difference distance(DD) matrix between aligned alpha carbon atoms. A differencedistance matrix can be
generated by first calculating the distancesbetween all pairs of C atoms in one molecule to generate aninitial
distance matrix. A second pairwise distance matrix isgenerated for the second molecule and,
for equivalent/alignedCalpha atoms, the two matrices are subtracted from one another,
yieldingthe DD matrix. From the DD matrix it is possible to quantitativelyassess the structural
similarity/dissimilarity between two structures.In fact, the difference distance method is particularly good
at detecting domain or hinge motions in proteins. SuperPose analyzes the DD matrices and
identifies thelargest contiguous domain between the two molecules that exhibits<2.0 Å difference.
From the information derived fromthe sequence alignment and DD comparison, the program then makesa
decision regarding which regions should be superimposed andwhich atoms should be counted in calculating
the RMSD. Thisinformation is then fed into the quaternion superposition algorithmand the RMSD calculation
subroutine. The quaternion superpositionprogram is written in C and is based on both Kearsley's method
and the PDBSUP Fortran program developed by Rupp and Parkin. Quaternions were developed by
W. Hamilton (the mathematician/physicist)in 1843 as a convenient way to parameterize rotations in a simple
algebraic fashion. Because algebraic expressions are more rapidlycalculable than trigonometric expressions
using computers, thequaternion approach is exceedingly fast.
SuperPose can calculate both pairwise and multiple structuresuperpositions [using standard hierarchical methods
andcan generate a variety of RMSD values for alpha carbons, backboneatoms, heavy atoms and all atoms
(average and pairwise). Whenidentical sequences are compared, SuperPose also generates ‘perresidue’
RMSD tables and plots to allow users to identify,assess and view individual residue displacements.
Identical/same sequence but different structure
Calmodulin: 1A29 vs. 1CLL
(open andclosed form)
Similarstructure but slightly different sequencelength
Thioredoxin: 3TRX vs. 2TRX_a
Similarstructure but extremely different sequence
Thioredoxin/Glutaredoxin: 3TRX vs. 3GRX_1
The unified atomic mass unit (symbol: u) or Dalton (symbol: Da) is a unit that is used for indicating mass on an atomic or molecular scale.
It is defined as one twelfth of the rest mass of an unbound atom of carbon-12 in its nuclear and electronic ground state,and has a value of 1.660538782(83)×10−27 kg.
Da is approximately equal to the mass of 1 proton or 1 neutron.
Glycine (C2H3ON): M= 2*12 + 3*1 + 16 + 14 = 57 Da