Towards Low Resolution Refinement. Garib N Murshudov York Structural Laboratory Chemistry Department University of York. Contents. Some of the projects TWIN refinement in REFMAC and its extension Problems of low resolution refinement
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Garib N Murshudov
York Structural Laboratory
Chemistry Department
University of York
People involved: Alexei Vagin, Fei Long
Using redesigned PDB with their domain and multimeric organisation tries to solve molecular replacement problem.
People involved: Andrey Lebedev and Paul Young (and ccp4)
A GUI to design links and covalent link descriptions.
Link
Conformation invariant alignment and restrain generation
People involved:
Rob Nicholls
Green= 0
Red= 2
2cex : Sialic Acid Binding Protein
1mpd : Maltose Binding Protein
average fragment score = 1.2A
Andrey Lebedev
merohedral and pseudomerohedral twinning
Domain 1
Twinning operator

Domain 2
Crystal symmetry: P3 P2 P2
Constrain:  β = 90º 
Lattice symmetry *: P622 P222 P2
(rotations only)
Possible twinning: merohedral pseudomerohedral 
Crystal lattice is invariant with respect to twinning operator.
The crystal is NOT invariant with respect to twinning operator.
Twin refinement in refmac (5.5 or later) is automatic.
Intensities can be used
If phases are available they can be used
Maximum likelihood refinement is used
The dimension of integration is in general twice the number of twin related domains. Since the phases do not contribute to the first part of the integrant the second part becomes Rice distribution.
The integration is carried out using Laplace approximation.
These equations are general enough to account for: nonmerohedral twinning (including allawtwin), unmerged data. A little bit modification should allow handling of simultaneous twin and SAD/MAD phasing, radiation damage
Map coefficients
It seems to be working reasonable well. For unbiased map it is necessary to integrate over errors in all parameters (observations as well as refined parameters.
“non twin” map
“twin” map
Space group: P21
Cell parameters: 54.63 142.77 84.37 90.00 108.76 90.00
Resolution: 1.8
Twin operators: H, K, H+L (or H, K, HL)
Twin fractions: 0.46, 0.54
“Rmerge”: 0.065
Rmerge for calc: 0.36
R/Rfree no twin: 0.30/0.34
R/Rfee twin: 0.21/0.26
R/Rfree are final statistics after refinement and rebuilding
Note: Reindexing may be needed. In these cases REFMAC warns that you may reconsider reindexing.
Twin off (difficult rebuilding)
Twin on (final model)
Rfactors
Random R factor in the presence of perfect twin with twinning modeled is around 40. If twinning is not modeled then it is around 50. Be careful after molecular replacement
Small twin fractions: Be careful with small twin fractions. Refmac removes twin domains with fraction less than 5%
High symmetry:
If twin fractions are refined towards perfect twinning then space group may be higher. Program zanuda from YSBL website may sort out some of the space group uncertainty problems:
www.ysbl.york.ac.uk/YSBLPrograms/index.jsp
For acentric case only:
For random structure
Crystallographic R factors
No twinning 58%
For perfect twinning: twin modelled 40%
For perfect twinning without twin modelled 50%
R merges without experimental error
No twinning 50%
Along non twinned axes with another axis than twin 37.5%
Non twin
Twin
Using twinning in refinement programs is straightforward. It improves statistics substantially (sometimes Rfactors can go down by 10%). However improvement of electron density is not very dramatic (just like when you use TLS). It may improve electron density in weak parts but in general do not expect miracles. Especially when twinning and NCS are close then improvements are marginal.
Use of available knowledge 1) NCS local2) Restraints to known structure(s)3) Restraints to current interatomic distances (implicit normal modes or “jelly” body)4) Better restraints on B values These are available from the version 5.6NoteBuster/TNT has local NCS and restraints to known structures CNS has restraints to known structures (they call it deformable elastic network)Phenix has Bvalue restraints on nonbonded atom pairs and automatic global NCSLocal NCS (only for torsion angle related atom pairs) was available in SHELXL since the beginning of time
Aligned regions
Chain A
Chain B
k(=5)
Water or ligand
Shell 2
Shell 1
Chain A
Water or ligand
Chain B
Shell 2
Shell 1
Example of alignment: 2vtu.
There are two chains similar to each other. There appears to be gene duplication
RMS – all aligned atoms
Ave(RmsLoc) – local RMS
********* Alignment results *********

: N: Chain 1 : Chain 2 : No of aligned :Score : RMS :Ave(RmsLoc):

: 1 : J( 131  256 ) : J( 3  128 ) : 126 : 1.0000 : 5.2409 : 1.6608 :
: 2 : J( 1  257 ) : L( 1  257 ) : 257 : 1.0000 : 4.8200 : 1.6694 :
: 3 : J( 131  256 ) : L( 3  128 ) : 126 : 1.0000 : 5.2092 : 1.6820 :
: 4 : J( 3  128 ) : L( 131  256 ) : 126 : 1.0000 : 3.0316 : 1.5414 :
: 5 : L( 131  256 ) : L( 3  128 ) : 126 : 1.0000 : 0.4515 : 0.0464 :

Domain 2
In many cases it could be expected that two or more copies of the same molecule will have (slightly) different conformation. For example if there is a domain movement then internal structures of domains will be same but between domains distances will be different in two copies of a molecule
Domain 1
Domain 2
Domain 1
One class of robust (to outliers) estimators are called Mestimators: maximumlikelihood like estimators. One of the popular functions is GemanMcclure.
Essentially when distances are similar then they should be kept similar and when they are too different they should be allowed to be different.
This function is used for NCS local restraints as well as for restraints to external structures
Red line: x2
Black line: x^2/(1+w x^2)
where x=(d1d2)/σ, w=0.1
ProSmart
structure to be refined known similar structure (prior)
Remove bond and angle related pairs
The term is added to the target function:
Summation is over all pairs in the same chain and within given distance (default 4.2A). dcurrent is recalculated at every cycle. This function does not contribute to gradients. It only contributes to the second derivative matrix.
It is equivalent to adding springs between atom pairs. During refinement interatomic distances are not changed very much. If all pairs would be used and weights would be very large then it would be equivalent to rigid body refinement.
It could be called “implicit normal modes”, “soft” body or “jelly” body refinement.
TLS2
TLS1
loop
If there are two densities of distributions – p(x) and q(x) then symmetrised KullbackLeibler divergence between them is defined (it is distance between distributions)
If both distributions are Gaussian with the same mean values and U1 and U2 variances then this distance becomes:
And for isotropic case it becomes
Restraints for bonded pairs have more weights more than for nonbonded pairs. For nonbonded atoms weights depend on the distance between atoms.
This type of restraint is also applied for rigid bond restraints in anisotropic refinement
Rfactors vs cycle
Black – simple refinement
Red – Global NCS
Blue – Local NCS
Green – “Jelly” body
Solid lines – Rfactor
Dashed lines  Rfree
Rfactors vs cycle
Black – Simple refinement
Red – External restraints
Blue – “Jelly” body
Solid lines – Rfactor
Dashed lines  Rfree
Rfactors vs cycle
Black – Simple refinement
Red – External restraints
Blue – “Jelly” + local NCS
Solid lines – Rfactor
Dashed lines  Rfree
We want to observe ρ0(x) but we observe ρ(x). These two entities are related:
If Kis known, calculate ρ0. In general problem is easy: Discretise and solve the linear equation. However these problems are illposed (small perturbation in the input causes large deviation in the output). In practice: by sharpening signal as well as noise are amplified.
Regularisation may help:
L is related with regularisation function. For L2 norm (value of ρ should be small) L is identity and for Sobolev norm (ρ should be smooth) of first order it is Laplace operators
MAP SHARPENING: INVERSE PROBLEMIn general K is effect of such terms as TLS or smoothly varying blurring function.
Noises in electron density: series termination, errors in phases and noises in experimental data.
Very simple case: K is overall B value. Then the problem is solved using FFT: Fourier transform of Gaussian is Gaussian, Fourier transform of Laplace operator is square length of the reciprocal space vector.
MAP SHARPENING: INVERSE PROBLEMOne way of selecting regularisation parameter: Minimise predicted error.
Where Aα=K Kα
If we restore unobserved data with their expected values Fe then the last term would be replaced by
Restoring seems to give less predictive error (problem of bias towards error in phases remains)
“Best” regularisation parameter is that that minimises PE.
MAP SHARPENING: INVERSE PROBLEMREGULARISATION PARAMETER.No sharpening
Top left and bottom:
After local NCS refinement
Sharpening, median B
α optimised
Sharpening, median B
α 0
MAP SHARPENING: 2R6C, 4Å RESOLUTIONSAD refinement available from version 5.5
SIRAS refinement available from version 5.6
New and complete dictionary available from version 5.6
Improved mask solvent available from version 5.6
Jligand for ligand dictionary and link description
Electron density calculation: Bayesian filters
How to combine two conflicting ideas: Sharpen electron density to have “better” defined atoms and integrate over errors to smoothen the electron density thus reduce noise
Local TLS restraints: Needs to be tested
Restraints on secondary structures and other internal patterns
Reticular twin: Almost there
Radiation damage
Error estimation
YorkLeiden
Alexei Vagin Pavol Skubak
Andrey Lebedev Raj Pannu
Rob Nocholls
Fei Long
CCP4, YSBL people
REFMAC is available from CCP4 or from York’s ftp site:
www.ysbl.york.ac.uk/refmac/latest_refmac.html
This and other presentations can be found on:
www.ysbl.york.ac.uk/refmac/Presentations/