1 / 24

With thanks to Zhijun Wu

An introduction to the algorithmic problems of Distance Geometry. With thanks to Zhijun Wu. Distance Geometry mapping from semi-metric to metric spaces Euclidean and non-Euclidean. Multidimensional Scaling data classification geometric mapping of data. T. S. fundamental problem:

sammy
Download Presentation

With thanks to Zhijun Wu

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An introduction to the algorithmic problems of Distance Geometry With thanks to Zhijun Wu

  2. Distance Geometry mapping from semi-metric to metric spaces Euclidean and non-Euclidean Multidimensional Scaling data classification geometric mapping of data T S fundamental problem: find the coordinates for a set of points, given the distances for all pairs of points B Cayley-Menger determinant necessary & sufficient conditions of embedding singular-value decomposition method strain/stress minimization Molecular Conformation embedding in 3D Euclidean space protein structure prediction and determination sparse, inexact distances, bounds on the distances, probability distributions

  3. Molecular Distance Geometry Problem Given n atoms a1, …, an and a set of distances di,j between ai and aj, (i,j) in S

  4. Problems and Complexity problems with all distances: solvable in O (n3) using SVD problems with sparse sets of distances: NP-complete (Saxe 1979) problems with distance ranges (NMR results): NP-complete (More and Wu 1997), if the ranges are small problems with probability distributions of distances: stochastic multidimensional scaling, structure prediction

  5. Current Approaches • Embed Algorithm by Crippen and Havel • CNS Partial Metrization by Brünger et al • Graph Reduction by Hendrickson • Alternating Projection by Glunt and Hayden • Global Optimization by Moré and Wu • Multidimensional Scaling by Trosset, et al

  6. Solution of Distance Geometry ProblemsFundamental Problem SVD Decomposition SVD Decomposition

  7. Embed Algorithm time consuming in O(n3~n4) • bound smooth; keep distances consistent • distance metrization; estimate the missing distances • repeat (say 1000 times): • randomly generate D in between L and U • find X using SVD with D • if X is found, stop • select the best approximation X • refine X with simulated annealing • final optimization costly in O(n2~n3) Crippen and Havel 1988 (DGII, DGEOM) Brünger et al 1992, 1998 (XPLOR, CNS)

  8. Geometric Build-Up Independent Points: A set of k+1 points in Rk is called independent if it is not a set of points in Rk-1. Metric Basis: A set of points B in a space S is a metric basis of S provided each point of S is uniquely determined by its distances from the points in B. Fundamental Theorem: Any k+1 independent points in Rk form a metric basis for Rk. Blumenthal 1953: Theory and Applications of Distance Geometry

  9. Geometric Build-Up in two dimension

  10. Geometric Build-Up in three dimension

  11. Geometric Build-Up in three dimension

  12. Geometric Build-Up x1 = (u1, v1, w1) x2 = (u2, v2, w2) x3 = (u3, v3, w3) x4 = (u4, v4, w4) 1 ? xi = (ui, vi, wi) i ||xi - x1|| = di,1 ||xi - x2|| = di,2 ||xi - x3|| = di,3 ||xi - x4|| = di,4 2 4 j ||xj - x1|| = dj,1 ||xj - x2|| = dj,2 ||xj - x3|| = dj,3 ||xj - x4|| = dj,4 ? xj = (uj, vj, wj) 3 The build-up algorithm solves a distance geometry problem in O(n) when distances between all pairs are givenSVD takes O(n3) time

  13. X-ray structure (left) of HIV-1 RT p66 protein (4200 atoms) and the structure (right) determined by the geometric build-up algorithm using distances for all pairs of atoms. The RMSD of the two structures is ~10-4 Å. Build-up took 188,859 floating-point operations; SVD took 1,268,200,000 floating-point operations.

  14. Problems with Sparse Sets of Distances

  15. Control of Rounding Errors

  16. Control of Rounding Errors

  17. Tolerate Distance Errors

  18. Tolerate Distance Errors i (i,j) in S xj are determined. j

  19. The objective function is convex and the problem can be solved using a standard Newton method. Each function evaluation requires order of n floating point operations, where n is the number of atoms. (i,j) in S xj are determined. In the ideal case when every atom can be determined, n atoms require O(n2) floating point operations.

  20. NMR Structure Determination The distances are given with their possible ranges. i j

  21. (i, j) in S

  22. Computational Results Computational Results The structure of 4MBA (red lines) determined by using a geometric build-up algorithm with a subset of all pairs of inter-atomic distances. The X-ray crystallography structure is shown in blue lines.

  23. Computational Results Computational Results The total distance errors (red) for the partial structures of a polypeptide chain obtained by using a geometric build-up are all smaller than 1 Å, while those (blue) by using CNS (Brünger et al) grow quickly with increasing numbers of atoms in the chain.

  24. Extension to Statistical Distance Data the distributions of the distances in structure database i j structure prediction

More Related