1 / 19

Structure Alignment in Polynomial Time

Structure Alignment in Polynomial Time. Rachel Kolodny Stanford University Nati Linial The Hebrew University of Jerusalem. Problem Statement. 2 structures in R 3 A={a 1 ,a 2 ,…,a n }, B={b 1 ,b 2 ,…,b m }

Download Presentation

Structure Alignment in Polynomial Time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Structure Alignment in Polynomial Time Rachel Kolodny Stanford University Nati Linial The Hebrew University of Jerusalem

  2. Problem Statement • 2 structures in R3A={a1,a2,…,an}, B={b1,b2,…,bm} • Find subsequences sa and sb s.t the substructures{asa(1),asa(2),…, asa(l)},{bsb(1),bsb(2),…, bsb(l)} are similar

  3. Motivation • Structure is better conserved than amino acid sequence • Structure similarity can give hints to common functionality/origin • Allows automatic classification of protein structure

  4. Correspondence  Position • Given a correspondence the rotation and translation that minimize the cRMS distance can be calculated Kabsch, W. (1978).

  5. Position  Correspondence • Given a rotation and translation one can calculate the alignment that optimizes a (separable) score • Using dynamic programming • Essentially similar to sequence alignment • Example score

  6. Score  cRMS • We want to give “bonus points” for longer correspondences • e.g. corresponding ONE atom from each structure has 0 cRMS • Even better scores ? • vary gap penalty depending on position in structure • Incorporate sequence information

  7. Score  cRMS A specific correspondence

  8. Previous Work *most data taken from Orengo 94

  9. “…It can be proved that, for these reasons, finding an optimal structural alignment between two protein structures is an NP hard problem and thus there are no fast structural alignment algorithms that are guaranteed to be optimal within any given similarity measure…” Adam Godzik ‘The structural alignment between two proteins: Is there a unique answer’ 1996 “There is no exact solution to the protein structure alignment problem, only the best solution for the heuristics used in the calculation.” Shindyalov & Bourne ‘Protein Structure Alignment by Incremental Combinatorial (CE) of the Optimal Path’ 1998

  10. Exponentially many Focus on Scoring Functions

  11. Exponentially many Focus on Scoring Functions

  12. Exponentially many All Maxima are interesting Noisy data !!

  13. Good scoring functions • Each of the functions is well-behaved • Satisfies Lipschitz condition • Thus, the maximum over a finite set is well-behaved • In each dimension two points at distance  have function values that vary by O(n) • Need O(n) samples in every dimension

  14. Sampling is Sufficient

  15. Polynomial Algorithm • Sample in rotation and translation space • compute best score (and alignment) for each sample point • Return maximum score • Need O(n6n2) time and O(n2) space

  16. Internal Distance Matrices • Invariant to position and rotation of structures  can be compared directly • Find largest common sub-matrices (LCM) whose distances are roughly the same

  17. LCM is NP-complete • Harder than MAX-CLIQUE • Matrices encode distances that are positive, symmetric and obey triangle inequality

  18. 1dme 28 amino acids 1jjd 51 amino acids Example Best STRUCTAL score 149 Best score found by exhaustive search 197

  19. Heuristic • Consider only translations that positions an atom from protein A on an atom of protein B • O(m*n) instead of O((n+m)3)

More Related