1 / 14

Sequence order independent structural alignment

Sequence order independent structural alignment. Joe Dundas, Andrew Binkowski, Bhaskar DasGupta, Jie Liang Department of Bioengineering/Bioinformatics, University of Illinois at Chicago. Background. Extended Central Dogma of molecular biology

engelke
Download Presentation

Sequence order independent structural alignment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sequence order independent structural alignment Joe Dundas, Andrew Binkowski, Bhaskar DasGupta, Jie Liang Department of Bioengineering/Bioinformatics, University of Illinois at Chicago

  2. Background • Extended Central Dogma of molecular biology DNA  RNA  primary structure  3D structure  function • Evolution conserves the 3D structure more than amino acid sequence. • Structural similarity often reflects a common function or origin of proteins.[1] • It is useful to classify proteins based on their structures. (SCOP, CATH, FSSP). • Many methods for structure alignment have been reported. (CE, DALI, FAST, Matchprot)

  3. Uliel S., Fliess A., Amir A., Unger R. (1999)[6] Uliel S., Fliess A., Unger R. (2001)[7] Circular Permutation • Ligation of the N and C termini, and subsequent cleavage elsewhere. • In 1979, first natural circular permutation was observed in favin vs. concanavalin A.[2] • In 1983, the first engineered circular permutation was performed on bovine pancreatic trypsin inhibitor.[3] • Since, studies have shown that artificially permuted proteins are able to fold into a stable structures that are similar to the native protein.[4] • Circular permutations have been discovered in lectins, β-glucanases, swaposin…[5]

  4. Alignment Problem • Most structural alignment methods rely on the structural units of each protein to align sequentially i.e. CE, FAST. • Some newer methods will perform non-sequential alignments i.e. Dali, Matchprot. After explaining our method, will we compare the results against Dali and Matchprot.

  5. Our Method • We exhaustively fragment protein A and protein B into lengths ranging from 4 to 7 residues. Notation: fragment λa = (a1, a2), where a1 and a2 are the beginning and ending positions relative to the N termini of protein A. Πa = {λa,1, λa,2,… λa,n} is the set of all fragments from protein A. La,i is the length of fragment Πa,I • Each fragment from protein A is aligned to all fragments of protein B if La,I = Lb,j, forming a set of Aligned Fragment Pairs ( ΛΠa x Πb ). • A similarity function σ maps Λ

  6. Similarity Function All Λi with σ(Λi) > Threshold are used to create a conflict graph.

  7. δ1 δ2 Reference Protein Residues δ3 δ4 Query Protein Residues Conflict Graph • Two fragment pairs Λi and Λj are in conflict if any residue in λi,A is also in λj,A or any residue in λi,B is also in λj,B. Simplified Example Conflicts can be found by a vertex sweep.

  8. LP Formulation x is a relaxed integer between 0 and 1 0 = don’t use fragment 1 = use fragment Subject to: No conflicting residues in query or reference protein. Consistency between variables All variables are between 0 and 1 Solve using linear programming package

  9. δ4 δ4 δ1 δ1 δ3 δ3 δ2 δ2 Local Conflict Number σ(Λ4) = 15 x Λ4 = 0.01 ΘΛ4 = 0.26 LP will assign a number between 0 and 1 for each xδ. For each Λ compute a local conflict number Θ Define δmin as the vertex with the smallest local conflict number. Assign a new σ Remove all vertices with σ ≤ 0 from Λ and push them onto a stack Ω in descending order of σ σ(Λ1) = 50 x Λ1 = .85 ΘΛ1 = 1.10 δmin σ(Λ3) = 20 x Λ3 = 0.6 ΘΛ3 = 0.85 σ(Λ2) = 20 x Λ2 = .25 ΘΛ2 = 1.46 σ(Λ4) = 0 σ(Λ1) = 50 σ(Λ2) = 15 σ(Λ3) = 20

  10. Repeat Repeat LP formulation until all vertices have been pushed onto the stack Ω. Begin with 5 empty alignments. While the stack is not empty, retrieve a aligned pair by popping the stack. Insert it into each non-empty alignment if and only if: • No residue conflicts occur. • The global RMSD does not change by some threshold. If it can not be inserted into any alignment, insert it into an available empty alignment. Determine which alignment with highest similarity score.

  11. Results – Circular Permutation? 1jqsC 70s ribosome functional complex Fold: Ribosome & Ribosomal fragments 2pii PII (Product of glnB) Fold: Ferredoxin-like RMSD: 2.3194

  12. Results – Circular Permutation 1iudA Aspartate Racemase Fold: ATC-like 1h0rA Type II 3-dehydrogenate dehydralase Fold: Flavodoxin

  13. Results 1vet Mitogen activated protein kinase 1fe0 ATX1 Metallochaperone Fold: ferredoxin-like

  14. Results 1e50 Core binding factor Fold: Core binding factor beta 1pkv Riboflavin Synthase

More Related