Topic 3: MSA

Topic 3: MSA Iterative Algorithms in Multiple Sequence Alignment Prepared By: 1. Chan Wei Luen 2. Lim Chee Chong 3. Poon Wei Koot 4. Xu Jin Mei 5. Yuan Ling 6. Zeng Sheng

Introduction • The introduction is MSA was given by the previous group, so we’ll not cover this here. • The major problem of progressive method is that alignments errors occurred during initial phase are propagated to the latter result. • Iterative method seeks to overcome this limitation by repeatedly realigned subgroups of the sequence and then by aligning these subgroup into a final sequence to achieve the best possible alignment.

Introduction.. In order to correct the mistakes introduced by the progressive alignment, iterative algorithm was introduced in 1987. • Barton suggested an algorithm that refines the alignment by realigning each sequence with the completed alignment less than that sequence. • For instance, sequence A1 is aligned with the alignment of sequences A2, A3, …Ai, which was first removed any gaps that are common. • This process is repeated until all sequences have been realigned.

Progressive Global Local SB NJ Multal SBpima clustalx ML UPGMA multalign pileupB MLpima Praline OMA prrp Iteralign dialign dialign2 Stochastic HMMS Genetic Algm hmmt saga Iterative Architecture of multiple sequence alignment algorithms

OMA • An iterative alignment algorithm • Using an improved algorithm for the optimal alignment of multiple biological sequences based on the A* algorithm • Using Divide and Conquer Alignment method (DCA) repeatedly

OMA Step 1) A small value of Z is used to divide the sequences Step 2 ) Align sub-sequences using A* algorithm and reassemble the alignment results Step 3 ) A larger value Z to divide the results of the previous alignments Step 4) Remove the inserts in divided sequences, align them and reassemble the alignment results Step 5 ) Repeat step 3 and 4 using increasing values of Z, up to optimality or you can stop at anytime.

Divide and Conquer Alignment iteration

DiAlign / DiAlign2 • Background • New method for pairwise and multiple alignments • DiAlign and DiAlign2 were proposed by Burkhard Morgenstern in 1998 and 1999 respectively • DiAlign2 modified the weight function of DiAlign such that: • it reduces the running time, • it can be applied both globally and locally to related sequence sets

DiAlign / DiAlign2 • Algorithm • Step 1: All optimal pairwise alignments are formed and sorted • according to their weighted scores • according to the degree of overlap with other diagonals • Step 2:The diagonal with the highest weight is the first one to be selected for the alignment.

DiAlign / DiAlign2 • Step 3: The next diagonal from the list is checked for consistency and added to the alignment if consistent, and is repeated iteratively until no additional diagonals can be found. • Step 4: The program introduces gaps into the sequences until all residues connected by the selected diagonals are properly arranged.

DiAlign / DiAlign2 • Advantage • Good at properly aligning sequences where local homology is the driving signal. • Disadvantage • Not as accurate as other algorithm such as Clustal W or Prrp but it works well in sequences which require very long insertions to be properly aligned

Iteralign • Iteralign algorithm is as follows: • First, designate the r original sequences by {Si} • Each of this sequence is used to match all r sequences in an ungapped mode • Construct an “ameliorated” sequence for each of the sequences and call it {Sk(1)} • Align each of the original sequences Si to Sk(1) • Create a new ameliorated sequence {Sk(2)} • Iterate the process until no more change in the new ameliorated sequence {Sk(n)} • Call this final sequence Ck(1)

Iteralign • Collect all Ck(1) sequences and call them {Ci(1)} set also known as consensus sequences or round 1 • Use Ci(1) as the input to step 1 and repeat the whole process iteratively until there is no more change • We call this final set the core blocks {Ci()} • Core blocks have the property that the consensus aligns maximally to all individual sequences • Use a local Dynamic Programming (DP) method to optimize the displacements (allowing gap) of individual sequences

Iteralign

Open Issue • There are some strengths and weaknesses in iterative methods. • Pro: • A common characteristic of these methods lies in that the accuracy of alignment has been markedly improved • Cons: • However, huge computational time and memory complexity is required. • A multitude of parallel techniques have been proposed to solve this problem. However, parallelization of the iterative alignment algorithm remains a difficult task. • In summary, iterative alignment strategy is a promising trend.

Conclusion • Traditionally the most popular approach for multiple sequence alignment has been the progressively alignment method. • But over the years, Iterative alignment strategy will be a more suitable choice of multiple sequence alignment.

Topic 3: MSA

Topic 3: MSA

Presentation Transcript

Topic 3

Topic 3

Topic-3

Topic 3

Topic 3

Topic 3

Topic 3

Topic 3

TOPIC 3

Topic 3

TOPIC 3

Topic 3

Topic-3

Topic 3

Topic 3.

TOPIC 3

Topic 3

Topic 3

Topic 3

Topic 3

TOPIC 3