Patterns, Profiles, and Multiple Alignment

1 / 50

# Patterns, Profiles, and Multiple Alignment - PowerPoint PPT Presentation

Patterns, Profiles, and Multiple Alignment. OUTLINE. Profiles and Sequence Logos Profile Hidden Markov Models Aligning Profiles Multiple Sequence Alignments by Gradual Sequence Adition Other Ways of Obtaining Multiple Alignments Sequence Pattern Discovery. OUTLINE.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Patterns, Profiles, and Multiple Alignment' - bracha

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Patterns, Profiles, and

Multiple Alignment

OUTLINE
• Profiles and Sequence Logos
• Profile Hidden Markov Models
• Aligning Profiles
• Other Ways of Obtaining Multiple Alignments
• Sequence Pattern Discovery
OUTLINE
• Profiles and Sequence Logos
• Profile Hidden Markov Models
• Aligning Profiles
• Other Ways of Obtaining Multiple Alignments
• Sequence Pattern Discovery
Aligning Profiles

Comparing two PSSMs by alignment

Can not done by standard alignment techniques,

Consşder alignement of two columns, one from each PSSM:

Both are in fact scores,

Use measure of the similarity between the scores in the two columns.

Aligning Profiles

Comparing two PSSMs by alignment

The Program LAMA (Local Alignment of Multiple Alignments:)

Do not allow gaps in the alignment of PSSMs,

Uses Pearson correlation coefficient as similarity mesure,

The score of each column reanges from 1 to -1.

Aligning Profiles

Comparing two PSSMs by alignment

Modified pairwise dynamic programming:

Pairwise dynamic programming algorithms can be modified to find the optimal alignment of more than two sequences,

Modified pairwise dynamic programming:

Align 3 sequences:

SEQUENCE 1

SEQUENCE 2

SEQUENCE 3

Modified pairwise dynamic programming:

Align 3 sequences:

Modified pairwise dynamic programming:

Align 3 sequences:

Modified pairwise dynamic programming:

RESULT:

dynamic programming approach for alignment between two sequences is easily extended to k sequences,

For k sequences we need to deal with a k-dimensional matrix,

Therefore, it is impractical due to exponential running time

Progressive alignment:

The order in which they are aded can be crucial to the successful generation of an accurate alignment,

There are different ways to determine this addition.

Progressive alignment (ClustalW):

Dynamic programming,

Sum-of-pairs scoring method,

organize multiple sequence alignment using a guide tree where leaves represent sequences and internal nodes represent alignments,

Progressive alignment (ClustalW):

Steps:

Find similarity matrix.

Progressive alignment (ClustalW):

Steps:

Cluster analysis (tree construction).

Progressive alignment (ClustalW):

Steps:

Align sequences according to the order determined by the tree:

Progressive alignment (ClustalW):

Steps:

Align sequences according to the order determined by the tree:

Progressive alignment (ClustalW):

depending on the internal node in the tree, we may have to align a

a sequence with a sequence 

a sequence with a profile 

a profile with a profile 

in all cases we can use dynamic programming

for the profile cases, use SP (sum-of-pairs) scoring

• Progressive alignment (ClustalW):
• Sum of Pairs Scoring:
• Consider all possible pairs.

Progressive alignment (ClustalW):

Sum of Pairs Scoring:

Progressive alignment (ClustalW):

Sum of Pairs Scoring:

Assume c(match) = 1 ,

c(mismatch) = -1 ,

and c(gap) = -2 ,

also assume c(-, -) = 0

to prevent the double counting of gaps.

Progressive alignment (ClustalW):

Sum of Pairs Scoring:

Assume c(match) = 1 , c(mismatch) = -1 , and c(gap) = -2 , also assume c(-, -) = 0 to prevent the double counting of gaps.

Progressive alignment (Star Alignment):

Select a sequence c as the center of the star,

For each sequencex1, …, xk such that index i ≠

c, perform a Needleman-Wunsch global alignment

Aggregate alignments with the principle “once a gap, always a gap.”

Progressive alignment (Star Alignment):

Select the center sequence:

Progressive alignment (Star Alignment):

Select the center sequence:

Simply choose as xc (center sequence) the sequence xithat maximizes the following

Progressive alignment (Star Alignment):

Select the center sequence EXAMPLE:

Progressive alignment (Star Alignment):

Select the center sequence EXAMPLE:

Compute all pairwise alignments (global alignments) and scores.

Progressive alignment (Star Alignment):

Select the center sequence EXAMPLE:

Compute all pairwise alignments (global alignments) and scores.

sequence most

similar to the rest

Progressive alignment (Star Alignment):

Select the center sequence EXAMPLE:

Progressive alignment (Star Alignment):

Select the center sequence EXAMPLE:

Build the alignment:

Progressive alignment (Star Alignment):

Select the center sequence EXAMPLE:

Build the alignment:

Progressive alignment (Star Alignment):

Select the center sequence EXAMPLE:

Build the alignment:

Progressive alignment (Star Alignment):

For highly similar sequences this method can generate a reasonable alignment,

When the percentage identity between sequences is low, multiple alignment obtained by star alignment can be very poor.

Other Ways of Obtaining Multiple Alignments

DIALIGN

Focuses on short ungapped alignments,

Complete alignment can be constructed from ungapped local alignments between pairs of sequences.

Other Ways of Obtaining Multiple Alignments

DIALIGN

All possible diagonals between each pair of sequences are considered,

Other Ways of Obtaining Multiple Alignments

SAGA

Use genetig algorithm to find the optimal alignment.

Other Ways of Obtaining Multiple Alignments

SAGA

Steps in genetic algorithm (GENERAL):

Other Ways of Obtaining Multiple Alignments

SAGA

Crossover operations in SAGA:

Other Ways of Obtaining Multiple Alignments

SAGA

Crossover operations (another way) in SAGA:

Sequence Pattern Discovery

From multiple sequence alignments

By searching for possible patterns in the set of sequences

Sequence Pattern Discovery

eMOTIF:

Uses 20 groups of amino acids to denote amino acids that can be substituted by each other

Sequence Pattern Discovery

eMOTIF:

For every position of the alignment determine which single group can cover the whole column

By examining the possible column combinations, identify patterns

References
• M. Zvelebil, J. O. Baum, “Understanding Bioinformatics”, 2008, Garland Science
• Andreas D. Baxevanis, B.F. Francis Ouellette, “Bioinformatics: A practical guide to the analysis of genes and proteins”, 2001, Wiley.
• Barbara Resch, “Hidden Markov Models - A Tutorial for the Course Computational Intelligence”, 2010.