BI420 – Introduction to Bioinformatics. Sequence alignment. Gabor T. Marth. Department of Biology, Boston College marth@bc.edu. Sequence alignment – Biology. http://bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html. Biologically significant sequence alignment. Sequence alignment – Biology.

Sequence alignment – Biology

Biologically significant sequence alignment

Sequence alignment – Biology

Biologically plausible sequence alignment

Sequence alignment – Biology

Spurious alignment

Examples from: Biological sequence analysis. Durbin, Eddy, Krogh, Mitchison

Alignment types

How do we align the words: CRANE and FRAME?

CRANE

|| |

FRAME

3 matches, 2 mismatches

How do we align words that are different in length?

COELACANTH

|| |||

P-ELICAN--

COELACANTH

|| |||

-PELICAN--

5 matches, 2 mismatches, 3 gaps

In this case, if we assign +1 points for matches, and -1 for mismatches or gaps, we get 5 x 1 + 1 x (-1) + 3 x (-1) = 0. This is the alignment score.

Examples from: BLAST. Korf, Yandell, Bedell

Finding the “best” alignment

COELACANTH

| |||

PE-LICAN--

COELACANTH

||

P-EL-ICAN-

COELACANTH

PELICAN--

S=-6

S=-10

S=-2

COELACANTH

|| |||

P-ELICAN--

S=0

Visualizing pair-wise alignments

http://bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html

Sequence similarity and scoring

Match-mismatch-gap penalties: e.g. Match = 1 Mismatch = -5 Gap = -10

Scoring matrices