This presentation is the property of its rightful owner.
1 / 20

# Sequence Alignment PowerPoint PPT Presentation

Sequence Alignment. Oct 9, 2002 Joon Lee Genomics & Computational Biology. Dynamic Programming. Optimization problems: find the best decision one after another Subproblems are not independent Subproblems share subsubproblems Solve subproblem, save its answer in a table.

Sequence Alignment

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

## Sequence Alignment

Oct 9, 2002

Joon Lee

Genomics & Computational Biology

### Dynamic Programming

• Optimization problems: find the best decision one after another

• Subproblems are not independent

• Subproblems share subsubproblems

• Solve subproblem, save its answer in a table

Genomics & Computational Biology

### Four Steps of DP

• Characterize the structure of an optimal solution

• Recursively define the value of an optimal solution

• Compute the value of an optimal solution in a bottom-up fashion

• Construct an optimal solution from computed information

Genomics & Computational Biology

### Sequence Alignment

Sequence 1: G A A T T C A G T T A

Sequence 2: G G A T C G A

Genomics & Computational Biology

### Align or insert gap

G A A T T C A G T T A

| | | | | |

G G A _ T C _ G _ _ A

G _ A A T T C A G T T A

| | | | | |

G G _ A _ T C _ G _ _ A

Genomics & Computational Biology

### Three Steps of SA

• Initialization: gap penalty

• Scoring: matrix fill

• Alignment: trace back

Genomics & Computational Biology

### Step 1: Initialization

Genomics & Computational Biology

### Step 2: Scoring

• A = a1a2…an, B = b1b2…bm

• Sij : score at (i,j)

• s(aibj) : matching score between ai andbj

• w : gap penalty

figure source

Genomics & Computational Biology

### Step 2: Scoring

• Match: +2

• Mismatch: -1

• Gap: -2

Genomics & Computational Biology

### Step 2: Scoring

0 + 2 = 2

-2 + (-2) = -4

-2 + (-2) = -4

Genomics & Computational Biology

### Step 2: Scoring

-2 + (-1) = -3

-4 + (-2) = -6

2 + (-2) = 0

Genomics & Computational Biology

### Step 2: Scoring

-2 + 2 = 0

2 + (-2) = 0

-4 + (-2) = -6

Genomics & Computational Biology

### Step 2: Scoring

Genomics & Computational Biology

### Step 3: Trace back

Genomics & Computational Biology

### Step 3: Trace back

G A A T T C A G T T A

G G A _ T C _ G _ _ A

G A A T T C A G T T A

G G A T _ C _ G _ _ A

Genomics & Computational Biology

### Excercise

• Match: +2

• Mismatch: -1

• Gap: -2

Genomics & Computational Biology

### Excercise

• Match: +2

• Mismatch: -1

• Gap: -2

G C A T C C G

G A T C G

G A T C G

G A T C G

Genomics & Computational Biology

### Amino acids

• Match/mismatch → Substitution matrix

Genomics & Computational Biology

### Global & Local alignment

• Global: Needlman-Wunsch Algorithm

• Local: Smith-Waterman Algorithm

From Mount Bioinformatics Chap 3

Genomics & Computational Biology

### References

• Sequence alignment with Java applet

• http://linneus20.ethz.ch:8080/5_4_5.html

Genomics & Computational Biology