Pairwise profile alignment

Usman Roshan

BNFO 601

- PFAM: http://pfam.sanger.ac.uk/
- Family alignments can be used to search for new members in a database

- Given a family alignment, how can we align it to a sequence?
- First, we compute a profile of the alignment.
- We then align the profile to the sequence using standard dynamic programming.
- However, we need to describe how to align a profile vector to a nucleotide or residue.

- A profile can be described by a set of vectors of nucleotide/residue frequencies.
- For each position i of the alignment, we we compute the normalized frequency of nucleotides A, C, G, and T

- ClustalW/MUSCLE
- Let f be the profile vector
- Score(f,j)=
- where S(i,j) is substitution scoring matrix

- PSI-BLAST
- Score(f,i)=log(Qi/Pi)
- Pi is the background probability of nucleotide i
- qij is a matrix of match/mismatch probabilities
- Define gi as
- and Qi as