Sequence Alignment Programs for Global, Local, and Multiple Alignments
120 likes | 219 Views
Learn how to run Global, Local, and Multiple Alignment programs efficiently, including sample outputs and detailed explanations. Input requirements and usage instructions are provided for each alignment type.
Sequence Alignment Programs for Global, Local, and Multiple Alignments
E N D
Presentation Transcript
Global/Local/Multiple Alignments by Boyang Wei
GlobalAlignment How to run: • GlobalAlignment -s seq1 seq2 (-m) • GlobalAlignment -f file1 file2 (-m) Sample output: rns202-15.cs.stolaf.edu% GlobalAlignment -s ABB ABABBAB -ABBA - B
GlobalAlignment // represent each box in the matrixstruct box { int row; // row index int col; // column index int score; // best score vector<box*> from; // where does the score come from};
LocalAlignment How to run: • LocalAlignment -s seq1 seq2 (-m) • LocalAlignment -f file1 file2 (-m) Sample output: rns202-15.cs.stolaf.edu% LocalAlignment -f seq1.txt seq2.txtAC|TAC|T G|TAC|
LocalAlignment Sample output with matrix printed: rns202-15.cs.stolaf.edu% LocalAlignment -f seq1.txt seq2.txt -mAC|TAC|T G|TAC| - A C T A C T - 0 0 0 0 0 0 0 G 0 0 0 0 0 0 0 T 0 0 0 2 1 0 2 A 0 2 1 1 4 3 2 C 0 1 4 3 3 6 5
Global & Local Alignment Input: • both programs require exact two sequences as input • not limited to A, T, G, C • don't require capitalized character • all kinds of characters will work
MultipleAlignment How to run: • MultipleAlignment -s seq1 seq2 seq3 ... • MultipleAlignment -f file Sample output: rns202-15.cs.stolaf.edu% MultipleAlignment -s ATGC ATG ATCATGCATG -AT - C
MultipleAlignment // class to store a character// its objects represent the letters in the sequence// NOTE! since the letter class is set up in this way, // it will store characters other than A, T, G, C as a gap,// so this program only works for input consisting of A, T, G, C struct letter { float A; // percentage of A in this letter float T; // percentage of T in this letter float G; // percentage of G in this letter float C; // percentage of C in this letter float gap; // percentage of gap in this letter ......}
MultipleAlignment // a self-defined string class// to store the sequence/string as a sequence of letter objects struct sequence { vector<letter> seq; // the sequence vector<int> gapPosition; // the gap positions at the end of aligning int prev[2]; // index of the previous two sequences // (sometimes a sequence may be generated by // combining two sequence) ...... }
MultipleAlignment // calculate the score for two letter// either a match, mismatch, or partial match float calculateScore(letter l1, letter l2) { float matchPercent = min(l1.A, l2.A) + min(l1.T, l2.T) + min(l1.G, l2.G) + min(l1.C, l2.C); float misMatchPercent = 1 - matchPercent; return matchPercent*matchScore + misMatchPercent*misMatchScore;}
MultipleAlignment // combine two letters to generate a new oneletter sum(letter l1, letter l2) { // if one of them is a gap, return the other one if (l1.gap == 1) return l2; else if (l2.gap == 1) return l1; // otherwise, combine the percentages else { letter l((l1.A+l2.A)/2, (l1.T+l2.T)/2, (l1.G+l2.G)/2, (l1.C+l2.C)/2, (l1.gap+l2.gap)/2); // could just put 0 for this line return l; }}
MultipleAlignment Input: • requires at least one sequence as input • if read from file: • first line: the number of input sequences • rest lines: one sequence per line