300 likes | 439 Views
Comp. Genomics. Recitation 13. Genome rearrangements Homework solutions. Exercise 1. Two haploid, single-chromosome genomes G 1 and G 2 were sequenced. G 1 is an ancestor of G 2 . G 1 is represented by the unsigned permutation 1,2,…,n.
E N D
Comp. Genomics Recitation 13 Genome rearrangements Homework solutions
Exercise 1 • Two haploid, single-chromosome genomes G1 and G2 were sequenced. G1 is an ancestor of G2. G1 is represented by the unsigned permutation 1,2,…,n. • The region gi,…,gj is known as a “tough chromosomal region”. Reversal events never create breakpoints in this region.
Exercise 1 • Assume that G2 was generated from G1 by the minimal number of reversal events that is needed for obtaining G2 • Give an upper bound on the number of reversal events that occurred during G1 to G2 evolution.
Solution 1 • We can apply the same reversals in reverse order to obtain G1 • E.g., if a single reversal transformed G1=12345 into G2=14325, we can apply a reversal on the same indices and get G1 • So is we show a series of reverse-reversals of length k, k is an upper bound
Solution 1 • Genes 1,…,i-1 appear in G2 before position i or after position j. In the worst case, we need i-1 reversal operations to get these genes into their correct order. • Then we have in G2: • 1,2,..,i-1,TOUGH_REGION,REST_OF_GENES • where the TOUGH_REGION is either i,i+1,…,j or j,j-1,…,i+1
Solution 1 • We can fix the REST_OF_GENES region in n-j-1 reversal operations, and in total we get i-1+1+n-j-1=n-(j-i)-1
Exercise 2 • A break point is a location in the sequence such that • Prove or refute: Out of n/2 reversals on the unsigned permutation 1,2,…,n, there is at least one reversal that cancels a breakpoint at some index. • A reversal operates on a subsequence. • Note that a reversal can both cancel a breakpoint and create new ones
Solution 2 • Can you refute it? • The claim is false. • Consider the permutation (1,2,3). • (1,2,3)(1,3,2)(3,1,2)(1,3,2)… No No Yes Yes Yes No No No
Exercise 3 • Two reversals occur on the permutation 1,2,…,n. How many breakpoints can occur in the resulting permutation?
Solution 3 • One reversal: 1 2 3 4 5 6 7 1 76 5 4 3 2 one breakpoint 1 6 5 4 3 2 7two breakpoints
Solution 3 • Two reversals: 1 2 3 4 5 6 7 1 6 5 4 3 2 7 1 2 3 4 5 6 7 zero breakpoints
Solution 3 • Two reversals: 1 2 3 4 5 6 7 1 7 6 5 4 3 2 3 4 5 6 7 1 2 one breakpoint
Solution 3 • Two reversals: 1 2 3 4 5 6 7 1 76 5 4 3 2 1 3 4 5 6 7 2 two breakpoints
Solution 3 • Two reversals: 1 2 3 4 5 6 7 1 6 5 4 3 2 7 1 6 2 3 4 57 three breakpoints
Solution 3 • Four breakpoints: 1 2 3 4 5 6 7 1 6 5 4 3 2 7 1 6 5 3 4 2 7 four breakpoints
DCJ Algorithm • Why does it run in linear time?
DCJ Algorithm – cont’d • dDCJ(A,B) = N – (C+I/2). • Each iteration increments either C by on or I by two. • Our genome representation allows to find and perform each sorting operation in constant time. • The DCJ distance is never larger than N.
שאלה ממועד א' תשס"ז • גנום הוא קבוצה של כרומוזומים, שבו כל כרומוזום הוא רצף של מספרים שלמים בעלי סימן. יחד, הכרומוזומים מכילים את המספרים השלמים 1,…,n ללא חזרות. • למשל, G={(1,-2,3),(4,5,6,-7)} הוא גנום עם שני כרומוזומים אנחנו מניחים שכרומוזום וההפכי שלו עם סימנים הפוכים הם שקולים. • לכן (4,5,6,-7) שקול ל-(7,-6,-5,-4).
שאלה ממועד א' תשס"ז • פעולת היפוך (reversal) הופכת את הסדר ואת הסימנים של מקטע רציף בתוך כרומוזום בודד. לכן, היפוך יחיד על הכרומוזום הראשון של G יכול לייצר את הגנום {(1,-3,2), (4,5,6,-7)} • פעולת העברה (translocation) מחליפה מקטעים קיצוניים של שני כרומוזומים (כאשר אחד מהם יכול להיות ריק). למשל, העברה על G יכולה ליצור את הגנום {(1,-2,5,6,-7),(4,3)}.
שאלה ממועד א' תשס"ז • הבעיה היא לעבור מגנום נתון לגנום אחר תוך שימוש במספר מינימלי של פעולות היפוך והעברה. • תן אלג' המבטיח יחס ביצועים קבוע לבעיה ופועל בזמן פולינמויאלי.
פתרון • הבעיה שקולה ל-signed reversal. • ראינו בכיתה פתרון 2-קירוב בזמן פולימניאלי.
HW 3 question 5 • Uniform lifted alignment – alignment in which for each level all string are either lifted from right or left. • Prove that the optimal uniform lifted alignment has cost at most twice of the optimal alignment tree. • Give a polynomial algorithm to find the optimal uniform lifted alignment.
HW 3 question 5 • Uniform lifted alignment, proof: • Assume we had the optimal tree T*. • Transform it in the following way: • To assign string at level k, consider: • Pick the minimal sum.
HW 3 – question 5 – cont’d • Assign each ‘costy’ edge (T,S) to a path in the optimal tree: • The path from leaf (T) to node (S*). S (S*) T S T Together, these paths cover all edges of the tree.
HW 3 – question 5 – cont’d By triangle inequality: D(S, T) ≤ D(S, S*) + D(S*, T) S (S*) T S By choice of left/right: Σs D(S,S*)+D(S*,T) ≤ Σs D(S*,T)+D(S*,T) = Σs 2D(S*,T) T => One-sided tree with cost at most twice the optimal.
HW 3 – question 5 – cont’d • Algorithm: • Preprocess pairwise sequence distances. • Try all different assignments for a left/right for each level, and pick the minimal one. • Running time (n sequences of length m): • Proprocessing: O(m2n2). • Height h, different assignment 2h. • Calculation cost of tree O(n).
HW 1 question 1 • Question 1: Explain how to compute local alignment in linear space • The linear space algorithm from the lecture is a global alignment algorithm
solution x y local alignment global alignment
solution • For every cell [i,j] in the DP matrix, add a field b[i,j] that will be updated as follows: • If the score of [i,j] is 0 then b[i,j]=(i,j) • Otherwise • If match b[i,j]=b[i-1,j-1] • If mismatch for x b[i,j]=b[i-1,j] • If mismatch for y b[i,j]=b[i,j-1]
solution • Use the linear space algorithm from class for computing the score of the optimal local alignment • At the same time the field b[i,j] can be updated for every cell • Now, “cut out” the small matrix using the cell with the optimal score [i* ,j*] and b[i* ,j*], and run Hirschberg