Physical Mapping of DNA

140 Views

Download Presentation
## Physical Mapping of DNA

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**BIO/CS 471 – Algorithms for Bioinformatics**Physical Mapping of DNA**Landmarks on the genome**• Identify the order and/or location of sequence landmarks on the DNA Restriction Enzyme Digests Hybridization Mapping Clones BamH1 – GGATCC y x z w Probes Physical Mapping**Producing a map of the genome**m u a e j i x f w z d u z d m Physical Mapping**Restriction Fragment Mapping**3 8 6 10 A: 4 5 11 7 B: 3 1 5 2 6 3 7 A + B: Physical Mapping**Set Partitioning**• The Set Partition Problem • Input: X = {x1, x2, x3, … xn} • Output: Partition of X into Y and Z such thatY = Z • This problem is NP Complete • Suppose we have X = {3, 9, 6, 5, 1} • Can we recast the problem as a double digest problem? Physical Mapping**Reduction from Set Partition**• X = {3, 9, 6, 5, 1} 1 3 5 6 9 12 12 1 3 5 6 9 9 3 5 6 1 12 12 9 3 5 6 1 Physical Mapping**NP-Completeness**• Suppose we could solve the Double Digest Problem in polynomial time… Instance ofSet Partition Instance ofDouble Digest Convert* Solve in polynomial time *polynomial time conversion Physical Mapping**Hybridization Mapping**• The sequence of the clones remains unknown • The relative order of the probes is identified • The sequence of the probes is known in advance Clones y x z w Probes Physical Mapping**Interval graph representation**a b c d e Becomes a graph coloring problem, which is (you guessed it) NP-Complete b a c e d Physical Mapping**Simplifying Assumptions**• Probes are unique • Hybridize only once along the target DNA • There are no errors • Every probe hybridizes at every possible position on every possible clone Physical Mapping**Consecutive Ones Problem (C1P)**Rearrange the columns such that all the ones in every row are together: Probes Clones: Physical Mapping**An algorithm for C1P**• Separate the rows (clones) into components • Permute the components • Merge the permuted components S1 = {1, 2, 4, 5, 7, 9} S2 = {2, 3, 4, 5, 6, 7, 8, 9} Physical Mapping**Partitioning clones into components**• Component graph Gc • Nodes correspond to clones • Connect li and lj iff: Physical Mapping**Component Graph**l2 l3 a b l1 l4 g l8 l5 d Here, connected components are labeled with greek letters. l6 l7 Physical Mapping**Assembling a component**• Consider only row 1 of the following: • Placing all of the ones together, we can place columns 2, 7, and 8 in any order l2 l1 l3 {2, 7, 8} {2, 7, 8} {2, 7, 8} l1 … 0 1 1 1 0 … Physical Mapping**Row 2**• Because of the way we have constructed the component, l2 will have some columns with 1’s where l1 has 1’s, and some where l2 does not. • Shall we place the new 1’s to the right or left? • Doesn’t matter because the reverse permutation is the same answer. Physical Mapping**Adding row 2**• Placing column 5 to the left partially resolves the {2, 7, 8} columns S1 = {2, 7, 8} S2 = {2, 5, 7} {5} {2, 7} {2, 7} {8} l1 … 0 0 1 1 1 0 … l2 … 0 1 1 1 0 0 … Physical Mapping**Additional Rows**• Select a new row k from the component such that edges (i, j) and (i, k) exist for two already added rows i, and k. • Look at the relationship between i and k, and between i and j to determine if k goes on the same side or the opposite side of j as i. k i j Physical Mapping**Definitions**• Let • Place k on the same side as i if • Else, place on the opposite side k i j Physical Mapping**Placing Rows**Place k on the same side as i if • Or: i l2 j i l1 l3 k So we place l3 on the opposite side of l2 (relative to l1), which is the right. Physical Mapping**Placing rows**• Repeat for every row in the component {5} {2} {7} {8} {1,4} {1,4} l1 … 0 0 1 1 1 0 0 0 … l2 … 0 1 1 1 0 0 0 0 … l3 … 0 0 0 1 1 1 1 0 … Physical Mapping**Joining Components**• New graph: Gm – the merge graph (directed) • Nodes are connected components of Gc • Edge (, ) iff every set in is a subset of a set in l2 a l3 b l1 Physical Mapping**Constructing Gm**l2 l3 a b l1 l4 g l8 l5 d l6 l7 a b d g Physical Mapping**All rows in g will share the same disjoint/subset**relationship with each row of a Different components disjoint or subset Same component shares a 1 column That column matches in a row of a, then subset, else disjoint Properties of Gm a b d g Physical Mapping**Ordering components**• Vertices without incoming edges: freeze their columns • Process the rest in topological order. • is a singleton and a subset a b d g Physical Mapping**Ordering Components (2)**• Find the leftmost column with a 1 • In the current assembly, find the rows that contain all ones for that column • Merge the columns Physical Mapping**Ordering Components (3)**Physical Mapping