1 / 23

Constructing evolutionary trees from rooted triples

Constructing evolutionary trees from rooted triples. Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University. An evolutionary tree. A rooted tree Each leaf represents one species. Internal nodes are unlabelled. (inferred common ancestors). a. b. c. d. e. f.

anahid
Download Presentation

Constructing evolutionary trees from rooted triples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Constructing evolutionary trees from rooted triples Bang Ye Wu Dept. of Computer Science and Information Engineering Shu-Te University

  2. An evolutionary tree • A rooted tree • Each leaf represents one species. • Internal nodes are unlabelled. (inferred common ancestors) a b c d e f

  3. A (rooted) triple (triplet) • An evolutionary tree of 3 species. • A constraint in an evolutionary tree construction problem. • (c(ab)): lca(b,c)=lca(c,a)lca(a,b)lca : lowest common ancestor : “is an ancestor of “ • a,b should be closer than a,c or b,c. a b c

  4. A tree compatible with triples • Given a set of triples, construct a tree satisfying all the triples. • If such a tree exists, the problem is polynomial time solvable. [Aho et al, 1981]

  5. Two conflicting triples Three conflicting triples (pairwise compatible) Incompatible (conflicting) triples

  6. Two optimization problems • The maximum consensus tree: • the tree satisfying maximum number of triples. • NP-hard [Jansson, 2001][Wu, to appear] • A new heuristic algorithm [this paper] • The maximum compatible set: • The compatible species subset of maximum cardinality. • NP-hard [this paper]

  7. Previous heuristicBest-One-Split-First • If a species x is split from a set V, all triples (x(v1v2)), v1 and v2 in V, will be satisfied. • Repeatedly split one species from the set. Choose the split species greedily.

  8. {a,b,d} c c is split {a,d} b c b is split a d b c c is chosen, and the two triples is satisfied.

  9. Previous heuristicMin-Cut-Split-First • Construct an auxiliary graph: • Vertex: species • Each edge is labeled by a set: for each triple (x(yz)), x is in the label set of edge (y,z).

  10. a min-cut, triple (c(bd)) is conflicting • A bipartition corresponds to a split in the tree. • The label in the cut of the bipartition corresponds to the triples conflicting the split. • Repeatedly find the bipartition with minimum cut.

  11. Previous heuristicBest-Pair-Merge-First • Instead of top-down splitting, BPMF uses the bottom-up merging strategy. • Starting from sets of singleton, we repeatedly merge the sets step by step. • Scoring functions are used to evaluate which pair should be merged in each step.

  12. {a} {b} {c} {d} {a,d} {b} {c} b a a b a d d c c d {a,d} {b,c} {a,d,b,c}

  13. An exact algorithm for MCTT • Dynamic programming • F(V)=max{F(V1)+F(V2)+W(V1,V2)}, taken among all bipartition (V1,V2) of V. • F(V): # of satisfied triples over V. • W(V1,V2): # of (x(v1v2) for x not in V and v1, v2 in V1, V2 respectively. • Computed with cardinality from small to large.

  14. Our new heuristic algorithm (DPWP) • Derived from the exact algorithm. • The number of subsets of each cardinality is limited by a parameter K. • When K=infinity, it is just the exact algorithm. • Time-quality trade-off. • The time complexity is O(n2k2(n3+k)). • Sorry, there is a mistake in the paper.

  15. The experiment results (time)

  16. The MCST problem • Given triples over species set S, find a subset U of S such that all given triples over U is compatible and |U| is maximum. • We show the problem is NP-hard. • Transformed from the Feedback Vertex Set problem.

  17. The feedback vertex set problem • Feedback vertex set: a vertex subset containing at one vertex of each cycle of the given directed graph. • In other words, removing a feedback vertex set results in an acyclic digraph.

  18. The reduction

  19. Concluding remarks • What is the approximation ratio? • The Best-One-Split-First algorithm is a 3-approximation algorithm, • The larger K give us better solution, but we do not know the theoretic bound of the ratio.

More Related