140 likes | 160 Views
Explore biological, practical, and computational issues in phylogenetic reconstruction, from data gathering to assessment methods. Understand how practical constraints translate to computational challenges and impact reconstruction techniques.
E N D
The Big Issues in Phylogenetic Reconstruction Randy Linder Integrative Biology, University of Texas rlinder@mail.utexas.edu
Overview for Talk • Biological issues • Practical • Time constraints • Material constraints • Monetary constraints • Information constraints • Theoretical • Models: Processes of evolution • How does information move through the system • How do nucleotides evolve • How do indels evolve • Rearrangements and duplications • Alignment • Reconstruction • Comparative methodologies
Overview for Talk (cont.) • Computational issues • Theoretical • Graph theory • Algorithmic • Heuristics • Practical • Performance • Running times • Accuracy of results • “Efficiency”: accuracy of results for a given amount of data • Will not address these directly, but they will come up at various points
General Overview of a Systematist’s Work • Determine the scope of the work • Anything from kingdoms to single genera • Seek funding • Plan and travel to get materials • Costs • Politics • Time
General Overview of a Systematist’s Work (cont.) • Extract DNA (in most cases) • Sometimes easy (most animals, microbes) • Sometimes not (many plants, fungi) • Determine which DNA regions to use • Ones previously determined by other studies • Develop new ones • Amplify (and clone) regions • See if they have appropriate variation • Do so in all available specimens
General Overview of a Systematist’s Work (cont.) • Align sequences • Use available algorithms (most often some flavor of Clustal) • Hand align where algorithms are too stupid • If using model-based reconstruction method, determine model to use • Reconstruct relationships • Choose method(s) of reconstruction • Time • Computational feasibility • Quality of result • Decide if datasets can be combined
General Overview of a Systematist’s Work (cont.) • Assess quality of reconstruction • Bootstrap • Non-parametric (any method, especially MP) • Parametric (ML) • Posterior probabilities (Bayesian) • Perform comparative analyses (often) • Examples: • Assess character evolution • Test for patterns consistent with • Types of speciation • Biogeographic hypotheses • Adaptive hypotheses
Biological Issues: Practical • Time constraints • Gathering of material for study • Finding adequate regions for analysis • Cloning (sometimes) and sequencing of information • Sometimes gathering of morphological information • Run time of phylogenetic analyses • Run time of assessments of support • Time to conduct comparative analyses
Biological Issues: Practical • Material constraints • Getting all of the taxa desired • Monetary constraints • Travel costs • Cloning and sequencing costs • Information constraints • What regions will provide the right amount of variation for my group? • What is the best sampling strategy for my group? • Do my regions for analysis avoid gene tree/species tree problems? • Does my group have any reticulation in it and how will I know?
How Do Practical Issues Translate to Compuational Issues? • Prior to lab work • Not much, really • Maybe with design of taxon sampling strategies b/c getting samples is often a major constraint • During lab work • Help in locating new regions for analysis • Help in assessing/developing regions that are free of gene tree/species tree problems.
How Do Practical Issues Translate to Compuational Issues? • Post lab work • Improved alignment methods • Better methods, independent of reconstruction • Methods that handle indels better • Sequence length differences • Non-repeat indels • Repeat indels • Assessment of quality of alignment in different regions • Alignment methods that simultaneously infer phylogeny • Need alignment methods that explicitly attempt to infer positional homology according to some optimality criterion
How Do Practical Issues Translate to Compuational Issues? • Post lab work • Reconstruction methods • Better models of sequence evolution • Relaxing of the rates-across-sites assumption • Models that make use of information in indels • Methods that assess reasons for phylogenetic incongruence and support for different explanations • Reticulation above the species level • Reticulation below the species level • Lineage sorting: alleles, gene duplications
How Do Practical Issues Translate to Compuational Issues? • Post lab work • Reconstruction methods • Methods that statistically assess the tree-space of optimal and nearly optimal solutions to reconstructions • Better supertree methods • Support methods • Develop methods that assess support for alternative explanations: • Tree, network, gene tree/species tree, etc.