Phylogenetic inference
Download
1 / 24

phylogenetic inference - PowerPoint PPT Presentation


  • 262 Views
  • Uploaded on

Phylogenetic Inference. Data Optimality Criteria Algorithms Results Practicalities. Our Goals. Infer Phylogeny Optimality criteria Algorithm Phylogenetic inference (interesting ones). Watch Out.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'phylogenetic inference' - Gabriel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Phylogenetic inference

Phylogenetic Inference

Data

Optimality Criteria

Algorithms

Results

Practicalities

Chuck Staben


Our goals
Our Goals

  • Infer Phylogeny

    • Optimality criteria

    • Algorithm

  • Phylogenetic inference

    • (interesting ones)

Chuck Staben


Watch out
Watch Out

“The danger of generating incorrect results is inherently greater in computational phylogenetics than in many other fields of science.”

“…the limiting factor in phylogenetic analysis is not so much in the facility of software applicaition as in the conceptual understanding of what the software is doing with the data.”

Chuck Staben


Phylogenetic models
Phylogenetic Models

  • No transfer of genetic information by hybridization

  • All sequences are homologous

  • Each position in alignment homologous

  • Observed variation is valid sample from included group

  • Positions evolve independently

Chuck Staben


Steps in analysis
Steps in Analysis

  • Data Model (Alignment)

    • alignment method

    • “trimming” to a phylogenetic set

  • DNA base substitution model

  • Build Trees

    • Algorithm based vs Criterion based

    • Distance based vs Character-based

Chuck Staben


Choice of input data
Choice of Input Data

  • Data Type

    • Aligned sequences, RFLP, morphological data…

  • Molecule of interest

    • rRNA (general purpose)

    • interesting character

  • Number/type of taxa

    • ingroup and outgroup

Informative

Chuck Staben


Rrna genes
rRNA Genes

  • Conserved across kingdoms

  • Varies within species

  • Widely sequenced, easy

  • Long, lots of characters

Duplication?

Chuck Staben


Multiple alignment method
Multiple Alignment Method

  • Computer dependence

  • Phylogenetic Assumptions

  • Alignment parameters

    • (substitution matrix, gap cost)

  • Aligned features

    • primary sequence, structure

  • Optimization

    • statistical, non-statistical

Chuck Staben


Typical alignment method
Typical Alignment Method

  • CLUSTAL, then manual editing

    • Manual editing for phylogeny

    • phylogenetic assumption in guide tree

    • parameters a priori and dynamic

    • primary structure (with some “influence”

    • optimization non-statistical

Chuck Staben


Substitution models
Substitution Models

  • G to A, C to T versus N to N

  • amino acid substitution

  • forwards and backwards identical?

  • site-to-site variation

Simpler model better

Estimate from "quick" tree building,

Observed Variation

Chuck Staben


Tree building methods
Tree-Building Methods

  • Distance

    • UPGMA, NJ, FM, ME

  • Character

    • Maximum Parsimony (PAUP)

    • Maximum Likelihood (PHYLIP)

Acrimonious Debates

Chuck Staben


Distance methods
Distance Methods

  • Measure distance (dissimilarity)

  • Accurate if distances are all summative (ultrametric)

    • NEVER true over large distance

  • Methods

    • UPGMA (Unweighted pair group method with Arithmetic Mean)

    • NJ (Neighbor joining)

    • FM (Fitch-Margoliash)

    • ME (Minimal Evolution)

Most Often Wrong!

CLUSTAL

Chuck Staben


Which distance method
Which Distance Method?

  • UPGMA

    • Least accurate, most used

  • NJ

    • EXTREMELY RAPID

    • GIVES ONLY 1 TREE

  • ME and FM seem best

    • Minimize tree path lengths

Chuck Staben


Character methods
Character Methods

  • Maximum Parsimony

    • minimal changes to produce data

    • can use different substitution models

  • Maximum Likelihood

    • turns problem “inside out”

      • coin flip analogy

    • increasingly popular

Chuck Staben


Searching for trees
Searching for Trees

Chuck Staben


Tree search algorithms
Tree Search Algorithms

  • Exhaustive

    • VERY INTENSIVE

  • Branch and Bound

    • Compromise

  • Heuristic

    • FAST (usually start with NJ)

Chuck Staben


Evaluating trees
Evaluating Trees

  • Consenus Tree

  • Randomized Trees

    • Skewness tests

  • Randomized Character Data

    • Permutation tests

  • Bootstrap, Jackknife

    • resampling techniques

    • >70% probably correct; 50% overestimates accuracy

Chuck Staben


Rooting trees
Rooting Trees

  • Molecular Clock

    • Root=midpoint, longest span

    • Almost ALWAYS WRONG

  • Extrinsic Evidence

    • select fungus as root for plants, eg

      • long branch attraction can be problem

  • Paralog rooting

    • long branch problems

Chuck Staben


Tree congruence
Tree Congruence

  • Tree-to-Tree Comparison

    • 2 different characters/same groups

    • Important for evaluating biological hypotheses

      • lentiviruses diverged within their current hosts only

      • plant pathogenicity has arisen many times in fungi

Chuck Staben


Common software
Common Software

  • PAUP

    • GCG

      • Pileup, Lineup, Paupsearch, Paupdisplay

    • PAUPSTAR (MACs best!)

  • PHYLIP

    • UNIX (Seqanal)

Chuck Staben


Phylogenetic stories
Phylogenetic Stories

  • HIV

    • complete genome accessible

    • evolution rapid

      • selection, neutralism?

    • human interest (dentist and his patients, eg.)

  • Coevolution, host and pathogen

  • Big Tree

Chuck Staben


Phylogenetic resources
Phylogenetic Resources

  • NCBI Taxonomy Browser

    • http://www.ncbi.nlm.nih.gov/Taxonomy/

  • RDP database

    • http://rdpwww.life.uiuc.edu/

  • “Tree of Life”

    • http://phylogeny.arizona.edu/tree/phylogeny.html

Chuck Staben


Practicalities
Practicalities

  • Quality of input data critical

  • Examine data from all possible angles

    • distance, parsimony, likelihood

  • Outgroup taxon critical

    • problem if outgroup shares a selective property with a subset of ingroup

  • Order of input can be problematic

    • Jumble them!

Chuck Staben


Trees
Trees

plagiarized by Chuck Staben, 1998

Seargent Joyce Kilmer, 1914

Chuck Staben


ad