Practical session bayesian evolutionary analysis by sampling trees beast
Sponsored Links
This presentation is the property of its rightful owner.
1 / 18

Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST) PowerPoint PPT Presentation


  • 76 Views
  • Uploaded on
  • Presentation posted in: General

Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST). Rebecca R. Gray, Ph.D. Department of Pathology University of Florida. BEAST: is a cross-platform program for Bayesian MCMC analysis of molecular sequences

Download Presentation

Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Practical Session: Bayesian evolutionary analysis by sampling trees (BEAST)

Rebecca R. Gray, Ph.D.

Department of Pathology

University of Florida


  • BEAST:

    • is a cross-platform program for Bayesian MCMC analysis of molecular sequences

    • entirely orientated towards rooted, time-measured phylogenies inferred using strict or relaxed molecular clock models

    • can be used as a method of reconstructing phylogenies, but is also a framework for testing evolutionary hypotheses without conditioning on a single tree topology

    • uses MCMC to average over tree space, so that each tree is weighted proportional to its posterior probability


Citations

  • The recommended citation for this program is:

    • Drummond AJ, Rambaut A (2007) "BEAST: Bayesian evolutionary analysis by sampling trees." BMC Evolutionary Biology7:214

  • To cite the relaxed clock model in BEAST:

    • Drummond AJ, Ho SYW, Phillips MJ & Rambaut A (2006) PLoS Biology4, e88

  • To cite the Bayesian Skyline model in BEAST:

    • Drummond AJ, Rambaut A & Shapiro B and Pybus OG (2005) Mol BiolEvol22, 1185-1192

  • The original MCMC paper was:

    • Drummond AJ, Nicholls GK, Rodrigo AG & Solomon W (2002) Genetics161, 1307-1320


Basic Pipeline

  • 1) setting up xml file (beauti)

  • 2) running xml file (beast)

  • 3) evaluating the performance of the run (Tracer)

  • 4) comparing models, obtaining estimates of parameters (Tracer)

  • 5) summarizing the tree distribution (TreeAnnotator)

  • 6) viewing MCC tree (Figtree)


Downloading programs

  • http://beast.bio.ed.ac.uk/Main_Page\

    • Download contains beauti, BEAST, TreeAnnotator

  • http://beast.bio.ed.ac.uk/Tracer

  • http://beast.bio.ed.ac.uk/FigTree


Practical: Rift Valley fever virus


Epidemiology of RVF

  • The virus was first identified in 1931 in the Rift Valley of Kenya

  • Mosquito vector, primarily infects livestock

  • 1997–1998, a major outbreak occurred in Kenya, Somalia and the United Republic of Tanzania

  • September 2000 cases were confirmed in Saudi Arabia and Yemen (first reported occurrence of the disease outside the African continent)


Setting up xml file in beauti

  • Requires a nexus file

    • Helpful to have dates with the sample name

    • Use the finest resolution available

  • GUI interface allows basic selection of parameters

  • Xml file can be manually edited to test specific hypotheses/tweak run


Beauti practical

  • Import alignment (g_63.nex)

  • Tip dates – use tipdates, guess dates (years since some time in the past)

  • Site models – use GTR + G, empirical base frequencies

  • Test hypothesis of strict vs. relaxed molecular clock

  • Trees – coalescent tree prior – constant size

  • 5 x 107 generations


BEAST

  • Open xml file with text editor

  • Run in beast

  • Check mixing of the MCMC chain

  • Open S log files in Tracer

  • Open L and G2 log files

  • What can we do about the trace??


Proper mixing

  • First step – run chain longer

    • Open L200 files

  • Other steps to try:

    • Over parameterization – reduce complexity

    • Temporal/phylogenetic signal

    • Priors are inappropriate


Model testing

  • Bayes factors:

    • Compare estimates of the marginal likelihoods of the models of interest

    • 2*(ln marginal likelihood model 1 – ln marginal likelihood model 2)

    • >10, strong support for alternative (more complex model)

  • Strict clock vs. relaxed clock

    • Also consider the coefficient of variation


Summarizing tree

  • TreeAnnotator

    • Burnin 10% (501 samples)

    • Keep median heights

    • MCC tree

  • Visualizing tree: FigTree

    • Posterior probabilities for branches

    • Median heights for clades of interest


Advanced analyses

  • Different coalescent priors

    • Parametric models (exponential, logistic)

    • Bayesian skyline plots

  • Phylogeography

    • Lemey et al, 2009, Plos Computational Biology

  • Site specific rates of variation


Change in effective population size over time

Log10 Ne

Log10 Ne


1916 (1868-1942)

Bayesian Genealogy Of G Gene


Additional resources

  • Tutorials on the beast website, google group

  • 16th International BioInformatics Workshop on Virus Evolution and Molecular Epidemiology

    • Johns Hopkins University, Baltimore

    • 29 August - 03 September 2010, Bethesda, USA

    • http://www.rega.kuleuven.be/cev/workshop/


  • Login