data and statistics new methods and future challenges phil o neill university of nottingham
Download
Skip this Video
Download Presentation
Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham

Loading in 2 Seconds...

play fullscreen
1 / 30

Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham - PowerPoint PPT Presentation


  • 55 Views
  • Uploaded on

Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham . Professors: How they spend their time. Professors: How they spend their time. 1. High-resolution genetic data 2. Model assessment . 1. High-resolution genetic data 2. Model assessment .

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Data and Statistics: New methods and future challenges Phil O’Neill University of Nottingham' - lexine


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data and statistics new methods and future challenges phil o neill university of nottingham

Data and Statistics: New methods and future challengesPhil O’NeillUniversity of Nottingham

slide7
“High-resolution genetic data”: what are they? individual-level data on the pathogen can be taken at single or multiple time points  high-dimensional e.g. whole genome sequences proportion of individuals sampled could be high/low  becoming far more common due to cost reduction
slide8
“High-resolution genetic data”: what use are they? better inference about transmission paths more reliable estimates of epi quantities? understand evolution of the pathogen
slide10

.

A C C C T T G G G A A A .....

slide11
Modelling and Data Analysis methodsTwo kinds of approaches exist:1. Separate genetic and epidemic components (e.g. Volz, Rasmussen) 2. Combine genetic and epidemic components (e.g. Ypma, Worby, Morelli)
slide12
1. Separate genetic and epidemic componentse.g: - estimate phylogenetic tree - given the tree, fit epidemic modelor - cluster individuals into genetically similar groups - given the groups, fit multi-type epidemic model
slide13
1. Separate genetic and epidemic components + “Simple” approach + Avoids complex modelling- Ignores any relationship between transmission and genetic information
slide14
2. Combine genetic and epidemic componentse.g: - model genetic evolution explicitly - define model featuring both genetic and epidemic parts
slide15
2. Combine genetic and epidemic components + “Integrated” approach - Is modelling too detailed? - Initial conditions: typical sequence?+/- Model differences between individuals instead?
slide18
“Model assessment”: why do it? Poor fit sheds doubt on conclusions from modelling Model choice can be a tool for directly addressing questions of interest
linear regression y k ax k b e k e k n 0 v minimise distance of model mean from observed data

Linear regression: yk= axk + b + ek, ek ~ N(0,v)Minimise distance of model mean from observed data

linear regression y k ax k b e k e k n 0 v minimise distance of model mean from observed data1

Linear regression: yk= axk + b + ek, ek ~ N(0,v)Minimise distance of model mean from observed data

slide21
For outbreak data: What are the right residuals? Should observed or unobserved data be compared to the model? (Streftaris and Gibson) Mean model may only be available via simulation Is the mean the right quantity to consider?
slide22
For outbreak data: What are the right residuals? Should observed or unobserved data be compared to the model? (Streftaris and Gibson) Mean model may only be available via simulation Is the mean the right quantity to consider?
slide23
Simulation-based approaches to model fit: Forward simulation – “close” to data? Choice of summary statistics? Close ties to ABC methods (McKinley, Neal)
slide24
Approaches to model choice  Hypermodels/saturated models Bayesian non-parametric methods Bayesian methods e.g. RJMCMC Mixture models
slide25
 Hypermodels/saturated modelse.g. Infection rates βS or βSI or βSI0.5 in an SIR model? Instead use βSI and estimate  (O’Neill and Wen)
slide26
 Bayesian non-parametric methodse.g. Infection rate β(t)SI or β(t) in an SIR model; Estimate β(t) in a Bayesian non-parametric manner using Gaussian process machinery (Kypraios,O’Neill and Xu; Knock and Kypraios)
slide28
 Reversible Jump MCMCe.g. Distinct models (usually small number), estimate Bayes factors by running MCMC on union of parameter spaces (O’Neill; Neal and Roberts; Knock and O’Neill)
slide29
 Mixture modelse.g. Given two models (f, g), create mixture model f(x) =  g(x) + (1-  ) h(x);estimation of  enables estimation of Bayes Factors (Kypraios and O’Neill)
ad