Model identification model selection
This presentation is the property of its rightful owner.
Sponsored Links
1 / 45

Model Identification & Model Selection PowerPoint PPT Presentation


  • 116 Views
  • Uploaded on
  • Presentation posted in: General

Model Identification & Model Selection. With focus on Mark/Recapture Studies. Overview. Basic inference from an evidentialist perspective Model selection tools for mark/recapture AICc & SIC/BIC Overdispersed data Model set size Multimodel inference. DATA.

Download Presentation

Model Identification & Model Selection

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Model identification model selection

Model Identification & Model Selection

With focus on Mark/Recapture Studies


Overview

Overview

  • Basic inference from an evidentialist perspective

  • Model selection tools for mark/recapture

    • AICc & SIC/BIC

    • Overdispersed data

    • Model set size

    • Multimodel inference


Model identification model selection

DATA

/* 01 */ 1100000000000000 1 1 1.16 27.7 4.19;

/* 04 */ 1011000000000000 1 0 1.16 26.4 4.39;

/* 05 */ 1011000000000000 1 1 1.08 26.7 4.04;

/* 06 */ 1010000000000000 1 0 1.12 26.2 4.27;

/* 07 */ 1010000000000000 1 1 1.14 27.7 4.11;

/* 08 */ 1010110000000000 1 1 1.20 28.3 4.24;

/* 09 */ 1010000000000000 1 1 1.10 26.4 4.17;

/* 10 */ 1010110000000000 1 1 1.42 27.0 5.26;

/* 11 */ 1010000000000000 1 1 1.12 27.2 4.12;

/* 12 */ 1010101100000000 1 1 1.11 27.1 4.10;

/* 13 */ 1010101100000000 1 0 1.07 26.8 3.99;

/* 14 */ 1010101100000000 1 0 0.94 25.2 3.73;

/* 15 */ 1010101100000000 1 0 1.24 27.1 4.58;

/* 16 */ 1010101100000000 1 0 1.12 26.5 4.23;

/* 17 */ 1010101000000000 1 1 1.34 27.5 4.87;

/* 18 */ 1010101011000000 1 0 1.01 27.2 3.71;

/* 19 */ 1010101011000000 1 0 1.04 27.0 3.85;

/* 20 */ 1010101000000000 1 1 1.25 27.6 4.53;

/* 21 */ 1010101011000000 1 0 1.20 27.6 4.35;

/* 22 */ 1010101011000000 1 0 1.28 27.0 4.74;

/* 23 */ 1010101010110000 1 0 1.25 27.2 4.59;

/* 24 */ 1010101010110000 1 0 1.09 27.5 3.96;

/* 25 */ 1010101010110000 1 1 1.05 27.5 3.82;

/* 26 */ 1010101010101100 1 0 1.04 25.5 4.08;

/* 27 */ 1010101010101010 1 0 1.13 26.8 4.22;

/* 28 */ 1010101010101010 1 1 1.32 28.5 4.63;

/* 29 */ 1010101010101010 1 0 1.18 25.9 4.56;

/* 30 */ 1010101010101010 1 0 1.07 26.7 4.01;

/* 31 */ 1010101010101010 1 1 1.26 26.9 4.68;

/* 32 */ 1010101010101010 1 0 1.27 27.6 4.60;

/* 33 */ 1010101010101010 1 0 1.08 26.0 4.15;

/* 34 */ 1010101010101010 1 1 1.11 27.0 4.11;

/* 35 */ 1010101010101010 1 0 1.15 27.1 4.24;

/* 36 */ 1010101010101010 1 0 1.03 26.5 3.89;

/* 37 */ 1010101010101010 1 0 1.16 27.5 4.22;


Models carry the meaning in science

Models carry the meaning in science

  • Model

    • Organized thought

  • Parameterized Model

    • Organized thought connected to reality


Model identification model selection

  • Science is a cyclic process of model reconstruction and model reevaluation

    • Comparison of predictions with observations/data

    • Relative comparisons are evidence


All models are false but some are useful

All models are false, but some are useful.

George Box


Statistical inferences

Statistical Inferences

  • Quantitative measures of the validity and utility of models

  • Social control on the behavior of scientists


Scientific model selection criteria

Scientific Model Selection Criteria

  • Illuminating

  • Communicable

  • Defensible

  • Transferable


Common information criteria

Common Information Criteria


Statistical methods are tools

Statistical Methods are Tools

  • All statistical methods exist in the mind only, but some are useful.

    • Mark Taper


Classes of inference

Classes of Inference

  • Frequentist Statistics - Bayesian Statistics

  • Error Statistics – Evidential Stats – Bayesian Stats


Two key frequencies in frequentist statistics

Two key frequencies in frequentist statistics

  • Frequency definition of probability

  • Frequency of error in a decision rule


Null h tests with fisherian p values

Null H tests with Fisherian P-values

  • Single model only

  • P-value= Prob of discrepancy at least as great as observed by chance.

  • Not terribly useful for model selection


Neyman pearson tests

Neyman-Pearson Tests

  • 2 models

  • Null model test along a maximally sensitive axis.

  • Binary response: Accept Null or reject Null

  • Size of test (α) describes frequency of rejecting null in error.

    • Not about the data, it is about the test.

    • You support your decision because you have made it with a reliable procedure.

  • N-P test tell you very little about relative support for alternative models.


Decisions vs conclusions

Decisions vs. Conclusions

  • Decision based inference reasonable within a regulatory framework.

    • Not so appropriate for science

  • John Tukey(1960) advocated seeking to reach conclusions not making decisions.

    • Accumulate evidence until a conclusion is very strongly supported.

    • Treat as true.

    • Revise if new evidence contradicts.


In conclusion framework multiple statistical metrics not incompatible

In conclusion framework, multiple statistical metrics not “incompatible”

All are tools for aiding scientific thought


Statistical evidence

Statistical Evidence

  • Data based estimate of the relative distance between two models and “truth”


Common evidence functions

Common Evidence Functions

  • Likelihood ratios

  • Differences in information criteria

  • Others available

    • E.g. Log(Jackknife prediction likelihood ratio)


Model adequacy

Model Adequacy

  • Bruce Lindsay

  • The discrepancy of a model from truth

  • Truth represented by an empirical distribution function,

  • A model is “adequate” if the estimated discrepancy is less than some arbitrary but meaningful level.


Model adequacy and goodness of fit

Model Adequacy and Goodness of Fit

  • Estimation framework rather than testing framework

  • Confidence intervals rather than testing

  • Rejection of “true model formalism”


Model adequacy goodness of fit and evidence

Model Adequacy, Goodness of Fit, and Evidence

  • Adequacy does not explicitly compare models

  • Implicit comparison

  • Model adequacy interpretable as bound on strength of evidence for any better model

  • Unifies Model Adequacy and Evidence in a common framework


Model adequacy interpreted as a bound on evidence for a possibly better model

Model adequacy interpreted as a bound on evidence for a possibly better model

Empirical Distribution - “Truth”

Model 1

Potentially better model

Model adequacy measure

Evidence measure


Goodness of fit misnomer

Goodness of fit misnomer

  • Badness of fit measures & goodness of fit tests

  • Comparison of model to a nonparametric estimate of true distribution.

    • G2-Statistic

    • Helinger Distance

    • Pearson χ2

    • Neymanχ2


Points of interest

Points of interest

  • Badness of fit is the scope for improvement

  • Evidence for one model relative to another model is the difference of badness of fit.


Ic estimates differences of kullback leibler discrepancies

ΔIC estimates differences of Kullback-Leibler Discrepancies

  • ΔIC = log(likelihood ratio) when # of parameters are equal

  • Complexity penalty is a bias correction to adjust of increase in apparent precision with an increase in # parameters.


Evidence scales

Evidence Scales

Note cutoff are arbitrary and vary with scale


Which information criterion

Which Information Criterion?

  • AIC? AICc ? SIC/BIC?

  • Don’t use AIC

  • 5.9 of one versus 6.1 of the other


What is sample size for complexity penalty

What is sample size for complexity penalty?

  • Mark/Recapture based on multinomial likelihoods

  • Observation is a capture history not a session


To q or not to q

To Q or not to Q?

  • IC based model selection assumes a good model in set.

  • Over-dispersion is common in Mark/Recapture data

    • Don’t have a good model in set

    • Due to lack of independence of observations

    • Parameter estimate bias generally not influenced

    • But fit will appear too good!

    • Model selection will choose more highly parameterized models than appropriate


Quasi likelihood approach

Quasi likelihood approach

  • χ2 goodness of fit test for most general model

  • If reject H0 estimate variance inflation

  • c^ = χ2 /df

  • Correct fit component of IC & redo selection


Model identification model selection

QICs


Problems with quasilikelihood correction

Problems with Quasilikelihood correction

  • C^ is essentially a variance estimate.

    • Variance estimates unstable without a lot of data

  • lnL/c^ is a ratio statistic

    • Ratio statistics highly unstable if the uncertainty in the denominator is not trivial

  • Unlike AICc, bias correction is estimated.

    • Estimating a bias correction inflates variance!


Fixes

Fixes

  • Explicitly include random component in model

    • Then redo model selection

  • Bootstrapped median c^

  • Model selection with Jackknifed prediction likelihood


Large or small model sets

Large or small model sets?

  • Problem: Model Selection Bias

    • When # of models large relative to data size some models will have a good fit just by chance

  • Small

    • Burnham & Anderson strongly advocate small model sets representing well thought out science

    • Large model sets = “data dredging”

  • Large

    • The science may not be mature

    • Small model sets may risk missing important factors


Model selection from many candidates taper 2004

Model Selection from Many Candidates Taper(2004)

SIC(x) = -2In(L) + (In(n) + x)k.


Performance of sic x with small data set

Performance of SIC(X) with small data set.

N=50, true covariates=10, spurious covariates=30, all models of order <=20, 1.141 X 1014 candidate models

'


Chen chen 2009

Chen & Chen 2009

  • M subset size, P= # of possible terms


Explicit tradeoff

Explicit Tradeoff

  • Small model sets

    • Allows exploration of fine structure and small effects

    • Risks missing unanticipated large effects

  • Large model sets

    • Will catch unknown large effects

    • Will miss fine structure

  • Large or small model sets is a principled choice that data analysts should make based on their background knowledge and needs


Akaike weights model averaging

Akaike Weights & Model Averaging

Beware, there be dragons here!


Akaike weights

Akaike Weights

  • “Relative likelihood of model i given the data and model set”

  • “Weight of evidence that model i most appropriate given data and model set”


Model averaging

Model Averaging

  • “Conditional” Variance

    • Conditional on selected model

  • “Unconditional” Variance.

    • Actually conditional on entire model set


Good impulse with huge problems

Good impulse with Huge Problems

  • I do not recommend Akaike weights

  • I do not recommend model averaging in this fashion

  • Importance of good models is diminished by adding bad models

  • Location of average influenced by adding redundant models


Model redundancy

Model Redundancy

  • Model Space is not filled uniformly

  • Models tend to be developed in highly redundant clusters.

  • Some points in model space allow few models

  • Some points allow many


Redundant models do not add much information

Redundant models do not add much information

Model adequacy

Model adequacy

Model dimension

Model dimension


A more reasonable approach

A more reasonable approach

  • Bootstrap Data

  • Fit model set & select best model

  • Estimate derived parameter θ from best model

  • Accumulate θ

Repeat

Within

Time

Constraints

Mean or median θ with percentile confidence intervals


  • Login