1 / 39

# BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS - PowerPoint PPT Presentation

BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS. Mathematical expectation. The mean (x) of random variable x is:. where n is the number of observations, the variance (s 2 ) is:. Mathematical expectation. The standard deviation ( s ) is:. The coefficient of variation is:.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS' - doyle

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

CONCEPTS AND METHODS

The mean (x) of random variable x is:

where n is the number of observations, the variance (s2) is:

The standard deviation (s) is:

The coefficient of variation is:

The probability of a event occurring is expressed as: P(event)

The probability of the event not occurring, 1- P(event) or P(~event).

If events are independent, the probability of events A and B occurring is estimated as: p(A) * p(B).

The probability of capturing a single fish given 1 is present: p(capture)

capturing 2 fish given 2 are present:

p(capture)*p(capture) = p(capture)2,

the probability of catching at least 1 given 2 present:

p(capture)*(1-p(capture )) + (1- p(capture))*p(capture) + p(detect)2

or:

1- (1- p(capture))N

where N = number of fish present

The probability of detecting a fish during a single event: p(detect)

On all three sampling occasions is:

p(detect)*p(detect)*p(detect) = p(detect)3,

the probability of not catching it during any of the 3 occasions is:

(1- p(detect))*(1- p(detect))*(1- p(detect)) = (1- p(detect))3,

and the probability of catching it on at least 1 occasion is the complement of not catching it during any of the occasions:

1- (1- p(detect))3.

The probability a fish is present: p(present)

The probability detecting a fish, given it is present : p(detect | present)

The probability detecting a fish given it is not present?

The probability a fish is present and detected:

p(detect | present) * p(present)

The probability N fish are present: p(N)

The probability detecting at least 1 fish, given N are present : p(detect | N)

The probability N fish are present and at least 1 is detected:

p(detect | N) * p(N)

Question: if we sampled but did not detect a fish species, what are the chances it was present?

p(present | not detected)

The probability fish species present: p(present)

not present: 1- p(present)

The probability detection, given present : p(detect | present)

probability detection, given not present : p(detect | not present) = 1

Total probability of the event of not detecting the species:

Two possibilities: (1) present but not detected and (2) not present

P(not detected| present)*P(present) + P(not detected| not present)*P(not present)

Bayes rule what are the chances it was present?

p(present | not detected) =

p(not detected| present)*p(present)

p(not detected| present)*p(present) + p(not detected| not present)*p(not present)

Assume 80% probability of detection:

p(not detected| present) = 1- 0.80 = 0.20

Assume 40% probability of bull trout present:

p(present) = 0.40, p(not present) = 0.60

p(not detected| not present) = 1

Now calculate:

0.20*0.40

0.20*0.40 + 1*0.60 = 0.118 or 11.8%

Models and fisheries management what are the chances it was present?

“True” Models

• Fundamental assumption: there is no “true” model that generates biological data

• Truth in biological sciences has essentially infinite dimension; hence,

• full reality cannot be revealed with finite samples.

• Biological systems are complex with many small effects, interactions, individual

• heterogeneity, and environmental covariates.

• Greater amounts of data are required to model smaller effects.

• Thus all models are approximations of reality

Models = hypotheses what are the chances it was present?

Models and hypotheses

• Hypotheses are unproven theories, suppositions that are tentatively

• accepted to explain facts or as the basis for further investigation

• Models are very explicit representations of hypotheses

• Several models can represent a single hypotheses

• Models are tools for evaluating hypotheses

Models and hypotheses: example what are the chances it was present?

Hypothesis: shoal bass reproduction success is greater when there are more reproductively active adults

Y = aN

Number of young is proportional to the number of adults

Number of young increases with the number of adults

until nesting areas are saturated

Y = aN/(1+bN)

Number of young is increases until the carrying capacity of

nesting and rearing areas is reached

Y = aNe-bN

Y = aN what are the chances it was present?

Y = aN/(1+bN)

Number of YOY

Y = aNe-bN

Number of shoal bass

Tapering Effect Sizes what are the chances it was present?

• Biological systems there are often large important effects, followed by smaller

• effects, and then yet smaller effects.

• These effects might be sequentially revealed as sample size increases

• because information content increases

• Rare events yet are more difficult to study (e.g. fire, flood, volcanism)

Big

effects

small

effects

Model selection what are the chances it was present?

• Determine what is the best explanation given the data

• Determine what is the best model for predicting the response

• Two approaches in fisheries/ecology

• Null hypothesis testing

• Information theoretic approaches

Null hypothesis testing what are the chances it was present?

Develop an a priori hypothesis

Deduce testable predictions (i.e., models)

Carry out suitable test (experiment)

Compare test results with predictions

Retain or reject hypothesis

Hypothesis testing example: what are the chances it was present?

Density independence for lake sturgeon populations

Hypothesis: lake sturgeon reproduction is density independent

Prediction: there is no relation between adult density and age 0 density

Model: Y = B0

Test: measure age 0 density for various adult densities over time

Compare:

Linear regression between age 0 and adult sturgeon densities, P value = 0.1839

Using a critical a-level = 0.05, we conclude no significant relationship

Result: Retain hypothesis lake sturgeon reproduction is density

independent

Model selection based on p-values what are the chances it was present?

• No theoretical basis for model selection

• P-values ~ precision of estimate

• P-values strongly dependent on sample size

P(the data (or more extreme data)| Model) vs. L(model | the data)

JUST SAY NO TO STATISTICAL SIGNIFICANCE TESTING

FOR MODEL SELECTION

Information theory what are the chances it was present?

If full reality cannot be included in a model, how do we tell how close we are to truth.

truth

Kullback-Leibler distance based on information theory

The measures how much information is in accounted for in a model

Entropy is synonymous with uncertainty

Information theory what are the chances it was present?

K,L distance (information) is represented by: I(truth| model)

It represents information lost when the candidate model is used to

Approximate truth thus SMALL values mean better fit

AIC is based on the concept of minimizing K-L distance

Akaike noticed that the maximum log likelihood

Log( L (model or parameter estimate | the data) ) was related to K-L distance

45 what are the chances it was present?

40

35

30

25

20

15

10

5

0

0

5

10

What a maximum likelihood estimate?

It is those parameter values that maximize the value of the likelihood,

given the data

Sums of squares in regression also is a measure of the relative fit of a model

SSE = Sdeviations2

Akaike’s contribution was that he showed that:

AIC = -2ln(L (model | the data)) + 2K

It is based on the principle of parsimony

Bias2

Variance

Few

Many

Number of parameters

Heuristic interpretation

AIC = -2ln(likelihood) + 2*K

Measures model lack of fit

Penalty for increasing model size

(enforces parsimony)

AIC: Small sample bias adjustment K-L distance

If ratio of n/K is < 40 then use AICc

AICc = -2*ln(likelihood | data) + 2*K + (2*K*(K+1))/(n-K-1)

As n gets big….

(2*K*(K+1))/(n-K-1) = 1/very large number

(2*K*(K+1))/(n-K-1) = 0

AICc = AIC

So….

Model selection with AIC K-L distance

What is model selection?

AIC by itself is relatively meaningless.

Recall that we find the best model by comparing various models and examining

Their relative distance to the “truth”

We do this by calculating the difference between the best fitting model (lowest AIC) and the other models.

Model selection uncertainty

Which model is the best?

What about if you collect data at the same spot next year,

next week, next door?

AIC weights-- long run interpretation vs. Bayesian.

Confidence set of models analogous to confidence intervals

Where do we get AIC? K-L distance

K

-2ln(L (model | the data))

Interpreting AIC K-L distance

Best model

(lowest AICc)

Difference between lowest AIC and model

(relative distance from truth)

Interpreting AIC K-L distance

AICc weight, ranges 0-1 with 1 = best model

Interpreted a relative likelihood that model is best, given the data and the other models in the set

Interpreting AIC K-L distance

Ratio of 2 weights interpreted as the strength of evidence for one model over another

Here the best model is 0.86748/0.13056 = 6.64 times more likely to be

The best model for estimating striped bass population size

Confidence model set K-L distance

Analogous to a confidence interval for a parameter estimate

Using a 1/8 (0.12) rule for weight of evidence, my confidence set includes the

top two models (both model likelihoods > 0.12).

Linear models review K-L distance

Y: response variable (dependent variable)

X: predictor variable (independent variable)

Y = b0 + b1*X + e

b0 is the intercept

b1 is the slope (parameter) associated with X

e is the residual error

Linear models review K-L distance

When Y is a probability it is bounded by 0, 1

Y = b0 + b1*X

Can provide values <0 and > 1, we need to transform

or use a link function

For probabilities, the logit link is the most useful

Logit link K-L distance

p

h = ln( )

1- p

h is the log odds

p is the probability of an event

Log linear models K-L distance

(logistic regression)

h = b0 + b1*X

h is the log odds

b0 is the intercept

b1 is the slope (parameter) associated with X

Betas are on a logit scale and the log-odds needs to be back transformed

Back transformation: K-L distance

1

p =

-h

1+exp( )

h is the log odds

p is the probability of an event

Back transformation example K-L distance

h = b0 + b1*X

b0 = - 2.5

b1 = 0.5

X = 2

Back transformation example K-L distance

h = -2.5 + 0.5*2

h = -1.5

1

= 0.18 or 18%

1+exp(1.5)

Interpreting beta estimates K-L distance

Betas are on a logit scale, to interpret calculate odds ratios

Using the exponential function

b1 = 0.5

exp(0.5) = 1.65

Interpretation: for each 1 unit increase in X, the event is 1.65 times more likely to occur

For example, for each 1 inch increase in length, a fish is 1.65 times more likely to be

caught