1 / 11

# Confidence Intervals - PowerPoint PPT Presentation

Confidence Intervals. Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change our estimates. So, we treat our estimates as random variables Want a measure of how confident we are in our estimate.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Confidence Intervals' - uriel-oneil

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

• Underlying model: Unknown parameter

• We know how to calculate point estimates

• E.g. regression analysis

• But different data would change our estimates.

• So, we treat our estimates as random variables

• Want a measure of how confident we are in our estimate.

• Calculate “Confidence Interval”

• If know how data sampled

• We can construct a Confidence Interval for an unknown parameter, q.

• A 95% C.I. gives a range such that true q is in interval 95% of the time.

• A 100(1-a) C.I. captures true q

(1-a) of the time.

• Smaller a, more sure true q falls in interval, but wider interval.

• Lead in drinking water causes serious health problems.

• To test contamination, require a control site.

• Problems:

• Lead concentration in control site?

• Estimate 95% confidence interval

• Recall U.S. gas market question:

• By how much does gas consumption decrease when price increases?

• Our linear model:

• Estimate of b1: -.04237.

• How confident are we in this estimate?

• Construct 90% C.I. for this estimate

If Data ~N(m,s2)

• Since we don’t know s, use t-distribution.

• 95% C.I. for m:

• s is standard error of mean.

• t97.5 is critical value of t distribution

• Draw on board (Prob = 2.5%)

• Similar to Normal Distribution

• Requires “degrees of freedom”.

• df = (# data points) – (# variables).

• E.g. mean of lead concentration, 8 samples, one variable: d.f.=7.

• Higher d.f., closer t is to Normal distribution.

• Can use “Bootstrapping”.

• Draw large sample with replacement

• Calculate mean

• Repeat many times

• Draw histogram of sample means

• Calculate empirical 95% C.I.

• Requires no previous knowledge of underlying process

• Mean=51.39, s=5.75, t97.5=2.365

• Lower=51.39-(5.75)(2.365)

• Upper= 51.39+(5.75)(2.365)

• C.I. = [37.8,65.0]

• Using bootstrapped samples:

• C.I. = [40.8,62.08]

Coefficients:

Value Std. Error t value Pr(>|t|)

(Intercept) -0.0898134 0.0507787 -1.7687217 0.0867802

PG -0.0423712 0.0098406 -4.3057672 0.0001551

Y 0.0001587 0.0000068 23.4188561 0.0000000

PNC -0.1013809 0.0617077 -1.6429209 0.1105058

PUC -0.0432496 0.0241442 -1.7913093 0.0830122

Residual standard error: 0.02680668 on 31 degrees of freedom

Multiple R-Squared: 0.9678838

F-statistic: 233.5615 on 4 and 31 degrees of freedom, the p-value is 0

• b2=-.04237, s=.00984

• 90% C.I.: t95=1.695 (d.f.=37-5=32)

• C.I. = [-.0591,-.0256]

• Using bootstrapped samples:

• C.I. = [-.063,-.026]

• Response is probably between 2.5 gallons and 6 gallons.

• There is a 95% chance that the true average lead concentration lies in this range.

• There is a 90% chance that the true value of b1 lies in this range.

• Also can calculate “confidence region” for 2 or more variables.