- 77 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Confidence Intervals' - uriel-oneil

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Confidence Intervals

- Underlying model: Unknown parameter
- We know how to calculate point estimates
- E.g. regression analysis

- But different data would change our estimates.
- So, we treat our estimates as random variables

- Want a measure of how confident we are in our estimate.
- Calculate “Confidence Interval”

What is it?

- If know how data sampled
- We can construct a Confidence Interval for an unknown parameter, q.

- A 95% C.I. gives a range such that true q is in interval 95% of the time.
- A 100(1-a) C.I. captures true q
(1-a) of the time.

- Smaller a, more sure true q falls in interval, but wider interval.

Example 1: Lead in Water

- Lead in drinking water causes serious health problems.
- To test contamination, require a control site.
- Problems:
- Lead concentration in control site?
- Estimate 95% confidence interval

Example 2: Gas Market

- Recall U.S. gas market question:
- By how much does gas consumption decrease when price increases?
- Our linear model:
- Estimate of b1: -.04237.

- How confident are we in this estimate?
- Construct 90% C.I. for this estimate

If Data ~N(m,s2)

- Since we don’t know s, use t-distribution.
- 95% C.I. for m:
- s is standard error of mean.
- t97.5 is critical value of t distribution
- Draw on board (Prob = 2.5%)

t-distribution

- Similar to Normal Distribution
- Requires “degrees of freedom”.
- df = (# data points) – (# variables).
- E.g. mean of lead concentration, 8 samples, one variable: d.f.=7.

- Higher d.f., closer t is to Normal distribution.

If Distribution Unknown

- Can use “Bootstrapping”.
- Draw large sample with replacement
- Calculate mean
- Repeat many times
- Draw histogram of sample means
- Calculate empirical 95% C.I.

- Requires no previous knowledge of underlying process

Lead Concentration

- 8 lead measurements:
- Mean=51.39, s=5.75, t97.5=2.365
- Lower=51.39-(5.75)(2.365)
- Upper= 51.39+(5.75)(2.365)

- C.I. = [37.8,65.0]
- Using bootstrapped samples:
- C.I. = [40.8,62.08]

Gas Regression: S-Plus

Coefficients:

Value Std. Error t value Pr(>|t|)

(Intercept) -0.0898134 0.0507787 -1.7687217 0.0867802

PG -0.0423712 0.0098406 -4.3057672 0.0001551

Y 0.0001587 0.0000068 23.4188561 0.0000000

PNC -0.1013809 0.0617077 -1.6429209 0.1105058

PUC -0.0432496 0.0241442 -1.7913093 0.0830122

Residual standard error: 0.02680668 on 31 degrees of freedom

Multiple R-Squared: 0.9678838

F-statistic: 233.5615 on 4 and 31 degrees of freedom, the p-value is 0

Gas Price Response

- b2=-.04237, s=.00984
- 90% C.I.: t95=1.695 (d.f.=37-5=32)
- C.I. = [-.0591,-.0256]
- Using bootstrapped samples:
- C.I. = [-.063,-.026]
- Response is probably between 2.5 gallons and 6 gallons.

Interpretation & Other Facts

- There is a 95% chance that the true average lead concentration lies in this range.
- There is a 90% chance that the true value of b1 lies in this range.
- Also can calculate “confidence region” for 2 or more variables.

Download Presentation

Connecting to Server..