Confidence intervals
Download
1 / 11

Confidence Intervals - PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on

Confidence Intervals. Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change our estimates. So, we treat our estimates as random variables Want a measure of how confident we are in our estimate.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Confidence Intervals' - uriel-oneil


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Confidence intervals
Confidence Intervals

  • Underlying model: Unknown parameter

    • We know how to calculate point estimates

    • E.g. regression analysis

  • But different data would change our estimates.

    • So, we treat our estimates as random variables

  • Want a measure of how confident we are in our estimate.

  • Calculate “Confidence Interval”


What is it
What is it?

  • If know how data sampled

    • We can construct a Confidence Interval for an unknown parameter, q.

  • A 95% C.I. gives a range such that true q is in interval 95% of the time.

  • A 100(1-a) C.I. captures true q

    (1-a) of the time.

    • Smaller a, more sure true q falls in interval, but wider interval.


Example 1 lead in water
Example 1: Lead in Water

  • Lead in drinking water causes serious health problems.

  • To test contamination, require a control site.

  • Problems:

    • Lead concentration in control site?

    • Estimate 95% confidence interval


Example 2 gas market
Example 2: Gas Market

  • Recall U.S. gas market question:

    • By how much does gas consumption decrease when price increases?

    • Our linear model:

    • Estimate of b1: -.04237.

  • How confident are we in this estimate?

    • Construct 90% C.I. for this estimate


If data n m s 2
If Data ~N(m,s2)

  • Since we don’t know s, use t-distribution.

  • 95% C.I. for m:

  • s is standard error of mean.

  • t97.5 is critical value of t distribution

    • Draw on board (Prob = 2.5%)


T distribution
t-distribution

  • Similar to Normal Distribution

  • Requires “degrees of freedom”.

  • df = (# data points) – (# variables).

    • E.g. mean of lead concentration, 8 samples, one variable: d.f.=7.

  • Higher d.f., closer t is to Normal distribution.


If distribution unknown
If Distribution Unknown

  • Can use “Bootstrapping”.

    • Draw large sample with replacement

    • Calculate mean

    • Repeat many times

    • Draw histogram of sample means

    • Calculate empirical 95% C.I.

  • Requires no previous knowledge of underlying process


Lead concentration
Lead Concentration

  • 8 lead measurements:

    • Mean=51.39, s=5.75, t97.5=2.365

    • Lower=51.39-(5.75)(2.365)

    • Upper= 51.39+(5.75)(2.365)

  • C.I. = [37.8,65.0]

  • Using bootstrapped samples:

  • C.I. = [40.8,62.08]


Gas regression s plus
Gas Regression: S-Plus

Coefficients:

Value Std. Error t value Pr(>|t|)

(Intercept) -0.0898134 0.0507787 -1.7687217 0.0867802

PG -0.0423712 0.0098406 -4.3057672 0.0001551

Y 0.0001587 0.0000068 23.4188561 0.0000000

PNC -0.1013809 0.0617077 -1.6429209 0.1105058

PUC -0.0432496 0.0241442 -1.7913093 0.0830122

Residual standard error: 0.02680668 on 31 degrees of freedom

Multiple R-Squared: 0.9678838

F-statistic: 233.5615 on 4 and 31 degrees of freedom, the p-value is 0


Gas price response
Gas Price Response

  • b2=-.04237, s=.00984

  • 90% C.I.: t95=1.695 (d.f.=37-5=32)

  • C.I. = [-.0591,-.0256]

  • Using bootstrapped samples:

  • C.I. = [-.063,-.026]

  • Response is probably between 2.5 gallons and 6 gallons.


Interpretation other facts
Interpretation & Other Facts

  • There is a 95% chance that the true average lead concentration lies in this range.

  • There is a 90% chance that the true value of b1 lies in this range.

  • Also can calculate “confidence region” for 2 or more variables.


ad