1 / 64

Power 16 - PowerPoint PPT Presentation

Power 16. Review. Post-Midterm Cumulative. Projects. Logistics. Put power point slide show on a high density floppy disk, or e-mail as an attachment, for a WINTEL machine. Email Llad@econ.ucsb.edu the slide-show as a PowerPoint attachment. Assignments. 1. Project choice

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Power 16' - debra

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Power 16

• Post-Midterm

• Cumulative

• Put power point slide show on a high density floppy disk, or e-mail as an attachment, for a WINTEL machine.

• Email Llad@econ.ucsb.edu the slide-show as a PowerPoint attachment

• 1. Project choice

• 2. Data Retrieval

• 3. Statistical Analysis

• 4. PowerPoint Presentation

• 5. Executive Summary

• 6. Technical Appendix

• 7. Graphics

Power_13

• 1. Introduction: Members 1 ,2 , 3

• What

• Why

• How

• 2. Executive Summary: Member 5

• 3. Exploratory Data Analysis: Member 3

• 4. Descriptive Statistics: Member 3

• 5. Statistical Analysis: Member 3

• 6. Conclusions: Members 3 & 5

• Spreadsheet of data used and sources or if extensive, a subsample of the data

• Descriptive Statistics and Histograms for the variables in the study

• If time series data, a plot of each variable against time

• If relevant, plot of the dependent Vs. each of the explanatory variables

• Statistical Results, for example regression

• Plot of the actual, fitted and error and other diagnostics

• Brief summary of the conclusions, meanings drawn from the exploratory, descriptive, and statistical analysis.

• Project I: Power 16

• Contingency Table Analysis: Power 14, Lab 8

• ANOVA: Power 15, Lab 9

• Survival Analysis: Power 12, Power 11, Lab 7

• Multi-variate Regression: Power 11 , Lab 6

• Challenger disaster

• Number of O-Rings Failing On Launch i: yi(#) = a + b*tempi + ei

• Biased because of zeros, even if divide equation by 6

• Two Ways to Proceed

• Tobit, non-linear estimation: yi(#) = a + b*tempi + ei

• Bernoulli variable: probability models

• Probability Models: yi(0,1) = a + b*tempi + ei

• Probability Models: yi(0,1) = a + b*tempi + ei

• OLS, Linear Probability Model, linear approximation to the sigmoid

• Probit, non-linear estimate of the sigmoid

• Logit, non-linear estimate of the sigmoid

• Significant Dependence on Temperature

• t-test (or z-test) on slope, H0 : b=0

• F-test

• Wald test

• Plots of Number or Probability Vs Temp.

• Label the axes

• The most frequent sins

• Did not explicitly address significance

• Did not answer b, 660 : all launches at lower temperatures had one or more o-ring failures

• Did not execute c, estimate linear probability model

• Failure of O-rings that sealed grooves on the booster rockets

• Was there any relationship between o-ring failure and temperature?

• Engineers knew that the rubber o-rings hardened and were less flexible at low temperatures

• But was there launch data that showed a problem?

• What: Was there a relationship between launch temperature and o-ring failure prior to the Challenger disaster?

• Why: Should the launch have proceeded?

• How: Analyze the relationship between launch temperature and o-ring failure

• Data

• number of o-rings that failed

• launch temperature

• Launches where there was a problem

1 58

1 57

1 70

1 63

1 70

2 75

3 53

• All Launches

Plot of failures per observation versus temperature range shows

temperature dependence:

Mean temperature for the 7 launches with o-ring failures was

lower, 63.7, than for the 17 launches without o-ring failures,

72.6. -

Contingency table analysis

Probit extrapolated to 31F:

• From extrapolating the probability models to 31 F, Linear Probability, Probit, or Logit, there was a high probability of one or more o-rings failing

• From extrapolating the Number of O-rings failing to 31 F, OLS or Tobit, 3 or more o-rings would fail.

• There had been only one launch out of 24 where as many as 3 o-rings had failed.

• Decision theory argument: expected cost/benefit ratio:

• Decision theory argument: expected cost/benefit ratio:

Difference in mean temperatures for failures and successes

Difference in probability of one or more o-ring failures for high and low temperature ranges

Probabilty models: LPM (OLS), probit, logit

Number of o-ring failure per launch Vs. Temp.

OLS, Tobit

Contingency table analysis

ANOVA

• Challenger example

• Probability one or more o-rings fail

• Low temp: 53-62 degrees

• Medium temp: 63-71 degrees

• High temp: 72-81 degrees

• Average number of o-rings failing per launch

• Low temp: 53-62 degrees

• Medium temp: 63-71 degrees

• High temp: 72-81 degrees

• ANOVA and Regression

• (Non-Parametric Statistics)

• (Goodman Log-Linear Model)

• Salesaj = c(1)*convenience+c(2)*quality+c(3)*price+ e

• E[salesaj/(convenience=1, quality=0, price=0)] =c(1) = mean for city(1)

• c(1) = mean for city(1) (convenience)

• c(2) = mean for city(2) (quality)

• c(3) = mean for city(3) (price)

• Test the null hypothesis that the means are equal using a Wald test: c(1) = c(2) = c(3)

Regression Coefficients are the City Means; F statistic

Anova and Regression: One-WayAlternative Specification

• Salesaj = c(1) + c(2)*convenience+c(3)*quality+e

• E[Salesaj/(convenience=0, quality=0)] = c(1) = mean for city(3) (price, the omitted one)

• E[Salesaj/(convenience=1, quality=0)] = c(1) + c(2) = mean for city(1) (convenience)

• c(1) = mean for city(3), the omitted city

• c(2) = mean for city(1) minus mean for city(3)

• Test that the mean for city(1) = mean for city(3)

• Using the t-statistic for c(2)

Anova and Regression: One-WayAlternative Specification

• Salesaj = c(1) + c(2)*convenience+c(3)*price+e

• E[Salesaj/(convenience=0, price=0)] = c(1) = mean for city(2) (quality, the omitted one)

• E[Salesaj/(convenience=1, price=0)] = c(1) + c(2) = mean for city(1) (convenience)

• c(1) = mean for city(2), the omitted city

• c(2) = mean for city(1) minus mean for city(2)

• Test that the mean for city(1) = mean for city(2)

• Using the t-statistic for c(2)

ANOVA and Regression: Two-WaySeries of Regressions; Compare to Table 11, Lecture 15

• Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + c(5)*convenience*television + c(6)*quality*television + e, SSR=501,136.7

• Salesaj = c(1) + c(2)*convenience + c(3)* quality + c(4)*television + e, SSR=502,746.3

• Test for interaction effect: F2, 54 = [(502746.3-501136.7)/2]/(501136.7/54) = (1609.6/2)/9280.3 = 0.09

ANOVA and Regression: Two-WaySeries of Regressions

• Salesaj = c(1) + c(2)*convenience + c(3)* quality + e, SSR=515,918.3

• Test for media effect: F1, 54 = [(515918.3-502746.3)/1]/(501136.7/54) = 13172/9280.3 = 1.42

• Salesaj = c(1) +e, SSR = 614757

• Test for strategy effect: F2, 54 = [(614757-515918.3)/2]/(501136.7/54) = (98838.7/2)/(9280.3) = 5.32

• Density, f(t)

• Cumulative distribution function, CDF, F(t)

• Probability you failed up to time t* =F(t*)

• Survivor Function, S(t) = 1-F(t)

• Probability you survived longer than t*, S(t*)

• Kaplan-Meier estimates:

(#at risk- # ending)/# at risk

• Applications

• Testing a new drug

• Current standard for ovarian cancer is taxol and a platinate such as cisplatin

• Previous standard was cyclophosphamide and cisplatin

• Kaplan-Meier Survival curves comparing the two regimens

• Lab 7: ( # at risk- #ending)/# at riak

interrupts cell division (mitosis)

It is a cyclical hydrocarbon

342 at risk for Tc, 292

Survived 1 year

Bottom Panel:

Gynecological Oncology

Group, 196 at risk

For Tc, 168 survived

1 year

• What to do when the sample of observations is not distributed normally?

• Wilcoxon Rank Sum Test for independent samples

• Data Analysis Plus

• Signs Test for Matched Pairs: Rated Data

• Eviews, Descriptive Statistics

• Wilcoxon Signed Rank Sum Test for Matched Pairs: Quantitative Data

• Eviews

• Testing the difference between the means of two populations when they are non-normal

• A New Painkiller Vs. Aspirin, Xm17-02

• 30 total ratings for both samples

• 3 ratings of 1

• 5 ratings of 2

• etc

5 30 27

Rank Sum 276.5 188.5

• E (T )= n1 (n1 + n2 + 1)/2 = 15*31/2 = 232.5

• VAR (T) = n1 * n2 (n1 + n2 + 1)/12

• VAR (T) = 15*31/12 , sT = 24.1

• For sample sizes larger than 10, T is normal

• Z = [T-E(T)]/ sT = (276.5 - 232.5)/24.1 = 1.83

• Null Hypothesis is that the central tendency for the two drugs is the same

• Alternative hypothesis: central tendency for the new drug is greater than for aspirin: 1-tailed test

1.645