1 / 95

The Wealth of Nations

The Wealth of Nations. Jamie Brabston Matt Caulfield Mark Testa. Overview. Introduction Regression of Individual Variables Multicollinearity Multiple Regression Stepwise Regression Final Model. Introduction. Collected data for 30 countries 12 variables

chaman
Download Presentation

The Wealth of Nations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Wealth of Nations Jamie Brabston Matt Caulfield Mark Testa

  2. Overview • Introduction • Regression of Individual Variables • Multicollinearity • Multiple Regression • Stepwise Regression • Final Model

  3. Introduction • Collected data for 30 countries • 12 variables • Life expectancy, median age, population growth, population density, literacy rate, unemployment rate, oil consumption – oil production, cell phone / land line, military expenditures, area, sex ratio, external debt • Goal: create a model to predict GDP per capita

  4. Life Expectancy

  5. Life Expectancy • Analysis: R2: 0.45. P-value: Highly significant. • An outlier was identified using a Leverage-residual plot and removed. • Residuals vs. Fitted Values plot showed nonlinearity. • Tried a Box-Cox transform.

  6. Life Expectancy - Top: Influential data points. - Bottom: Non-influential data points. - Left: Non-outliers. - Right: Outliers. Upshot: Eliminate points in the top right quadrant as influential outliers.

  7. Life Expectancy • Box-Cox Transform: y -> (yp - 1)/p • Produces linear fit if variables are related by a power law. This plot shows the goodness of the fit as a function of p. In this case, the optimal p is fairly small.

  8. Life Expectancy • Linear regression was done on the BC transformed data. Significant nonlinearity remained.

  9. Life Expectancy • Conclusions: Clearly, there is a significant positive relationship between per capita GDP and life expectancy. • We could not identify the precise nature of the relationship. • This prevents extrapolation and prediction.

  10. Median Age

  11. Median Age • Analysis: R2: 0.58. P-value: Highly significant. • No suspected outliers. • The plot of Residuals vs. Fitted values is approximately linear, but significantly deviated from normal.

  12. Median Age • Box-Cox Transform gives:

  13. Median Age • Box-Cox transform significantly improved the normality of the residual distribution. • The Box-Cox p = 0.15. • R2 is improved to 0.72. • Final Model: (GDP0.15 – 1)/0.15 = -2.1 + 0.17(Med.Age)

  14. Population Growth

  15. Population Growth • Analysis: R2 = 0.058. p-value: 0.11. • Correlation is very low, and the p-value is outside any reasonable significance level. • An outlier was found and eliminated using a Leverage-Residual plot.

  16. Population Growth Box-Cox Transform:

  17. Population Growth • A Box-Cox transform improved the nonlinearity slightly, and gave a significant p-value. • From this, we concluded that population growth has a slight negative relationship with GDP. • No detailed predictions are possible because significant nonlinearity remains.

  18. Population Density

  19. Population Density • Analysis: The outlier on the far right corresponds to Singapore, a country with an exceptionally high population density. • A less extreme outlier is China. Both of these data points were removed.

  20. Population Density

  21. Population Density • The p-value for the data without outliers is a very insignificant 0.68. • A Box-Cox transform was attempted, but the p-value did not get close to significance. • Conclusion: Population density and GDP are essentially unrelated.

  22. Literacy Rate

  23. Final model: GDP= -3.320 + .0657(literacy rate)

  24. Unemployment Rate

  25. Final model: GDP= 1.388 -.0236(unemployment rate)

  26. Oil Consumption – Production

  27. Final model: GDP= -3.320 + .0657(literacy rate)

  28. Cell phones vs. Landlines

  29. Final model: GDP= 1.52811 - .0928(cells vs landlines)

  30. Military Expenditures

More Related