1 / 31

Chap 2. SIMPLE LINEAR REGRESSION MODEL

Chap 2. SIMPLE LINEAR REGRESSION MODEL. by Bambang Juanda. Definition of Model. Problem Formulation  Model Model: Abstract of reality in mathematic equation Ekonometric model : statistic model including error. Y = f(X 1 , X 2 , ..., X p ) + error (2.1)

polly
Download Presentation

Chap 2. SIMPLE LINEAR REGRESSION MODEL

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chap 2.SIMPLE LINEAR REGRESSION MODEL by BambangJuanda

  2. Definition of Model • Problem Formulation  Model • Model: Abstract of reality in mathematic equation • Ekonometric model : statistic model including error Y = f(X1, X2, ..., Xp) + error (2.1) actual data = estimate + residual data = systematic term + non-systematic term estimated Y = f(X1, X2, ..., Xp) (2.2)

  3. Description of Error : • Measurement error and proxies of dependent variable Y and explanatory variable X1, X2, ..., and Xp. • Wrong assumption of the functional form. • Omitted variables. • unpredictable effects.

  4. Simple Linear Regression Model • Relation between 2 variables in Linear function of Parameter Population Regression Model : Slope Error random intercept Y Explanatory (Independent) Variable Respons (dependent) Variable Sample Regresssion Model :

  5. Population Regression Model Y Y = b + b X + e Observation value i 0 1 i i e = random error i + m b b = X estimate 0 1 i Y/Xi X Observation value

  6. Persamaan Regresi Linear Sederhana (Teladan) Annual Store Square Sales Fotage ($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Square footage and annual sales ($000) for sample of 7 grocery stores

  7. Scatter Diagram

  8. Sample Linear regression Model Ù Ù Yi = Estimated Y for the ith observation Xi = Value of X for the ith observation b0 = estimated intercept coefficient of b0 ; average Y when X=0 b1 = estimated slope coefficient of b1 ; average difference of Y when X differ 1 unit

  9. Ordinary Least Squares (OLS) Estimation Method

  10. The “Best” Straight Line Equation Ù Predictor Coef SE Coef T P Constant 1636.4 451.5 3.62 0.015 X 1.4866 0.1650 9.01 0.000 S = 611.752 R-Sq = 94.2% R-Sq(adj) = 93.0% Analysis of Variance Source DF SS MS F P Regression 1 30380456 30380456 81.18 0.000 Residual Error 5 1871200 374240 Total 6 32251656

  11. The “Best” Straight Line Equation Yi = 1636.415 +1.487Xi Ù

  12. Interpretation of Coefficients Ù Yi = 1636.415 +1.487Xi Interpretation of slope value 1.487 (‘generally’): for the increase of 1 unit in X, estimated Y will increase 1.487unit. • The ‘precise’ Interpretation’: • Average difference of sales between stores which their area differ 1 square footage is $1487 per year • The Implication of estimated slope(with certain assumption: • When the size of store increase 1 square feet, the model predicts that the expected sales will increase $1487 per year.

  13. Assumption of Linear Regression Model • Normality of error • Homoscedasticity of error • Independence of error

  14. Variance of Error around Regression Line f(e) Y X2 X1 X Regression Line

  15. Estimated Standard Error bi~N(i;2 ) bi Properties of OLS Estimator: i bi

  16. Inference of Slope: t-test • t-test for Population Slope Is there linear relationship between X and Y ? • Statistical Hypothesis • H0: b1 = 0 (X cannot explainY) • H1: b1¹ 0 (X can explain Y) • Test statistic: where and df = n - 2

  17. Inference of Slope: Example of t-test • H0: b1 = 0 H1: b1¹ 0 a=.05 df=7 - 2 = 5 Critical values : T-test statistic : Decision: Conclussion: Reject H0 Reject H0 Reject H0 .025 .025 There is a linear relationship. The bigger the store size, the larger its sales t -2.5706 0 2.5706

  18. Confidence Interval of Slope b1 ± tn-2 Sb1 Excel Output of the problem of Grocery Stores We estimate with 95% confidence that the value of slope between 1.062 and 1.911. (This confidence interval excludes value of 0)

  19. Level of significance,a and rejection region b1~N(I;2 ) b1 1 b1 a H0:1³ k H1: 1 < k Rejection region (ttk kritis) 0 t a H0: 1£ k H1: 1 > k t 0 a/2 H0: 1= k H1: 1¹ k 0 t

  20. Assumption of Linear Regression Model : εi are normally, independently and identically distributed for i=1,.. ,n. (i) ei~N(0;2 ) • independence: Cov(εt, εs)= E(εtεs)=0 for t≠s. • Homoscedasticity: Var(εi)= E(εi2)=2. 0 ei (ii) X fixed variable • OLS estimates of i are Best Linear Unbiased Estimator, and normally distributed ^ μY/X • Estimated average Y for certain Xi Normally distributed ~N(0+1Xi;2 ) i μY i ^ μY/X i 0 + 1 Xi • Estimated individual Y for certain Xi equal to its estimated average, also Normally distributed with higher variance ^ Y/Xi~N(0+1Xi;2 ) Yi ^ Y/X1

  21. Estimated Interval of Forcast Values Confidence Interval of mYX, Average Y for certain Xi Interval varies according to the distance to the average X. Estimated Standard error T value from table with df=n-2

  22. Estimated Interval of Forcast Values Confidence Interval of individual Yi for certain Xi The addition of 1 makes this interval is wider than CI of the average Y, µXY

  23. Estimated Interval of Forcast Values for Different X Values Confidence Inteverage Y Confidence Interval for individual Yi Y Ù Yi = b0 + b1Xi X _ Certain Xi X

  24. ANOVA: Analysis of Variance Is the variance of Y can be explained by (variable X in) the Model ? Yi = b0 + b1 Xi + ei Yi = (Y - b1 X) + b1 Xi + ei (Yi – Y) = b1 (Xi – X) + ei (Yi – Y)2 = { b1 (Xi – X) + ei }2 (Yi – Y)2 = { b1 (Xi – X) + ei }2 (Yi – Y)2 = b12(Xi – X)2 + ei2 TSS = RSS + ESS

  25. Measure of Variance: Sum of Squares Y Ù ESS =å(Yi-Yi )2 _ Yi = b0 + b1Xi Ù TSS =å(Yi-Y)2 _ Ù RSS = å(Yi -Y)2 _ Y X Xi

  26. _ Ù RSS = å(Yi -Y)2 Table of ANOVA For Simple Linear Regression Model Ù ESS =å(Yi-Yi )2 _ T SS=å(Yi-Y)2

  27. Inference of the Model: F-test Is the Model can explain the variance of Y? • Statistical Hypothesis • H0: b1 = 0 (model cannot explain) • H1: b1¹ 0 (model can explain) • Test statistic : a = 0.05 F = MSR/MSE ~ F(p, n-1-p) p: number of independent variables 0 6.61 F(1,5) Analysis of Variance Source DF SS MS F P Regression 1 30380456 30380456 81.18 0.000 Residual Error 5 1871200 374240 Total 6 32251656

  28. Residual Analysis for Linearity ü Not Linear Linear e e X X

  29. Residual Analysis for Homoscedasticity ü Heteroscedasticity Homoscedasticity SR SR X X Using Standardized Residuals (SR)

  30. Residual Analysis for Independence of e ü not independent independent SR SR X X

  31. Residual Analysis: Computer Output

More Related