1 / 88

Session 5

Session 5. Outline. Creating the Full Model “From Scratch” Excel Array Functions Qualitative Variables Dummy Variables Interaction Effects. How does Excel calculate these coefficients?. Matrix Algebra!. 4 Basic Matrix Operations. Sum Product Transpose Matrix Multiplication

dana
Download Presentation

Session 5

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Session 5

  2. Outline • Creating the Full Model “From Scratch” • Excel Array Functions • Qualitative Variables • Dummy Variables • Interaction Effects Applied Regression -- Prof. Juran

  3. How does Excel calculate these coefficients? Matrix Algebra! Applied Regression -- Prof. Juran

  4. 4 Basic Matrix Operations • Sum Product • Transpose • Matrix Multiplication • Inverting Matrices Applied Regression -- Prof. Juran

  5. Applied Regression -- Prof. Juran

  6. Sum Product Applied Regression -- Prof. Juran

  7. Notes: r = i and c = j. In other words, the two matrices must have the same number of rows as each other and the same number of columns as each other. They do not need to be square matrices (where r = c and i = j). Applied Regression -- Prof. Juran

  8. A B C D 1 7 4 6 2 5 2 11 3 4 9 3 8 5 10 12 1 6 =SUMPRODUCT(A1:C2,A4:C5) 7 208 8 Applied Regression -- Prof. Juran

  9. Transpose For matrix A, Applied Regression -- Prof. Juran

  10. Notes: • If A is an r x c matrix, then must be a c x r matrix. • A does not need to be a square matrix. Applied Regression -- Prof. Juran

  11. There is an Excel function for this purpose, called TRANSPOSE. This function is one of a special class of functions called array functions. In contrast with most other Excel functions, array functions have two important differences: • They are entered into ranges of cells, not single cells • You enter them by pressing Shift+Ctrl+Enter, not just Enter Applied Regression -- Prof. Juran

  12. Using the spreadsheet above as an example, we start by selecting the entire range A4:C6. Then type into the formula bar =TRANSPOSE(A1:C2) Press Shift+Ctrl+Enter, and curly brackets will appear round the formula (you can’t type them in). Applied Regression -- Prof. Juran

  13. Matrix Multiplication For matrices A and B, Applied Regression -- Prof. Juran

  14. Applied Regression -- Prof. Juran

  15. Applied Regression -- Prof. Juran

  16. Inverting Matrices First, define a square matrix Ij as a matrix with j rows and j columns, completely filled with zeroes, except for ones on the diagonal: This special matrix is called the identity matrix. Applied Regression -- Prof. Juran

  17. Now, for a square matrix A with j rows and j columns, there may exist a matrix called A-inverse (symbolized ) such that: Not all square matrices can be inverted, a fact that has implications for regression analysis. Applied Regression -- Prof. Juran

  18. Applied Regression -- Prof. Juran

  19. Remember: • Select the entire range A5:B6 before typing the formula. • Press Shift+Ctrl+Enter. Applied Regression -- Prof. Juran

  20. Least-Squares Regression “By Hand” Combinations of: • Basic arithmetic • These four matrix operations • The T and F functions we already know Applied Regression -- Prof. Juran

  21. Definitions Observed dependent variable data: Observed independent variable data: Applied Regression -- Prof. Juran

  22. Unknown true coefficients: Unknown true errors: Applied Regression -- Prof. Juran

  23. Estimated coefficients: Residual errors from : Applied Regression -- Prof. Juran

  24. The linear model in matrix form: A.1 A.2 A.4 Applied Regression -- Prof. Juran

  25. Regression Formulas Two special matrices: “C” matrix: A.5 “Projection” or “Hat” matrix: A.3 Applied Regression -- Prof. Juran

  26. Example: Supervisor Data Applied Regression -- Prof. Juran

  27. Example: Supervisor Data(Set Intercept to Zero) Applied Regression -- Prof. Juran

  28. Applied Regression -- Prof. Juran

  29. Bottom Part of the Output Applied Regression -- Prof. Juran

  30. The “C” Matrix Applied Regression -- Prof. Juran

  31. The “P” Matrix Applied Regression -- Prof. Juran

  32. Applied Regression -- Prof. Juran

  33. Applied Regression -- Prof. Juran

  34. We’ll come back to the rest of the “bottom” part of the output later (slide 41). Applied Regression -- Prof. Juran

  35. Middle Part of the Output Applied Regression -- Prof. Juran

  36. Applied Regression -- Prof. Juran

  37. Applied Regression -- Prof. Juran

  38. Top Part of the Output Applied Regression -- Prof. Juran

  39. Applied Regression -- Prof. Juran

  40. Now that we have the standard error of the estimate , we can finish the “bottom” part of the output. Applied Regression -- Prof. Juran

  41. cjj refers to the jth row and jth column in the C matrix; each independent variable’s standard error is calculated with a different cjj value. We used VLOOKUP here as a fancy way of doing it. Applied Regression -- Prof. Juran

  42. Juran’s Lazy Portfolio • Invest in Vanguard mutual funds under university retirement plan • No shorting • Max 8 mutual funds • Rebalance once per year • Tools used: • Excel Solver • Basic Stats (mean, stdev, correl, beta, crude version of CAPM) Decision Models -- Prof. Juran 42

  43. Decision Models -- Prof. Juran 43

  44. Decision Models -- Prof. Juran 44

  45. Categorical Variables Frequently in regression analysis we wish to study the effects of independent variables that are measured as nominal data (also known as categorical data). For example, many demographic factors such as political party, gender, ethnicity, marital status, or credit rating are measured in this way. Although regression analysis is generally designed to work with continuous data, these categorical types of variables can be studied with regression. Applied Regression -- Prof. Juran

  46. The Binary Case: The most basic version of this situation is when a variable has two possible values, such as gender. We can set up a single “dummy” variable, such as: Why not use 1 for Male and 2 for Female? Applied Regression -- Prof. Juran

  47. Multi-Category Cases: When there are k categories, we can model them with k – 1 dummy variables. For example, if we were studying the effects of different days of the week on sales, we could set up six dummy variables: Why not have a seventh variable for Saturday? Applied Regression -- Prof. Juran

  48. Example: Interest Rates Data from Session 3: Applied Regression -- Prof. Juran

  49. ANOVA results from Session 3: Applied Regression -- Prof. Juran

  50. Regression Model for Interest Rates Applied Regression -- Prof. Juran

More Related