Session 5

Session 5

Outline • Creating the Full Model “From Scratch” • Excel Array Functions • Qualitative Variables • Dummy Variables • Interaction Effects Applied Regression -- Prof. Juran

How does Excel calculate these coefficients? Matrix Algebra! Applied Regression -- Prof. Juran

4 Basic Matrix Operations • Sum Product • Transpose • Matrix Multiplication • Inverting Matrices Applied Regression -- Prof. Juran

Applied Regression -- Prof. Juran

Sum Product Applied Regression -- Prof. Juran

Notes: r = i and c = j. In other words, the two matrices must have the same number of rows as each other and the same number of columns as each other. They do not need to be square matrices (where r = c and i = j). Applied Regression -- Prof. Juran

A B C D 1 7 4 6 2 5 2 11 3 4 9 3 8 5 10 12 1 6 =SUMPRODUCT(A1:C2,A4:C5) 7 208 8 Applied Regression -- Prof. Juran

Transpose For matrix A, Applied Regression -- Prof. Juran

Notes: • If A is an r x c matrix, then must be a c x r matrix. • A does not need to be a square matrix. Applied Regression -- Prof. Juran

There is an Excel function for this purpose, called TRANSPOSE. This function is one of a special class of functions called array functions. In contrast with most other Excel functions, array functions have two important differences: • They are entered into ranges of cells, not single cells • You enter them by pressing Shift+Ctrl+Enter, not just Enter Applied Regression -- Prof. Juran

Using the spreadsheet above as an example, we start by selecting the entire range A4:C6. Then type into the formula bar =TRANSPOSE(A1:C2) Press Shift+Ctrl+Enter, and curly brackets will appear round the formula (you can’t type them in). Applied Regression -- Prof. Juran

Matrix Multiplication For matrices A and B, Applied Regression -- Prof. Juran

Inverting Matrices First, define a square matrix Ij as a matrix with j rows and j columns, completely filled with zeroes, except for ones on the diagonal: This special matrix is called the identity matrix. Applied Regression -- Prof. Juran

Now, for a square matrix A with j rows and j columns, there may exist a matrix called A-inverse (symbolized ) such that: Not all square matrices can be inverted, a fact that has implications for regression analysis. Applied Regression -- Prof. Juran

Remember: • Select the entire range A5:B6 before typing the formula. • Press Shift+Ctrl+Enter. Applied Regression -- Prof. Juran

Least-Squares Regression “By Hand” Combinations of: • Basic arithmetic • These four matrix operations • The T and F functions we already know Applied Regression -- Prof. Juran

Definitions Observed dependent variable data: Observed independent variable data: Applied Regression -- Prof. Juran

Unknown true coefficients: Unknown true errors: Applied Regression -- Prof. Juran

Estimated coefficients: Residual errors from : Applied Regression -- Prof. Juran

The linear model in matrix form: A.1 A.2 A.4 Applied Regression -- Prof. Juran

Regression Formulas Two special matrices: “C” matrix: A.5 “Projection” or “Hat” matrix: A.3 Applied Regression -- Prof. Juran

Example: Supervisor Data Applied Regression -- Prof. Juran

Example: Supervisor Data(Set Intercept to Zero) Applied Regression -- Prof. Juran

Bottom Part of the Output Applied Regression -- Prof. Juran

The “C” Matrix Applied Regression -- Prof. Juran

The “P” Matrix Applied Regression -- Prof. Juran

We’ll come back to the rest of the “bottom” part of the output later (slide 41). Applied Regression -- Prof. Juran

Middle Part of the Output Applied Regression -- Prof. Juran

Top Part of the Output Applied Regression -- Prof. Juran

Now that we have the standard error of the estimate , we can finish the “bottom” part of the output. Applied Regression -- Prof. Juran

cjj refers to the jth row and jth column in the C matrix; each independent variable’s standard error is calculated with a different cjj value. We used VLOOKUP here as a fancy way of doing it. Applied Regression -- Prof. Juran

Juran’s Lazy Portfolio • Invest in Vanguard mutual funds under university retirement plan • No shorting • Max 8 mutual funds • Rebalance once per year • Tools used: • Excel Solver • Basic Stats (mean, stdev, correl, beta, crude version of CAPM) Decision Models -- Prof. Juran 42

Decision Models -- Prof. Juran 43

Decision Models -- Prof. Juran 44

Categorical Variables Frequently in regression analysis we wish to study the effects of independent variables that are measured as nominal data (also known as categorical data). For example, many demographic factors such as political party, gender, ethnicity, marital status, or credit rating are measured in this way. Although regression analysis is generally designed to work with continuous data, these categorical types of variables can be studied with regression. Applied Regression -- Prof. Juran

The Binary Case: The most basic version of this situation is when a variable has two possible values, such as gender. We can set up a single “dummy” variable, such as: Why not use 1 for Male and 2 for Female? Applied Regression -- Prof. Juran

Multi-Category Cases: When there are k categories, we can model them with k – 1 dummy variables. For example, if we were studying the effects of different days of the week on sales, we could set up six dummy variables: Why not have a seventh variable for Saturday? Applied Regression -- Prof. Juran

Example: Interest Rates Data from Session 3: Applied Regression -- Prof. Juran

ANOVA results from Session 3: Applied Regression -- Prof. Juran

Regression Model for Interest Rates Applied Regression -- Prof. Juran

Session 5

Session 5

Presentation Transcript

Session 5

Session 5

Session 5

Session 5

Session 5

SESSION #5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5

Session 5