1 / 11

Geometric Representation of Regression

Geometric Representation of Regression. ‘Multipurpose’ Dataset from class website Attitude towards job Higher scores indicate more unfavorable attitude toward company Number of years worked Days absent 12 cases. EMP DAYSABS ATTRATE YEARS a 1 1 1 b 0 2 1

hea
Download Presentation

Geometric Representation of Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geometric Representation of Regression

  2. ‘Multipurpose’ Dataset from class website • Attitude towards job • Higher scores indicate more unfavorable attitude toward company • Number of years worked • Days absent • 12 cases EMP DAYSABS ATTRATE YEARS a 1 1 1 b 0 2 1 c 1 2 2 d 4 3 2 e 3 5 4 f 2 5 6 g 5 6 5 h 6 7 4 i 9 10 8 j 13 11 7 k 15 11 9 l 16 12 10

  3. Typical representation with response surface • Correlations .89 and up* • R2 model = .903 DAYSABS ATTRATE YEARS DAYSABS 1.0000000 0.9497803 0.8902164 ATTRATE 0.9497803 1.0000000 0.9505853 YEARS 0.8902164 0.9505853 1.0000000 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -2.2630 1.0959 -2.065 0.0689 . ATTRATE 1.5497 0.4805 3.225 0.0104 * YEARS -0.2385 0.6064 -0.393 0.7032

  4. Typical representation with response surface • Where the response surface crosses the y axis (daysabs) provides the intercept in our formula • Holding a variable ‘constant’ is like adding a plane perpendicular to that variable’s axis • The process as a whole minimizes the sum of the squared distances between the original data points and their projection onto the plane

  5. Alternative • Given a variable, we can instead view it as a vector projection from an origin into some n-dimensional space • In another way, the space is the number of dimensions, one for each individual (for this data 12 dimensions), where this vector, which represents their values on some predictor, occupies only a single dimension within that space

  6. Assume now two standardized variables of equal N • Now we have 2 vectors (of N components) emanating from the origin* • The cosine of the angle they create is the simple correlation of the two variables • If they were perfectly correlated they would occupy the same dimension (i.e. be right on top of one another) X1 X2

  7. Adding a third variable, we can again understand their simple correlations as the cosines of the respective angles they create • Given the plane created by X1 and X2, might we find a way to project Y onto it? Y X1 X2

  8. That is in fact what multiple regression does and this projection is that linear combination* resulting in our predicted values • The cosine of the angle created by Y and Y-hat is the multiple R, which when squared gives the amount of variance in Y accounted for by the model containing X1 and X2 • The attempt is made in regression to minimize that angle/max its cosine • Partial correlations may be represented too, by creating a plane perpendicular** to one variable and projecting the others onto that plane • The cosine of the angle they create will be their partial correlation Y X1 Y-hat X2

  9. One dichotomous predictor

  10. 2 dichotomous predictors (2x2 ANOVA)

  11. Dichotomous outcome

More Related