1 / 66

Multivariate Methods

Multivariate Methods. LIR 832. Multivariate Methods: Topics of the Day. A. Isolating Interventions in a multi-causal world B. Multivariate probability Distributions C. The Building Block: covariance D. The Next Step: Correlation. A Multivariate World.

ray
Download Presentation

Multivariate Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multivariate Methods LIR 832

  2. Multivariate Methods: Topics of the Day • A. Isolating Interventions in a multi-causal world • B. Multivariate probability Distributions • C. The Building Block: covariance • D. The Next Step: Correlation

  3. A Multivariate World • Isolating Interventions in a Multi-Causal World • A. Example of problem: • Evaluate a program to reduce absences from a plant? • Is there age discrimination? • B. Types of data • Experimental • Quasi-experimental • Non-experimental • C. Need multivariate analysis to sort out causal relationships.

  4. Bi-Variate Relations: A First Run at Multivariate Methods • A. Many of the issues we are interested in are essentially about the relationship between two variables. • B. Bi-variate can be generalized to multivariate relationships • C. We learn bi-variate formally and make more intuitive reference to multivariate. • D. What do we mean by bi-variate relationship?

  5. Bi-Variate Example • Our firm, has formed teams of engineers, accountants and general managers at all plants to work on several issues that are considered important in the firm. The firm has long been committed to gender diversity and we are interested in the distribution of gender among our managerial classifications. We are particularly concerned about the distribution of gender on these teams and particularly among engineers. Consider the distribution of two statistics about these three person teams. • a. gender of the team members (X: x = number of men) • b. is the engineer a woman (Y: 0 = man, 1 = woman)

  6. Bi-Variate Example (cont.)

  7. Bi-Variate Example (cont.)

  8. Bi-Variate Example (cont.)

  9. Bi-Variate Example (cont.)

  10. Bi-Variate Example (cont.) • We can also use this information to build conditional probabilities: What is the likelihood that the engineer is a woman, given that we have a man on the team?

  11. Bi-Variate Example (cont.) • What is the likelihood that the engineer is a woman, given that we have a man on the team? • P(Y = 1 & X = 1|X= 1) • = P(Y = 1 & X = 1)/P(X= 1) • = (2/8) / (3/8) = 2/3 • Note: P(Y= 1|X=2) is: • “the probability that Y is equal to 1 given that X = 2" or • “the probability that Y = 1 conditional on X = 2"

  12. Bi-Variate Example (cont.) • What is the likelihood that there is only one man, given the engineer is a woman? • P(Y = 1 & X = 1|Y= 1) • = P(Y = 1 & X = 1)/P(Y= 1) • = (2/8)/(4/8) = 2/4 =1/2

  13. Bi-Variate Example (cont.) • What is the likelihood that the engineer is a woman? • P(Y= 1) = 1/2 • But if we know that there are two men, we can improve our estimate: • P(Y=1 |X=2) • = P(Y=1 & X=2|X=2) • = P(Y=1 &X=2) / P(X=2) • = 1/8 / 3/8 = 1/3 • What about calculating the likelihood of two men given the engineer is a woman?

  14. Example: Gender Distribution

  15. Example: Gender Distribution • Working with Conditional Probability: • P(female) = 50.91% • P(female| LRHR) = p(Female & LRHR)/P(LRHR) = 0.36/0.55 = 65% • P(LRHR) = 0.55% • P(LRHR|Female) = p(lrhr & female)/p(female) = .36/50.91 = .70%

  16. Independence Defined • Now that we know a bit about bi-variate relationships, we can define what it means, in a statistical sense, for two events to be independent. • If events are independent, then • Their conditional probability is equal to their unconditional probability • The probability of the two independent events occurring is P(X)*P(Y) = P(X,Y).

  17. Importance of Independence • Why is independence important? • If events are independent, then we are getting unique information from each data point. If events are not independent, then • A practical example on running a survey on employee satisfaction within an establishment.

  18. Example: Employee Satisfaction

  19. Covariance • Covariance: Building Block of Multi-variate Analysis • All very nice, but what we are looking for is a means of expressing and measuring the strength of association of two variables. • How closely do they move together? • Is variable A a good predictor of variable B? • Move to a slightly more complex world, no more 2 and three category variables

  20. Example: Age and Income Data

  21. Example:Age and Income Data

  22. Example: Age and Income Data

  23. Example: Age and Income Data __________________________________________________________________ Descriptive Statistics: age, annual income Variable N Mean Median StDev SE Mean age 23 24.565 23.000 4.251 0.886 annual I 23 17174 10000 15712 3276 Variable Minimum Maximum Q1 Q3 age 22.000 42.000 22.000 26.000 annual I 0 65000 7000 25000 _________________________________________________________________

  24. Example:Age and Income Data

  25. Example:Age and Income Data • Adding some info to the graph…

  26. Covariance and Correlation Defined • Define Covariance and Correlation for a random sample of data: • Let our data be composed of pairs of data (Xi,Yi) where X has mean mx and Y has mean my. Then the covariance, the co-movement around their means, is defined as:

  27. Example: Covariance • We observe the relationship between the number of employees at work at a plant and the output for five days in a row: Attendance Output 8 40 3 28 2 20 6 39 4 28 • What is the covariance of attendance and output?

  28. Example: Covariance (cont.) The covariance is positive. This suggests that when attendance is above its mean, output is also above its mean. Similarly, when attendance is below its mean, output is below its mean.

  29. Example: Overtime Hours and Productivity

  30. Example: Overtime Hours and Productivity

  31. Example: Overtime Hours and Productivity Covariances: prod-avg, week prod-avg week prod-avg 113.7292 week -49.5667 22.6667

  32. Example: Overtime Hours and Productivity (cont.)

  33. Example: Overtime Hours and Productivity (cont.)

  34. Example: Overtime Hours and Productivity (cont.) Covariances: prod-avg, week, week-hours prod-avg week week-hours prod-avg 233.3345 week -51.8706 21.3986 week-hours -89.0777 0.0000 99.3069

  35. Example: Overtime Hours and Productivity (cont.)

  36. Example: Overtime Hours and Productivity (cont.)

  37. Example: Overtime Hours and Productivity (cont.)

  38. Correlation vs. Covariance • A limitation of covariance is that it is difficult to interpret. Its units are not well defined. • Thus, we need a measure which is more readily interpreted and tells about the strength of association. • Correlation: • Population Correlation is Defined as:

  39. Correlation = 1.00

  40. Correlation = 0.94

  41. Correlation = 0.604

  42. Correlation = 0.198

  43. Correlation: Previous Examples

  44. Correlation: Previous Examples

  45. Correlation:Previous Examples

  46. Correlation: Previous Examples • Overtime-Productivity: Limit to 5 days, 10 hours: Correlations: prod-avg, week, week-hours prod-avg week week -0.734 0.000 week-hours -0.585 0.000 0.000 1.000

  47. Correlations:Previous Examples

  48. Example: Correlation

More Related