1 / 10

Chapter 4: Missing data mechanisms

Chapter 4: Missing data mechanisms. Handbook: chapter 2 Missing data patterns Missing data mechanisms. Missing data mechanisms. Missing data patterns Describe which values are observed and which values are missing Different patterns require different methods to deal with the missing data

atempleton
Download Presentation

Chapter 4: Missing data mechanisms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 4: Missing data mechanisms • Handbook: chapter 2 • Missing data patterns • Missing data mechanisms

  2. Missing data mechanisms • Missing data patterns • Describe which values are observed and which values are missing • Different patterns require different methods to deal with the missing data • Missing data mechanisms • Describe the relationship between the missingness and the variables in the dataset

  3. Missing data patterns • Univariate missing data • Y represents a group of variables that is either completely observed or completely missing for each sample element • Example: Unit nonresponse X1 X2 . . . . . . Xp Y 1 2 . . . . . . . N

  4. Missing data patterns • Monotone missing data • Data are ordered in such a way that if Yjis missing for a unit, then Yj+1, …,Ypare missing as well. • Example: panel drop out, attrition. Y1 Y2 Y3 … Yp 1 2 . . . . . . . N

  5. Missing data patterns • Arbitrary missing data • No structure or ordering in missingness • Example: item nonresponse Y1 Y2 Y3 … Yp 1 2 . . . . . . . N ? ? ? ?

  6. Missing data mechanisms • Any analysis of data involving item- or unit nonresponse requires some assumption about the missing data mechanism • Partition Y into an observed and an unobserved part • Distribution of missingness is characterized by the conditional distribution of R given Y

  7. Missing Completely At Random (MCAR) • The conditional distribution of R given Y does not depend on the data at all. P(Y = missing) is unrelated to missing values of Y or other variables X • Let X be a set of auxiliary variables, completely observed. Y is a target variable, partly missing. Z represents causes of missingness unrelated to X and Y. • MCAR: • Analysis with observed units only (complete case analysis) is still valid. X Z Y R

  8. Missing At Random (MAR) • The conditional distribution of missingness depends on the observed data, but not on the missing values; P(Y = missing) is unrelated to missing values, after controlling for other variables X • MAR: • MAR = MCAR within classes of X • Example: Y = Income; X = Property tax Persons with high income may be less willing to reveal income. But within classes of property tax, nonresponse on the income question is random. Income then is MAR; given property tax, the missingness does not depend on income. X Z Y R

  9. Not Missing At Random (NMAR) • The distribution of the missingness can not be simplified any further and depends on both the observed and the missing data • NMAR: X Z Y R

  10. Missing data mechanisms – An example • X = Age, Y = Work status • If the probability of providing the work status is the same for all the persons in the survey, regardless of their age or work status, the data are Missing Completely At Random (MCAR). • If the probability of providing the work status is varies according to the age of the respondent, but does not vary according to the work status of respondents within an age group, then the data are Missing At Random (MAR). • If the probability of providing the work status varies according to the work status within each age group, the data are Not Missing At Random (NMAR).

More Related