1 / 49

Goodness of fit in structural equation models

Goodness of fit in structural equation models. Jose M. Cortina Tiffany M. Bludau George Mason University. SEM and Fit. SEM is an analysis technique We need to know whether data and model are consistent with one another The assessment of “Model Fit” is the assessment of this consistency.

velma
Download Presentation

Goodness of fit in structural equation models

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Goodness of fit in structural equation models Jose M. Cortina Tiffany M. Bludau George Mason University

  2. SEM and Fit • SEM is an analysis technique • We need to know whether data and model are consistent with one another • The assessment of “Model Fit” is the assessment of this consistency

  3. Outline • General discussion of fit • Observed vs. reproduced matrices • Identification • The role of chi-squared • Alternative fit indices • Which to use • Pitfalls

  4. The Two Faces of Fit • The term “Model Fit” is often used to denote overall fit • But assessment of fit comes more directly from consideration of the individual path coefficients and endogenous errors • If the coeffs linking constructs to one another are small, then the data and model are inconsistent!

  5. Overall Fit • As a whole, are the linkages in the model consistent with the relationships among the observed variables? • In one way or another, this is the question addressed by model fit indices • Specifically, fit indices compare observed and reproduced correlation matrices

  6. Reproduced matrices • Same form as the observed matrix • Contains r’s that are implied by the model • Consider a simple mediation model .20 .20 X M Y

  7. X M Y X 1 M .20 1 Y .10 .20 1 X M Y X 1 M .20 1 Y .04 .20 1 Observed vs. Reproduced Lack of fit in this model stems from the discrepancy between the red numbers

  8. Overidentification • The discrepancy is only possible because the model is “overidentified” • There are more knowns (i.e., observed r’s) than unknowns (i.e., coeffs to be estimated) • In this example, there are three knowns and two unknowns • What if we add a third path?

  9. Just identified model .062 .20 .19 Here there are the same number of knowns and unknowns. Observed and reproduced matrices will be identical “Fit” is perfect

  10. Underidentified model In this model, there is one known and there are two unknowns. There are an infinite number of solutions that would reproduce the observed r perfectly

  11. To summarize • In order for a unique solution to exist, a model must be at least just identified • In order for “fit” to be relevant, a model must be overidentified • The closer a model is to being just identified, the better (and less relevant) fit will be

  12. Fit Basics: Chi-squared • Begins with the model R-squared • R2m =1 - (1 - R21)(1 - R22)...(1 - R2E) • This value is computed for both the hypothesized (overidentified) model AND the just identified model • We then use these R-squareds to compute Q=(1-R2saturated)/(1-R2hypo) • X2 = -(N-d)lnQ

  13. Worksheet For the hypothesized or overidentified model: R2var.M = .04, R2var.Y=.04, R2Model = .078 For the saturated or just identified model: R2var.M = .04, R2var.Y=.044, R2Model = .082 Q=(1-.082)/(1-.078)=.9956 Assuming N=101, Χ2 = -(101-1)ln(.9956) = .44 with 1 df

  14. Why isn’t X2 used? • It is a test statistic for badness of fit, and as such, rewards poor designs (i.e, small N) • It is sample size dependent, which means that it doesn’t give effect size information • It does serve as a basis for other indices

  15. What are the alternatives? • There are dozens, among them • NFI = (χ2n - χ2t )/ χ2n or (Fn - Ft)/Fn • PNFI = (dft/dfn)NFIt • GFI = 1 - .5tr(S - Σ)2 • AGFI = 1 - (1 - GFIt) * {[(p+q)(p+q+1)]/2dft} • PGFI = (dft/dfn)GFIt • RMR = SQRT [SUM(S - Σ)2/[(p+q)(p+q+1)]] • IFI = (Fn - Ft)/[Fn – (dft / N-1)]

  16. A few things about that slide • Note that chi-squared appears often • Note the simiplicity of RMR • Note the F’s • Note that df are used to adjust AGFI, PGFI, and PNFI

  17. RMR • RMR stands for Root Mean Squared Residual • It is the square root of the average of the squared differences between the values in the observed and reproduced correlation matrices • The smaller, the better • RMSEA is computed differently, but is very similar and more commonly used • Ranges from 0-1, with .08 being the conventional cutoff

  18. Other indices • For other common indices, the larger, the better • Note the F’s, which stand for Fit Function • For GLS estimators (e.g., ULS, ML, WLS), the fit function is F(Θ) = (S-Σ)’W-1 (S-Σ)

  19. Adjusted fit indices • One problem with most indices is that they reward lack of parsimony • This is true of GFI, for example • The AGFI includes a penalty for lack of parsimony • The PGFI includes a large penalty for lack of parsimony

  20. Other ways to distinguish • Degree of penalty for lack of parsimony is one dimension on which indices differ • There are others • As a set, these dimensions can be used to choose a set of indices that are maximally diagnostic

  21. Tanaka’s (1993) dimensions Population-based/sample-based Parsimony Normed/non-normed Absolute vs. relative Reliance on estimation method Sample size dependence

  22. Theoretical work • A number of theoretical papers, e.g. • Mulaik, James, Van Alstine, & Bennett, 1989 • Medsker, Williams, & Holahan, 1994 • Hu & Bentler, 1999 • Lack of empirical work • What has been done often uses simulated datasets (e.g. Marsh, Balla, & McDonald, 1988)

  23. Little guidance • The literature offers little in the way of guidance with regard to which indices should be reported • Reviewers and editors do no better • So, authors tend to report the indices that are most flattering to their models • We sought to combine Tanaka’s work with empirical work to generate the best set of indices

  24. Specifically • We conducted a meta-analysis of correlations among fit indices • We compiled studies that reported at least two indices, then computed the correlation between each pair of indices • Those indices that are least redundant with other indices offer most unique info

  25. Studies used • Multiple disciplines • Keywords: Structural equation modeling, SEM, covariance structures model, and causal model • Currently have 400+ articles collected • Eliminated articles that: • Were theoretical in nature • Did not report results

  26. Coding of the studies • Two co-authors coded all articles • Coded for: • Discipline, software used, estimation method • Sample size, degrees of freedom • Various fit indices • Coded only the final model

  27. Correlations among indices p < .001,p < .05 RMSEA = Root Mean Square Error of Approximation, NFI = Normed Fit Index, TLI = Tucker-Lewis Index, CFI = Comparative Fit Index, SRMR = Standardized Root Mean Square Residual, GFI = Goodness of Fit, AGFI = Adjusted Goodness of Fit.

  28. Results from factor analysis Ran analysis on 5 indices Dropped the NFI and TLI 2 factor structure 83% of variance r = -.46

  29. Regressions Regressed each dimension onto the remaining dimensions GFI  R2= .91 AGFI  R2= .94 SRMR  R2= .65 RMSEA  R2= .61 CFI  R2= .46

  30. Recommendations • Select one index from each factor • CFI rather than the GFI or AGFI (Bentler & Bonnett, 1980; Marsh, Balla, & McDonald, 1988) • RMSEA or SRMR • CFI and RMSEA • Tanaka’s dimensions • Formulas • Other information

  31. Recommendations cont’d • Our study could only focus on indices that are commonly reported, and parsimony indices are not among them • We would suggest that PGFI or PNFI also be reported

  32. Tanaka’s dimensions RMSEA Population based Accounts for parsimony Normed Absolute Not estimation method specific Sample size dependent CFI Population based Does not account for parsimony Normed Relative Not estimation method specific Sample size dependent

  33. Formulas of the RMSEA and CFI Note the comparative nature of CFI Note how RMSEA does not account for the null model

  34. Other information • CFI • Works well with ML estimation • Works well will small sample sizes • As # of variables in a model increase, tends to worsen • RMSEA • Does not include comparison with a null model • Tends to improve as # of variables increase • Known distribution • CI’s • Stable across estimation methods and sample sizes

  35. Still in progress • Not able to account for all indices in each study coded • Plan to replicate the correlation matrices and generate missing values for coded studies • Reporting tendencies of researchers • Our plan is to show how patterns ACROSS this set of indices are diagnostic of particular plusses and minuses in a model

  36. Pitfalls • Reward for lack of parsimony • Overemphasis on overall fit • Overemphasis on absolute fit • Fit driven by measurement model • Specification searches

  37. Lack of parsimony • Models with few df generate very good values for almost all fit indices regardless of the quality of the model • In such cases, it is better to focus on the individual path coefficients

  38. Overall fit • Regardless of the magnitude of fit indices, individual path coefficients are very important • It is entirely possible to generate good indices for a bad model • For any given data set, there are many very different models that “fit”

  39. Absolute Fit • Knowledge of fit in an absolute sense is helpful but insufficient • Also helps to know how a model compares to alternatives • Relative fit indices help, but generally involve comparisons against a straw man (e.g., the null model) • Better to evaluate hypothesized model against plausible alternatives (e.g., additive model in MSEM)

  40. Decoupling measurement and structural models • Consider the following model Excluding latent variances and correlated errors, there are 21 path coefficients to be estimated. Only 1 of these is part of the structural model

  41. What happens when the ratio of meas. to struct. linkages is large? • Fit is driven largely by the measurement model • Thus, good fit can be achieved even if the the latent vars. are unrelated to one another • Good fit can be impossible even if the latent vars. are strongly related to one another

  42. Anderson & Gerbing • These authors suggested a two step approach • Evaluate the measurement model in the first step (i.e., CFA). • Once a measurement model is settled upon, its values are fixed. • Only then is the structural model evaluated • Fit indices will then give a better picture of the degree to which hypotheses are supported

  43. Specification searches • If the fit of the hypothesized model is inadequate, one can conduct a specification search • This is an attempt to identify the sources of lack of fit • Modification indices are used most often

  44. Modification indices • MIs give the reduction in chi-squared that would be achieved with the addition of a given path • In many models, inferior fit is due to omission of a small number of paths • So, perhaps we should simply add these paths and move on

  45. Not so fast! • Often, the largest MI values are attached to paths for which there is no theoretical basis • A path should only be added if a theoretical case can be made for it, albeit post hoc

  46. What about correlated errors? • Often, the largest MI are attached to paths AMONG errors (i.e., off-diag elements of theta or psi matrices) • There is seldom (but not never) any theoretical basis for these, so they should not be added • Exceptions include errors attached to isomorphic variables separated by time and errors attached to variables that share components

  47. Cross-validation • Regardless of justification, spec. searches are post hoc • If N is adequate, plan for cross-validation • Separate sample into two parts at the outset • Test hypotheses on the larger part • Conduct spec search • Test modified model on the holdout sample • This reduces capitalization on chance

  48. Overall recommendations • Base conclusions on path coefficients as well • Ignore fit for models with few df • Choose fit indices wisely, for yourself and for others! • Beware the pitfalls • Preempt objections to spec search with cross validation • But most important….

  49. Tune in to CARMA!

More Related