Create Presentation
Download Presentation

Download Presentation
## Structural Equation Models

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Structural Equation Models**Asma Alfadhel Sarah Asio Jimmy(Yuanshan) Cheng 10/10/2013**Outline:**• Part I: • CFA • Part II: • SEM • SEM Plots • Part III: • Goodness of Fit**Part I (CFA) - Outline:**• CFA • Confirmatory Factor Analysis • Available R Packages. • The Lavaan Package. • Model Description. • Apply CFA and Interpret Results**Confirmatory Factor Analysis vs. EFA**• EFA: exploratory • All loadings are free to vary (“L” has no zeros) • Assumption: Cov(F) = I • CFA: driven by theory • The number of factors • Correlations between factors, Cov(F) = ɸ • Which items load onto which factors • CFA allows for the constraint of certain loadings to be zero Diagram**Confirmatory Factor Analysis vs. SEM**• SEM: specify the causality between factors • Directed arrows between latent variables • Called the structural model • CFA: no directed arrows between latent factors • Called (the measurement model) • CFA is frequently used as a first step to assess the proposed measurement model in a structural equation model. (wikipedia)**Objective of CFA:**• Cov(Y) = L Cov(F) LT+ Ψ • Factors are uncorrelated with error terms, and error terms are uncorrelated • Cov(Y): the covariance of the observed variables • Cov(F) = ɸ, the covariance of the factors • Cov(Y) = L ɸ LT+ Ψ • Ʃ = Ʃ(Ɵ) • (Observed Cov) (Implied Cov) • Try to match the implied covariance with the observed covariance**R Packages for SEM**• “SEM” package: developed by John Fox and for along time was the only option in R • “OpenMx” package: developed by Steven Boker. • “lavaan” package: developed by Yves Rossel from the Ghent University in Belgium.**The “lavaan” Package:**• lavaan is an R package for latent variable analysis: * • conﬁrmatory factor analysis: function cfa() • structural equation modeling: function sem() • latent curve analysis / growth modeling: function growth() • general mean/covariance structure modeling: function lavaan() • (item response theory (IRT) models) • (latent class + mixture models) • (multilevel models) • More information: • Lavaanwebsite. • lavaan: an R package for structural equation modeling. • Journal of Statistical Software • *http://users.ugent.be/~yrosseel/lavaan/lavaan1.pdf**“cfa” function:**• Description: Fit a Conﬁrmatory Factor Analysis (CFA) model. • Usage cfa(model = NULL, data = NULL, meanstructure= "default", fixed.x = "default", orthogonal = FALSE, std.lv = FALSE, std.ov = FALSE, missing = "default", ordered = NULL, sample.cov= NULL, sample.cov.rescale = "default", sample.mean= NULL, sample.nobs = NULL, ridge = 1e-05, group = NULL, group.label= NULL, group.equal = "", group.partial = "", cluster = NULL, constraints = ’’, estimator = "default", likelihood = "default", information = "default", se = "default", test = "default", bootstrap = 1000L, mimic = "default", representation = "default", do.fit= TRUE, control = list(), WLS.V = NULL, NACOV = NULL, start = "default", verbose = FALSE, warn = TRUE, debug = FALSE) • Arguments • model: A description of the user-speciﬁed model. • data: An optional data frame containing the observed variables used in the model. • std.lv: If TRUE, the metric of each latent variable is determined by ﬁxing their variances to 1.0. If FALSE, the metric of each latent variable is determined by ﬁxing the factor loading of the ﬁrst indicator to 1.0. • std.ov: If TRUE, all observed variables are standardized before entering the analysis. • Missing: If the data contain missing values, the default behavior is “listwise” deletion. If the missing mechanism is MCAR (missing completely at random) or MAR (missing at random), the lavaan package provides case-wise (or 'full information') maximum likelihood estimation (Set missing = "ML").**Model Description**• The dataset was collected by Sarah Asio. • The original model consists of 12 factors and 42 observed indicators. • For simplification a sub-model was used; it consists of 4 factors and 23 observed variables. • The dataset contains a sample of 381 responses from students. • The items range in value from 1 to 6. Team Innovation Team Effort Team Learning Team Communication**Specifying the model: (Symbols)**• =~“latent variable definition” • latent variable =~ indicator1 + indicator2 + indicator3 • It define how the latent variables are 'manifested by' a set of observed variables. • The reason why this model syntax is so short, is that the function will take care of several things: • First, by default, the factor loading of the first indicator of a latent variable is fixed to 1, thereby fixing the scale of the latent variable. • Second, residual variances are added automatically. • And third, all exogenous latent variables are correlated by default. • http://lavaan.ugent.be/tutorial/cfa.html**Specifying the model: (Symbols)**• ~~“Correlation” --- Correlated with • Residual Variance • Covariance of each latent variable. • ~“Regression” --- Regressed on • This is used in specifying the SEM model.**Specifying the model:**• #Specify the model Our.model<- 'CMM =~ CM9 + CM10 + CM11 + CM12 + CM13 EFF =~ EF14 + EF15 + EF16 +EF17 LN =~ LN18 + LN19 +LN20 +LN21 +LN22 +LN23 +LN24 INN =~ IN36 + IN37 + IN38 + IN39 + IN40 + IN41 + IN42' fit <- cfa(Our.model, data=MyData) summary(fit, fit.measures=T) Syntax**Missing values, Standardization, & R2**• fit <- cfa(Our.model, data=MyData, std.lv=TRUE, std.ov = TRUE, missing = "ML") • summary(fit, fit.measures=T, rsq=T) • OR • Inspect(fit, "rsquare") (no round off) • fit <- cfa(Our.model, data=MyData, missing = "ML") • summary(fit, standardized = TRUE, rsq =TRUE)**2st Output**• fit <- cfa(Our.model, data=MyData, std.lv=TRUE, std.ov = TRUE, missing = "ML") • Inspect(fit, "rsquare")**CFA Syntax in “lavaan” vs “sem”**• install.packages("semPlot") • Lavaan.model <- semSyntax(fit, "lavaan") • Sem.model <- semSyntax(fit, “sem") • Output: • BACK**CFA vs. EFA**• Back**Part II Outline**• SEM process - Overview • SEM Measurement models • SEM Path diagram - Overview • R-Code for: • SEM model specification • SEM model fitting • SEM Path Diagram • Outputs for SEM model and path diagram**STRUCTURAL Equations Modeling (SEM) process**Notes: SEM vs CFA “Factor Analysis, Path Analysis, and Structural Equations Modeling”, Book extract, Jones and Bartlett publishers. http://www.jblearning.com/samples/0763755486/55485_CH14_Walker.pdf**SEM Measurement models**• Endogenous measurement model: • Y = ByZ + ey • Here: • Y is an (ny x1) matrix of endogenous indicators, • By is an (nyxq) matrix of coefficients from the endogenous variable to endogenous indicators, • Z is a (qx1) matrix of endogenous latent variable(s), • ey is a (nyx1) matrix for error associated with the endogenous indicators. • Exogenous measurement model: • X = BxU + ex • Here: • X is an (nx x1) matrix of exogenous indicators, • Bx is an (nx xp) matrix of coefficients from the exogenous variables to exogenous indicators, • U is a (px1) matrix of exogenous latent variable(s), • ex is a (nx x1) matrix for error associated with the exogenous indicators. “Factor Analysis, Path Analysis, and Structural Equations Modeling”, Book extract, Jones and Bartlett publishers. http://www.jblearning.com/samples/0763755486/55485_CH14_Walker.pdf**Overall SEM Measurement & Structural models**• SEM model for the case study: • Z = BzU + ez • Here: • Z is the endogenous variable, • U is a (3x1) matrix of exogenous latent variable(s), • Bz is a (1x3) matrix of coefficients of exogenous variables, • ez is the error associated with the endogenous variable. + + + “Factor Analysis, Path Analysis, and Structural Equations Modeling”, Book extract, Jones and Bartlett publishers. http://www.jblearning.com/samples/0763755486/55485_CH14_Walker.pdf**Matrix representation for SEM measurement models**X = BxU + ex Y = ByZ + ey Z = BsU + es Notes: CFA vs EFA**SEM Path diagram - Overview**• A path diagram is a graphical representation of the hypothesized relationships between the variables. • Exogenous – emanates arrow (analogous to independent variables). • communication, effort and learning • Endogenous – receives arrow (analogous to dependent variables). • innovation and measures • Other variables are error terms which account for random or measurement error for endogenous variables. http://en.wikipedia.org/wiki/Structural_equation_modeling**Path Diagram Node representations**http://people.ucsc.edu/~zurbrigg/psy214b/09SEM3a.pdf**R-Code for SEM model specification**• #Specify the model • Our.model <- ‘ • CMM =~ CM9 + CM10 + CM11 + CM12 + CM13 • EFF =~ EF14 + EF15 + EF16 +EF17 • LN =~ LN18 + LN19 +LN20 +LN21 +LN22 +LN23 +LN24 • INN =~ IN36 + IN37 + IN38 + IN39 + IN40 + IN41 + IN42 • INN ~ CMM + EFF + LN’ • #Install the lavaan package • install.packages("lavaan") • require("lavaan")**R-Code for SEM model fitting**• # Fit SEM model using standardized data • fit <- lavaan ::: sem(Our.model, data=SEMdata, std.lv=TRUE, std.ov = T, missing = "ML") • summary(fit, standardized=TRUE, fit.measures=TRUE, rsquare=TRUE) • Syntax definitions: • std.lv: If TRUE, the metric of each latent variable is determined by ﬁxing their variances to 1.0. If FALSE, the metric of each latent variable is determined by ﬁxing the factor loading of the ﬁrst indicator to 1.0. • std.ov: If TRUE, all observed variables are standardized before entering the analysis. • Missing: If "listwise", cases with missing values are removed listwise from the data • frame before analysis. If "direct" or "ml" or "fiml" and the estimator is maximum likelihood, Full Information Maximum Likelihood (FIML) estimation is used using all available data in the data frame. • http://cran.r-project.org/web/packages/lavaan/lavaan.pdf**R-Code for SEMS Path Diagram**• #Install semPlot package • install.packages("semPlot") • require("semPlot") • # Plot input path diagram • semPaths(fit,title=FALSE, curvePivot = TRUE, exoVar = FALSE, exoCov = FALSE) • # Plot output path diagram with standardized parameters • semPaths(fit, "std”, curvePivot = TRUE, exoVar = FALSE, exoCov = FALSE) • For more options and Syntax definitions, refer to: • http://cran.r-project.org/web/packages/semPlot/semPlot.pdf**Part III (Goodness of fit) - Outline**• Introduction to fit indices • Using R to show these indices • Modification indices**Goodness of fit**• Model fit: “how the model that best represents the data reflects underlying theory” • Population covariance matrix (∑) Matches Implied covariance matrix (∑(θ) ) • So far not yet an agreement on • Which indices to use • Cut-offs for various indices • Hopper et. al (2008)**Overview of Indices**• Hopper et. al (2008)**Benchmarks Summary**• Hopper et. al (2008)**Reporting Strategy**• Not necessary to report all • Do not choose to report only the good ones • CFI, GFI, NFI, and NNFI are most commonly reported (McDonald and Ho 2002) Hopper et. al (2008)**Reporting Strategy**• Hopper et al (2008) • Chi-Square, df, p-value • RMSEA, SRMR, CFI and one parsimony fit index • Two-index presentation strategy (Hu and Bentler, 1999) • TLI and SRMR • RMSEA and SRMR • CFI and SRMR**Modification indices**• To improve the model fit by freeing fixed parameters • CFA is structured by theory • One factor only measures certain but not all observable measures • Parameters assumed to be zeros • Assumed zero error correlations • Just practical standard (Westfall et. al, 2012) Wikipedia**Freeing fixed parameters**F2 F1 X2 X4 X1 X3 e1 e2 e3 e4**Modification Indices**• Don’t allow modification indices to drive to process • Any modification should make theoretical sense • Good practice to assess the fit • Hopper et. al (2008)