Structural Equation Modeling

1 / 24

Structural Equation Modeling - PowerPoint PPT Presentation

Structural Equation Modeling. Karl L. Wuensch Dept of Psychology East Carolina University. Nomenclature. AKA “causal modeling.” “analysis of covariance structure.” Two sets of variables Indicators – measured (observed, manifest) variables – diagramed within rectangles

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Structural Equation Modeling' - max

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Structural Equation Modeling

Karl L. Wuensch

Dept of Psychology

East Carolina University

Nomenclature
• AKA
• “causal modeling.”
• “analysis of covariance structure.”
• Two sets of variables
• Indicators – measured (observed, manifest) variables – diagramed within rectangles
• Latent variables – factors – diagramed within ellipses
Causal Language
• Indicators and latent variables may be classified as “independent” or “dependent”
• Even if no variables are manipulated
• Based on the causal model being tested.
• In the diagram, indicators and latent variables may be connected by arrows
• One-headed = unidirectional causal flow
• Two-headed = direction not specified
Paths
• “Dependent” variables have one-headed arrows pointing to them.
• “Independent” variables do not.
• “Dependent” variables also have residuals (are not perfectly predicted by the independents)
• Called errors (e) for observed variables
• Disturbances (d) for latent variables
Two Models
• Measurement Model – how the measured variables are related to the latent variables.
• Structural Model – how the latent variables are related to each other.
Two Variance Covariance Matrices
• The sample matrix – computed from the sample data.
• The estimated population matrix – estimated from the model.
• 2 test of null that the model fits the data well.
• More useful are goodness of fit estimators
Sample Size
• Need at least 200 cases even for a simple model.
• Rule of thumb: at least 10 cases per estimated parameter.
Assumptions & Problems
• Multivariate normality.
• Linear relationships
• But can include polynomial components
• A singular matrix may crash the program
• Multicollinearity can be a problem
Simple Example from T&F

Regression Parameters
• Regression coefficients for the paths
• May be “fixed” to value 0 (no path) or 1
• Or estimated from the data ().
Variance/Covariance Parameters
• Variances/Covariances of the “independent” variables
• May be estimated
• or fixed, to 1 or
• to the variance of a “marker” measured variable (set to 1 the path to the marker).
• Variances for “dependent” latent variables usually fixed to the variance of one of the measured variables (set to 1 the path to that measured variable).
Model Identification
• A model is “identified” if there is a unique solution for each of the estimated parameters.
• Determine the number of input data points (values in the sample variance/covariance matrix).
• This is
• Where m = number of measured variables.
Model Identification
• For T&F’s simple model, 5(6)/2 = 15 data points.
Model Identification
• If the number of data points = the number of parameters to be estimated, the model is “just identified,” or “saturated,” and the fit will be perfect.
• If there are fewer data points than parameters to be estimated, the model is “under identified” and the analysis is kaput.
The “Over Identified” Model
• The number of input data points exceeds the number of parameters to be estimated.
• This is the desired situation.
• For T&F’s simple model, count the number of asterisks in the diagram. I count 11.
• 15 input data points, 11 parameters to estimate  we have an over identified model.
Eleven Parameters (*)

Identification of the Measurement Model Should Be OK if
• Only one latent variable, at least three indicators, errors not correlated.
• Two or more latent variables, each has at least three indicators, errors not correlated, each indicator loads on only one latent variable, latent variables are allowed to covary.
Identification of the Measurement Model Should Be OK if
• Two or more latent variables, one has only two indicators, errors are not correlated, each indicator loads on only one latent variable, none of the latent variables has a variance or covariance of zero.
Identification of the Structural Model May Be OK if
• None of the latent DVs predicts another latent DV,
• or if one does, it is recursive (unidirectional) and the disturbances are not correlated
• If there are nonrecursiverelationships (an arrow from A to B and from B to A), hire an expert in SEM.
Error in Identification
• If there is a problem, the software will throw an error.
• The software may suggest a way to reach identification.
• You must tinker with the model to make it identified.
Estimation
• Maximum Likelihood most common
• An iterative procedure used to maximize fit between the sample var/cov matrix and the estimated population var/cov matrix.
• Generalized Least Squares estimation has also fared well in Monte Carlo comparisons of techniques.
Modifying and Comparing Models
• May simplify a model by deleting one or more parameters in it.
• The simplified model is nested with the previous model.
• Calculate difference 2 = difference between the two model’s Chi-Squares.
• df = number of parameters deleted.
LM and Wald Tests
• Lagrange Multiplier Test
• Would fit be improved by estimating a parameter that is currently fixed?
• Wald Test
• Would fixing this parameter significantly reduce the fit?
• Available in SAS Calis, not in Amos
Reliability of Measured Variables
• Assume that the variance in the measured variable is due to variance in the latent variable (the “true scores”).
• Reliability = true variance divided by (true and error variance).
• Thus, estimated reliability = the r2 between measured variable and latent variable.