Structural Equation Modeling : A simple-complex multivariate technique

1 / 45

# Structural Equation Modeling : A simple-complex multivariate technique - PowerPoint PPT Presentation

Structural Equation Modeling : A simple-complex multivariate technique. By: Caroline Quesnel Carole Scherling Nancy Wallis. SEM. Categorization of SEM Used in many fields Path Analysis Confirmatory Factor Analysis Combining all for SEM analysis Steps for SEM Summary of SEM

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Structural Equation Modeling : A simple-complex multivariate technique

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Structural Equation Modeling: A simple-complex multivariate technique

By:

Caroline Quesnel

Carole Scherling

Nancy Wallis

SEM

• Categorization of SEM
• Used in many fields
• Path Analysis
• Confirmatory Factor Analysis
• Combining all for SEM analysis
• Steps for SEM
• Summary of SEM
Categorization of SEM
• Since many different kinds of models fall into each of the categories, structural modeling as an enterprise is very difficult to characterize.
• Major components include:
• Path analysis
• Confirmatory factor analysis

Categorization of SEM

• When SEM is useful:
• When you need to deal with latent (unobserved) constructs.
• When you have a strong theoretical background to your data (a priori hypothesis).
• When you are examining complex relationships.
Not only in social sciences…
• Social Psychology
• Structural equation modeling the use of a risk assessment instrument in child protective services
• A Structural Equation Model of Social Influences and Exposure to Media Smoking on Adolescent Smoking
• Application of structural equation modeling to evaluate the Intention of shippers to use Internet services in liner shipping.
• Human Capital and SME Internationalization: A Structural Equation Modeling Study
• Health/Medicine
• Application of Structural Equation Modeling to Health Outcomes Research
• Structural equation modeling of sleep apnea, inflammation, and metabolic dysfunction in children
Not only in social sciences…
• Neurosciences
• Connectivity exploration with structural equation modeling: an fMRI study of bimanual motor coordination
• Unified structural equation modeling approach for the analysis of multisubject, multivariate functional MRI data
• A structural equation modeling analysis of attentional control: an event-related fMRI study
• Cognition
• Validation of Cognitive Structures: A Structural Equation Modeling Approach.
• Static and Dynamic Longitudinal Structural Analyses of Cognitive Changes in Old Age

A

B

C

What is path analysis?
• Extension of multiple regression allowing us to consider more than one DV at a time and more importantly, allowing variables to be both DVs AND IVs.
• In other words, it permits us to consider chains of association, such that A can influence B, and B in turn can affect C.
• Therefore, B is an DV because it’s affected by A, but also an IV since it predicts C.
What is path analysis?
• New terminology:
• exogenous variable= Not influenced by any other model variables
• endogenous variable = Influenced by other model variables.
• Є= Error term of observed variables
• Limitation: can only handle variables that are observed.
Extension of path analysis in SEM
• Instead of being limiting to drawing paths among observed variables (as in path analysis), in SEM we can draw paths among latent variables.
• What are latent variables?
• Abstract, unobserved variables indirectly measured through their effects on observable variables.
• IQ, self-esteem and depression are good examples.
Extension of path analysis in SEM
• EX: Evaluation of the accepted construct of general intelligence (G) is done indirectly since there is no direct way of observing it.
• How? If subjects perform well in a range of assessed abilities (observed variables), a higher G score is attributed. However, poor performance in the battery leads to the assumption of a poorer score for G.
But there’s more…
• SEM is also in part composed of a form of factor analysis called Confirmatory Factor Analysis or CFA.
• So, let’s now discuss CFA and how it differs from the more commonly encountered forms of factor analysis.
What is factor analysis (FA)?
• Originally, this technique was used to EXAMINE/EXPLORE the data till something “desired” was revealed.
• Premises:
• have many variables and want to examine if they can be explained by a smaller number of factors.
• No a priori hypothesis (impossible to even indicate a hunch to the program) as to which variables will cluster together on which factor.
CFA presents a revised FA…
• The major difference is that an a priori hypothesis is essential:
• which variables grouped together as manifestations of an underlying construct and fits the model
• Like with path analysis, it can be helpful to draw hypothesized relations in a diagram.
• Most commonly used computer programs, such as LISREL (SSI, Lincolnwood, IL), AMOS (SPSS, Chicago, IL), EQS (Multivariate Software, Encino, CA), and Mplus (Muthén & Muthén, Los Angeles, CA), accept these diagrams as input.
CFA ≠ model building
• With CFA, you stipulate where you think the variables should load. Then, the program simply tells you whether your model fits the data.
• If no fit, then there are few clues to guide you how to shuffle the variables around to make the model better fit the data.
• Note: Even if the model does fit, it does not guarantee that a new arrangement of variables would be an even better fit.
• Therefore, one must really use theory, knowledge, or previous research to guide your model, rather than rely on statistical criteria.
CFA diagram

Error

Latent variable

Observed Variables

Combining both for SEM

• Instead of being limited to drawing paths among the measured variables, as we were with path analysis, we can draw paths among the latent variables.
• Each of the latent variables has ideally 3 or more associated measured variables, so that each latent variable becomes a small CFA in its own right.

Constructing diagrams

• Rectangles: observed variables (endogenous AND exogenous);
• Circles : disturbance, or error terms;
• Ovals : latent variables.

3 types of diagram symbols used in SEM:

Constructing diagrams

• Direction of arrows between symbols are important:
• for the analyses
• as a reflection of the underlying theory of latent variables, CFA, and SEM in general.
Squared values of the path coefficient (SMR)

SMR = Squared value of path coefficients

• Interpreted like an R2 multiple regression
• in terms of how much of the variance in one variable is explained by, or is in common with, the other variable.
Error variances

Error variances = # over the arrows between the error terms and the observed variables.

• NB: Σ (SMR + error variance per variable) = 1
• all the variance of a variable is divided between that shared with the latent variable and error.
Path coefficients

• Therefore, this is a regression value.
• Standardized coefficients range: -1 to 1
• “> value” = stronger association
Steps to SEM
• Assumptions
• Model specification
• Model identification
• Model estimation
• Model testing
• Model modification (usually needed)
1. Assumptions
• Sample size: You must first compute the number of possible parameters, knowing you need 10 participants / parameter.
• Normality: Collected data must be normally distributed (but values of parameter estimates are relatively robust).
• Independence: must be non-repeated measures
• Absence of outliers
• Absence of colinearity
2. Model specification
• Although no mathematics is involved, it is probably the most difficult—and most important—part.
• No use of computer aid.
• Draw out paths based on theory, literature, and knowledge.
• NB: Correlations between observed variables should not be significantly high (ex: an individual correlation > 0.85 will cause the program to crash)
3. Model identification
• Problem to solve: Possibility that the data will fit more than one theoretical model equally well.
• If y+x = 10, therefore infinite number of possibilities
• Solution: Make sure to give the program more information than you are asking from it. This in order to not guess more parameters than you should considering the number of observed variables that you have.
• If y set at 2, y+x= 10, then x is solvable.
3. Model identification

Determine the # of parameters you have.

• Formula: (v(v+1) / 2), where v= # of observed variables
• Use of this formula, allows to see if trying to guess more than the number of parameters the existing data allows.
• Do not want to be JUST identified (cause lack of fit indices) or UNDER identified, therefore looking to be OVER-identified.
• Being OVER identified essentially means that there are more available parameters than trying to estimate.

Єx1

Єx2

Єy1

Єy2

Єy4

Єy3

X1

X2

y1

y2

y3

y4

A

B

C

dB

dc

3. Model identification

d= Disturbance (error term of latent variables)

Єx1

Єy1

Єy2

Єy4

Єy3

Єx2

X1

X2

y1

y2

y3

y4

A

B

C

dB

dc

3. Model identification

Steps

• Calculate the observed variables formula (v(v+1)/2):

= (6(6+1)/2)

= 21

Єx1

Єy1

Єy2

Єy4

Єy3

Єx2

X1

X2

y1

y2

y3

y4

A

B

C

dB

dc

3. Model identification
• Now the limits are known, using the # of parameters from the example we can calculate:

a) total # of variances (exogenous variables): 1

• Ex: A = 1

b) total # of d : 2

c) total # of Є: 6

Єx1

Єy1

Єy2

Єy4

Єy3

Єx2

X1

X2

y1

y2

y3

y4

A

B

C

dB

dc

1

1

1

3. Model identification

d) Total # of paths: 3

Rule of thumb: Set one path per each set of observed variables to “1” (hence, no longer a free parameter, so no estimation needed since it is now fixed).

y= b1x1 + b2x2

DV

e) Total # of structural paths: 2

3. Model identification

3. Now we must add up all the values:

1+2+6+3+2= 14

• Please note that our task is much eased since AMOS will tell you if you have the correct number of parameters.
• It will give you an error, or not run at all if it is under-identified.
• NB: if your model is based on theory, identification should not be an encountered problem.
4. Model estimation 5. Model testing

Steps

A) Run the model using the chosen program.

B) Verify fit (Is this a good model?)

i) Chi-squared (recommended, but often does not work)

-Index for “badness” of fit : Non-significant value = good model.

-Very sensitive: keep results in mind but do not solely rely.

ii) Other indices calculations are:

-RMSEA: reasonable fit = 0.08; < 0.05 indicates a good fit.

- CFI and SRMR: range = 0 and 1 (interpreted as measures of association or effect size); minimal acceptable value = 0.90 (except with significant chi-squared, thereby requiring 0.95).

4. Model estimation 5. Model testing

• Whenever you are presenting a preferred model, it is also convention to demonstrate that you have explored other models.
• It is up to the researcher to explain why the preferred should not be rejected in favour of statistically equivalent ones.
6. Model modification
• If indices indicate a poor fit, you can do post-hoc modifications to see if it is possible to achieve fit.
• Omission of variables,
• Dropping non-significant paths,
• Caveat: SEM is a knowledge based testing statistical tool. Therefore, applying a post-hoc modification can be a poor practice in theory.
6. Model modification
• NB: Must remember that it is unreasonable to expect a structural model to fit perfectly.
• A structural model with linear relations is only an approximation and the world is unlikely to be linear.
• So instead of asking “Does the model fit perfectly?”, you must ask “Does it fit well enough to be a useful approximation of reality and a reasonable explanation of the trends in the data?”.
6. Model modification
• NB: Simply because a model fits well, it does not prove that the model is correct.
• Fit indicates the that you are on the right track, however you must acknowledge the possibility that it could also be wrong, or that another could be even better.
• Therefore, it is a fallacy to affirm proof.

Review omissions

Reject

Modify

No

Do indices reveal a good fit ?

Build Knowledge base

Gather Data

Design and Run the model

Yes

Tentative Support for theoretical model

The basic logic of SEM is embodied by this process diagram.

Quick example of SEM (AMOS)

Screens of an SEM output

• http://www.creative-wisdom.com/teaching/WBI/SEM.shtml
Limitations of SEM
• If there is not enough theoretical background, the model WILL suffer.
• The model is only as good as the validated tests used in the experiment to measure the observed variables.