- 478 Views
- Uploaded on
- Presentation posted in: General

Statistical Modelling Special Topic: SEM

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

**1. **Statistical Modelling (Special Topic: SEM) Bidin Yatim, PhD Associate Professor in Statistics
College of Art and Science UUM. Phd Applied Statistics (Exeter, UK) MSc Industrial Maths (Aston,UK) BSc Maths & Stats (Nottingham, UK)

**3. **Modeling and Computing Steps In SEM
Implementing SEM Using AMOS
How to draw a model using AMOS.
How to run the AMOS model and evaluate several key components of the AMOS graphics and text output, including overall model fit and test statistics for individual path coefficients.
How to modify and re-specify a non-fitting model.

**4. **SEM

**5. **Steps in conducting an SEM: Specify the full model to be tested and check identification. Note: Models that appear identified on paper may prove to not be statistically or empirically identifiable due to properties of the data.
Test fit of measurement model.
Re-specify and refit measurement model if model fit statistics, etc. indicate this is necessary.
Once the measurement model is determined to have a good fit, test the fit of the theoretical model.
Re-specify and refit theoretical model if model fit statistics, etc. indicate this is necessary.

**6. **Model Specification The exercise of formally stating a model.
The step in which parameters are determined to be fixed or free. Fixed parameters are not estimated from the data and are typically fixed at zero (indicating no relationship between variables). The paths of fixed parameters are labeled numerically (unless assigned a value of zero, in which case no path is drawn) in a SEM diagram. Free parameters are estimated from the observed data and are believed by the investigator to be non-zero.
Asterisks in the SEM diagram label the paths of free parameters.
Determining which parameters are fixed and which are free in a SEM is extremely important because it determines which parameters will be used to compare the hypothesized diagram with the sample population variance and covariance matrix in testing the fit of the model (Step 4).
The choice of which parameters are free and which are fixed in a model is up to the researcher. This choice represents the researcher’s a priori hypothesis about which pathways in a system are important in the generation of the observed system’s relational structure (e.g., the observed sample variance and covariance matrix).

**7. **Model Identification concerns whether a unique value for each and every free parameter can be obtained from the observed data.
It depends on the model choice and the specification of fixed, constrained and free parameters.
A parameter is constrained when it is set equal to another parameter.
Models need to be overidentified in order to be estimated (Step 3 in SEM construction) and in order to test hypotheses about relationships among variables (See Ullman 1996 for a more detailed explanation of the levels of model identification).
A necessary condition for overidentification is that the number of data points (number of variances and covariances) is less than the number of observed variables in the model.

**8. **Steps In SEM
Model specification – based on EFA / CFA
Input Data - raw data or covariance/ correlation matrix
Covariance – preferred as test of theory (explains
magnitude and pattern of relationships)
Correlation – appropriate only if interested in
patterns, not in explaining total variance
Identification – based on degrees of freedom
Estimation – Maximum Likelihood or GLS
Testing fit – use Five or Six test criteria
Model Re-specification – re-think about alternative model structure

**10. **Model specification List the indicators for each factor
Describe all fixed and constrained parameters
Demonstrate that the model is identified

**11. **Gotta Fix It to 1 Lyrics by Alan Reifman (May be sung to the tune of "Fortunate Son," John Fogerty) Video of performance (Dr. Reifman, lead vocals) You make a construct, with its loadings, Can’t let them, all be free, So that the model’s identified, Fixing one is the key, It ain’t free, It ain’t free, Gotta fix it to 1, It ain’t free, It ain’t free, In AMOS, automatically done The number of knowns in your model, The unknowns can’t exceed, Fixing a loading for each construct, Will accomplish this need, It ain’t free, It ain’t free, Gotta fix it to 1, It ain’t free, It ain’t free, In AMOS, automatically done

**12. **Parsi-Mony Lyrics by Alan Reifman (May be sung to the tune of "Mony Mony," Bloom/Gentry/James/Cordell) Structural models need parsimony, Don’t want to add paths that are phony, Put the paths you need, now that’s all right, now, You got to keep your model lean and tight, now, ...lean and tight now, Yeah, yeah, yeah… If you can account (PARSIMONY), For (PARSIMONY), The data (PARSIMONY), With a (PARSIMONY), Minimum of paths (PARSIMONY), You’ve got (PARSIMONY) You’ve got (PARSIMONY) You want parsimony ...mo ...mo ...mony, Parsimony ...mo ...mo ...mony, Parsimony ...mo ...mo ...mony...

**13. **Identification Often thought of as being a very sticky issue
Is a fairly sticky issue
The extent to which we are able to estimate everything we want to estimate Identification is an important issue in SEM, but it is also a rather nasty issue. I shall try to give a nice overview of identification, but in so doing I shall run a risk of oversimplifying the issues, so for a much better coverage I recommend that you have a look at Bollen (1989).Identification is an important issue in SEM, but it is also a rather nasty issue. I shall try to give a nice overview of identification, but in so doing I shall run a risk of oversimplifying the issues, so for a much better coverage I recommend that you have a look at Bollen (1989).

**14. **X = 4
Unknown: x Imagine that we know one thing, that x = 4, and that we want to solve this equation to find the one unknown, which is x.
This is remarkable easy. X is 4.
In this example we had one piece of information that we put into the model, and we had one piece of information that we took out of the model.
Because this can be solved for the one unknown, it is referred to as being just-identified. Imagine that we know one thing, that x = 4, and that we want to solve this equation to find the one unknown, which is x.
This is remarkable easy. X is 4.
In this example we had one piece of information that we put into the model, and we had one piece of information that we took out of the model.
Because this can be solved for the one unknown, it is referred to as being just-identified.

**15. **In this example we have two unknowns, x and y, and two pieces of information that we can use to find these two.
Again, this is pretty straightforward.
It can be solved exactly, there is one piece of information and one unknown, so it is also just-identified.In this example we have two unknowns, x and y, and two pieces of information that we can use to find these two.
Again, this is pretty straightforward.
It can be solved exactly, there is one piece of information and one unknown, so it is also just-identified.

**16. **Again, we know two things, and we have two unknowns to estimate.
The equations can be solved to give us answers of:
x = 2.5
y = 1.5
This is also just-identifiedAgain, we know two things, and we have two unknowns to estimate.
The equations can be solved to give us answers of:
x = 2.5
y = 1.5
This is also just-identified

**17. **Now we have two unknowns, x and y, but we only have 1 piece of information.
It is not possible to solve this equation to provide answers for the unknown variables. (Or rather, there are an infinite number of solutions, each of which will do equally well.)
Now we have two unknowns, x and y, but we only have 1 piece of information.
It is not possible to solve this equation to provide answers for the unknown variables. (Or rather, there are an infinite number of solutions, each of which will do equally well.)

**18. ** So, if the number of things that we know is equal to the number of things that we want to know, the model is just-identified.
It is always possible to solve this type of model, and the model can never be wrong.
What we have been calling “normal” statistics are (usually) just identified. It is always possible to get a solution.
So, if the number of things that we know is equal to the number of things that we want to know, the model is just-identified.
It is always possible to solve this type of model, and the model can never be wrong.
What we have been calling “normal” statistics are (usually) just identified. It is always possible to get a solution.

**19. **If we are trying to get more information out of a system than we are putting into that system, then it is referred to as being not identified.
x + y = 7
Contains 1 piece of information, but we are asking for two pieces of information out of the system.
This is not identified, and can never be solved.
Similarly if we have a larger set of equations:
x - y = 4
z + y = 8
We cannot solve this set of simultaneous equations, because we are asking for more information out than we put in.
If we are trying to get more information out of a system than we are putting into that system, then it is referred to as being not identified.
x + y = 7
Contains 1 piece of information, but we are asking for two pieces of information out of the system.
This is not identified, and can never be solved.
Similarly if we have a larger set of equations:
x - y = 4
z + y = 8
We cannot solve this set of simultaneous equations, because we are asking for more information out than we put in.

**20. **If we have a smaller number of unknowns than there are pieces of information, then we have a model which is over-identified.
This is important, because it is now possible to have a model that is wrong.
The equations:
x + y = 4
x - y = 2
2x - y = 3
Cannot be solved to give a satisfactory answer. They are wrong. (The closest you can get is x = 2.43, y = 1.29).If we have a smaller number of unknowns than there are pieces of information, then we have a model which is over-identified.
This is important, because it is now possible to have a model that is wrong.
The equations:
x + y = 4
x - y = 2
2x - y = 3
Cannot be solved to give a satisfactory answer. They are wrong. (The closest you can get is x = 2.43, y = 1.29).

**21. **Identification We have information
(Correlations, means, variances)
“Normal” statistics
Use all of the information to estimate the parameters of the model
Just identified
All parameters estimated
Model cannot be wrong
Carrying out “traditional” statistical tests involves estimating as many parameters as there were pieces of information. This means that these analyses are just-identified. Carrying out “traditional” statistical tests involves estimating as many parameters as there were pieces of information. This means that these analyses are just-identified.

**22. **Over-identification SEM
Over-identified
The model can be wrong
If a model is a theory
Enables the testing of theories SEM models (usually) estimate fewer parameters than the number of items of information that went into the model. This means that the models can be tested, and can be found to be wrong. SEM models (usually) estimate fewer parameters than the number of items of information that went into the model. This means that the models can be tested, and can be found to be wrong.

**25. **So What is Identification?-1

**26. **So What is Identification?-2

**27. ** Input data Description of sample characteristics and size
Sample size – 5 to 10 per parameter
Larger sample for MLE estimation and nonnormality
Suggestion – 200 observations, not less than 100 or more than 400
Description of the type of data (e.g., nominal, interval, and scale range of indicators)
Tests of assumptions
Extent and method of missing data management
Complete information method – limits sample size
All-available method – can create estimation problems
Provide correlations, means, and SDs

**29. **Model Estimation

**30. **Estimation In this step, start values of the free parameters are chosen in order to generate an estimated population covariance matrix, ?(?), from the model. Start values can be chosen by the researcher from prior information, by computer programs used to build SEMs , or from multiple regression analysis (See Ullman 1996 and Hoyle 1995 for more start value choices and further discussion).
The goal of estimation is to produce a ?(?) that converges upon the observed population covariance matrix, S, with the residual matrix (the difference between ?(?) and S) being minimized.
Various methods can be used to generate ?(?). Choice of method is guided by characteristics of the data including sample size and distribution. Most processes used are iterative. The general form of the minimization function is:
Q = (s - ?(?))’W(s - ?(?))
where, s = vector containing the variances and covariances of the observed variables
?(?) = vector containing corresponding variances and covariances as predicted by the model
W = weight matrix.(Some authors refer to Q as F).
The weight matrix, W, in the function above, corresponds to the estimation method chosen. W is chosen to minimize Q, and Q(N-1) gives the fitting function, in most cases a X2-distibuted statistic. The performance of the X2 is affected by sample size, error distribution, factor distribution, and the assumption that factors and errors are independent (Ullman 1996).

**31. **Some of the most commonly used estimation methods are: Generalized Least Squares (GLS)
FGLS = ˝ tr[([S - ?(?)]W-1)2]
tr = trace operator, takes sum of elements on main diagonal of matrix
W-1 = optimal weight matrix, must be selected by researcher (most common choice is S-1)
Maximum Likelihood (MLE) most often used.
FML = log|?| - log|S| + tr(S?-1) - p
W = ?-1 and p = number of measured variables
Asymptotically Distribution Free (ADF) Estimator
FADF = [S - ?(?)]’W-1[S - ?(?)]
W, in this function, contains elements that take into account kurtosis.

**32. **Ullman (1996) and Hoyle (1995) discuss the advantages and limitations of the above estimators. ML (less problematic) and GLS are useful for normally distributed, continuous data when factors and errors are independent. Refer to handout
ADF is useful for analyzing categorical data, but is shown only to work well with sample sizes above 2,500. Refer to handout
What if the data is categorical and the sample size is not so large? Use measurement with 4 or more categories.
Ullman indicates that the best estimator for non-normally distributed data and/or dependence among factors and errors is the Scaled ML (See Ullman 1996 for further discussion).
Whatever function is chosen, the desired result of the estimation process is to obtain a fitting function that is close to 0. A fitting function score of 0 implies that the model’s estimated covariance matrix and the original sample covariance matrix are equal.

**33. **Model estimation Indicate software and version
Indicate type of data matrix analyzed
Indicate estimation method used

**34. **Assessing Fit of the Model: AMOS Output Divided into
Model, Parameters, and Estimation Summary
Model Assessment
Model Misspecification

**35. **Application 1: Barbara Byrne p55-93

**36. **Parameters and Model Summary

**37. **Parameters Summary Listed are
All the variables in the model according to their categorization- observed/ unobserved, endogenous/ exogenous (All factors/ error terms unobserved)
Total number of variables in the model and for each category.
Summary of the parameters: 32 regression weight, 20 fixed, 12 estimated.
Total of 58 parameters, 38 to be estimated

**38. **Overall Summary of the Model

**39. **Overall Summary hypothesized model is of recursive type
Sample size is 265
Info to determine identification status – =p(p+1)/2=16(17)/2=136 pieces of information, 38 to be estimated hence 136-38=98 degrees of freedom
Finally, a summary of the estimation process chi-square value of 167.153 with p-value of .0000

**40. **Limitations of Fit Indices Values of fit indices indicate only the average or overall fit of a model. It is thus possible that some parts of the model may poorly fit the data.
Because a single index reflects only a particular aspect of model fit, a favorable value of that index does not by itself indicate good fit. That is why model fit is assessed based on the values of multiple indices.

**43. **Assessment of Model Fit Examine the parameter estimates
Examine the standard errors and significance of the parameter estimates.
Examine the squared multiple correlation coefficients for the equations
Examine the fit statistics
Examine the standardized residuals
Examine the modification indices

**45. **Model Assessment: Parameter Estimates Provide all parameter estimates (e.g., factor loadings, error variances, factor variances)
Include the standard errors of the parameter estimates and its statistical significance
Need to consider the clinical as well as the statistical significance of the parameter estimates

**46. **Assessing Parameters Estimate Once the model has attained an acceptable fit, individual estimates of free parameters are assessed.
Free parameters are compared to a null value, using a z-distributed statistic.
The z statistic is obtained by dividing the parameter estimate by the standard error of that estimate.
The ratio of this test must exceed +/-1.96 in order for the relationship to be significant.
After the individual relationships within the model are assessed, parameter estimates are standardized for final model presentation.
When parameter estimates are standardized, they can be interpreted with reference to other parameters in the model and relative strength of pathways within the model can be compared.

**47. **Assessing Fit of the Model A fitting function value of close to 0 is desired for good model fit. In general, if the ratio between X2 (Chi-Square) and degrees of freedom is <5, the model is a good fit (Ullman 1996).
To have confidence in the goodness of fit test, a sample size of 100 to 200 is recommended (Hoyle 1995). In general a model should contain 10 to 20 times as many observations as variables (Mitchell 1993).
Ullman (1996) discusses a variety of non-X2-distributed fitting functions, which he calls “comparative fit indices.” Hoyle (1995) refers to these as “adjunct fit indices.” Basically, these approaches compare the fit of an independence model (a model which asserts no relationships between variables) to the fit of the estimated (your) model. The result of this comparison is usually a number between 0 and 1, with 0.90 or greater accepted as values that indicate good fit.
Both Hoyle and Ullman suggest use of multiple indices when determining model fitness.

**48. **Measures of Fit Measures of fit are provided for three models:
Default Model – this is the model that you specified
Saturated Model – This is the most general model possible. No constraints are placed on the population moments It is guaranteed to fit any set of data perfectly.
Independence Model – The observed variables are assumed to be uncorrelated with one another.

**49. **Summary of Models

**50. **Overall Measures of Fit NPAR is the number of parameters being estimated (q)
CMIN is the minimum value of the discrepancy function between the sample covariance matrix and the estimated covariance matrix.
DF is the number of degrees of freedom and equals the p-q
p=the number of sample moments
q= the number of parameters estimated

**51. **Overall measures of Fit CMIN is distributed as chi square with df=p-q
P is the probability of getting as large a discrepancy with the present sample
CMIN/DF is the ratio of the minimum discrepancy to degrees of freedom. Values should be <5 for correct models.

**52. **Cont….Summary of Models

**55. **Evaluating Goodness of Fit: Definitions of Some Key Statistics The ?2 Test assesses the overall model in terms of the discrepancy between the sample and estimated covariance matrix. If the theory is supported then the ?2 value computed from the sample must be statistically insignificant (P< .05)
The Goodness-Of-Fit (GFI) is interpreted in a similar manner to R2 in multiple regression. It is the amount of variances and sample covariances in the sample covariance matrix that are predicted by the hypothesised model.(GFI . >9)
The Adjusted Goodness-Of-Fit (AGFI) is analogous to the Adjusted R2 in multiple regression analysis. (AGFI > .80)
The Root Mean Square Residual (RMSR) is the square root of the average of the squared amount by which the sample variances and covariances differ from their estimates obtained if the hypothesised model is correct.The larger RMSR the poorer the fit

**56. **Performing Model Analysis-I

**57. **Comparisons to a Baseline Model NFI is the Normed Fit Index. It compares the improvement in the minimum discrepancy for the specified (default) model to the discrepancy for the Independence model. A value of the NFI below 0.90 indicates that the model can be improved.

**58. **Bentler-Bonett Index or Normed Fit Index (NFI) Define null model in which all correlations are zero:
?2 (Null Model) - ?2 (Proposed Model) ?2 (Null Model)
Value between .90 and .95 is acceptable; above .95 is good
A disadvantage of this index is that the more parameters, the larger the index.

**59. **Comparisons to a Baseline Model: Summary RFI is the Relative Fit Index This index takes the degrees of freedom for the two models into account.
IFI is the incremental fit index. Values close to 1.0 indicate a good fit.
TLI is the Tucker-Lewis Coefficient and also is known as the Bentler-Bonett non-normed fit index (NNFI). Values close to 1.0 indicate a good fit.
CFI is the Comparative Fit Index and also the Relative Noncentrality Iindex (RNI). Values close to 1.0 indicate a good fit.

**60. **Tucker Lewis Index or Non-normed Fit Index (NNFI) Value: ?2/df(Null Model) - ?2/df(Proposed Model) ?2/df(Null Model)
If the index is greater than one, it is set to1.
Values close to .90 reflects a good model fit.
For a given model, a lower chi-square to df ratio (as long as it is not less than one) implies a better fit.

**61. **Comparative Fit Index (CFI) If D= ?2 - df, then:
D(Null Model) - D(Proposed Model)
D(Null Model)
If index > 1, it is set to 1; if index <0, it is set to 0
A lower value for D implies a better fit
If the CFI < 1, then it is always greater than the TLI
The CFI pays a penalty of one for every parameter estimated

**62. **Root Mean Square Error of Approximation (RMSEA) Value: ?[(F0pop-F0est)/n]/df
F0 is the minimum vale of the discrepancy function.
If ?2 < df for the model, RMSEA is set to 0
Good models have values of < .05; values of > .10 indicate a poor fit.
It is a parsimony-adjusted measure.
Amos provides upper and lower limits of a 90% confidence interval for the RMSEA

**66. **A Model is accepted if the FIVE following criteria are met;

**69. **Model Modification If the covariance/variance matrix estimated by the model does not adequately reproduce the sample covariance/variance matrix, hypotheses can be adjusted and the model retested. To adjust a model, new pathways are added or original ones are removed. In other words, parameters are changed from fixed to free or from free to fixed. It is important to remember, as in other statistical procedures, that adjusting a model after initial testing increases the chance of making a Type I error.

**70. **Model Modification …. Cont. The common procedures used for model modification are the Lagrange Multiplier Index (LM) and the Wald test. Both of these tests report the change in X2 value when pathways are adjusted. The LM asks whether addition of free parameters increases model fitness. This test uses the same logic as forward stepwise regression. The Wald test asks whether deletion of free parameters increases model fitness. The Wald test follows the logic of backward stepwise regression.
To adjust for increased type one error rates, Ullman (1996) recommends using a low probability value (p<0.01) when adding or removing parameters. Ullman also recommends cross-validation with other samples. Because the order in which parameters are freed can affect the choice of remaining parameters, LM should be applied before the Wald test (i.e., add all parameters before beginning to delete them) (MacCullum 1986, cited in Ullman 1996). Refer to Ullman (1996) and Hoyle (1995) for further description of these and other model modification techniques.

**71. ** SEM and Its Applications Example 2:
Data with three continuous predictor variables: education level, a socioeconomic indicator, and feelings of powerlessness measured in 1967.
There is one continuous dependent variable, feelings of powerlessness measured in 1971.
These data are simulated based on the results reported in a larger study by Wheaton, Muthén, Alwin, and Summers (1977).

**72. **If you run a multiple regression analysis in SPSS for Windows using these variables, you will obtain the following results;

**74. **Now consider the equivalent model fit in AMOS:

**75. **Evaluating Global Model Fit Using AMOS

**77. **Model Re-specification

**78. **Example 3 Felson & Bohrnstedt’s Model (1979)
Perceived ‘academic’ performance is modeled as a function of GPA and perceived attractiveness
Perceived attractiveness, in turn, is modeled as a function of perceived academic performance, height, weight, and the rating of attractiveness by children from another city.

**79. **Model A

**80. **Model B: Respecification

**86. **What to Report in Documenting Results The covariance matrix between all variables (constructs) used in the research model
Variable (construct) means and their standard deviations
Parameter Estimates, that is, the structural path coefficients and the construct loadings
Overall fit indices. One has to be very selective in the choice – absolute, parsimonious, comparative fit measures
R-square

**88. **Links to other sites on SEM and Path Analysis:
http://www2.chass.ncsu.edu/garson/pa765/structur.htm
This is an excellent web page. It covers SEM in depth, mostly focusing on goodness of fit tests and assumptions of SEM. It is very thorough and therefore lengthy. Discussions are related to AMOS, LISREL and EQS computer applications, especially in terms of capabilities in goodness of fit tests. The discussions of identification, handling of missing data, and methods for estimating path coefficients are all very helpful. The end of the page gives information in a “Frequently Asked Questions” format and also provides links to a number of SEM packages (under the question heading: Does it matter which statistical package you use for structural equation modeling?)
http://www2.chass.ncsu.edu/garson/pa765/path.htm
This web page from the above author focuses on path analysis. It is also a very useful site, with discussion of key concepts and terms, assumptions, and a “Frequently Asked Questions” section.
http://www.statsoft.com/textbook/stathome.html
This site provides a short explanation of SEM.
http://www.maths.ex.ac.uk/~jph/psy6003/pathanal.html
This site from the University of Exeter provides a brief introduction and explanation of path analysis. It is part of a set of class notes.
http://www.uic.edu/classes/idsc/ids570/ntspaths.htm
This site from the University of Chicago also provides class notes on path analysis and SEM. It answers a general list of questions about SEM: why to use it, when to use it and how, etc.
http://www.geocities.com/CollegePark/Campus/caveat.htm
This short paragraph discusses the uncertainty of causal direction in SEM pathways and the problems caused by post hoc adjustment of models. It provides references to more in depth discussion of these topics.