Factor analysis
Download
1 / 33

Factor analysis - PowerPoint PPT Presentation


  • 169 Views
  • Uploaded on

Factor analysis. Caroline van Baal March 3 rd 2004, Boulder. Phenotypic Factor Analysis. (Approximate) description of the relations between different variables Compare to Cholesky decomposition

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Factor analysis' - miyoko


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Factor analysis

Factor analysis

Caroline van Baal

March 3rd 2004, Boulder


Phenotypic factor analysis
Phenotypic Factor Analysis

  • (Approximate) description of the relations between different variables

    • Compare to Cholesky decomposition

  • Testing of hypotheses on relations between different variables by comparing different (nested) models

    • How many underlying factors?


Factor analysis and related methods
Factor analysis and related methods

  • Data reduction

    • Consider 6 variables:

    • Height, weight, arm length, leg length,verbal IQ, performal IQ

    • You expect the first 4 to be correlated, and the last 2 to be correlated, but do you expect high correlations between the first 4 and the last 2?


Data analysis in non experimental designs using latent constructs
Data analysis in non-experimental designs using latent constructs

  • Principal Components Analysis

  • Triangular Decomposition (Cholesky)

  • Exploratory Factor Analysis

  • Confirmatory Factor Analysis

  • Structural Equation Models


Exploratory factor analysis
Exploratory Factor Analysis constructs

  • Account for covariances among observed variables in terms of a smaller number of latent, common factors

  • Includes error components for each variable

  • x = P * f + u

  • x = observed variables

  • f = latent factors

  • u = unique factors

  • P = matrix of factor loadings


INF constructs

SIM

VOC

COM

ARI

DIG

COD

BLC

MAZ

PIC

PIA

OBA

1

Factor 1

IQ, “g”


INF constructs

SIM

VOC

COM

ARI

DIG

COD

BLC

MAZ

PIC

PIA

OBA

1

1

Factor 1

verbal

Factor 2

performal


Efa equations
EFA equations constructs

  • C = P * D * P’ + U * U’

  • C = observed covariance matrix

    • Nvar by nvar, symmetric

  • P = factor loadings

    • Nvar by nfac, full

  • D = correlations between factors

    • Nfac by nfac, standardized

  • U = specific influences, errors

    • Nvar by nvar, diagonal


  • Exploratory factor analysis1
    Exploratory factor analysis constructs

    • No prior assumption on number of factors

    • All variables load on all latent factors

    • Factors are either all correlated or all uncorrelated

    • Unique factors are uncorrelated

    • Underidentification


    INF constructs

    SIM

    VOC

    COM

    ARI

    DIG

    COD

    BLC

    MAZ

    PIC

    PIA

    OBA

    1

    1

    Factor 1

    verbal

    Factor 2

    performal

    Fix to 0


    Confirmatory factor analysis
    Confirmatory factor analysis constructs

    • An initial model is constructed, because:

      • its elements are described by a theoretical process

      • its elements have been obtained from a previous analysis in another sample

    • The model has a specific number of factors

    • Variables do not have to load on all factors

    • Measurement errors may correlate

    • Some latent factors may be correlated, while others are not


    INF constructs

    SIM

    VOC

    COM

    ARI

    DIG

    COD

    BLC

    MAZ

    PIC

    PIA

    OBA

    1

    1

    Factor 1

    verbal

    Factor 2

    performal


    INF constructs

    SIM

    VOC

    COM

    ARI

    DIG

    COD

    BLC

    MAZ

    PIC

    PIA

    OBA

    1

    1

    Factor 1

    verbal

    Factor 2

    performal


    INF constructs

    SIM

    VOC

    COM

    ARI

    DIG

    COD

    BLC

    MAZ

    PIC

    PIA

    OBA

    VC

    FD

    PO


    INF constructs

    SIM

    VOC

    COM

    ARI

    DIG

    COD

    BLC

    MAZ

    PIC

    PIA

    OBA

    VC

    FD

    PO


    Cfa equations
    CFA equations constructs

    • x = P * f + u

    • x = observed variables, f = latent factors

    • u = unique factors, P = factor loadings

    • C = P * D * P’ + U * U’

    • C = observed covariance matrix

    • P = factor loadings

    • D = correlations between factors

    • U = diagonal matrix of errors


    Structural equations models
    Structural equations models constructs

    • The factor model x = P * f + u is sometimes referred to as the measurement model

    • The relations between latent factors can also be modeled

    • This is done in the covariance structure model, or the structural equations model

    • Higher order factor models


    Second order factor model: constructsC = P*(A*I*A’+B*B')*P' + U*U’

    INF

    SIM

    VOC

    COM

    ARI

    DIG

    COD

    BLC

    MAZ

    PIC

    PIA

    OBA

    2nd order Factor

    “g”

    F1

    F2

    F3

    VC

    FD

    PO


    Five steps characterize structural equation models

    Model specification constructs

    Identification

    E.g., if a factor loads on 2 variables only, multiple solutions are possible, and the factor loadings have to be equated

    Estimation of parameters

    Testing of goodness of fit

    Respecification

    K.A. Bollen & J. Scott Long: Testing Structural Equation Models, 1993, Sage Publications

    Five steps characterize structural equation models


    Practice

    IQ and brain volumes (MRI) constructs

    3 brain volumes

    Total cerebellum, Grey matter, White matter

    2 IQ subtests

    Calculation, Letters / numbers

    Brain and IQ factors are correlated

    Datafile: mri-IQ-all-twinA-5.dat

    Practice!


    Script phenofact mx

    BEGIN MATRICES ; constructs

    P FULL NVAR NFACT free ; ! factor loadings

    D STAND NFACT NFACT !free ; ! correlations between factors

    U DIAG NVAR NVAR free ; ! subtest specific influences

    M Full 1 NVAR free ; ! means

    END MATRICES ;

    BEGIN ALGEBRA;

    C= P*D*P' +U*U' ; ! variance covariance matrix

    END ALGEBRA;

    Means M /

    Covariances C /

    Script: phenofact.mx


    in exploratory factor analysis, if nfact = 2, one of the factor loadings has to be fixed to 0 to make it an identified model

    fix P 1 2

    In confirmatory factor analysis, specify a brain and an IQ factor

    SPECIFY P

    101 0

    102 0

    103 0

    0 204

    0 205

    0 206

    (if a factor loads on 2 variables only, it is not possible to estimate both factor loadings. Equate them, or fix one of them to 1)


    Phenotypic correlations mri iq dutch twins a n 111 296 pairs
    Phenotypic Correlations: MRI-IQ, factor loadings has to be fixed to 0 to make it an identified modelDutch twins (A), n=111/296 pairs


    • What is the fit of a 1 factor model? factor loadings has to be fixed to 0 to make it an identified model

      • C = P * P’ + U*U’, P = 5x1 full, U = 5x5 diagonal

    • What is the fit of a 2 factor model?

      • Same, P = 5x2 full with 1 factor loading fixed to 0

      • (Reducion: fix first 3 factor loadings of factor 2 to 0)

    • Data suggest 2 latent factors: a brain (first 3) and an IQ factor (last 2): what is the evidence for this model?

      • Same, P = 5x2 full with 5 factor loadings fixed to 0

    • Can the 2 factor model be improved by allowing a correlation between these 2 factors?

      • C = P * D * P’ + U*U’, P = 5x2 full matrix (5 fixed),D = stand 2x2 matrix, U = 5x5 diagonal matrix


    Principal components analysis
    Principal Components Analysis factor loadings has to be fixed to 0 to make it an identified model

    • SPSS, SAS, Mx (functions \eval, \evec)

    • Transformation of the data, not a model

    • Is used to reduce a large set of correlated observed variables (xi) to (a smaller number of) uncorrelated (orthogonal) components (ci)

    • xi is a linear function of ci


    Pca path diagram

    c1 factor loadings has to be fixed to 0 to make it an identified model

    c2

    c3

    c4

    c5

    x4

    x2

    x3

    x5

    x1

    PCA path diagram

    • D

    • P

    • S = observed covariances = P * D * P’


    Pca equations

    c1 factor loadings has to be fixed to 0 to make it an identified model

    c2

    c3

    c4

    c5

    x4

    x2

    x3

    x5

    x1

    PCA equations

    • Covariance matrix qSq = qPq * qDq * qPq’

    • P = full q by q matrix of eigenvectors

    • D = diagonal matrix of eigenvalues

    • P is orthogonal: P * P’ = I (identity) Criteria for number of factors

    • Kaiser criterion, scree plot, %var

    • Important: models not identified!


    Correlations satisfaction n 100
    Correlations: satisfaction, n=100 factor loadings has to be fixed to 0 to make it an identified model


    work factor loadings has to be fixed to 0 to make it an identified model

    home

    0

    0

    ++

    0

    0

    ++

    ++

    ++

    ++

    ++

    0

    0

    Var 4

    Var 1

    Var 2

    Var 3

    Var 5

    Var 6


    Pca factor loadings eigenvalues 2 89 1 79
    PCA: Factor loadings factor loadings has to be fixed to 0 to make it an identified model(eigenvalues 2.89 & 1.79)


    Triangular decomposition cholesky
    Triangular decomposition (Cholesky) factor loadings has to be fixed to 0 to make it an identified model

    1

    1

    1

    1

    1

    y1

    y2

    y3

    y4

    y5

    x4

    x2

    x3

    x5

    x1

    • 1 operationalization of all PCA outcomes

    • Model is just identified! Model is saturated (df=0)


    Triangular decomposition
    Triangular decomposition factor loadings has to be fixed to 0 to make it an identified model

    • S = Q * Q’ ( = P# * P# ‘, where P# is P*D)

    • 5Q5 = f11 0 0 0 0 f21 f22 0 0 0 f31 f32 f33 0 0 f41 f42 f43 f44 0 f51 f52 f53 f54 f55

    • Q is a lower matrix

    • This is not a model! This is a transformation of the observed matrix S. Fully determinate!


    Saturated model latent factors script phenochol mx
    Saturated model, # latent factors factor loadings has to be fixed to 0 to make it an identified modelscript: phenochol.mx

    • BEGIN MATRICES ;

    • P LOWER NVAR NVAR free ; ! factor loadings

    • M FULL 1 NVAR free ; ! means

    • END MATRICES ;

    • BEGIN ALGEBRA;

    • C= Q*Q' ; ! variance covariance matrix

    • K=\stnd(C) ; ! correlation matrix

    • X=\eval(K) ; ! eigen values (i.e., variance of latent factors)

    • Y=\evec(K) ; ! eigenvectors (i.e., regression coefficients)

    • END ALGEBRA;

    • Means M /

    • Covariances C /


    ad