1 / 30

Kohei Adachi Osaka University , Japan

Orthogonal Factor Analysis Subject to Direct Sparseness Constraint on Loadings. Kohei Adachi Osaka University , Japan. Nickolay T. Trendafilov The Open University , UK. 1. Introduction.

quasar
Download Presentation

Kohei Adachi Osaka University , Japan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. OrthogonalFactor Analysis Subject to Direct Sparseness Constraint on Loadings Kohei Adachi Osaka University, Japan Nickolay T. Trendafilov The Open University, UK

  2. 1. Introduction Starting with the FA model, we introduce Sparse Othogonal FA as a procedure for overcoming the problem of Confirmatory FA, with five slides. 1.1. FA Model 1.2. Problem of CFA 1.3. Automatic CFA by SOFA 1.4. Differences to Sparse PCA 1.5. Remaining Parts

  3. 1.1. FA (Factor Analysis) model FA model with m factors can be written as common factorsunique factors loadings (diag) unique variances X  F+U2 npnmp npp for standardized n-obsp-var data matrix X. The aim of FA is to estimate , ,  (factor corrlations) FAis classified into EFA (exploratory FA) without an constraint and CFA (Confirmatory FA) in which someloadings in is constrained to be zero.

  4. Var.1 Var.2 Var.3 Var.4 Var.5 Fac.1 Fac.2 1.2. Problem of CFA A CFA model is illustrated in this path diagram corresponding to  = 31 42 11 22 52 41 where the pairs of Var & Fac with nonzero loadings are linked. A problem of CFA is that its users must specify a priori the constraints on , i.e., how variables are linked to factors. To deal with this problem, we propose a procedure for computationally identifying the optimal CFA model among all possible models with  = I (identity).

  5. 1.3. Automatic CFA by SOFA We call our proposed procedure SOFA abbreviating Sparse Orthogonal FA, as it seeks sparseincluding zero loadings and  = I is assumed. Let use SP() for the sparseness of  (i.e., the number of zero loadings). Then, SOFA is formulated as: SOFA: [A] Min,f(,) s.t. SP() = an integer q [B] Perform [A] over q=qmin…qmax to select the best q SOFA allows us to find the optimal orthogonal CFAmodel among all possible ones.

  6. 1.4. Differences to Sparse PCA First X  F+U2 SOFA is based on FA model not on PCA model without2 X  F Second In SOFA, sparseness is directly constrained as Min,f(,) s.t. SP() = an integer q without using Penalty, in contrast to the existing sparse PCA formulated as MinfPCA() + Penalty() over 

  7. 1.5. Organization of Remaining Parts SOFA: [A] Min,f(,) s.t. SP() = an integer q [B] Perform [A] over q=qmin…qmax to select the best q 2Loss Functionintroducef(,) 3Algorithm describe [A] 4Sparseness Selectiondescribe [B] 5Simulation Study 6Examples 7Discussion

  8. 2. Loss Function We present the loss function to be minimized and formulate SOFA. 2.1. What Function is Selected? 2.2. Selected Function 2.3. Formulation of SOFA

  9. 2.1. What Function is Selected? FA is formulated with some types of loss functions. Among them, we select a function that can be rewritten as irrelevant to  const > 0 given matrix f(,) = h() + c  A2 = = (ij) (aij) SP() = q This minimization over s.t. is easily attained by ij =

  10. 2.2. Function Selected As such a function, we select f(F,U,,) = X  (F+U)2 (1) (de Leeuw, 2004; Unkel & Trendafilov, 2011) which can be written in the form f(F,U,,) = h() + n A2 = = X  (FA+U)2 n1XF Though (1) is a function of F,U,,, we show that (1) can be minimized only with the update of , later.

  11. 2.3. Formulation of SOFA So, our proposed SOFA is formulated as Min f(F,U,,) = X  (F+U)2 subject to SP() = q Sparseness Constraint FF = nIm Orthogonal Common Factors UU= nIp, Orthogonal unique Factors FU = Om×pOrthogo. common vs unique

  12. 3. Algorithm We detail the algorithm for SOFA. 3.1. Overview 3.2. Update of Λ and Ψ 3.3. Update of n1XZ (1) 3.4. Update of n1XZ (2) 3.5. Whole Algorithm 3.6. Multiple Starts

  13. 3.1. Overview To minimize Min f(F,U,,) = X  (F+U)2 we consider an ALS algorithm in which ,, Z = [F,U]are alternately updated, with common/unique factors combined in n(m+p) Z = [F,U]. However, in Slide 3.3, we show no need of updating Zand further no need of data matrix of Xif covariance matrixS = n1XX is available.

  14. 3.2. Update of , MinX  (F+U)2 with F,,Ufixed is attained by= diag(n1XU) (1) MinX  (F+U)2s.t. SP() = q with F,,Ufixed Remember, rewritten as h()+nA2 and  is obtained from A=n1XF (2) Note; (1) and (2) show ,A<=n1X[F,U] Z

  15. 3.3. Update of n1XZ (1) We use 2 slides to show how n1XZis updated. [F,U] [,] Our task is MinZ X(F+U)2 = X ZB2 s.t. n1ZZ = Im+p FF=nIm ,UU=nIp FU = O summarize attained using SVD n1/2XB = = P1D1Q1 Z=n1/2P1Q1+n1/2P2Q2 for being not unique, but n1XZis unique as next

  16. 3.4. Update of n1XZ (2) The two equations n1/2X= n1/2XBB+= P1D1 Q1B+ Z =n1/2P1Q1+n1/2P2Q2 imply the matrix giving ,  is rewritten as n1XZ = (P1D1Q1B+)(P1Q1+P2Q2) = B+Q1D1Q1 which can be obtained from EVD:BSB = Q1D12Q1 derived from SVD = n1XX sample cavariance matrix

  17. 3.5. Whole Algorithm X(F+U)2 = X(FA+U)2 + n A2 monotonically decreases with the following algorithm: 1 Initialize B = [,] randomly 2 Perform EVD BSB = Q1D12Q1 3 Obtain B+Q1D1Q1 4 Update  5 Obtain A to update  6 Finish, or back to 2 with B = [,] Here, we find that SOFA only needs S=n1XX

  18. 3.6. Multiple Runs SOFA is sensitive to local minima. So, we take the following multiple runs procedure: 1 We run the algorithm 50 times with different starts and find the two equivalent solutions with the lowest loss function values. 2 If such solutions are found, we finish with selecting them as the optimal ones; otherwise, go to 3. 3 We further run the algorithm with different starts, until the two equivalent solutions with the lowest loss function values.

  19. 4. Sparseness Selection We present our sparseness selection procedure with just one slide. 4.1. Selection using BIC

  20. 3.5. Whole Algorithm SOFA: [A] Min,f(,) s.t. SP() = an integer q [B] Perform [A] over q=qmin…qmax to select the best q In the last section, we described [A]. For [B], we use BIC expressed as BIC(q) 2log-likelihood  q  log n That is, [B] is formulated as Best q = argmin BIC(q) over q = qmin…qmax We empirically found SOFA solutions were almost equivalent to ML ones, which validate using ML-based BIC for LS-based SOFA solutions.

  21. 5. Simulation Studies We briefly report a simulation study whose purpose is to assess how well the true sparseness and parameters are recovered by SOFA. 5.1. True Parameters 5.2. Results

  22. 5.1. True Parameters We synthesized the true 40 which had one of the five structures: Simple Structure Bi-factor Structure A “?” cell had 0 or a non-zero randomly. The resulting , gave 200 (= 40  5) correlation matrices to be analyzed by SOFA.

  23. Median Worst 5% Rate of correctly identified zeros Rate of correctly identified non-zeros 5.2. Recovery The resulting medians and worst 5 percentiles of indices values among 200 solutions are shown here. 1 2 3 4 5 1: True sparseness were selected well by BIC. 2,3: True structures were recovered well. 4,5: True parameter values were recovered well.

  24. 6. Examples We illustrate SOFA with the two famous data sets which have often been used for testing FA procedures. 6.1. Box Problem Data 6.2. Twenty-four Psy Test Data

  25. 6.1. Box Problem Data The first example is the 3 factor solution for the 400  20 box data matrix generated following Thurstone (1940). BIC was the lowest for q = 27, and the corresponding solution is shown right, where we find the exact simple structure.

  26. 6.2. Twenty-four Psy Test Data The second is the 4 factor solution for 24 psychol test data. BIC was the lowest for q = 35, and the corresponding solution is shown right. The loadings showed the bi-factor structure matched to the ones found in the previous studies using EFA and CFA.

  27. 7. Discussions After summarizing SOFA, we discuss its advantages over the existing CFA and EFA. 7.1. Summary 7.2. SOFA vs CFA 7.3. SOFA vs EFA (Rotation)

  28. 7.1. Summary We propose SOFA formulated as [A] Min,f(,) s.t. SP() = an integer q [B] Perform [A] over q=qmin…qmax to select the best q For [A] we developed the ALS algorithm for minimizing X  (F+U)2 s.t. SP() = q, [F,U][F,U] = I, which can be attained only if sample covariances are available. For [B] we propose to select sparsenessq using BIC. Numerical studies demonstrated SOFA successfully select q, obtain sparse structure in  and estimate ,.

  29. 7.2. SOFA vs CFA SOFA overcomes the problem of CFA that the locations of zero loadings must be specified by users: SOFA computationally find the optimal CFA model. But, SOFA solutions are restricted to orthogonal ones. So, oblique version of SOFA remains to be considered in future studies.

  30. 7.2. SOFA vs EFA (Rotation) As compared to SOFA, two drawbacks are found in EFA, in which loading matrix 0 is rotated so that the resulting 0T has quasi-sparse structure. This term implies that 0T cannot include exact zero loadings [1] The users must resort to view some loadings as approximately zeros, which is subjective and tandem. [2] Rotation does not involve the original data, i.e., the function of only 0T is optimized. in contrast to SOFA in which FA model with sparseness constraint is optimally fitted to data for finding the sparse structure underlying the data.

More Related