topic outline n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Topic Outline PowerPoint Presentation
Download Presentation
Topic Outline

Loading in 2 Seconds...

play fullscreen
1 / 49

Topic Outline - PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on

Topic Outline. Motivation Representing/Modeling Causal Systems Estimation and Updating Model Search Linear Latent Variable Models Case Study: fMRI. Discovering Pure Measurement Models. Richard Scheines Carnegie Mellon University. Ricardo Silva* University College London.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Topic Outline' - rianna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
topic outline
Topic Outline

Motivation

Representing/Modeling Causal Systems

Estimation and Updating

Model Search

Linear Latent Variable Models

Case Study: fMRI

richard scheines carnegie mellon university

Discovering

Pure Measurement Models

Richard ScheinesCarnegie Mellon University

Ricardo Silva*University College London

Clark Glymour and Peter SpirtesCarnegie Mellon University

outline
Outline
  • Measurement Models & Causal Inference
  • Strategies for Finding a Pure Measurement Model
  • Purify
  • MIMbuild
  • Build Pure Clusters
  • Examples
    • Religious Coping
    • Test Anxiety
goals
Goals:
  • What Latents are out there?
  • Causal Relationships Among Latent Constructs

Relationship

Satisfaction

Depression

or

Relationship

Satisfaction

Depression

or ?

needed
Needed:

Ability to detect

conditional independence

among latent variables

lead and iq
Lead and IQ

e2

e3

Parental Resources

Lead

Exposure

IQ

Lead _||_ IQ | PR

e2 ~ N(m=0, s = 1.635)

Lead = 15 -.5*PR + e2

PR ~ N(m=10, s = 3)

e3 ~ N(m=0, s = 15)

IQ = 90 + 1*PR + e3

psuedorandom sample n 2 000
Psuedorandom sample: N = 2,000

Parental Resources

Lead

Exposure

IQ

Regression of IQ on Lead, PR

measuring the confounder
Measuring the Confounder

e1

e3

e2

X1

X2

X3

Parental Resources

Lead Exposure

IQ

X1 = g1* Parental Resources + e1

X2 = g2* Parental Resources + e2

X3 = g3* Parental Resources + e3

PR_Scale = (X1 + X2 + X3) / 3

scales don t preserve conditional independence
Scales don't preserve conditional independence

X1

X2

X3

Parental Resources

Lead Exposure

IQ

PR_Scale = (X1 + X2 + X3) / 3

indicators don t preserve conditional independence
Indicators Don’t Preserve Conditional Independence

X1

X2

X3

Parental Resources

Lead Exposure

IQ

Regress IQ on: Lead, X1, X2, X3

structural equation models work
Structural Equation Models Work

X1

X2

X3

Parental Resources

Lead Exposure

IQ

b

  • Structural Equation Model
  • (p-value = .499)
  • Lead and IQ “screened off” by PR
slide12

Local Independence / Pure Measurement Models

  • For every measured item xi:
  • xi _||_ xj | latent parent of xi
strategies
Strategies
  • Find a Locally Independent Measurement Model
  • Correctly specify the MM, including deviations from Local Independence
tetrad constraints

tetrad

constraints

CovWXCovYZ

=(122L)(342L) ==(132L) (242L)=

CovWYCovXZ

WXYZ = WYXZ = WZXY

Tetrad Constraints
  • Fact: given a graph with this structure
  • it follows that

L

W = 1L + 1

X = 2L + 2

Y = 3L + 3

Z = 4L + 4

1

4

2

3

W

X

Y

Z

early progenitors
Early Progenitors

Charles Spearman (1904)

StatisticalConstraints Measurement Model Structure

g

m1

m2

r1

r2

rm1 * rr1 = rm2 * rr2

slide21

Impurities/Deviations from Local Independence

defeat tetrad constraints selectively

rx1,x2 * rx3,x4 = rx1,x3 * rx2,x4

rx1,x2 * rx3,x4 = rx1,x4 * rx2,x3

rx1,x3 * rx2,x4 = rx1,x4 * rx2,x3

rx1,x2 * rx3,x4 = rx1,x3 * rx2,x4

rx1,x2 * rx3,x4 = rx1,x4 * rx2,x3

rx1,x3 * rx2,x4 = rx1,x4 * rx2,x3

slide22

Purify

True Model

Initially Specified Measurement Model

slide23

Purify

Iteratively remove item whose removal most improves measurement model fit (tetrads or c2)

– stop when confirmatory fit is acceptable

Remove x4

Remove z2

slide24

Purify

Detectibly Pure Subset of Items

Detectibly Pure Measurement Model

how a pure measurement model is useful
How a pure measurement model is useful

Consistently estimate covariances/correlations among latents- test conditional independence with estimatedlatent correlations

Test for conditional independence among latents directly

2 test conditional independence relations among latents directly
2. Test conditional independence relations among latents directly

Question: L1 _||_ L2 | {Q1, Q2, ..., Qn}

b21

b21= 0  L1 _||_ L2 | {Q1, Q2, ..., Qn}

mimbuild
MIMbuild

Input:

- Purified Measurement Model

- Covariance matrix over set of pure items

MIMbuild

PC algorithm with independence tests

performed directly on latent variables

Output: Equivalence class of structural models

over the latent variables

goal 2 what latents are out there
Goal 2: What Latents are out there?
  • How should they be measured?
slide31

Latents and the clustering of items they measure

imply tetrad constraints diffentially

build pure clusters bpc
Build Pure Clusters (BPC)

Input:

- Covariance matrix over set of original items

BPC

1) Cluster (complicated boolean combinations of tetrads)

2) Purify

Output: Equivalence class of measurement models over a pure subset of original Items

build pure clusters
Build Pure Clusters
  • Qualitative Assumptions
  • Two types of nodes: measured (M) and latent (L)
  • M L (measured don’t cause latents)
  • Each m  M measures (is a direct effect of) at least one l  L
  • No cycles involving M
  • Quantitative Assumptions:
  • Each m  M is a linear function of its parents plus noise
  • P(L) has second moments, positive variances, and no deterministic relations
build pure clusters1
Build Pure Clusters

Output - provably reliable (pointwise consistent):

Equivalence class of measurement models over a pure subset of M

For example:

TrueModel

Output

build pure clusters2
Build Pure Clusters

Measurement models in the equivalence class are at most refinements, but never coarsenings or permuted clusterings.

Output

build pure clusters3
Build Pure Clusters
  • Algorithm Sketch:
  • Use particular rank (tetrad) constraints on the measured correlations to find pairs of items mj, mk that do NOT share a single latent parent
  • Add a latent for each subset S of M such that no pair in S was found NOT to share a latent parent in step 1.
  • Purify
  • Remove latents with no children
case studies
Case Studies

Stress, Depression, and Religion (Lee, 2004)

Test Anxiety (Bartholomew, 2002)

case study stress depression and religion

Specified Model

Case Study: Stress, Depression, and Religion
  • Masters Students (N = 127) 61 - item survey (Likert Scale)
  • Stress: St1 - St21
  • Depression: D1 - D20
  • Religious Coping: C1 - C20

p = 0.00

slide42

Case Study: Stress, Depression, and Religion

  • Assume Stress temporally prior:
  • MIMbuild to find Latent Structure:

p = 0.28

case study test anxiety
Case Study : Test Anxiety

Bartholomew and Knott (1999), Latent variable models and factor analysis

12th Grade Males in British Columbia (N = 335)

20 - item survey (Likert Scale items): X1 - X20:

Exploratory Factor Analysis:

slide44

Case Study : Test Anxiety

Build Pure Clusters:

case study test anxiety1
Case Study : Test Anxiety

Build Pure Clusters:

Exploratory Factor Analysis:

p-value = 0.00

p-value = 0.47

case study test anxiety2

MIMbuild

Scales: No Independencies or Conditional Independencies

p = .43

Uninformative

Case Study : Test Anxiety
limitations
Limitations
  • In simulation studies, requires large sample sizes to be really reliable (~ 400-500).
  • 2 pure indicators must exist for a latent to be discovered and included
  • Moderately computationally intensive (O(n6)).
  • No error probabilities.
open questions projects
Open Questions/Projects
  • IRT models?
  • Bi-factor model extensions?
  • Appropriate incorporation of background knowledge
references
References
  • Tetrad: www.phil.cmu.edu/projects/tetrad_download
  • Spirtes, P., Glymour, C., Scheines, R. (2000). Causation, Prediction, and Search, 2nd Edition, MIT Press.
  • Pearl, J. (2000). Causation: Models of Reasoning and Inference, Cambridge University Press.
  • Silva, R., Glymour, C., Scheines, R. and Spirtes, P. (2006) “Learning the Structure of Latent Linear Structure Models,” Journal of Machine Learning Research, 7, 191-246.
  • Learning Measurement Models for Unobserved Variables, (2003). Silva, R., Scheines, R., Glymour, C., and Spirtes. P., in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence , U. Kjaerulff and C. Meek, eds., Morgan Kauffman