Bootstrap Confidence Intervals for Three-way Component Methods
Download
1 / 38

Bootstrap Confidence Intervals for Three-way Component Methods Henk A.L. Kiers University of Groningen The Nether - PowerPoint PPT Presentation


  • 86 Views
  • Uploaded on

Bootstrap Confidence Intervals for Three-way Component Methods Henk A.L. Kiers University of Groningen The Netherlands. three- way data X. i = 1 . . . . . . I. SUBJECTs. K. OCCASIONS. k=1 . j=1 . . . . . . . J VARIABLES. three- way data X. i = 1 . . . . . . I.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Bootstrap Confidence Intervals for Three-way Component Methods Henk A.L. Kiers University of Groningen The Nether' - panthea


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

Bootstrap Confidence Intervals for Three-way Component Methods

Henk A.L. Kiers

University of Groningen

The Netherlands


Slide2 l.jpg

three- Methodsway

data

X

i = 1

.

.

.

.

.

.

I

SUBJECTs

K

OCCASIONS

k=1

j=1 . . . . . . . J

VARIABLES


Slide3 l.jpg

three- Methodsway

data

X

i = 1

.

.

.

.

.

.

I

SUBJECTs

K

OCCASIONS

k=1

j=1 . . . . . . . J

VARIABLES

  • Three-way Methods:Tucker3Xa = AGa(CB) + EaA (IP), B(JQ), C (KR) component matricesGamatricized version of G (PQR)core arrayCP = Candecomp/ParafacXa = AGa(CB) + EaG (RRR) superdiagonal

  • Practice:

    • three-way methods applied to sample from population

    • goal: results should pertain to population


Slide4 l.jpg

  • Example (Kiers & Van Mechelen 2001): Methods

    • scores of 140 individuals, on 14 anxiousness response scales in 11 different situations

    • Tucker3 with P=6, Q=4, R=3 (41.1%) Rotation: B, C, and Core to simple structure



Slide6 l.jpg

C Methods


Slide7 l.jpg

Core Methods


Slide8 l.jpg

Is solutions stable? Methods

Is solution ‘reliable’? Would it also hold for population?

Kiers & Van Mechelen report split-half stability results:

Split-half results: rather global stability measures


Slide9 l.jpg

  • How can we assess degree of Methodsstability/reliability of individual results?

  • confidence intervals (CI) for all parameters

  • not readily available

  • derivable under rather strong assumptions (e.g., normal distributions, full identification)

  • alternative:

  • BOOTSTRAP


Slide10 l.jpg

  • BOOTSTRAP Methods

    • distribution free

    • very flexible (CI’s for ‘everything’)

    • can deal with nonunique solutions

    • computationally intensive


Slide11 l.jpg

Bootstrap procedure: Methods

  • Analyze sample data X (IJK) by desired method  sample outcomes  (e.g., A, B, C and G)

  • Repeat for b=1:500

  • draw sample with replacement from I slabs of X Xb (IJK)

  • analyze bootstrap sample Xb in same way as sample outcomes b (e.g., Ab, Bb, Cb and Gb)

  • For each individual parameter :

  • 500 values available

  • range of 95% middle values  “95% percentile interval” ( Confidence Interval)


Slide12 l.jpg

  • Basic idea of bootstrap: Methods

    • distribution in sample = nonparametric maximum likelihood estimate of population distribution

    • draw samples from estimated population distribution,just as actual sample drawn from population

  • From which mode do we resample?

  • Answer: mimic how we sampled from population

    • sample subjects from population  resample A-mode


Slide13 l.jpg

  • Three questions: Methods

  • How deal with transformational nonuniqueness?

  • Are bootstrap intervals good approximations of confidence intervals?

  • How deal with computational problems (if any)?

Lots of possibilities, depends on interpretation

Not too bad

Simple effective procedure


Slide14 l.jpg

  • 1. How to deal with transformational nonuniqueness? Methods

  • identify solution completely

  • identify solution up to permutation/reflection

  •  for CP and Tucker3

  • identify solution up to orthogonal transformations

  • identify solution up to arbitrary nonsingular transformations

  •  only for Tucker3


Slide15 l.jpg

Identify solution completely: Methods

 uniquely defined outcome parameters 

 bootstrap straightforward (CI’s directly available)

CP and Tucker3 (principal axes or simple structure)

- solution identified up to scaling/permutation

Both cases:

- further identification needed


Slide16 l.jpg

does not affect fit Methods

Identify solution up to permutation/reflection

 outcome parameters b may differ much, but maybe only due to ordering or sign

 bootstrap CI’s unrealistically broad !

 how to make b’s comparable?

Solution:

 reorder and reflect columns in (e.g.) Bb, Cbsuch that Bb, Cb optimally resemble B, C


Slide17 l.jpg

e.g., two equally Methods

strong components

 unstable order

pros

cons

Completely identified

direct bootstrap CI’s

takes orientation, order, (too?!) seriously

Identified up to perm./refl.

more realistic solution

cannot fully mimic sample & analysis process


Slide18 l.jpg

*) thanks to program by Patrick Groenen (procedure by Meulman & Heiser, 1983)

Intermezzo

What can go wrong when you take orientation too seriously?

Two-way Example Data: 100 x 8 Data set

PCA: 2 Components

Eigenvalues: 4.04, 3.96, 0.0002, (first two close to each other)

PCA (unrotated) solutions for variables (a,b,c,d,e,f,g,h)

bootstrap 95% confidence ellipses*


Slide19 l.jpg

What caused these enormous ellipses? Meulman & Heiser, 1983)

Look at loadings for data and some bootstraps:

… leading to standard errors: ...


Slide20 l.jpg

Conclusion: Meulman & Heiser, 1983)solutions very unstable, hence: loadings seem very uncertain

However ….

Configurations of subsamples very similar

So: Weshould’ve considered the whole configuration !


Slide21 l.jpg

  • Identify solution up to orthogonal transformations Meulman & Heiser, 1983)

  • Tucker3 solution with A, B, C columnwise orthonormal:

  •  any rotation gives same fit (counterrotation of core)

  •  outcome parameters b may differ much, but maybe only due to coincidental ‘orientation’

  •  bootstrap CI’s unrealistically broad

  • Make b’s comparable:

  •  rotate Bb, Cb,Gb such that they optimally resemble B, C, G

  • How?

    • minimize f1(T)=||BbT–B||2 and f2(U)=||CbU–C||2

    • counterrotate core: Gb(UT)

    • minimize f3(S)=||SGb–G||2

    • use Bb* = BbT , Cb* = CbU, Gb* = SGb to determine 95%CI’s

comparable across bootstraps


Slide22 l.jpg

  • Notes: Meulman & Heiser, 1983)

  • first choose orientation of sample solution (e.g., principal axes or other)

  • order of rotations (first B and C, then G): somewhat arbitrary, but may have effect


Slide23 l.jpg

Identify solution up to nonsingular transformations Meulman & Heiser, 1983)

....analogously.....

 transform Bb, Cb,Gb so as to optimally resemble B, C, G


Slide24 l.jpg

  • Expectation: Meulman & Heiser, 1983)

  • the more transformational freedom used in bootstraps

  •  the smaller the CI’s

  • Example:

    • anxiety data set (140 subjects, 14 scales, 11 situations)

    • apply 4 bootstrap procedures

    • compare bootstrap se’s of all outcomes


Slide25 l.jpg

BootstrapMethod Meulman & Heiser, 1983)

mean se (B)

mean se (C)

mean se (G)

Principal Axes

.085

.101

3.84

Simple Structure

.085

.093

2.77

Orthog Matching

.059

.088

2.20

Oblique Matching

.055

.076

2.17

Some summary results:


Slide26 l.jpg

Now what CI’s did we find for Anxiety data Meulman & Heiser, 1983)

Plot of confidence ellipses for first two and last two B components



Slide28 l.jpg

A bit small.... Meulman & Heiser, 1983)

Confidence intervals for Higest Core Values


Slide30 l.jpg

  • 2. Are bootstrap intervals good approximations Meulman & Heiser, 1983) of Confidence Intervals?

  • 95%CI should contain popul.values in 95% of samples

  •  “coverage” should be 95%

  • Answered by SIMULATION STUDY

  • Set up:

    • construct population with Tucker3/CP structure + noise

    • apply Tucker3/CP to population  population parameters

    • draw samples from population

    • apply Tucker3/CP to sample and construct bootstrap CI’s

    • check coverage: how many CI’s contain popul. parameter


Slide31 l.jpg

  • Design of simulation study: Meulman & Heiser, 1983)

    • noise: low, medium, high

    • sample size (I): 20, 50, 100

    • 6 size conditions: (J=4,8, K=6,20, core: 222, 333, 432)

  • Other Choices

    • number of bootstraps: always 500

    • number of populations: 10

    • number of samples 10

    • Each cell: 1010500 = 50000 Tucker3 or CP analyses(full design: 336=54 conditions)


Slide32 l.jpg

Should be close Meulman & Heiser, 1983) to 95%

Here are the results


Slide33 l.jpg

Some details: Meulman & Heiser, 1983) ranges of values per cell in design (and associated se’s)

  • Some cells really low coverage

  • Most problematic cases in conditions with small I (I=20)


Slide34 l.jpg

3. How deal with computational problems (if any) Meulman & Heiser, 1983)

Is there a problem?

Computation times per 500 boostraps:

(Note: largest data size: 100  8  20)

CP: min 4 s, max 452 s

Tucker3 (SimpStr): min 3 s, max 30 s

Tucker3 (OrthogMatch): min 1 s, max 23 s

Problem most severe with CP


Slide35 l.jpg

How deal with computational problems for CP? Meulman & Heiser, 1983)

  • Idea: Start bootstraps from sample solution

  • Problem: May detract from coverage

  • Tested by simulation:

  • CP with 5 different starts per bootstrap

  • vs

  • Fast bootstrap procedure


Slide36 l.jpg

  • Results: Meulman & Heiser, 1983)

  • Fast method about 6 times faster (as expected)

  • Coverage

  • Optimal method: B: 95.5% C: 95.1%

  • Fast method: B: 95.3% C: 94.7%

  • Time gain enormous

  • Coverage hardly different


Slide37 l.jpg

  • Conclusion Meulman & Heiser, 1983)& Discussion

  • Bootstrap CI’s seem reasonable

  • Matching makes intervals smaller

  • Computation times for Tucker3 acceptable, for CP can be decreased by starting each bootstrap in sample solution


Slide38 l.jpg

some first tests show that this works Meulman & Heiser, 1983)

  • Conclusion &Discussion

  • What do bootstrap CI’s mean in case of matching?

  • 95% confidence for each value ? - chance capitalization - ignores dependence of parameters (they vary together)

  • Show dependence by bootstrap movie...!?!

  • Develop joint intervals (hyperspheres)...?

  • Sampling from two modes (e.g., A and C) ?

some first tests show that this does not work


ad