Circular analysis in systems neuroscience
This presentation is the property of its rightful owner.
Sponsored Links
1 / 62

Circular analysis in systems neuroscience – with particular attention to cross-subject correlation mapping PowerPoint PPT Presentation


  • 121 Views
  • Uploaded on
  • Presentation posted in: General

Circular analysis in systems neuroscience – with particular attention to cross-subject correlation mapping. Nikolaus Kriegeskorte Laboratory of Brain and Cognition, National Institute of Mental Health. Collaborators. Chris I Baker W Kyle Simmons Patrick SF Bellgowan Peter Bandettini.

Download Presentation

Circular analysis in systems neuroscience – with particular attention to cross-subject correlation mapping

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Circular analysis in systems neuroscience– with particular attention to cross-subject correlation mapping

Nikolaus Kriegeskorte

Laboratory of Brain and Cognition, National Institute of Mental Health


Collaborators

Chris I Baker

W Kyle Simmons

Patrick SF Bellgowan

Peter Bandettini


Overview

Part 1General introduction to circular analysis in systems neuroscience(synopsis of Kriegeskorte et al. 2009)

Part 2Specific issue: selection bias incross-subject correlation mapping(following up on Vul et al. 2009)


data

results


analysis

data

results


analysis

data

results


assumptions

analysis

data

results


assumptions

data

results

analysis


Circular inference

assumptions

analysis

data

results


Circular inference

assumptions

analysis

data

results


Weighting

(continuous selection)

Elimination

(binary selection)

Sorting

(multiclass selection)

How do assumptions tinge results?

– Through variants of selection!


Elimination

(binary selection)

assumptions:

selection criteria

analysis

data

results


Example 1Pattern-information analysis


Experimental design

TASK

(property judgment)

Simmons et al. 2006

“Animate?”

“Pleasant?”

STIMULUS

(object category)


Pattern-information analysis

define ROI by selecting ventral-temporal voxels for which any pairwise condition contrast is significant at p<.001 (uncorr.)

perform nearest-neighbor classificationbased on activity-pattern correlation

use oddruns for trainingand evenruns for testing


Results

stimulus

(object category)

task

(judged property)

decoding accuracy

chance level

1

0.5

0


stimulus

task

decoding accuracy

chance level

!

?

fMRI data

data from Gaussian

random generator

using all data

to select ROI voxels

1

1

1

1

...but we used cleanly independent

training and test data!

using only

training data

to select ROI voxels

0.5

0.5

0.5

0.5

0

0

0

0


Conclusion for pattern-information analysis

The test data must not be used in either...

  • training a classifier or

  • defining the ROI

continuous weighting

binary weighting


Data selection is key to many conventional analyses.

Can it entail similar biases in other contexts?


Example 2Regional activation analysis


ROI definition is affected by noise

independent

ROI

overfitted

ROI

true region

overestimated effect

ROI-average

activation


Data sorting

assumptions:

sorting criteria

analysis

data

results


Set-average tuning curves

...for data sorted by tuning

response

stimulus parameter (e.g. orientation)

noise data


ROI-average

fMRI response

A

B

C

D

condition

Set-average activation profiles

...for data sorted by activation

noise data


To avoid selection bias, we can...

...perform a nonselective analysis

OR

...make sure that selection and results statistics are independent under the null hypothesis,

because they are either:

  • inherently independent

  • or computed on independent data

e.g. whole-brain mapping

(no ROI analysis)

e.g. independent contrasts


Does selection by an orthogonal contrast vector ensure unbiased analysis?

cselection=[1 1]T

ctest=[1 -1]T

orthogonal contrast vectors 

ROI-definition contrast: A+B

ROI-average analysis contrast: A-B


Does selection by an orthogonal contrast vector ensure unbiased analysis?

contrast

vector

– No, there can still be bias.

still not sufficient

not sufficient

The design and noise dependencies matter.

design

noise dependencies


Circular analysis

Pros

Cons

  • highly sensitive

  • widely accepted (examples in all high-impact journals)

  • doesn't require independent data sets

  • grants scientists independencefrom the data

  • allows smooth blending of blind faith and empiricism


Circular analysis

Pros

Cons

  • highly sensitive

  • widely accepted (examples in all high-impact journals)

  • doesn't require independent data sets

  • grants scientists independencefrom the data

  • allows smooth blending of blind faith and empiricism


Circular analysis

Pros

Pros

Cons

[can’t think of any right now]

  • the error that beautifies results

  • confirms even incorrect hypotheses

  • improves chances ofhigh-impact publication

  • highly sensitive

  • widely accepted (examples in all high-impact journals)

  • doesn't require independent data sets

  • grants scientists independencefrom the data

  • allows smooth blending of blind faith and empiricism


Part 2Specific issue: selection bias incross-subject correlation mapping(following up on Vul et al. 2009)


Motivation

Vul et al. (2009) posed a puzzle:

Why are the cross-subject correlations found in brain mapping so high?

Selection bias is one piece of the puzzle.

But there are more pieces and we have yet to put them all together.


Overview

  • List and discuss six pieces of the puzzle.

    (They don't all point in the same direction!)

  • Suggest some guidelines for good practice.


Six pieces synopsis

  • Cross-subject correlation estimates are very noisy.

  • Bin or within-subject averaging legitimately increases correlations.

  • Selecting among noisy estimates yields large biases.

  • False-positive regions are highly likely for a whole-brain mapping thresholded at p<.001, uncorrected.

  • Reported correlations are high, but not highly significant.

  • Studies have low power for finding realistic correlations in the brain if multiple testing is appropriately accounted for.


Vul et al. 2009

,,

noise-free

correlation

population

,,

The geometric mean of the reliability is an upper bound

on the population correlation.

The reliabilities provide no bound

on the sample correlation.


Piece 1

Sample correlationsacross small numbers of subjectsare very noisy estimatesof population correlations.


0.65


Cross-subject correlation estimatesare very noisy

95%-confidence

interval

correlation

10 subjects


Cross-subject correlation estimatesare very noisy


Piece 2

The more we average(reducing noise but not signal),the higher correlations become.


Bin-averaging inflates correlations


Bin-averaging inflates correlations


Subjects are like bins...

For each subject, all data is averaged to give one number.

Take-home message

Cross-subject correlation estimates are expected to be...

  • high (averaging all data for each subject)

  • noisy (low number of subjects)

So what's Ed fussing about?We don't need selection bias to explain the high correlations, right?


Piece 3

Selecting the maximumamong noisy estimatesyields large selection biases.


Expected maximum correlationselected among null regions

expected maximum correlation

bias

16 subjects


Piece 4

False-positive regions are likely to be found in whole-brain mappingusing p<.001, uncorrected.


Mapping with p<.001, uncorrected

Global null hypothesis is true

(population correlation = 0 in all brain locations)


Piece 5

Reported correlations are high,but not highly significant.


correlation thresholds as a function

of the number of subjects

one-sided

two-sided

Reported correlations are high,but not highly significant

p<0.00001

p<0.001p<0.01

p<0.05


correlation thresholds as a function

of the number of subjects

one-sided

two-sided

Reported correlations are high,but not highly significant

p<0.00001

p<0.001p<0.01

p<0.05


correlation thresholds as a function

of the number of subjects

one-sided

two-sided

Reported correlations are high,but not highly significant

What correlations would we expect

under the global null hypothesis?

(assuming each study reports

the maximum of 500

independent brain locations)

p<0.00001

p<0.001p<0.01

p<0.05


one-sided

two-sided

Reported correlations are high,but not highly significant

What correlations would we expect

under the global null hypothesis?

p<0.00001

p<0.001p<0.01

p<0.05

(assuming each study reports the max.of 500 independent brain locations)


Piece 6

Most of the studies have low powerfor finding realistic correlationswith whole-brain mappingif multiple testing is appropriately accounted for.

see also: Yarkoni 2009


Numbers of subjectsin studies reviewed by Vul et al. (2009)

number of correlations estimates

4

8

16

36

60

100

number of subjects


power

In order to find a single region with across-subject correlation of 0.7 in the brain...

...we would need

about 36 subjects

16 subjects


power

In order to find a single region with across-subject correlation of 0.7 in the brain...

...we would need

about 36 subjects

16 subjects


Take-home message

Whole-brain cross-subject correlation mapping

with 16 subjects

does

not

work.

Need at least twice as many subjects.


Conclusions

Unless much larger numbers of subjects are used, whole-brain cross-subject correlation mapping suffers from either:

  • very low power to detect true regions(if we carefully to correct for multiple comparisons)

  • very high rates of false-positive regions(otherwise)

    If analysis is circular, selection bias is expected to be high here (because selection occurs among noisy estimates).

...in other words,

it doesn't work.


Suggestions

  • Design study to have enough power to detect realistic correlations. (Need either anatomical restrictions or large numbers of subjects.)

  • Consider studying trial-to-trial rather than subject-to-subject effects.

  • Correct for multiple testing to avoid false positives.

  • Avoid circularity: Use leave-one-subject out procedure to estimate regional cross-subject correlations.

  • Report correlation estimates with error bars.


  • Login