T tests anovas and regression
Download
1 / 51

T tests, ANOVAs and regression - PowerPoint PPT Presentation


  • 103 Views
  • Updated On :

T tests, ANOVAs and regression. Tom Jenkins Ellen Meierotto SPM Methods for Dummies 2007. Why do we need t tests?. Objectives. Types of error Probability distribution Z scores T tests ANOVAs. Error. Null hypothesis Type 1 error ( α ): false positive Type 2 error ( β ): false negative.

Related searches for T tests, ANOVAs and regression

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'T tests, ANOVAs and regression' - lerato


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
T tests anovas and regression

T tests, ANOVAs and regression

Tom Jenkins

Ellen Meierotto

SPM Methods for Dummies 2007



Objectives
Objectives

  • Types of error

  • Probability distribution

  • Z scores

  • T tests

  • ANOVAs


Error
Error

  • Null hypothesis

  • Type 1 error (α): false positive

  • Type 2 error (β): false negative



Z scores
Z scores

  • Standardised normal distribution

  • µ = 0, σ = 1

  • Z scores: 0, 1, 1.65, 1.96

  • Need to know population standard deviation

Z=(x-μ)/σfor one point compared to pop.


T tests
T tests

  • Comparing means

  • 1 sample t

  • 2 sample t

  • Paired t



2 sample t tests
2 sample t tests

Pooled standard error of the mean





T tests in spm did the observed signal change occur by chance or is it stat significant
T tests in SPM: Did the observed signal change occur by chance or is it stat. significant?

  • Recall GLM. Y= X β + ε

  • β1 is an estimate of signal change over time attributable to the condition of interest

  • Set up contrast (cT) 1 0 for β1:1xβ1+0xβ2+0xβn/s.d

  • Null hypothesis: cTβ=0 No significant effect at each voxel for condition β1

  • Contrast 1 -1 : Is the difference between 2 conditions significantly non-zero?

  • t = cTβ/sd[cTβ] – 1 sided


Anova
ANOVA chance or is it stat. significant?

  • Variances not means

  • Total variance= model variance + error variance

  • Results in F score- corresponding to a p value

Variance

F test = Model variance /Error variance


T tests anovas and regression

Group 1 chance or is it stat. significant?

Group 1

Group 1

Group 2

Group 2

Group 2

Partitioning the variance

Total =

Model +

(Between groups)

Error

(Within groups)


T vs f tests
T vs F tests chance or is it stat. significant?

  • F tests- any differences between multiple groups, interactions

  • Have to determine where differences are post-hoc

  • SPM- T- one tailed (con)

  • SPM- F- two tailed (ess)


Conclusions
Conclusions chance or is it stat. significant?

  • T tests describe how unlikely it is that experimental differences are due to chance

  • Higher the t score, smaller the p value, more unlikely to be due to chance

  • Can compare sample with population or 2 samples, paired or unpaired

  • ANOVA/F tests are similar but use variances instead of means and can be applied to more than 2 groups and other more complex scenarios


Acknowledgements
Acknowledgements chance or is it stat. significant?

  • MfD slides 2004-2006

  • Van Belle, Biostatistics

  • Human Brain Function

  • Wikipedia


Correlation and regression

Correlation and Regression chance or is it stat. significant?


Topics covered
Topics Covered: chance or is it stat. significant?

  • Is there a relationship between x and y?

  • What is the strength of this relationship

    • Pearson’s r

  • Can we describe this relationship and use it to predict y from x?

    • Regression

  • Is the relationship we have described statistically significant?

    • F- and t-tests

  • Relevance to SPM

    • GLM


Relationship between x and y
Relationship between chance or is it stat. significant?x and y

  • Correlation describes the strength and direction of a linear relationship between two variables

  • Regression tells you how well a certain independent variable predicts a dependent variable

  • CORRELATION  CAUSATION

    • In order to infer causality: manipulate independent variable and observe effect on dependent variable


Scattergrams

Y chance or is it stat. significant?

X

Scattergrams

Y

Y

Y

Y

Y

X

X

Positive correlation

Negative correlation

No correlation


Variance vs covariance

Covariance ~ chance or is it stat. significant?

DX * DY

Variance ~

DX * DX

Variance vs. Covariance

  • Do two variables change together?


Covariance
Covariance chance or is it stat. significant?

  • When X and Y : cov (x,y) = pos.

  • When X and Y : cov (x,y) = neg.

  • When no constant relationship: cov (x,y) = 0


Example covariance

x chance or is it stat. significant?

(

)(

)

-

-

y

-

-

x

x

y

y

x

x

y

y

i

i

i

i

0

3

-

3

0

0

2

2

-

1

-

1

1

3

4

0

1

0

4

0

1

-

3

-

3

6

6

3

3

9

å

=

=

7

y

3

=

x

3

Example Covariance

What does this number tell us?


Example of how covariance value relies on variance
Example of how covariance value relies on variance chance or is it stat. significant?


Pearson s r
Pearson’s R chance or is it stat. significant?

  • Covariance does not really tell us anything

    • Solution: standardise this measure

  • Pearson’s R: standardise by adding std to equation:


Basic assumptions
Basic assumptions chance or is it stat. significant?

  • Normal distributions

  • Variances are constant and not zero

  • Independent sampling – no autocorrelations

  • No errors in the values of the independent variable

  • All causation in the model is one-way (not necessary mathematically, but essential for prediction)


Pearson s r degree of linear dependence
Pearson’s R: chance or is it stat. significant?degree of linear dependence


Limitations of r
Limitations of r chance or is it stat. significant?

  • r is actually

    • r = true r of whole population

    • = estimate of r based on data

  • r is very sensitive to extreme values:


In the real world
In the real world… chance or is it stat. significant?

  • r is never 1 or –1

  • interpretations for correlations in psychological research (Cohen)

    Correlation Negative Positive

    Small -0.29 to -0.10 00.10 to 0.29

    Medium -0.49 to -0.30 0.30 to 0.49

    Large -1.00 to -0.50 0.50 to 1.00


Regression
Regression chance or is it stat. significant?

  • Correlation tells you if there is an association between x and y but it doesn’t describe the relationship or allow you to predict one variable from the other.

  • To do this we need REGRESSION!


Best fit line

ŷ chance or is it stat. significant? = ax + b

slope

ε

= y i , true value

ε =residual error

Best-fit Line

  • Aim of linear regression is to fit a straight line, ŷ = ax + b, to data that gives best prediction of y for any value of x

  • This will be the line that

    minimises distance between

    data and fitted line, i.e.

    the residuals

intercept

= ŷ, predicted value


Least squares regression
Least Squares Regression chance or is it stat. significant?

  • To find the best line we must minimise the sum of the squares of the residuals (the vertical distances from the data points to our line)

Model line: ŷ = ax + b

a = slope, b = intercept

Residual (ε) = y - ŷ

Sum of squares of residuals = Σ (y – ŷ)2

  • we must find values of a and b that minimise

    Σ (y – ŷ)2


Finding b

b chance or is it stat. significant?

Finding b

  • First we find the value of b that gives the min sum of squares

b

ε

ε

b

  • Trying different values of b is equivalent to shifting the line up and down the scatter plot


Finding a

b chance or is it stat. significant?

b

Finding a

  • Now we find the value of a that gives the min sum of squares

b

  • Trying out different values of a is equivalent to changing the slope of the line, while b stays constant


Minimising sums of squares

sums of squares (S) chance or is it stat. significant?

Gradient = 0

min S

Values of a and b

Minimising sums of squares

  • Need to minimise Σ(y–ŷ)2

  • ŷ = ax + b

  • so need to minimise:

    Σ(y - ax - b)2

  • If we plot the sums of squares for all different values of a and b we get a parabola, because it is a squared term

  • So the min sum of squares is at the bottom of the curve, where the gradient is zero.


The maths bit
The maths bit chance or is it stat. significant?

  • So we can find a and b that give min sum of squares by taking partial derivatives of Σ(y - ax - b)2 with respect to a and b separately

  • Then we solve these for 0 to give us the values of a and b that give the min sum of squares


The solution

r s chance or is it stat. significant?y

r = correlation coefficient of x and y

sy = standard deviation of y

sx = standard deviation of x

a =

sx

The solution

  • Doing this gives the following equations for a and b:

  • You can see that:

    • A low correlation coefficient gives a flatter slope (small value of a)

    • Large spread of y, i.e. high standard deviation, results in a steeper slope (high value of a)

    • Large spread of x, i.e. high standard deviation, results in a flatter slope (high value of a)


The solution cont

y = ax + b chance or is it stat. significant?

b = y – ax

b = y – ax

r sy

r = correlation coefficient of x and y

sy = standard deviation of y

sx = standard deviation of x

x

b = y -

sx

The solution cont.

  • Our model equation is ŷ = ax + b

  • This line must pass through the mean so:

  • We can put our equation into this giving:

  • The smaller the correlation, the closer the intercept is to the mean of y


Back to the model
Back to the model chance or is it stat. significant?

  • We can calculate the regression line for any data, but the important question is:

    How well does this line fit the data, or how good is it at predicting y from x?


How good is our model

∑(ŷ – y) chance or is it stat. significant?2

∑(y – y)2

SSy

SSpred

sy2 =

sŷ2 =

=

=

n - 1

n - 1

dfŷ

dfy

∑(y – ŷ)2

SSer

serror2 =

=

n - 2

dfer

How good is our model?

  • Total variance of y:

  • Variance of predicted y values (ŷ):

This is the variance explained by our regression model

  • Error variance:

This is the variance of the error between our predicted y values and the actual y values, and thus is the variance in y that is NOT explained by the regression model


T tests anovas and regression

How good is our model cont. chance or is it stat. significant?

  • Total variance = predicted variance + error variance

    sy2 = sŷ2 + ser2

  • Conveniently, via some complicated rearranging

    sŷ2 = r2 sy2

    r2 = sŷ2 / sy2

  • so r2 is the proportion of the variance in y that is explained by our regression model


How good is our model cont
How good is our model cont. chance or is it stat. significant?

  • Insert r2 sy2 into sy2 = sŷ2 + ser2 and rearrange to get:

    ser2 = sy2 – r2sy2

    = sy2 (1 – r2)

  • From this we can see that the greater the correlation the smaller the error variance, so the better our prediction


Is the model significant

s chance or is it stat. significant?ŷ2

r2 (n - 2)2

F

=

(dfŷ,dfer)

ser2

1 – r2

Is the model significant?

  • i.e. do we get a significantly better prediction of y from our regression equation than by just predicting the mean?

  • F-statistic:

complicated

rearranging

=......=

  • And it follows that:

So all we need to

know are r and n !

r(n - 2)

t(n-2) =

(because F = t2)

√1 – r2


General linear model
General Linear Model chance or is it stat. significant?

  • Linear regression is actually a form of the General Linear Model where the parameters are a, the slope of the line, and b, the intercept.

    y = ax + b +ε

  • A General Linear Model is just any model that describes the data in terms of a straight line


Multiple regression
Multiple regression chance or is it stat. significant?

  • Multiple regression is used to determine the effect of a number of independent variables, x1, x2, x3 etc., on a single dependent variable, y

  • The different x variables are combined in a linear way and each has its own regression coefficient:

    y = a1x1+ a2x2 +…..+ anxn + b + ε

  • The a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y.

  • i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for


T tests anovas and regression
SPM chance or is it stat. significant?

  • Linear regression is a GLM that models the effect of one independent variable, x, on ONE dependent variable, y

  • Multiple Regression models the effect of several independent variables, x1,x2 etc, on ONE dependent variable, y

  • Both are types of General Linear Model

  • GLM can also allow you to analyse the effects of several independent x variables on several dependent variables, y1, y2, y3etc, in a linear combination

  • This is what SPM does and will be explained soon…