1 / 63

Computing in Archaeology - PowerPoint PPT Presentation

Computing in Archaeology. Session 11. Correlation and regression analysis. © Richard Haddlesey www.medievalarchitecture.net. Lecture aims. To introduce correlation and regression techniques. The scattergram.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Computing in Archaeology' - tovah

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Computing in Archaeology

Session 11. Correlation and regression analysis

• To introduce correlation and regression techniques

• In correlation, we are always dealing with paired scores, and so values of the two variables taken together will be used to make a scattergram

• Quantities of New Forrest pottery recovered from sites at varying distances from the kilns

Here we can see that the quantity of pottery decreases as distance from the source increases

Here we see that the taller a pot, the wider the rim

Again the further from source, the less quantity of artefacts

Arched relationship (non-monotonic)

Here we see the first molar increases with age and is then worn down as the animal gets older

• This shows us that scattergrams are the most important means of studying relationships between two variables

• Regression differs from other techniques we have looked at so far in that it is concerned not just with whether or not a relationship exists, or the strength of that relationship, but with its nature

• In regression analysis we use an independent variable to estimate (or predict) the values of a dependent variable

y = f(x)

• y = y axis (in this case the dependent

• f = function (of x)

• x = x axis

y = x y = 2x y = x2

• y = a + bx

• Where y is the dependent variable, x is the independent variable, and the coefficients a and b are constants, i.e. they are fixed for a given data

• If x = 0 then the equation reduces to y = a, so a represents the point where the regression line crosses the y axis (the intercept)

• The b constant defines the slope of gradient of the regression line

• Thus for the pottery quantity in relation to distance from source, b represents the amount of decrease in pottery quantity from the source

1 correlation coefficient

1 correlation coefficient

2 significance

• 1 correlation coefficient

• r

• 2 significance

• 1 correlation coefficient

• r

• -1 to +1

• 2 significance

• nominal – in name only

• ordinal – forming a sequence

• interval – a sequence with fixed distances

• ratio – fixed distances with a datum point

• nominal

• ordinal

• interval

• ratio

• nominal

• ordinal

• interval Product-Moment

• Correlation Coefficient

• ratio

• nominal

• ordinal Spearman’s Rank

• Correlation Coefficient

• interval

• ratio

Correlation Coefficient

length (cm) width (cm)

n=20

r = nΣxy – (Σx)(Σy) g

√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

length (cm) width (cm)

n=20

r = nΣxy – (Σx)(Σy) g

√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

n=20

r = nΣxy – (Σx)(Σy) g

√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

n=20

r = nΣxy – (Σx)(Σy) g= +0.67

√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

n=20

H0 : true correlation coefficient = 0

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

Sample statistics needed: n and r

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

Sample statistics needed: n and r

Test statistic: TS = r

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

Sample statistics needed: n and r

Test statistic: TS = r

Table: product moment correlation coefficient table.

length (cm) width (cm)

H0 : true correlation coefficient = 0

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

Sample statistics needed: n and rs

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

Sample statistics needed: n and rs

Test statistic: TS = rs

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

Sample statistics needed: n and rs

Test statistic: TS = rs

Table: Spearman’s rankcorrelation coefficient table