Download Presentation
## Computing in Archaeology

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Computing in Archaeology**Session 11. Correlation and regression analysis © Richard Haddlesey www.medievalarchitecture.net**Lecture aims**• To introduce correlation and regression techniques**The scattergram**• In correlation, we are always dealing with paired scores, and so values of the two variables taken together will be used to make a scattergram**example**• Quantities of New Forrest pottery recovered from sites at varying distances from the kilns**Negative correlation**Here we can see that the quantity of pottery decreases as distance from the source increases**Positive correlation**Here we see that the taller a pot, the wider the rim**Curvilinear monotonic relation**Again the further from source, the less quantity of artefacts**Arched relationship (non-monotonic)**Here we see the first molar increases with age and is then worn down as the animal gets older**scattergram**• This shows us that scattergrams are the most important means of studying relationships between two variables**REGRESSION**• Regression differs from other techniques we have looked at so far in that it is concerned not just with whether or not a relationship exists, or the strength of that relationship, but with its nature • In regression analysis we use an independent variable to estimate (or predict) the values of a dependent variable**Regression equation**y = f(x) • y = y axis (in this case the dependent • f = function (of x) • x = x axis**y = f(x)**y = x y = 2x y = x2**General linear equations**• y = a + bx • Where y is the dependent variable, x is the independent variable, and the coefficients a and b are constants, i.e. they are fixed for a given data**Therefore:**• If x = 0 then the equation reduces to y = a, so a represents the point where the regression line crosses the y axis (the intercept) • The b constant defines the slope of gradient of the regression line • Thus for the pottery quantity in relation to distance from source, b represents the amount of decrease in pottery quantity from the source**CORRELATION**1 correlation coefficient**CORRELATION**1 correlation coefficient 2 significance**CORRELATION**• 1 correlation coefficient • r • 2 significance**CORRELATION**• 1 correlation coefficient • r • -1 to +1 • 2 significance**Levels of measurement:**• nominal – in name only • ordinal – forming a sequence • interval – a sequence with fixed distances • ratio – fixed distances with a datum point**Levels of measurement:**• nominal • ordinal • interval • ratio**Levels of measurement:**• nominal • ordinal • interval Product-Moment • Correlation Coefficient • ratio**Levels of measurement:**• nominal • ordinal Spearman’s Rank • Correlation Coefficient • interval • ratio**The Product-Moment**Correlation Coefficient**sample – 20 bronze spearheads**length (cm) width (cm) n=20**r = nΣxy – (Σx)(Σy) g**√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] length (cm) width (cm) n=20**r = nΣxy – (Σx)(Σy) g**√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] n=20**r = nΣxy – (Σx)(Σy) g**√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] n=20**r = nΣxy – (Σx)(Σy) g= +0.67**√[nΣx2 – (Σx)2] [nΣy2 – (Σy)2] n=20**Test of product moment correlation coefficient**H0 : true correlation coefficient = 0**Test of product moment correlation coefficient**H0 : true correlation coefficient = 0 H1 : true correlation coefficient ≠ 0**Test of product moment correlation coefficient**H0 : true correlation coefficient = 0 H1 : true correlation coefficient ≠ 0 Assumptions: both variables approximately random**Test of product moment correlation coefficient**H0 : true correlation coefficient = 0 H1 : true correlation coefficient ≠ 0 Assumptions: both variables approximately random Sample statistics needed: n and r