correlation and percentages. association between variables can be explored using counts are high counts of bone needles associated with high counts of end scrapers? similar questions can be asked using percent-standardized data
matrix(round(rnorm(100, 50, 15), nrow=10)))
%s (10 vars.)
%s (5 vars.)
%s (3 vars.)
%s (2 vars.)
%s 10 vars.
%s 5 vars.
%s 3 vars.
%s 2 vars.
> hull1 chull(x, y)
> plot(x, y)
> polygon(x[hull1], y[hull1])
> abline(lm(y[-hull1] ~ x[-hull1]))
old.par par(no.readonly = TRUE)
plot(DIST, DENSITY, log="y")
half of the variation is explained by the regression…
half of the variation in y is explained by variation in x…
Basin of Mexico
How are these variables related?
Do any make sense as dependent or independent variables?
r2 = .75
y = 35.4 + .66x
SIZE = 35.38 + .66*AGLAND
> resSize frmdat$size – (35.4 +.66 * frmdat$agland)
SIZE = -29 + 98 * PROD
r2 = .69
What have we “explained” about site size??
r2 = .69
1 = total variance observed in independent variable (x0)
variance in x0 explained by x1, by itself…
variance in x0 unexplained by x1…
variance in x0 explained by x2, by itself…
variance in x0 unexplained by x2…
(total variance in x0 explained by x1, that is not explained by x2…)
partial correlation coefficient:
proportion of variance in x0 explained by x1, that is not explained by x2…
variance in x0 explained by x1 and x2, both separately, and together…
SIZE = -1.8 + .42*AGLAND + 50*PROD
prod productivity index
Bprod (prod-mean(prod))/sd(prod) })
lmBeta lm(Bsize ~ Bagland + Bprod)
size = .55*agland + .43*prod