What we measure vs what we want to know
Download
1 / 72

What we Measure vs. What we Want to Know - PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on

What we Measure vs. What we Want to Know. "Not everything that counts can be counted, and not everything that can be counted counts." - Albert Einstein. Scales, Transformations, Vectors and Multi-Dimensional Hyperspace.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' What we Measure vs. What we Want to Know' - britain


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
What we measure vs what we want to know

What we Measure vs. What we Want to Know

"Not everything that counts can be counted, and not everything that can be counted counts." - Albert Einstein


Scales transformations vectors and multi dimensional hyperspace
Scales, Transformations, Vectors and Multi-Dimensional Hyperspace

  • All measurement is a proxy for what is really of interest - The Relationship between them

  • The scale of measurement and the scale of analysis and reporting are not always the same - Transformations

  • We often make measurements that are highly correlated - Multi-component Vectors



Gulls variables
Gulls Variables Hyperspace


Scree plot
Scree Plot Hyperspace


Output
Output Hyperspace

> summary(gulls.pca2)

Importance of components:

Comp.1 Comp.2 Comp.3 Standard deviation 1.8133342 0.52544623 0.47501980 Proportion of Variance 0.8243224 0.06921464 0.05656722 Cumulative Proportion 0.8243224 0.89353703 0.95010425

> gulls.pca2$loadings

Loadings:

Comp.1 Comp.2 Comp.3 Comp.4Weight -0.505 -0.343 0.285 0.739Wing -0.490 0.852 -0.143 0.116Bill -0.500 -0.381 -0.742 -0.232H.and.B -0.505 -0.107 0.589 -0.622


Bi plot
Bi-Plot Hyperspace




Indirect gradient analysis
Indirect Gradient Analysis Hyperspace

  • Environmental gradients are inferred from species data alone

  • Three methods:

    • Principal Component Analysis - linear model

    • Correspondence Analysis - unimodal model

    • Detrended CA - modified unimodal model




Pca gradient site species biplot
PCA gradient - site/species biplot Hyperspace

standard

biodynamic& hobby

nature



Approaches
Approaches Hyperspace

  • Use single responses in linear models of environmental variables

  • Use axes of a multivariate dimension reduction technique as responses in linear models of environmental variables

  • Constrain the multivariate dimension reduction into the factor space defined by the environmental variables



Constrained
Constrained? Environmental Variables


Working with the variability that we can explain
Working with the Variability that we Environmental VariablesCan Explain

  • Start with all the variability in the response variables.

  • Replace the original observations with their fitted values from a model employing the environmental variables as explanatory variables (discarding the residual variability).

  • Carry our gradient analysis on the fitted values.


Unconstrained constrained
Unconstrained/Constrained Environmental Variables

  • Unconstrained ordination axes correspond to the directions of the greatest variability within the data set.

  • Constrained ordination axes correspond to the directions of the greatest variability of the data set that can be explained by the environmental variables.


Direct gradient analysis
Direct Gradient Analysis Environmental Variables

  • Environmental gradients are constructed from the relationship between species environmental variables

  • Three methods:

    • Redundancy Analysis - linear model

    • Canonical (or Constrained) Correspondence Analysis - unimodal model

    • Detrended CCA - modified unimodal model


Dune data unconstrained
Dune Data Unconstrained Environmental Variables


Dune data constrained
Dune Data Constrained Environmental Variables


How similar are objects samples individuals sites

How Similar are Environmental VariablesObjects/Samples/Individuals/Sites?


Similarity approaches or what do we mean by similar

Similarity approaches Environmental Variablesor what do we mean by similar?


Different types of data
Different types of data Environmental Variables

example

Continuous data : height

Categorical data

ordered (nominal) : growth rate

very slow, slow, medium, fast, very fast

not ordered : fruit colour

yellow, green, purple, red, orange

Binary data : fruit / no fruit


Different scales of measurement
Different scales of measurement Environmental Variables

example

Large Range : soil ion concentrations

Restricted Range : air pressure

Constrained : proportions

Large numbers : altitude

Small numbers : attribute counts

Do we standardise measurement scales to make them equivalent? If so what do we lose?


Similarity matrix
Similarity matrix Environmental Variables

We define a similarity between units – like the correlation between continuous variables.

(also can be a dissimilarity or distance matrix)

A similarity can be constructed as an average of the similarities between the units on each variable.

(can use weighted average)

This provides a way of combining different types of variables.


Distance metrics

A Environmental Variables

B

A

B

Distance metrics

relevant for continuous variables:

Euclidean

city block or Manhattan

(also many other variations)


Similarity coefficients for binary data

0,0 Environmental Variables

1,0

0,1

1,1

0,0

1,0

0,1

1,1

Similarity coefficients for binary data

simple matching

count if both units 0 or both units 1

Jaccard

count only if both units 1

(also many other variants, eg Bray-Curtis)

simple matching can be extended to categorical data


A distance matrix
A Distance Matrix Environmental Variables


Uses of distances
Uses of Distances Environmental Variables

Distance/Dissimilarity can be used to:-

  • Explore dimensionality in data using Principal coordinate analysis (PCO or PCoA)

  • As a basis for clustering/classification


Uk wet deposition network
UK Wet Deposition Network Environmental Variables


Grouping methods

Grouping methods Environmental Variables


Cluster analysis

Cluster Analysis Environmental Variables


Clustering methods
Clustering methods Environmental Variables

  • hierarchical

    • divisive

      • put everything together and split

      • monothetic / polythetic

    • agglomerative

      • keep everything separate and join the most similar points (classical cluster analysis)

  • non-hierarchical

    • k-means clustering


Agglomerative hierarchical
Agglomerative hierarchical Environmental Variables

Single linkage or nearest neighbour

finds the minimum spanning tree:

shortest tree that connects all points

  • chaining can be a problem


Agglomerative hierarchical1
Agglomerative hierarchical Environmental Variables

Complete linkage or furthest neighbour

  • compact clusters of approximately equal size.

  • (makes compact groups even when none exist)


Agglomerative hierarchical2
Agglomerative hierarchical Environmental Variables

Average linkage methods

  • between single and complete linkage


From alexandria to suez
From Alexandria to Suez Environmental Variables


Hierarchical clustering
Hierarchical Clustering Environmental Variables


Hierarchical clustering1
Hierarchical Clustering Environmental Variables


Hierarchical clustering2
Hierarchical Clustering Environmental Variables


Building and testing models
Building and testing models Environmental Variables

Basically you just approach this in the same way as for multiple regression – so there are the same issues of variable selection, interactions between variables, etc.

However the basis of any statistical tests using distributional assumptions are more problematic, so there is much greater use of randomisation tests and permutation procedures to evaluate the statistical significance of results.


Some examples

Some Examples Environmental Variables


Part of Fig 4. Environmental Variables


What technique
What Technique? Environmental Variables


Raw data
Raw Data Environmental Variables


Linear regression
Linear Regression Environmental Variables


Two regressions
Two Regressions Environmental Variables


Principal components
Principal Components Environmental Variables


Models of species response
Models of Species Response Environmental Variables

There are (at least) two models:-

Linear - species increase or decrease along the environmental gradient

Unimodal - species rise to a peak somewhere along the environmental gradient and then fall again


Linear
Linear Environmental Variables


Unimodal
Unimodal Environmental Variables


Non metric multidimensional scaling
Non-metric multidimensional scaling Environmental Variables

NMDS maps the observed dissimilarities onto an ordination space by trying to preserve their rank order in a low number of dimensions (often 2) – but the solution is linked to the number of dimensions chosen

it is like a non-linear version of PCO

define a stress function and look for the mapping with minimum stress

(e.g. sum of squared residuals in a monotonic regression of NMDS space distances between original and mapped dissimilarities)

need to use an iterative process, so try with many different starting points and convergence is not guaranteed


Procrustes rotation
Procrustes rotation Environmental Variables

used to compare graphically two separate ordinations


ad