1 / 16

Chapter 13

Chapter 13. Multivariate Analysis. BCB 702: Biostatistics. http://hei.unige.ch/~elkhou99/imageSC7.JPG. What is Multivariate Analysis?. Usually involves situations where there are two or more dependent (response) variables Examines the relationships or interactions of these variables

marambula
Download Presentation

Chapter 13

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 13 Multivariate Analysis BCB 702: Biostatistics http://hei.unige.ch/~elkhou99/imageSC7.JPG

  2. What is Multivariate Analysis? • Usually involves situations where there are two or more dependent (response) variables • Examines the relationships or interactions of these variables • Takes into account the fact that: • Variables may not be independent of each other • Performing multiple comparisons increases the risk of making a Type I error • Simply performing a series of multiple univariate tests would not be appropriate and would give false results

  3. Types of Multivariate Tests Include: • Multivariate Analysis of Variance (MANOVA) • Discriminant Function Analysis (DFA) • Principal Components Analysis (PCA) • Factor Analysis • Cluster Analysis • Canonical Correlation Analysis • Multidimensional Scaling

  4. MANOVA • Extension of the ANOVA • Examines two or more response variables • Combines multiple response variables into a single new variable to maximise the differences between the treatment group means • Obtain a multivariate F value – Wilks’ lambda (value between 0 and 1) is most commonly used • If the overall test is significant, we can then go on to examine which of the individual variables contributed to the significant effect

  5. MANOVA: Example • A researcher has collected a certain species of lizard from three different island populations. Each island represents a different eco-zone. He wishes to test whether lizards from different islands differ in their morphology and abilities, so he collects 10 lizards from each island and measures their body length, limb length and running speed. • Independent variable: • Island of origin • Dependent variables: • Body length • Limb length • Running speed http://www.flickr.com/photos/wyscan/14739853/

  6. MANOVA: Example • From the analysis, we get: • The model shows a significant difference in lizards from the three islands (p <0.001)

  7. MANOVA: Example • Limb length and running speed differ significantly between lizards from different islands. There is no difference in body length

  8. Discriminant Function Analysis • Discriminant Function Analysis (DFA) is used to determine which variables predict naturally occurring groups in data • Several independent variables and one non-metric (grouping) dependent variable • MANOVA in reverse • DFA organises the original independent variables into a set of canonical correlations, which are linear combinations of the original variables

  9. Discriminant Function Analysis • The first canonical correlation explains the most variation in the data set, the second canonical correlation explains the most variation that is left over, and so on • Three steps: • Look for an overall significant effect using a multivariate F test (Wilks’ lambda) • Examine the independent variables individually for differences in mean by group • Classification

  10. DFA: Example • Populations of a sunflower species grow at four sites (two in riparian habitat and two in serpentine habitat) that differ in soil chemistry and water availability. Various measures of soil chemistry were taken in order to determine which of these variables can be used to distinguish among sites. (Sambatti & Rice, 2006) • Independent variables: • Ca • Mg • P • Organic matter (OM) • pH • Dependent variable: • Site http://en.wikipedia.org/wiki/Image:Sunflowers.jpg

  11. Canonical Centroid plot DFA: Example • The overall model was significant (p <0.001), meaning that sites differ in soil nutrients • First canonical axis: The riparian habitats (particularly R1) have more OM and a lower pH • Second canonical axis: The two serpentine habitats (S1 and S2) have lower levels of Ca and P and slightly higher levels of Mg than riparian sites

  12. Principal Components Analysis • The goal of PCA is to reduce complex data sets containing a large number of variables to a lower dimension in order to see the relationships of variables more clearly • It computes a new set of composite variables called principal components (PCs) • Each PC explains a certain proportion of the variation in the data set, with PC1 explaining the most amount of variation, PC2 the next most amount of variation, and so on

  13. Factor Analysis • Similar to Principal Components Analysis • Used to uncover underlying trends and relationships in large and complex data sets • Works on a correlation matrix of variables • Combines original variables into a smaller set of factors • Variables are correlated with each other due to their correlation with a common factor

  14. A B C D E Cluster Analysis • Cluster analysis encompasses a number of different methods • Used to organize or group data according to similarities • There is no real dependent variable – cluster analysis does not attempt to explain why groups (clusters) exist • Often used in species taxonomy

  15. Canonical Correlation Analysis • Used when variables fall naturally into two groups (a group of dependent variables and a group of independent variables) • Tries to determine if there are linear relationships between the two sets of variables • It creates functions for each group, such that the correlation between the functions of each group is maximised • In this way, a combination of variables from the first group predicts a combination of variables from the second group

  16. Multidimensional Scaling • Analyses pairwise similarities between variables • Only applicable to continuous data • Plots variables graphically to provide a visual representation of the pattern of proximity of a set of variables (objects) • Objects plotted close together are relatively similar to each other, while objects plotted far apart are relatively dissimilar

More Related