Plotting multivariate data
This presentation is the property of its rightful owner.
Sponsored Links
1 / 18

Plotting Multivariate Data PowerPoint PPT Presentation


  • 65 Views
  • Uploaded on
  • Presentation posted in: General

Plotting Multivariate Data. Harry R. Erwin, PhD School of Computing and Technology University of Sunderland. Resources. Everitt , BS, and G Dunn (2001) Applied Multivariate Data Analysis, London:Arnold .

Download Presentation

Plotting Multivariate Data

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Plotting multivariate data

Plotting Multivariate Data

Harry R. Erwin, PhD

School of Computing and Technology

University of Sunderland


Resources

Resources

  • Everitt, BS, and G Dunn (2001) Applied Multivariate Data Analysis, London:Arnold.

  • Everitt, BS (2005) An R and S-PLUS® Companion to Multivariate Analysis, London:Springer


Edward tufte s recommendations

Edward Tufte’s Recommendations

  • Show the data

  • Induce the viewer to think about the substance of the data

  • Avoid distorting what the data have to say

  • Present many numbers in a small space

  • Make large data sets coherent

  • Encourage comparison

  • Reveal the data at several levels of detail

  • Serve a clear purpose

  • Be closely integrated with the statistical and verbal descriptions of the data

    • Tufte, E R (2001), The Visual Display of Quantitative Information, Graphics Press.


Tufte s points

Tufte’s Points

  • Graphics reveal data.

  • Graphics can be more precise and revealing than conventional statistics.

  • Anscombe’s data

    • Anscombe, F J (1973) “Graphs in Statistical Analysis”, American Statistician, 27:17-21.

  • All four data sets are described by the same linear model.


The anscombe graphics

The Anscombe Graphics


Ways of looking at data

Ways of Looking at Data

  • Scatterplots

    • Demonstration

  • “The convex hull of bivariate data”

    • Demonstration

  • Chiplot

    • Demonstration

  • BivariateBoxplot

    • Demonstration


And more multivariate graphics

And More Multivariate Graphics

  • Bivariate Densities

    • Demonstration

  • Other Variables in a Scatterplot

    • Demonstration

  • Scatterplot Matrix

    • Demonstration of pairs

  • 3-D Plots

    • Demonstration

  • Conditioning Plots

    • Demonstration


Demonstration

Demonstration

  • Launch R

  • Set the working directory to Statistics/RSPCMA/Data

  • airpoll<-source("chap2airpoll.dat")$value

  • Review exercises on pages 19-22


Convex hull of bivariate data

Convex Hull of Bivariate Data

  • Scatterplots are often used during the calculation of the correlation coefficient of two variables.

  • Used to detect outliers.

  • Convex hull trimming generates a robust estimate of the correlation coefficient.

  • Demonstration

    • attach(airpoll)

    • cor(SO2, Mortality)


Robust estimation of the correlation

Robust Estimation of the Correlation

  • hull<-chull(SO2, Mortality) # finds the convex hull

  • plot(SO2, Mortality, pch=1)

  • polygon(SO2[hull],Mortality[hull], density=15, angle=30)

  • cor(SO2[-hull],Mortality[-hull])

  • The results are almost identical, which is unusual.


Chiplot

Chiplot

  • A way of augmenting the scatterplot to spot dependence/independence.

  • See Statistics/RSCMPA/functions.txt

  • chiplot(SO2,Mortality,vlabs=c("SO2", "Mortality")

  • For independent data, the points will be scattered in ahoriszontal band centered around 0.

  • Departure from independence here is shown by the points missing from (-0.25,0.25)


Bivariate boxplot

BivariateBoxplot

  • Two-dimensional analogue of the boxplot

  • A pair of concentric ellipses—the inner ellipse (the “hinge”) holds half the data, and the outer ellipse (the “fence”) identifiers outliers.

  • Regression lines of x on y and y on x are shown.

    • bvbox(cbind(SO2,Mortality), xlab="SO2", ylab="Mortality")

  • Cleaned up (more robust):

    • bvbox(cbind(SO2,Mortality), xlab="SO2", ylab="Mortality", method="O")


Bivariate densities

Bivariate Densities

  • The goal of examining a scatterplot is to identify clusters and outliers.

  • Humans are not particularly good at this, so graphical aids help.

  • Adding a bivariate density estimate is good.

  • Histograms are too rough, though.


Demo of bivariate density

Demo of Bivariate Density

  • den1<-bivden(SO2,Mortality)

  • persp(den1$seqx, den1$seqy, den1$den, xlab=“SO2”, ylab=“Mortality”, zlab=“Density”, lwd=2)

  • plot(SO2, Mortality)

  • contour(den1$seqx, den1$seqy, den1$den, lwd=2, nlevels=20, add=T)


Adding a third variable to the scatterplot

Adding a Third Variable to the Scatterplot

  • Thebubbleplot

  • plot(SO2, Mortality, pch=1, lwd=2, ylim=c(700,1200), xlim=c(-5,300)) # basic scatterplot.

  • symbols(SO2, Mortality, circles=Rainfall, inches=0.4, add=TRUE, lwd=2) # adding Rainfall to each point.


Scatterplot matrix

Scatterplot Matrix

  • pairs(airpoll)

  • To add regression lines

    • pairs(airpoll,panel=function(x,y) {

      abline(lsfit(x,y)$coef,lwd=2)

      lines(lowess(x,y),lty=2,lwd=2)

      points(x,y)})

  • For 3D graphics, use cloud

    • cloud(Mortality~SO2+Rainfall)


Conditioning plots

Conditioning Plots

  • coplot(Mortality~SO2|Popden)

  • To add a local regression fit

    coplot(Mortality~SO2|Popden, panel=function(x,y,col,pch)

    panel.smooth(x,y,span=1))


Conclusions

Conclusions

  • The purpose of graphics is to aid your intuition.

  • Explore them—the appropriate graphics reflect your questions and the structure of the data.

  • Next week: graphic presentations to avoid, because they mislead you and your audience.

  • Look at the books by Edward Tufte in the library.


  • Login