stats 330 lecture 4
Download
Skip this Video
Download Presentation
STATS 330: Lecture 4

Loading in 2 Seconds...

play fullscreen
1 / 39

STATS 330: Lecture 4 - PowerPoint PPT Presentation


  • 458 Views
  • Uploaded on

STATS 330: Lecture 4 Graphics: Doing it in R Housekeeping My contact details…. Plus much else on course web page www.stat.auckland.ac.nz/~lee/330/ Or via Cecil Today’s lecture: R for graphics Aim of the lecture:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'STATS 330: Lecture 4' - libitha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
stats 330 lecture 4

STATS 330: Lecture 4

Graphics:

Doing it in R

330 lecture 4

housekeeping
Housekeeping

My contact details….

Plus much else on course web page

www.stat.auckland.ac.nz/~lee/330/

Or via Cecil

330 lecture 1

today s lecture r for graphics
Today’s lecture: R for graphics

Aim of the lecture:

To show you how to use R to produce the plots shown in the last few lectures

330 lecture 4

getting data into r
Getting data into R
  • In 330, as in most practice, data comes in 2 main forms
    • As a text file
    • As an Excel spreadsheet
  • Need to convert from these formats to R
  • Data in R is organized in data frames
    • Row by column arrangement of data (as in Excel)
    • Variables are columns
    • Rows are cases (individuals)

330 lecture 4

text files to r
Text files to R
  • Suppose we have the data in the form of a text file
  • Edit the text file (use Notepad or similar) so that
    • The first row consists of the variable names
    • Each row of data (i.e. data on a complete case) corresponds to one line of the file
  • Suppose data fields are separated by spaces and/or tabs
  • Then, to create a data frame containing the data, we use the R function read.table

330 lecture 4

example the cherry tree data
Example: the cherry tree data

Suppose we have a text file called cherry.txt (probably created using Notepad or maybe Word, but saved as a text file)

First line: variable names

Data for each tree on a separate line, separated by “white space” (spaces or tabs)

330 lecture 4

creating the data frame
Creating the data frame

In R, type

cherry.df = read.table(file.choose(),

header=T)

and press the return key

Click here to select file

This brings up the dialog to select the file cherry.txt

containing the data.

Click here to load data

330 lecture 4

check all is ok
Check all is OK!

330 lecture 4

getting data from a spreadsheet 1
Getting data from a spreadsheet (1)

Create the spreadsheet in Excel

Save it as Comma Delimited Text (CSV)

This is a text file with all cells separated by commas

File is called cherry.csv

330 lecture 4

getting data from a spreadsheet 2
Getting data from a spreadsheet (2)

In R, type

cherry.df = read.table(file.choose(),

header=T, sep=“,”)

and proceed as before

330 lecture 4

data frames and variables
Data frames and variables
  • Suppose we have read in data and made a data frame
  • At this point R doesn’t know about the variables in the data frame, so we can’t use e.g. the variable diameter in R commands
  • We need to say

attach(cherry.df)

to make the variables in cherry.df visible to R.

  • Alternatively, say cherry.df$diameter

330 lecture 4

scatterplots
Scatterplots

In R, there are 2 distinct sets of functions for graphics, one for ordinary graphics, one for trellis.

Eg for scatterplots, we use either plot (ordinary R) or xyplot (Trellis)

In the next few slides, we discuss plot.

330 lecture 4

simple plotting
Simple plotting

plot(height,volume,

xlab=“Height (feet)”,

ylab=“Volume (cubic feet)”,

main = “Volume versus height for 31 black cherry trees”)

i.e. label axes (give units if possible), give a title

330 lecture 4

alternative form of plot
Alternative form of plot

plot(volume ~ height,

xlab=“Height (feet)”,

ylab=“Volume (cubic feet)”,

main = “Volume versus height for 31 black cherry trees”,

data = cherry.df)

Don’t need to attach with this form, note reversal of x,y

330 lecture 4

colours points etc
Colours, points, etc

Type

?par

for more info

par(bg="darkblue")

plot(height,volume,

xlab="Height (feet)",

ylab="Volume (cubic feet)",

main = "Volume versus height for 31 black cherry trees",

pch=19,fg="white",

col.axis=“lightblue",col.main="white",

col.lab=“white",col="white",cex=1.3)

330 lecture 4

lines
Lines
  • Suppose we want to join up the rats on the rats plot. (see data next slide)
  • We could try

plot(day, growth, type=“l”)but this won’t work

  • Points are plotted in order they appear in the data frame and each point is joined to the next

330 lecture 4

rats the data
Rats: the data
  • > rats.df
  • growth group rat change day
  • 1 240 1 1 1 1
  • 2 250 1 1 1 8
  • 3 255 1 1 1 15
  • 4 260 1 1 1 22
  • 5 262 1 1 1 29
  • 6 258 1 1 1 36
  • 7 266 1 1 2 43
  • 8 266 1 1 2 44
  • 9 265 1 1 2 50
  • 10 272 1 1 2 57
  • 11 278 1 1 2 64
  • 12 225 1 2 1 1
  • 12 230 1 2 1 8
  • ... More data

330 lecture 4

slide21
Don’t want this!

330 lecture 4

solution
Solution

Various solutions, but one is to plot each line separately, using subsetting

plot(day,growth,type="n")

lines (day[rat==1],growth[rat==1])

lines (day[rat==2],growth[rat==2])

and so on …. (boring!), or (better)

for(j in 1:16){

lines (day[rat==j],growth[rat==j])

}

Draw axes, labels only

330 lecture 4

indicating groups
Indicating groups

Want to plot the litters with different colours, add a legend: Rats 1-8 are litter 1, 9-12 litter 2, 13-16 litter 3

plot(day,growth,type="n")

for(j in 1:8)lines(day[rat==j],

growth[rat==j],col="white") # litter 1

for(j in 9:12)lines (day[rat==j], growth[rat==j],col="yellow") # litter 2

for(j in 13:16)lines (day[rat==j], growth[rat==j],col="purple") # litter 3

Set colour of line

330 lecture 4

legend
legend

legend(13,380,

legend = c(“Litter 1”, “Litter 2”,

“Litter 3”),

col = c("white","yellow","purple"),

lwd = c(2,2,2),

horiz = T,

cex = 0.7)

(Type ?legend for a full explanation of these parameters)

330 lecture 4

points and text
Points and text

x=1:25

y=1:25

plot(x,y, type="n")

points(x,y,pch=1:25, col="red",

cex=1.2)

points and text 3
Points and text (3)

x=1:26

y=1:26

plot(x,y, type="n")

text(x,y, letters, col="blue", cex=1.2)

trellis
Trellis
  • Must load trellis library first

library(lattice)

  • General form of trellis plots

xyplot(y~x|W*Z, data=some.df)

  • Don’t need to attach data frame, trellis functions can pick out the variables, given the data frame

330 lecture 4

main trellis functions
Main trellis functions
  • dotplot for dotplots, use when X is categorical, Y is continuous
  • bwplot for boxplots, use when X is categorical, Y is continuous
  • xyplot for scatter plots, use when both x and y are continuous
  • equal.count use to turn continuous conditioning variable into groups

330 lecture 4

changing background colour
Changing background colour

To change trellis background to white

trellis.par.set(background = list(col="white"))

To change plotting symbols

trellis.par.set(plot.symbol = list(pch=16, col="red", cex=1))

330 lecture 4

equal count
Equal.count

xyplot(volume~height|diameter, data=cherry.df)

330 lecture 4

equal count 2
Equal.count (2)

diam.gp<-equal.count(diameter,number=4,overlap=0)

xyplot(volume~height|diam.gp, data=cherry.df)

330 lecture 4

changing plotting symbols
Changing plotting symbols

To change plotting symbols

trellis.par.set(plot.symbol = list(pch=16, col="red", cex=1))

330 lecture 4

non trellis version
Non-trellis version

coplot(volume~height|diameter, data=cherry.df)

330 lecture 4

non trellis version 2
Non-trellis version (2)

coplot(volume~height|diameter,

data=cherry.df,number=4,overlap=0)

330 lecture 4

other useful functions
Other useful functions
  • Regular R
    • scatterplot3d (3d scatter plot, load library scatterplot3d)
    • contour, persp (draws contour plots, surfaces)
    • pairs
  • Trellis
    • cloud (3d scatter plot)

330 lecture 4

ad