bare bones r l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Bare-Bones R PowerPoint Presentation
Download Presentation
Bare-Bones R

Loading in 2 Seconds...

play fullscreen
1 / 84

Bare-Bones R - PowerPoint PPT Presentation


  • 324 Views
  • Uploaded on

Bare-Bones R. A Brief Introductory Guide Thomas P. Hogan University of Scranton 2010 All Rights Reserved. Citation and Usage. This set of PowerPoint slides is keyed to Bare-Bones R: A Brief Introductory Guide, by Thomas P. Hogan, SAGE Publications, 2010.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Bare-Bones R' - Pat_Xavi


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
bare bones r

Bare-Bones R

A Brief Introductory Guide

Thomas P. Hogan

University of Scranton

2010 All Rights Reserved

citation and usage
Citation and Usage

This set of PowerPoint slides is keyed to Bare-Bones R: A Brief Introductory Guide, by Thomas P. Hogan, SAGE Publications, 2010.

All are welcome to use and/or adapt the slides without seeking further permission but with the usual professional acknowledgment of source.

part 1 base r
Part 1: Base R
  • 1-1 What is R
    • A computer language, with orientation toward statistical applications
    • Relatively new
    • Growing rapidly in use
1 2 r s ups and downs
1-2 R’s Ups and Downs
  • Plusses
    • Completely free, just download from Internet
    • Many add-on packages for specialized uses
    • Open source
  • Minuses
    • Obscure terms, intimidating manuals, odd symbols, inelegant output (except graphics)
1 3 getting started loading r
1-3 Getting Started: Loading R
  • Have Internet connection
  • Go to http://cran.r-project/
  • R for Windows screen, click “base”
  • Find, click on download R
  • Click Run, OK, or Next for all screens
  • End up with R icon on desktop
downloading base r figs 1 1 1 4
Downloading Base R [Figs 1.1 – 1.4]
  • Click on Windows
  • Then in next screen, click on “base”
  • Then screens for Run, OK, or Next
  • And finally “Finish”
    • will put R icon on desktop
what you should have when clicking on r icon rgui and r console ending with r prompt fig 1 5
What You Should Have when clicking on R icon:Rgui and R Consoleending with R prompt (>) [Fig 1.5]
the r prompt
The R prompt (>)
  • > This is the “R prompt.” It says R is ready to take your command.
1 4 using r as calculator
1-4 Using R as Calculator
  • Enter these after the prompt,

observe output

>2+3

>2^3+(5)

>6/2+(8+5)

>2 ^ 3 + (5)

more as calculator
More as Calculator
  • You can copy and paste, but don’t include the >
  • Use # at end of command for notes, e.g.

> (22+ 34+ 18+ 29+ 36)/5 #Calculating the average, aka mean

  • R as calculator: Not very useful
1 5 creating a data set
1-5 Creating a Data Set
  • > Scores = c(22, 34, 18, 29, 36)

c means “concatenate” in R

– in plain English “treat as data set”

  • Now do:

>Scores

R will print the data set

important rules
Important Rules
  • We created a variable
  • Variable names are case sensitive
  • No blanks in name

(can use _ or . to join words, but not -)

  • Start with a letter (cap or lc)
  • Can use <- instead of =
another variable
Another variable
  • Create SCORES, using <-

> SCORES<-c(122, 134, 118, 129, 124)

  • NB: SCORES different than Scores

Check with

>SCORES

>Scores

non numeric data
Non-numeric Data
  • Enclose in quotes, single or double
  • Separate entries with comma
  • Example:

> names = c(“Mary”, “Tom”, “Ed”, “Dan”, “Meg”)

saving stuff
Saving Stuff
  • To exit: either X or quit ( )
  • Brings up this screen:
  • Do what you want: Yes or No
    • Do Yes,
    • then re-open R, get Scores & names
special note on saving
Special Note on Saving
  • Previous slide assumes you control computer
  • If not, use File, Save Workspace, name file, click Save
  • Works much like saving a file in Microsoft
  • To retrieve, do File, Load Workspace, find file, click Open
1 6 using r functions simple stuff
1-6 Using R Functions: Simple Stuff
  • Commands for mean, sd, summary

(NB: function names case sensitive)

    • mean(Scores)
    • sd(Scores)
    • summary(Scores)
  • Command for correlation
    • cor(Scores,SCORES)
r functions
R functions
  • A zillion of ‘em
  • R’s big strength, most common use
  • For examples:
    • Help
      • R functions(text)
      • Enter name of a function (e.g., sd)
    • Yields lots (!) of information
1 7 reading in larger data sets
1-7 Reading in Larger Data Sets
  • In Excel, enter (or download) the SATGPA20 file
  • Save as .xls
  • Then save as Text (tab delimited) file
    • Will have .txt extension
larger data sets the read table command
… Larger Data SetsThe read.table command
  • Now read into R like this:

>SATGPA20R=read.table("E:/R/SATGPA20.txt", header =T)

  • Need exact path, in quotes
  • header = T
    • T or TRUE, F or FALSE
    • Depends on opening line of file
the file choose command
The file.choose ( ) command
  • At > enter file.choose ( )
  • Accesses your system’s files, much like Open in Microsoft
  • Find the file, click on it
  • R prints the exact path in R Console
  • Can copy and paste into read.table
checking what you ve got
Checking what you’ve got:
  • Enter

>SATGPA20R

  • Then

>mean (SATGPA20R)

  • Try

>mean (GPA)

the attach command
The attach Command
  • To access individual variables, do this:

>attach(SATGPA20R)

  • Now try:

>mean(GPA)

the data frame command
The data.frame Command
  • Let’s create these 3 variables with c

> IQ = c(110, 95, 140, 89, 102)

> CS = c(59, 40, 62, 40, 55)

> WQ = c(2, 4, 5, 1, 3)

  • Then put them together with:

>All_Data = data.frame(IQ, CS, WQ)

  • Check with:

>mean(All_Data)

1 8 getting help
1-8 Getting Help
  • >help(sd)
  • >example(sd)
  • On R Console:

Help

R functions (text)

Enter function name, click OK

Reminder: function names case sensitive

r s function terms
R’s “function” terms

R language: function(arguments)

Plain English: Do this (to this)

or Do this (to this, with these conditions)

1 9 dealing with missing data
1-9 Dealing with Missing Data
  • NB: It’s a pain in R!
  • Key items
    • In data, enter NA for a missing value
    • In (most) commands, use na.rm=T
examples for missing data
Examples for missing data

>Data=c(2,4,6,NA,10)

>mean(Data, na.rm=T)

  • Add to the SATGPA20 file

21 1 NA NA NA 3.14

23 2 1 NA NA 2.86

Etc. and create new file SATGPA25R

  • Then

>mean(SATGPA25R, na.rm=T)

  • Note exception for cor function (use=‘complete’)
1 10 using r functions hypothesis tests
1-10 Using R Functions: Hypothesis tests
  • Be sure you have an active data set (SATGPA25R), using attach if needed
  • Then, to test male vs. female on SATM:

>t.test(SATM~SEX) # note tilde~

  • Examples of changing defaults:

>t.test(SATM~SEX, var.equal=TRUE, conf.level=0.99)

hypothesis tests chi square
Hypothesis tests: Chi-square
  • Using SEX and State variables in SATGPA25R
  • chisq.test (SEX, State)
1 11 r functions for commonly used statistics
1-11 R Functions for Commonly Used Statistics

functioncalculates this

mean ( ) mean

median ( ) median

mode ( ) mode

sd ( ) standard deviation

range ( ) range

IQR ( ) interquartile range

min ( ) minimum value

max ( ) maximum value

cor ( ) correlation

quantile ( ) percentile

t.test ( ) t-test

chisq.test ( ) chi-sqaure

NB1: See notes in text for details

NB2: R contains many more functions

1 12 two commands for managing your files
1-12 Two Commands for Managing Your Files

> ls ( )

Will list your currently saved files

> rm (file)

Insert file name; this will remove the file

NB: R has many such commands

1 13 r graphics
1-13 R Graphics
  • R graphs: good, simple
  • Let’s start with hist and boxplot with the SATGPA25R file

>hist(SATM)

>boxplot(SATM)

>boxplot(SATV, SATM)

  • R Graphics window opens, need to minimize to get R Console
more graphics plot
More Graphics: plot
  • Create these variables

>RS=c(12,14,16,18,25)

>MS=c(10,8,16,12,20)

  • Now do this:

>plot(RS, MS)

line of best fit
Line of Best Fit
  • Do these for the RS and MS variables:

> lm(MS~RS) # lm means linear model

> res=lm(MS~RS) # res means residuals

> abline(res) # read as ‘a-b’ line

controlling your graphics a brief look
Controlling Your Graphics: A Brief Look
  • R has many (often obscure) ways for controlling graphics; we’ll look at a few
  • Basically, we’ll change “defaults”

Examples (try each one):

  • Limits (ranges) for X and Y axes

>plot(RS, MS, xlim = c(5,25), ylim = c(5,25))

controlling graphs more examples
Controlling Graphs: More Examples
  • Plot characters:

>plot(RS, MS, pch=3)

  • Line widths

>plot(RS, MS, pch=3, lwd=5)

  • Axis labels

>plot(RS, MS, xlab = “Reading Score”, ylab = “Math Score”)

  • You can put them all together in one command
part 2 r commander
Part 2: R Commander
  • 2-1 What is R Commander?
    • Point and click version of R
    • Uses (and prints) base R commands
  • Loading: Easy – it’s just a package
    • See next slide
loading rcmdr
Loading Rcmdr
  • On R Gui, top menu bar

click Packages,

then Install package(s).

Pick a CRAN mirror site (nearby), click OK.

From the list of packages ,scroll to Rcmdr,

highlight it, click OK

  • After it loads, do these:
      • Check with: >library ( )
      • Activate with: >library (Rcmdr)
rcmdr s extra packages
Rcmdr’s extra packages
  • Scary message when first activating Rcmdr:
  • Just click Yes – and take a break
the r commander window
The R Commander Window
  • You get, R Commander window with
    • Script window
    • Output window

(incl Submit button)

    • Message window
2 2 r commander windows and menus
2-2 R Commander Windows and Menus
  • File
  • Edit
  • Data **
  • Statistics ** Most important for us
  • Graphs **
  • Models
  • Distributions
  • Tools
  • Help
our lesser used menus
Our Lesser Used Menus
  • File [Table 2.1]
    • Much like in Microsoft
    • Manage files
  • Edit [Table 2.2]
    • Much like in Microsoft
    • Can do with right click of mouse
our lesser used menus cont
Our Lesser Used Menus (cont)
  • Models

Mostly more advanced stats

  • Distributions
  • Tools
    • Load packages
    • Options – change output defaults
  • Help
    • Searchable index
    • R Commander manual
2 3 the data menu very important submenus for creating getting data sets
2-3 The Data Menu (very important)(Submenus for creating/getting data sets)
  • New data set – create new data set
  • Load data set – only for existing .rda data
  • Import data – import from various file types
  • Data in packages – not important for us
data menu cont submenus for managing data sets
Data Menu (cont.) (Submenus for managing data sets)
  • Active data set
    • Do stuff with current data set
  • Manage variables in active data set
    • Do stuff with variables in current data set
new data set fig 2 3
New data set [Fig. 2.3]
  • Click on it, brings up spreadsheet
  • Name it SampleData
new data set cont
New data set (cont)
  • Enter these data:

var1 var2 var3

2 1 5

5 4 7

3 7 8

6 8 9

9 2 9

  • Then kill window with X
  • Note: SampleData in Active Data Set
now try these
Now Try These
  • View active data set
  • Edit active data set
  • In Script window, type*
    • mean(SampleData)
    • sd (SampleData)
    • mean(var1) [gives error message]
    • Attach(SampleData)
    • mean(var1)

* When typing do not include >, do hit Submit

changing var names
Changing “var” names
  • Data

Manage variables in active data set

Rename variables

Change names to Rater1, Rater2, Rater3

  • Then check with

mean(SampleData)

mean(Rater1)

compute new variable
Compute new variable
  • Data

Manage variables in active data set

Compute new variable

  • Give name to new variable, call it Total
  • In ‘Expression to compute’, enter Rater1+Rater2+Rater3
  • Check with
    • View data set
    • mean (SampleData)
import data very important submenu
Import data(very important submenu)
  • Allows importing from
    • .txt file
    • SPSS file
    • Excel file
    • Several others
  • Try it with a .txt file
    • (must already exist; try with SATGPA25.txt)
convert numeric variables to factors
Convert Numeric Variables to Factors
  • Recall types of scales (esp. nominal)
  • Rcmdr assumes numeric
  • To convert to nominal (factor)
    • Data, then Manage variables in active data set, and Convert numeric variables to factors. Highlight the variable you want to convert, click OK. In the next window, give labels to the levels of the variable.
    • Try with SEX and State in SATGPA25R
2 4 the statistics menu
2-4 The Statistics Menu
  • Obviously very important
  • Most pretty clear how to do
    • Some go beyond intro stats
    • Some surprises on what’s where
  • We’ll just sample some of them
  • Put SATGPA25R in Active data set
statistics summaries try each of these with satgpa25r in data set observe output
Statistics: Summaries (Try each of these with SATGPA25R in Data set, observe output)
  • Active data set (see next slide)
  • Numerical summaries (see next slide)
  • Frequency distributions
  • Summaries
  • Count missing observations
  • Table of statistics
  • Correlation matrix
  • Correlation test
  • Shapiro-Wilk test of normality
getting started on stat menu
Getting started on Stat menu
  • Statistics – Summaries - Active data set
  • Statistics – Summaries – Numerical summaries
  • Etc. with others
statistics means try t test anova
Statistics: Means(Try t-test, ANOVA)
  • Single-sample t-test
  • Independent samples t-test (TRY*)
  • Paired t-test
  • One-way ANOVA (TRY*)
  • Multi-way ANOVA

* With SATGPA25R

two way table chi square fig 2 9
Two-Way Table (chi-square) [Fig 2.9]
  • Statistics - Contingency tables - Two-Way table
2 5 the graphics menu
2-5 The Graphics Menu
  • All pretty intuitive (if you know the graph)
  • Try with SATGPA25R
    • Pie: State
    • Histogram: SATM
    • Boxplot by group: SATM by SEX
    • Scatterplot: GPA from SATV
changing graphs appearance
Changing Graphs Appearance
  • Rcmdr Graphs uses defaults
  • Change them in Script window
  • Use commands given earlier
  • Many ways to do; not terribly intuitive
  • See example on next slide
changing graphs defaults example
Changing Graphs Defaults: Example
  • Histogram of GPA (with defaults):

Hist(SATGPA25R$GPA, scale="frequency", breaks="Sturges", col="darkgray")

[copy, paste, change, Submit]

Hist(SATGPA25R$GPA, scale="frequency", breaks=4, col="black", lwd=3)

2 6 the distributions menu two quick examples
2-6 The Distributions Menu: Two Quick Examples
  • Distributions

Continuous distributions

Normal distribution

Normal probabilities [insert -1.5]

  • Distributions

Continuous distributions

t distribution

t probabilities [insert 1.71, df 28]

part 3 some other stuff
Part 3: Some Other Stuff

Supplementary, Not Essential, Brief

  • 3-1 A Few Other Ways to Enter Data
  • 3-2 Exporting R Results
  • 3-3 Bonus: Build Your Own Functions
  • 3-4 An Example of an Add-on Package
  • 3-5 Keeping Up to Date
  • 3-6 Going Further: Selected References
3 1 a few other ways to enter data
3-1 A Few Other Ways to Enter Data
  • From Word, a few rules
    • One space between entries
    • NA for missing data
    • Save as Plain text (.txt)
    • Access with read.table
from word example
From Word: Example
  • Sample data

Age Pop Looks

18 5 65

20 1 13

21 6 34

NA 9 60

21 7 98

  • Save as APL.txt on E drive, folder R
  • Read in as:

>APL = read.table(“E:/R/APL.txt”, header=T)

checking from word
Checking from Word
  • Do these:
    • >APL
    • >mean (APL)
    • >mean (Pop) [gives error]
    • >attach (APL)
    • >mean (Pop)
from spss file
From SPSS file
  • Be sure you have foreign library
    • Check with: > library ( ) [if needed, load]
    • Activate with: > library (foreign)
  • Have an SPSS file FinalData,

which we’ll put into FinalR,

using read.spss and

to.data.frame like this

>FinalR = read.spss(‘E/Project/FinalData.sav’, to.data.frame = T)

3 2 exporting r results
3-2 Exporting R Results
  • For most intro applications, you’ll be content with output on R Console or Rcmdr Output window
  • You can copy and paste to Word Hint: Use monospaced font for better alignment
  • Can also save to a variety of formats from Base R or Rcmdr
exporting stats from base r
Exporting Stats from Base R
  • Stats to an Excel file
    • R object = function(data set)
    • MYMEANS = mean (SATGPA20R)
    • Save MYMEANS as a .csv file

Then

    • write.csv(MYMEANS, file=“exact path”)
    • write.csv(MYMEANS, file=“E:/R/MYMEANS.csv”)
    • Can access MYMEANS.csv with Excel
    • Can read it, in R, with read.csv(MYMEANS)
exporting graphs from base r
Exporting Graphs from Base R
  • Easy in R Graphics window and

works same for base R and Rcmdr

  • Right click on the graph
  • Copy as metafile (and paste wherever)
  • Save as metafile (and save wherever)
exporting from r commander
Exporting from R Commander
  • Easy, works much like in Word
  • After running a stat,
    • Go to File menu, Save output as, give file a name and destination, click Save
  • Note file saved as a .txt file
  • Saving graphs: Same as from Base R
3 3 bonus build your own functions
3-3 Bonus: Build Your Own Functions
  • You can custom-make a function and save it for future use
  • Example: function to get mean of a data set + 2 times its SD

> weirdstat = function(x) mean(x) + (2*sd(x))

  • Now try:

>weirdstat(GPA)

  • Function names get saved like data sets and they are case sensitive
3 4 an example of an add on package
3-4 An Example of an Add-on Package
  • Getting Info about Packages (need Internet)
    • Take it slowly
    • Go to Task Views in

http://cran.r-project.org/

  • Gives categories of packages (23 now)
  • Click on link for a category
  • Package names: usually cryptic, often obscure
  • To see what’s in a package:
    • Click on its link
    • Look at its Reference Manual
installing an add on package
Installing an Add-on Package
  • Follow usual steps for download
    • Be sure to activate with >library(pkg)
    • Download psychometric package
  • Using an Add-on Package
    • Basically a collection of functions
  • Examples with psychometric package
    • r.nil(r, n)
    • rdif.nul(r1, r2, n1, n2)
3 5 keeping up to date
3-5 Keeping Up to Date
  • All parts of R (base, Commander, add-on packages) periodically updated
  • Check cran-r site for updates
  • Update by downloading new version

(need Internet connection for this)

3 6 selected references
3-6 Selected References
  • Key URLs
    • R home: http://www.r-project.org/
    • Download: http://cran.r-project.org/
    • For many other introductions to R:

http://cran.r-project.org/other-docs.html

references cont
References (cont)
  • Some ‘Official’ books – online as pdfs
    • Fox, J. (2005). Getting started with the R Commander
    • R Development Core Team (2009). R Data Import/Export version 2.9.0.
    • Venables, W. N., Smith, D. M., & the R Development Core Team (2009). An introduction to R. Notes on R: A programming environment for data analysis and graphics version 2.9.0.
references cont82
References (cont)
  • Some other books
    • Dalgaard, P. (2008). Introductory statistics with R (2nd ed.). New York: Springer.
    • Everitt, B. S., & Hothorn, T. (2006). A handbook of statistical analyses using R. Boca Raton, FL: Chapman and Hall.
    • Murrell, P. (2005). R graphics. Boca Raton, FL: Chapman and Hall.
to cite use of r
To cite use of R
  • To cite the use of R for statistical work, R documentation recommends the following:

R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.

  • Get the latest citation by typing citation ( ) at the > prompt in the R Console.