1 / 32

Module 3 R Script Basics

Do. Learning. See & Hear. Read. Menu. Module 3 R Script Basics. PowerPoint must be in View Show Mode to See videos and hyperlinks. Module 3 R Script Basics Goals. Systematically Start you on your R learning curve Introduce essential functions Demonstrate working R scripts

cachet
Download Presentation

Module 3 R Script Basics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Do Learning See & Hear Read Menu Module 3 R Script Basics PowerPoint must be in View Show Mode to See videos and hyperlinks R Script Basics

  2. Module 3 R Script BasicsGoals • Systematically Start you on your R learning curve • Introduce essential functions • Demonstrate working R scripts • Have you run and edit R scripts through assignments • Provide building blocks for your own scripts R Script Basics

  3. Essential Tasks and FunctionsCovered in This Module You can get complete R documentation On each function from R Console Help(“function”) or ?function For example >Help(“for”) >?read.table R Script Basics

  4. R’s File Path & Name Conventions How to Read a Data File Working with R Scripts Working with Vectors & data.frames Working with Dates Subsetting & Factors Module 3 R Scripts Menu Press Hyperlinks to go to topic slide, Press Video Box to Start Video R Script Basics

  5. Video 3-1: R’s File Path Click video image to start video R Script Basics

  6. Get file path interactively R’s choose.files() function Brings up Select File Window Lets users interactively select data file Copy/paste the correctly formatted file name to your script R’s 2 Valid paths formats: “C:/Learn_R/Mod_3_R_Scripts” or “C:\\Learn_R\\Mod_3_R_Scripts” R lets you use / or \\, not \ Menu R’s File Path and Name Conventions Issue • Windows & R handle forward/& backward \ slashes differently • Windows path: • “C:\Learn_R\Mod_3_R_Scripts” • R considers \ as an escape character • Need to adjust to R’s path conventions R Script Basics

  7. Start R Session In R Console, Open Script: “C:/Learn_R/Mod_3_R_Script/Ex_Scr_3_1_choose_file.R” Save Script as: “C:/Learn_R_Mod_3_R_Script/Practice_3_1_choose.R Edit Script to Read data file: "C:\\Learn_R\\Mod_3_R_Script_Basics\\Data_3_1_GISS_1980_By_year.txt" Expected Result Assignment 3-1choose.files() R Script Basics

  8. Video 3-2How to read a data file Click video image to start video R Script Basics

  9. read.table() May be single most frequent function you will use Goal is to read data from source file, assign to data.frame Web Based Files- Simply specify “url” rather than path Link <- “http://….. Menu How to Read Data File (Txt, CSV, Web Based) #################### Example R Script: ############# ##Ex_Scr_3_2_read_file.R ############################Script to read data file, list contents## STEP 1: SETUP - Source File rm(list=ls()) link my_data <- read.table(link, skip =?, sep = "?", dec=".", row.names = NULL , header = ?, colClasses = c("??","??"), comment.char = "#", na.strings = c( "","*", "-",-99.9, -999.9 ), col.names = c( "?? ", "??") )my_data • Tip: • Use Notepad to look at data file • Print out first few lines of file • Use printout to answer ?? R Script Basics

  10. Start R Session In R Console, Open Script: “C:/Learn_R/Mod_3_R_Script/Ex_Scr_3_2_read_file.R” Save Script as: “C:/Learn_R_mod-3_R_Script/Practice_3_2_read.R Edit Script to Read data file: "C:\\Learn_R\\Mod_3_R_Script_Basics\\Data_3_1_GISS_1980_by_year.txt" Expected Result ## Practice_3_2_read_file.R ################### ##Script to read data file, list contents ## STEP 1: SETUP - Source File rm(list=ls()) link <- choose.files() my_data <- read.table(link, skip =0, sep = ",", dec=".", row.names = NULL , header =F, colClasses = c("numeric","numeric"), comment.char = "#", na.strings = c( "","*", "-",-99.9, -999.9 ), col.names = c( "yr", "anom") ) my_data Assignment 3-2Read Data File R Script Basics

  11. Video 3-3: Working with R Scripts Click video image to start video R Script Basics

  12. Let’s look at the simple, structured R script on the right Many of our R scripts will handle similar sets of tasks: Define source data file Read data & assign to data.frame Manipulate Data Produce Charts/ Reports Working with R Scripts ## Ex_Scr_3_3_work_w_scripts.R ################### ##Script to read file, produce plot## STEP 1: SETUP & SOURCE FILE rm(list=ls()); par(las=1) link <- choose.files()## STEP 2: READ DATA my_data <- read.table(link, sep = ",", dec=".", skip = 0, row.names = NULL, header = T, colClasses = c("numeric", "numeric" ), na.strings = c("", "*", "-", -99.99,99.9, 999.9), col.names = c(“Var1", “Var2")) ## STEP 3: MANIPULATE DATA Title <- "Ex_scr_3_3.R Example Output\n Description of Data Set" ## STEP 4: CREATE PLOT plot(Var2 ~ Var1, data = my_data, type = "l", col = "red", main = Title) • Things to Notice • Extensive comments (#s) • Delineation of Steps • Uses several arguments in read.table() and plot function() • Indentation of arguments • This script can be edited and reused for similar tasks R Script Basics

  13. Menu Working with R Scripts ## Ex_Scr_3_3_work_w_scripts.R ################### ##Script to read 2 variable data file, produce XY plot## STEP 1: SETUP & SOURCE FILE rm(list=ls()); par(las=1) link <- choose.files()## STEP 2: READ DATA my_data <- read.table(link, sep = ",", dec=".", skip = 0, row.names = NULL, header = T, colClasses = c("numeric", "numeric" ), na.strings = c("", "*", "-", -99.99,99.9, 999.9), col.names = c(“Var1", “Var2")) ## STEP 3: MANIPULATE DATA Title <- "Ex_scr_3_3.R Example Output\n Description of Data Set" ## STEP 4: CREATE PLOT plot(Var2 ~ Var1, data = my_data, type = "l", col = "red", main = Title) R Script Basics

  14. Menu Assignment 3-3 Edit R Script Source Data File "C:\\Learn_R\\Mod_3_R_Script_Basics\\Data_3_2_CO2_by_month.txt" • Go to Desktop • Press R shortcut • In R GUI, File > Open, Select c:\Learn_R\Mod_3_R_Script_Basics\Ex_Scr_3_3_work_w_scripts.R • Save as:C:\Learn_R\Mod_3_R_Script_Basics\Practice_3_3_work_w_script.R • Edit Practice_3_3_work_w_script.R file • Edit Comment at top • Edit col.names: c(“yr_frac”, “CO2”) • Edit Title Line 2:Monthly CO2 (ppmv) Mauna Loa,Hawaii" • Change col = “blue” • Save changes to Practice_3_3_… • Control A + Control R to run ## Ex_Scr_3_3_work_w_scripts.R ################### ##Script to read file, produce plot## STEP 1: SETUP & SOURCE FILE par(las=1) link <- choose.files()## STEP 2: READ DATA my_data <- read.table(link, sep = ",", dec=".", skip = 0, row.names = NULL, header = T, colClasses = c("numeric", "numeric" ), na.strings = c("", "*", "-", -99.99,99.9, 999.9), col.names = c(“Var1", “Var2”)) ## STEP 3: MANIPULATE DATA Title <- "Ex_scr_3_3.R Example Output\nDescription of Data Set" ## STEP 4: CREATE PLOT plot(Var2 ~ Var1, data = my_data, type = "l", col = “red", main = Title) R Script Basics

  15. Menu Assignment 3-3Expected Result Practice_3_3_work_w_scripts.R script #################### Example R Script ############ Practice_3_3_work_w_scripts.R ############### ##Script to read file, produce plot## STEP 1: SETUP - Source File par(las=1) link <- choose.files()## STEP 2: READ DATA my_data <- read.table(link, sep = ",", dec=".", skip = 0, row.names = NULL, header = T, colClasses = c("numeric", "numeric" ), na.strings = c("", "*", "-", -99.99,99.9, 999.9), col.names = c("Yr_frac", "CO2")) ## STEP 3: MANIPULATE DATA Title <- “Practice_3_3_work_w_scripts.R Example Output\nMonthly CO2 (ppmv) Mauna Loa,Hawaii" ## STEP 4: CREATE PLOT plot(CO2 ~ Yr_frac, data = my_data, type = "l", col = "blue", main = Title) R Script Basics

  16. Video 3-4 Vectors and data.frames Click video image to start video R Script Basics

  17. Vector Data Types Numeric (2.67) Character (“John Smith”) Logical (“T”) Factor (“Male”) All items in vector must be same data type R will coerce all vector items to single type Vector Names data.frame[column number] data.frame$col.name data.frame & vector indexes [ ] df[c] - column number in data.frame df[r,c] - row & column in data.frame v[r] - row number in vector Calculated variables are vectors Vectors are dynamic Number of rows in data.frame must be the same for each vector nrow() function counts number of data rows in data.frame length() function counts number of items in vector Menu Vectors and data.frames– What you need to know R Script Basics

  18. 3 Ways to Enter vector items: Itemize var_type <- c(“character”, “numeric”, “numeric”, “logical”, “numeric”, “numeric”) or Combine c() & rep() var_type < c(“character”, rep(“numeric”, 2), “logical”, rep(“numeric”,2) or Combine c() & seq() x <- c(seq(1,10,2), 11,14,18,19) Menu Functions that Create Vectors c(), seq(), rep() In addition to read.table() function c() – “combine” Function my_animals <- c(“dog”, “cat”, rabbit”) my_num <- c(1,8,11.2, 13,6, 19.13) • seq() – “sequence” Function – uniformly spaced series • my_numbers <- seq(a,b, inc) • a – start value • b – end value • inc – increment; 1 is default • num <- seq(3,17,2) # (3,5,7,9,11,13,15,17) • uniform <- 5:9 #(5,6,7,8,9) • rep() – “replicate” Function • my_repeat_num <- rep(q, n) • q – number or character to be replicated • n – number of replications • my_rep<- rep(“abc”, 3) # (“abc”,“abc”,“abc”) R Script Basics

  19. Menu How to Make Basic Vector Calculations: sum(), max(), min(), mean(), median() # If you may have missing values, use na.rm = T max(vector, na.rm = T) min(vector, na.rm = T) # must remove na's to get valid answer mean(vector, na.rm = T) median(vector, na.rm = T) sum(vector, na.rm = T) summary() prints quartiles, mean, min, max summary(data.frame) prints summary for each column quantiles(x, 0.9) finds 90th percentile rnorm(n1, m, d) generates n1 random numbers, mean m & sd - d Example > r<- rnorm(10,100,5) # creates vector with 10 random nos, mean =100, sd = 5 > r_mean <- mean(r) # calculate mean of vector r > r_mean # output r_men to console [1] 98.16317 R Script Basics

  20. which( x = ??) Returns index for row(s) of vector x that meet criteria ?? # Find index of max() value vals <- c(1,3,2,68,11,13,19,8,49,4) my_max <- max(vals) which_val <- which(vals == my_max) cat(c("Max = ", my_max, "val #", which_val), fill = 30) Menu which() returns rows with specific value in vector R Script Basics

  21. Menu attach() data.frame • For vectors in a data.frame must include data.frame name • data.frame$col.name or • data.frame[column number] or • attach(data.frame) function adds data.frame to R search path • Vectors in data.frame can be accessed by name • Saves having to use data.frame$ before vector name • detach(data.frame) good idea to remove from workspace when done R Script Basics

  22. Menu Assignment 3-4Working with a vector • Start with New Script File • Save as: C:\Learn_R\Mod_3_R_Script_Basics\Practice_3_4_vectors.R • Create vals vector c(1,3,5,7,21,4,12.2,19.12,21) • Make these calculations summary(vals) length(vals) mean(vals) which(vals==max(vals)) Expected Result R Script Basics

  23. Menu Video 3-5Working with Dates Click video image to start video R Script Basics

  24. R,like Excel, treats dates in a special way!! R dates start Jan. 1, 1970 Before 1/1/70 negative After 1/1/70 positive Read dates as “character” vector Use as.Date() to convert to date vector Menu Working with DatesWhat You Need to Know my_date <- as.Date(char_v, “%m/%d/%y”) • Input dates must include d-m-year in any order • as. Date (char_v, “%m/%d/%y”) specifies how dates are organized • %d - day of month (1-31) • %m - month number (1-12) • %b - month abrev (Jan) • %B - full month name (January) • %y - 2 digit year (08) • %Y - 4 digit year (2008) Be sure to specify any delimiters in dates / , - * R Script Basics

  25. ##Script to Demonstrate character date input & conversion to R date ## STEP 1: SETUP - Source File link <- C:\\Learn_R\\Mod_3_R_Script_Basics\\Data_3_3_GISS_by_month.txt” ## STEP 2: READ DATA my_data <- read.table(link, sep = ",", dec=".", skip = 1, row.names = NULL, header = F, colClasses = c("character", "numeric","factor"), na.strings = c("", "*", "-", -99.99,99.9, 999.9), col.names = c("char_date", "T_anom", "Enso_f")) ## STEP 3:Convert character dates to R dates, then get month valuesr_date <- as.Date(my_data$char_date, "%m/%d/%Y")r_mo <- months(r_date) ## STEP 4: New data.frame - add r_date & r_mo vectorsmy_data_1 <- data.frame(my_data, r_date, r_mo)attach(my_data_1) head(my_data_1) Data File Example Dates are character strings Menu Reading Date Character DataConverting to R Dates R Script Basics

  26. Start R Session In R Console, Open Script: “C:/Learn_R/Mod_3_R_Script/Ex_Scr_3_5_Date_conv.R” Run Script to Read data file: "C:\\Learn_R\\Mod_3_R_Script_Basics\\Data_3_3_GISS_by_month.txt“ Things to Notice Creation of new data.frame Use of attach() function Use of as.Date() Use of head() Menu Assignment 3-5Working with Dates Printout & Read R Documentation for as.Date() & months() ? as.Date ? months R Script Basics

  27. Menu Video 3-6Functions to Subset & Summarize Data Click video image to start video R Script Basics

  28. R lets you quickly define subsets of data and calculate summary statistics for the subset Goal: calculate average temperature anom for 1930s decade which_decade <- 1930 decade <- as.integer(my_data$yr/10)*10 my_data <- data.frame(my_data, decade) attach(my_data) decade_subset <- subset(my_data, decade== which_decade) decade_avg <- mean(decade_subset$anom) cbind(which_decade, decade_avg) Data File Example Menu subset() Subset Data and Calculate Summary Values • subset() function • dec_subset<- subset(df, vector = =?) • Approach: • Calculate decade for each row • Subset rows with decade = 1930 • Calculate average for subset R Script Basics

  29. What if we want average for each decade? Data File for(i in a:b) { }How to use for loop & subset() ## STEP 3: CALC DECADE MEANS decade <- as.integer(my_data$yr/10)*10 my_data <- data.frame(my_data, decade) attach(my_data) dec_list <- seq(1880, 2000, 10) num_dec <- length(dec_list) dec_subset<- 1 dec_avg<- 1 for(i in 1:num_dec){ dec_subset <- subset(my_data, decade == dec_list[i]) dec_avg[i] <- mean(dec_subset$anom, na.rm=T) } cbind(dec_list, dec_avg) • Combine for loop & subset() • for (i in a:b) { subset(df, vector ==? )} • Approach: • Calculate decade for each row • Subset rows by decade • Calculate average for each decade subset R Script Basics

  30. 1. Create decade_f factor as.factor(decade) 2. Summarize by decade_f tapply( x, INDEX, FUN) Applies FUNction (mean, max, etc) to each cell in x for each level of factor INDEX Data File Menu tapply() How to Summarize Data by Factor Another way to get average for all decades? ## STEP 3: CALC DECADE MEANS decade <- as.integer(my_data$yr/10)*10 decade_f <- as.factor(decade) my_data <- data.frame(my_data, decade_f) attach(my_data) dec_avg <- tapply(anom, INDEX = decade_f, mean) cbind(dec_avg) R Script Basics

  31. Start R Session In R Console, Open Script: “C:/Learn_R/Mod_3_R_Script/Ex_Scr_3_6_subset_data_mean.R” Run Script to Read data file: "C:\\Learn_R\\Mod_3_R_Script_Basics\\Data_3_4_GISS_By_year.csv" Edit which_decade to 1940 & Rerun script Expected Result Menu Assignment 3-6subset() & mean() Printout & Read R Documentation for subset() & mean() ? subset ? mean R Script Basics

  32. Start R Session In R Console, Open Script: “C:/Learn_R/Mod_3_R_Script/Ex_Scr_3_8_factor_tapply.R” Run Script to Read data file: "C:\\Learn_R\\Mod_3_R_Script_Basics\\Data_3_4_GISS_By_year.csv“ Things to Notice Creation of new data.frame Use of attach() function Use of as.factor() Use of tapply() Menu Assignment 3-7as.factor() & tapply() Printout & Read R Documentation for as.factor() & tapply() ? factor ? tapply R Script Basics

More Related