1 / 40

STAT 534: Statistical Computing

STAT 534: Statistical Computing. Hari Narayanan harin@uw.edu. Course objectives. Write programs in R and C tailored to specifics of statistics problems you want to solve Familiarize yourself with: optimization techniques Markov Chain Monte Carlo (mcmc). Logistics. Class:

kyria
Download Presentation

STAT 534: Statistical Computing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STAT 534: Statistical Computing Hari Narayanan harin@uw.edu

  2. Course objectives • Write programs in R and C tailored to specifics of statistics problems you want to solve • Familiarize yourself with: • optimization techniques • Markov Chain Monte Carlo (mcmc)

  3. Logistics • Class: • Tuesdays and Thursdays 12:00pm – 1:20pm • Office hours: • Thursday 2:30pm – 4pm (Padelford B-301) or by appt • Textbooks: • Robert & Casella : Introducing Monte Carlo Methods with R • Kernighan & Ritchie : C Programming Language • Evaluation: • 4 assignments • 2 quizzes • Final project

  4. Introduction to R • R is a scripting language for statistical data manipulation and analysis • R is the successor of S/S Plus • R standard for professional statisticians • R is free and available on major platforms (Windows, Unix, Mac) • It is: general, object oriented • It is an interpreted programming language

  5. Getting R • Main website: http://cran.r-project.org/ • ~25 standard packages come with a default download, many more contributed packages can be obtained from the main website • Development environment/GUI: • Rstudiohttp://www.rstudio.com/

  6. First R interactive session • Type interactive commands at the prompt > 2+3 5 > 2==4 FALSE > 5/0 Inf > 0/0 NaN • Note that R is case sensitive • Getting help • help(FALSE) • ?FALSE • Ending session: • >q()

  7. R workspaces • R creates and manipulates objects: variables, arrays of numbers, list of character, functions, structures build from these components: > a = 4 > b = 5 > objects() # list all the objects in this workspace [1] "a" "b" > ls() # same as objects() [1] "a" "b" >rm(a) # remove an object from this workspace > ls() [1] "b“ • Objects of the current session are stored in .Rdatain the current folder and command history is stored in .Rhistory • These are reloaded every time you start R from the same directory

  8. Assignment • Multiple ways to assign values [ primitive values or results of the evaluation of an expression] to variables: > a = 2 + 3 > a <- 2 + 3 > 2 + 3 -> a > assign(“a”, 2+3)

  9. Vectors • Created using the c (concatenation) function: > v = c(1,2) > v [1] 1 2 • A number by itself is considered a vector of length 1 • No nesting > u=c(-4, v, 4) > u [1] -4 1 2 4 • Missing values c(1, NA, 4)

  10. Operations on vectors • Regular arithmetic operations apply (+, -, *, /, ^). Shorter vector is recycled to match needed length: > a=c(1,2) # becomes 1 2 1 > b=c(1,2,3) > r=3*a+b-1 Warning message: In 3 * a + b : longer object length is not a multiple of shorter object length > r [1] 3 7 5 • Other math functions can be applied element wise : sqrt, log, .. • Other functions: max, min, length, sum, prod, mean, var, sort > sort(c(4,3,7)) [1] 3 4 7

  11. Logical operations • Operators <, <=, >, >=, ==, &, |, ! > a=c(2,4) > r1=a>3 [1] FALSE TRUE > r2=a>4 [1] FALSE FALSE > r1 & r2 [1] FALSE FALSE > r1 | r2 [1] FALSE TRUE > ! r1 [1] TRUE FALSE • Can be used in arithmetic operations, FALSE coerced to 0, TRUE to 1 > r1 + 1 [1] 1 2 > c(2,3) & c(0,1) [1] FALSE TRUE

  12. Generating vectors • : operator (high precedence in an expression) > a=3 > 1:a [1] 1 2 3 > 1:a+1 [1] 2 3 4 > 1:(a+1) [1] 1 2 3 4 • seq function > seq(from=2, to=4) # named arguments same as seq(2,4) [1] 2 3 4 > seq(to=4, from=2) [1] 2 3 4 > seq(from=2, length=3) [1] 2 3 4 • rep function > a=c(1,2) > rep(a, times=2) [1] 1 2 1 2 > rep(a, each=2) [1] 1 1 2 2

  13. Manipulating vector data • Simple indexing: > a=c(2,3,8) > a[1] [1] 2 > a[5] [1] NA > a[-1] [1] 3 8 • More complex: > a[1:2] [1] 2 3 > a[a>2 & a<7] [1] 3 > a[c(1,1)] [1] 2 2

  14. Matrices • Associating a dimension vector with a vector allows it to be treated by R as an array/matrix: > a=c(2,3,8) > attributes(a) NULL > dim(a) = c(3,1) > a [,1] [1,] 2 [2,] 3 [3,] 8 > attributes(a) $dim [1] 3 1 > matrix(c(1,2,3,4,5,6), nrow=2) [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 > matrix(c(1,2,3,4,5,6), nrow=2, byrow=TRUE) [,1] [,2] [,3] [1,] 1 2 3 [2,] 4 5 6

  15. Operations on matrices • Addition/subtraction/element-wise multiplication : +,-,* • Matrix multiplication : %*% • Transpose : function t() e.g. t(matrix(c(1,2),nrow=1)) • diag function: • If argument is a number we get identity matrix > diag(2) [,1] [,2] [1,] 1 0 [2,] 0 1 • If argument is a vector, we get diag matrix with elements of vector > diag(c(1,2)) [,1] [,2] [1,] 1 0 [2,] 0 2 • If argument is a matrix, we get the elements of its major diagonal > m [,1] [,2] [1,] 3 5 [2,] 4 6 > diag(m) [1] 3 6

  16. Indexing • Similar to indexing vectors, except we have an indexing vector for every dimension: > m [,1] [,2] [,3] [1,] 1 3 5 [2,] 2 4 6 > m[2,2] [1] 4 > m[c(2),c(2)] # indexing vectors [1] 4 > m[c(1,2),c(2)] # first 2 rows and 2nd column [1] 3 4 > m[c(1,2),c(2,3)] [,1] [,2] [1,] 3 5 [2,] 4 6 > m[c(1,2),c(2,3)]=0 > m [,1] [,2] [,3] [1,] 1 0 0 [2,] 2 0 0 > m[c(TRUE,FALSE),TRUE] # keep first line and all columns [1] 1 0 0

More Related