64 Views

Download Presentation
##### Statistical Analysis

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Statistical Analysis**Programming in R**Vectors and assignment**• Simplest data structure is the numeric vector: • Type at the command line: > x<-c(10.4, 5.6, 3.1, 6.4, 21.7) • Type x at the command line to see the result: > x [1] 10.4 5.6 3.1 6.4 21.7 >**c() is a function**• Function c() takes an arbitrary number of vector arguments and concatenates them. > y<-c(x, 0, x) > y [1] 10.4 5.6 3.1 6.4 21.7 0.0 10.4 5.6 3.1 6.4 21.7**Vector arithmetic**• +,x,*,/,^ • log, exp, sin, cos, tan, sqrt,… • max, min, range • length, sum, prod**Calculate mean in R: mean and variation**> mean(x) [1] 9.44 > > var(x) [1] 53.853 >**Calculate mean in R: mean and variation**• mean(x) can be written as: > sum(x)/length(x) [1] 9.44 • var(x) can be written as: > sum((x-mean(x))^2)/(length(x)-1) [1] 53.853**Two sample t-statistic**twosam = function(y1,y2) { n1=length(y1); n2 =length(y2) yb1=mean(y1); yb2=mean(y2) s1=var(y1); s2=var(y2) s=((n1-1)*s1 + (n2-1)*s2)/(n1+n2-2) tst=(yb1-yb2)/sqrt(s2*(1/n1+1/n2)) tst } Copy and paste the above statements onto the command line in R**Should look like this:**> twosam <- function(y1,y2) { + n1<-length(y1); n2 <-length(y2) + yb1=mean(y1); yb2=mean(y2) + s1=var(y1); s2=var(y2) + s=((n1-1)*s1 + (n2-1)*s2)/(n1+n2-2) + tst=(yb1-yb2)/sqrt(s2*(1/n1+1/n2)) + tst + }**Test your function by calling it:**> tstat=twosam(x,x+1) > tstat [1] -0.2154592 >**Generating regular sequences**• 1:30 is the same with c(1,2,3,…,29,30) • : operator has the highest priority within an expression. For example: > 2*1:5 [1] 2 4 6 8 10**factors**> codons=c("GCA","GCC","GCG","GCU","UGC","UGU") > codons [1] "GCA" "GCC" "GCG" "GCU" "UGC" "UGU" > aminoacids=c("Ala","Ala","Ala","Ala","Cys","Cys") > aminoacids [1] "Ala" "Ala" "Ala" "Ala" "Cys" "Cys" > aaf=factor(aminoacids) > aaf [1] Ala Ala Ala Ala Cys Cys Levels: Ala Cys > ii=tapply(codons,aaf,print) [1] "GCA" "GCC" "GCG" "GCU" [1] "UGC" "UGU" >**arrays**> x=array(1:20,dim=c(4,5)) > x [,1] [,2] [,3] [,4] [,5] [1,] 1 5 9 13 17 [2,] 2 6 10 14 18 [3,] 3 7 11 15 19 [4,] 4 8 12 16 20**arrays**> x=array(0,dim=c(4,5)) > x [,1] [,2] [,3] [,4] [,5] [1,] 0 0 0 0 0 [2,] 0 0 0 0 0 [3,] 0 0 0 0 0 [4,] 0 0 0 0 0 >**Indexing arrays**> i=array(c(1:3,3:1),dim=c(3,2)) > i [,1] [,2] [1,] 1 3 [2,] 2 2 [3,] 3 1 > x=array(1:20,dim=c(4,5)) > x [,1] [,2] [,3] [,4] [,5] [1,] 1 5 9 13 17 [2,] 2 6 10 14 18 [3,] 3 7 11 15 19 [4,] 4 8 12 16 20 > x[i] [1] 9 6 3 > x[i]=0 > x [,1] [,2] [,3] [,4] [,5] [1,] 1 5 0 13 17 [2,] 2 0 10 14 18 [3,] 0 7 11 15 19 [4,] 4 8 12 16 20