1 / 25

Two Color Microarrays

Two Color Microarrays. EPP 245/298 Statistical Analysis of Laboratory Data. Two-Color Arrays. Two-color arrays are designed to account for variability in slides and spots by using two samples on each slide, each labeled with a different dye.

aneko
Download Presentation

Two Color Microarrays

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Two Color Microarrays EPP 245/298 Statistical Analysis of Laboratory Data

  2. Two-Color Arrays • Two-color arrays are designed to account for variability in slides and spots by using two samples on each slide, each labeled with a different dye. • If a spot is too large, for example, both signals will be too big, and the difference or ratio will eliminate that source of variability EPP 245 Statistical Analysis of Laboratory Data

  3. Dyes • The most common dye sets are Cy3 (green) and Cy5 (red), which fluoresce at approximately 550 nm and 649 nm respectively (red light ~ 700 nm, green light ~ 550 nm) • The dyes are excited with lasers at 532 nm (Cy3 green) and 635 nm (Cy5 red) • The emissions are read via filters using a ccd device EPP 245 Statistical Analysis of Laboratory Data

  4. EPP 245 Statistical Analysis of Laboratory Data

  5. EPP 245 Statistical Analysis of Laboratory Data

  6. EPP 245 Statistical Analysis of Laboratory Data

  7. File Format • A slide scanned with Axon GenePix produces a file with extension .gpr that contains the results:http://www.axon.com/gn_GenePix_File_Formats.html • This contains 29 rows of headers followed by 43 columns of data (in our example files) • For full analysis one may also need a .gal file that describes the layout of the arrays EPP 245 Statistical Analysis of Laboratory Data

  8. "Block" "Column" "Row" "Name" "ID" "X" "Y" "Dia." "F635 Median" "F635 Mean" "F635 SD" "B635 Median" "B635 Mean" "B635 SD" "% > B635+1SD" "% > B635+2SD" "F635 % Sat." EPP 245 Statistical Analysis of Laboratory Data

  9. "F532 Median" "F532 Mean" "F532 SD" "B532 Median" "B532 Mean" "B532 SD" "% > B532+1SD" "% > B532+2SD" "F532 % Sat." EPP 245 Statistical Analysis of Laboratory Data

  10. "Ratio of Medians (635/532)" "Ratio of Means (635/532)" "Median of Ratios (635/532)" "Mean of Ratios (635/532)" "Ratios SD (635/532)" "Rgn Ratio (635/532)" "Rgn R² (635/532)" "F Pixels" "B Pixels" "Sum of Medians" "Sum of Means" "Log Ratio (635/532)" "F635 Median - B635" "F532 Median - B532" "F635 Mean - B635" "F532 Mean - B532" "Flags" EPP 245 Statistical Analysis of Laboratory Data

  11. Analysis Choices • Mean or median foreground intensity • Background corrected or not • Log transform (base 2, e, or 10) or glog transform • Log is compatible only with no background correction • Glog is best with background correction EPP 245 Statistical Analysis of Laboratory Data

  12. d41 <- read.table("037841.gpr",header=T,skip=29) d41 <- d41[,c(4,5,9,10,12,13,18,19,21,22)] d50 <- read.table("037850.gpr",header=T,skip=29) d50 <- d50[,c(4,5,9,10,12,13,18,19,21,22)] d46 <- read.table("037846.gpr",header=T,skip=29) d46 <- d46[,c(4,5,9,10,12,13,18,19,21,22)] d47 <- read.table("037847.gpr",header=T,skip=29) d47 <- d47[,c(4,5,9,10,12,13,18,19,21,22)] d48 <- read.table("037848.gpr",header=T,skip=29) d48 <- d48[,c(4,5,9,10,12,13,18,19,21,22)] d49 <- read.table("037849.gpr",header=T,skip=29) d49 <- d49[,c(4,5,9,10,12,13,18,19,21,22)] d43 <- read.table("037843.gpr",header=T,skip=29) d43 <- d43[,c(4,5,9,10,12,13,18,19,21,22)] EPP 245 Statistical Analysis of Laboratory Data

  13. dataprep <- function(method="median",bc=F) { if ((method=="mean")&(bc)) cvec <- c(1,0,-1,0) if ((method!="median")&(bc)) cvec <- c(0,1,0,-1) if ((method=="mean")&(!bc)) cvec <- c(1,0,0,0) if ((method!="median")&(!bc)) cvec <- c(0,1,0,0) d41a <- as.matrix(d41[,3:6]) %*% cvec d41b <- as.matrix(d41[,7:10]) %*% cvec d50a <- as.matrix(d50[,3:6]) %*% cvec d50b <- as.matrix(d50[,7:10]) %*% cvec d46a <- as.matrix(d46[,3:6]) %*% cvec d46b <- as.matrix(d46[,7:10]) %*% cvec ... ... ... ... ... ... ... ... ... ... d45a <- as.matrix(d43[,3:6]) %*% cvec d45b <- as.matrix(d43[,7:10]) %*% cvec alldata <- cbind(d41a,d41b,d50a,d50b,d46a,d46b,d47a,d47b, d48a,d48b,d49a,d49b,d43a,d43b,d44a,d44b,d42a,d42b,d43a,d43b) return(alldata) } EPP 245 Statistical Analysis of Laboratory Data

  14. alldata <- dataprep(method="median",bc=F) rownames(alldata) <- d41[,1] dye <- as.factor(rep(c("Cy5","Cy3"),10)) slide <- as.factor(rep(1:10,each=2)) treat <- c(1,0,0,1,0,1,1,0,0,3,3,0,0,3,3,0,0,1,1,0) geneID <- d41[,1:2] EPP 245 Statistical Analysis of Laboratory Data

  15. Array normalization • Array normalization is meant to increase the precision of comparisons by adjusting for variations that cover entire arrays • Without normalization, the analysis would be valid, but possibly less sensitive • However, a poor normalization method will be worse than none at all. EPP 245 Statistical Analysis of Laboratory Data

  16. Possible normalization methods • We can equalize the mean or median intensity by adding or multiplying a correction term • We can use different normalizations at different intensity levels (intensity-based normalization) for example by lowess or quantiles • We can normalize for other things such as print tips EPP 245 Statistical Analysis of Laboratory Data

  17. Example for Normalization EPP 245 Statistical Analysis of Laboratory Data

  18. > normex <- matrix(c(1100,110,80,900,95,65,425,85,55,550,110,80),ncol=4) > normex [,1] [,2] [,3] [,4] [1,] 1100 900 425 550 [2,] 110 95 85 110 [3,] 80 65 55 80 > group <- as.factor(c(1,1,2,2)) > anova(lm(normex[1,] ~ group)) Analysis of Variance Table Response: normex[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 262656 262656 18.888 0.04908 * Residuals 2 27812 13906 --- Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 EPP 245 Statistical Analysis of Laboratory Data

  19. > anova(lm(normex[2,] ~ group)) Analysis of Variance Table Response: normex[2, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 25.0 25.0 0.1176 0.7643 Residuals 2 425.0 212.5 > anova(lm(normex[3,] ~ group)) Analysis of Variance Table Response: normex[3, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 25.0 25.0 0.1176 0.7643 Residuals 2 425.0 212.5 EPP 245 Statistical Analysis of Laboratory Data

  20. Additive Normalization by Means EPP 245 Statistical Analysis of Laboratory Data

  21. > mn <- mean(cmn) > normex - rbind(cmn,cmn,cmn)+mn [,1] [,2] [,3] [,4] cmn 974.58333 851.25 541.25 607.9167 cmn -15.41667 46.25 201.25 167.9167 cmn -45.41667 16.25 171.25 137.9167 > normex.1 <- normex - rbind(cmn,cmn,cmn)+mn EPP 245 Statistical Analysis of Laboratory Data

  22. > mn <- mean(cmn) > anova(lm(normex.1[1,] ~ group)) Analysis of Variance Table Response: normex.1[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 114469 114469 23.295 0.04035 * Residuals 2 9828 4914 > anova(lm(normex.1[2,] ~ group)) Analysis of Variance Table Response: normex.1[2, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 28617.4 28617.4 23.295 0.04035 * Residuals 2 2456.9 1228.5 > anova(lm(normex.1[3,] ~ group)) Analysis of Variance Table Response: normex.1[3, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 28617.4 28617.4 23.295 0.04035 * Residuals 2 2456.9 1228.5 EPP 245 Statistical Analysis of Laboratory Data

  23. Multiplicative Normalization by Means EPP 245 Statistical Analysis of Laboratory Data

  24. > normex*mn/rbind(cmn,cmn,cmn) [,1] [,2] [,3] [,4] cmn 779.16667 775.82547 687.33407 679.13851 cmn 77.91667 81.89269 137.46681 135.82770 cmn 56.66667 56.03184 88.94912 98.78378 > normex.2 <- normex*mn/rbind(cmn,cmn,cmn) > anova(lm(normex.2[1,] ~ group)) Response: normex.2[1, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 8884.9 8884.9 453.71 0.002197 ** Residuals 2 39.2 19.6 > anova(lm(normex.2[2,] ~ group)) Response: normex.2[2, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 3219.7 3219.7 696.33 0.001433 ** Residuals 2 9.2 4.6 > anova(lm(normex.2[3,] ~ group)) Response: normex.2[3, ] Df Sum Sq Mean Sq F value Pr(>F) group 1 1407.54 1407.54 57.969 0.01682 * Residuals 2 48.56 24.28 EPP 245 Statistical Analysis of Laboratory Data

  25. Multiplicative Normalization by Medians EPP 245 Statistical Analysis of Laboratory Data

More Related