1 / 40

Developing GUI Microarray Analysis Tools

Developing GUI Microarray Analysis Tools. Keith Satterley Bioinformatics, WEHI, Nov. 15 2005. Overview. 1. R, Environment, tools & resources. 2. Graphical tools . 3. LimmaGUI and AffylmGUI. 4. Example Analysis. 5. Resources available. 6. Future Developments.

victoir
Download Presentation

Developing GUI Microarray Analysis Tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Developing GUI Microarray Analysis Tools Keith Satterley Bioinformatics, WEHI, Nov. 15 2005

  2. Overview. 1. R, Environment, tools & resources 2. Graphical tools. 3. LimmaGUI and AffylmGUI. 4. Example Analysis. 5. Resources available. 6. Future Developments. The Walter and Eliza Hall Institute of Medical Research

  3. The R Project for Statistical Computing • R is language and environment for statistical computing and graphics. R is released under the GNU license. • R is a free software environment for statistical computing and graphics. It compiles and runs on a wide variety of platforms including Unix variants, Windows and MacOS. • S was developed by by John Chambers and colleagues at Bell Labs. R can be considered as a different implementation of S. • R was initially written by Robert Gentleman and Ross Ihaka of the Statistics Department of the University • of Auckland. • Since mid-1997 a large group of individuals have contributed to R by sending code and bug reports. • The R url is http://www.r-project.org/ The Walter and Eliza Hall Institute of Medical Research

  4. The R Project for Statistical Computing • R has an effective data handling and storage facility, • A suite of operators for calculations on arrays, in particular matrices, • Provides a vast number of useful statistical tools, many of which have been painstakingly tested, • R produces publication-quality graphics in a variety of formats, including JPEG, postscript, eps, pdf, and bmp, • A well-developed, simple and effective programming language. The Walter and Eliza Hall Institute of Medical Research

  5. The R Project for Statistical Computing • R allows users to add additional functionality by defining new functions. • C, C++ and Fortran code can be linked and called at run time. • R can be extended (easily) via packages. • There are about eight packages supplied with the R distribution and many more are available through the CRAN family of Internet sites The Walter and Eliza Hall Institute of Medical Research

  6. Resources for R • Frequently Asked Questions: http://www.ci.tuwien.ac.at/%7Ehornik/R/R-FAQ.html • Archives - CRAN see next. • Mailing Lists • r-help@lists.r-project.org: • r-devel@lists.r-project.org: • r-sig-mac@stat.math.ethz.ch. • Bug-tracking System: http://bugs.r-project.org/ The Walter and Eliza Hall Institute of Medical Research

  7. Resources for R • CRAN = Comprehensive R Archive Network. • CRAN is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R. The Walter and Eliza Hall Institute of Medical Research

  8. Australia • http://cran.au.r-project.org/ PlanetMirror, Brisbane http://cran.ms.unimelb.edu.au/ University of Melbourne • Austria • http://cran.at.r-project.org/ Technische Universitaet Wien • Brasil • http://cran.br.r-project.org/ Universidade Federal do Parana?? http://www.insecta.ufv.br/CRAN/ Federal University of Vicosa http://cran.fiocruz.br/ Oswaldo Cruz Foundation, Rio de Janeiro http://lmq.esalq.usp.br/CRAN/ University of Sao Paulo, Piracicaba http://www.vps.fmvz.usp.br/CRAN/ University of Sao Paulo, Sao Paulo • Canada • http://cran.stat.sfu.ca/ Simon Fraser University, Burnaby http://probability.ca/cran/ University of Toronto • China • http://www.lmbe.seu.edu.cn/CRAN/ Southeast University, Nanjing • Denmark • http://cran.dk.r-project.org/ dotsrc.org, Aalborg • France • http://cran.fr.r-project.org/ CICT, Toulouse http://cran.univ-lyon1.fr/ Dept. of Biometry & Evol. Biology, University of Lyon http://mirror.internet.tp/cran/ Boese Internet, Paris • Germany • http://cran.r-mirror.de/ Stefan Drees, Berlin http://pangora.org/cran/ Pangora GmbH, Hamburg http://cran.miscellaneousmirror.org/ Miscellaneousdata.de, Koeln http://umfragen.sowi.uni-mainz.de/CRAN/ University of Mainz http://cran.mirrorplus.org/ mirrorplus.org, Muenchen • Hungary • http://cran.hu.r-project.org/ Semmelweis University • Italy • http://cran.arsmachinandi.it/ Ars Machinandi, Arezzo http://microarrays.unife.it/CRAN/ Universita di Ferrara http://rm.mirror.garr.it/mirrors/CRAN/ Garr Mirror, Milano http://dssm.unipa.it/CRAN/ Universita degli Studi di Palermo • Israel • http://cran.active.co.il/ Activetech Ltd, Tel-Aviv • Japan • ftp://ftp.u-aizu.ac.jp/pub/lang/R/CRAN University of Aizu http://cran.md.tsukuba.ac.jp/ University of Tsukuba • Korea • http://bibscvs.snu.ac.kr/R/ Seoul National University • Netherlands • http://cran.nedmirror.nl/ Nedmirror, Amsterdam • Poland • http://novum.am.lublin.pl/CRAN/ Skubiszewski Medical University, Lublin http://r.meteo.uni.wroc.pl/ University of Wroclaw • Portugal • http://cran.pt.r-project.org/ Universidade do Porto • Slovenia • http://www.fastmirrors.org/cran/ Fastmirrors.org, Besnica http://www.wsection.com/cran/ Wsection.com, Ljubljana • South Africa • http://cbio.uct.ac.za/CRAN/ University of Cape Town http://cran.za.r-project.org/ Rhodes University • Spain • http://cran.es.r-project.org/ Spanish National Research Network, Madrid • Switzerland • http://cran.ch.r-project.org/ ETH Zuerich http://www.imsv.unibe.ch/cran/ Universitaet Bern http://cran.prokmu.com/ Prokmu Hosting, Bern • Turkey • http://godel.cs.bilgi.edu.tr/mirror/cran/ Istanbul Bilgi University • Taiwan • http://cran.cs.pu.edu.tw/ Providence University, Taichung http://cran.csie.ntu.edu.tw/ National Taiwan University, Taipei • UK • http://cran.uk.r-project.org/ University of Bristol http://www.sourcekeg.co.uk/cran/ Sourcekeg, London • USA • http://cran.cnr.Berkeley.edu University of California, Berkeley, CA http://cran.stat.ucla.edu/ University of California, Los Angeles, CA http://cran.ssds.ucdavis.edu/ University of California, Davis, CA http://rh-mirror.linux.iastate.edu/CRAN/ Iowa State University, Ames, IA http://www.biometrics.mtu.edu/CRAN/ Michigan Technological University, Houghton, MI http://cran.wustl.edu/ Washington University, St. Louis, MO http://www.ibiblio.org/pub/languages/R/CRAN/ University of North Carolina, Chapel Hill, NC http://cran.us.r-project.org/ Pair Networks, Pittsburgh, PA http://lib.stat.cmu.edu/R/CRAN/ Statlib, Carnegie Mellon University, Pittsburgh, PA http://cran.hostingzero.com/ Hosting Zero, Dallas, TX http://cran.fhcrc.org/ Fred Hutchinson Cancer Research Center, Seattle, WA The Walter and Eliza Hall Institute of Medical Research

  9. CRAN Mirrors – 475 packages The Walter and Eliza Hall Institute of Medical Research

  10. Resources for R • Features of R. • Graphical abilities. • Package System. • Objects in R. The Walter and Eliza Hall Institute of Medical Research

  11. Graphical Capabilities in R • On unix(inc. Mac OS X) X11 is used. • On MS Windows it uses the MS windows system commands. • This is not a GUI, but a graphics device for plotting and drawing. • There are high level, low level and interactive plotting commands. • plot(x) is a high level command. • If x is a time series, this produces a time-series plot. • If x is a numeric vector, it produces a plot of the values in the vector against their index in the vector. • If x is a complex vector, it produces a plot of imaginary versus real parts of the vector elements. The Walter and Eliza Hall Institute of Medical Research

  12. Graphical Capabilities in R • Low-level plotting commands can be used to add extra information (such as points, lines or text) to the current plot. • abline(a, b) • Adds a line of slope b and intercept a to the current plot. • title(main, sub) • Adds a title main to the top of the current plot The Walter and Eliza Hall Institute of Medical Research

  13. An R command line Example • library(limma) • setwd("C:/aaa-R/swirl/") • getwd() • list.files() • targets <- readTargets("SwirlTargetsFile.txt") • targets • RG <- read.maimages(targets$FileName, source="spot") • RG • par(fg="yellow",bg="green") • plot(RG$R,lwd=3) • abline(2000,1,lwd=5,col ="black") The Walter and Eliza Hall Institute of Medical Research

  14. R Graphics The Walter and Eliza Hall Institute of Medical Research

  15. R Graphics (cont.) The Walter and Eliza Hall Institute of Medical Research

  16. Bioconductor Graphics The Walter and Eliza Hall Institute of Medical Research

  17. R Packages • Packages provide a mechanism for loading code and attached documentation. • Packaging automatically checks and creates various documentation files from one source • Creates distributable win.binary(.zip), mac.binary(.tgz) or source files(.tar.gz). • Packages can specify dependent or suggested packages The Walter and Eliza Hall Institute of Medical Research

  18. R Packages(cont.) • install.packages() can install a package and all its dependencies (and their dependencies…), either the essential ones and/or the suggested ones (which maybe needed for examples etc.) The Walter and Eliza Hall Institute of Medical Research

  19. Objects in R • The entities R operates on are technically known as objects. • The class of an object determines how it will be treated by what are known as generic functions. • For example print, plot or summary will react according to what sort of object they are called to work on. The Walter and Eliza Hall Institute of Medical Research

  20. Bioconductor • Url is http://www.bioconductor.org/ • Bioconductor is an open source and open development software project for the analysis and comprehension of genomic data. • The Bioconductor core team is based primarily at the Fred Hutchinson Cancer Research Center. • Aims to promote high-quality documentation and reproducible research. • Aims to provide access to a wide range of powerful statistical and graphical methods for the analysis of genomic data. The Walter and Eliza Hall Institute of Medical Research

  21. Bioconductor • R and the R package system are the main vehicles for designing and releasing software. • Bioconductor has a commitment to full open source discipline, All contributions are expected to exist under an open source license such as GPL2 or BSD. The Walter and Eliza Hall Institute of Medical Research

  22. Bioconductor • Features of the Bioconductor site. • Packages – code • Packages – metadata • Version management system The Walter and Eliza Hall Institute of Medical Research

  23. Bioconductor Packages • 140 code packages listed • aCGH 1.4.0 Classes and functions for Array Comparative Genomic Hybridization data. • affxparser 1.2.0 Affymetrix File Parsing SDK • affy 1.8.1 Methods for Affymetrix Oligonucleotide Arrays • affycomp 1.6.0 Graphics Toolbox for Assessment of Affymetrix Expression Measures • affydata 1.6.0 Affymetrix Data for Demonstration Purpose • affylmGUI 1.4.0 GUI for affy analysis using limma package • affypdnn 1.4.0 Probe Dependent Nearest Neighbours (PDNN) for the affy package • affyPLM 1.6.0 Methods for fitting probe-level models • affyQCReport 1.8.0 QC Report Generation for affyBatch objects • altcdfenvs 1.4.0 alternative cdfenvs • ~~~~~~ • limma 2.2.0 Linear Models for Microarray Data • limmaGUI 1.6.0 GUI for limma package • ~~~~~~ • vsn 1.8.0 Variance stabilization and calibration for microarray data • webbioc 1.2.0 Bioconductor Web Interface • widgetInvoke 1.2.0 Evaluation widgets for functions • widgetTools 1.6.0 Creates an interactive tcltk widget • xcms 1.2.0 LC/MS and GC/MS Data Analysis • PLUS • 250 metadata packages • From: • ag 1.10.0 Affymetrix Arabidopsis Genome Array Annotation Data (ag) • agahomology 1.10.0 A data package containing annotation data for agahomology • To: • zebrafishcdf 1.10.0 zebrafishcdf • zebrafishprobe 1.10.0 Probe sequence data for microarrays of type zebrafish The Walter and Eliza Hall Institute of Medical Research

  24. Bioconductor – use the Subversion version mgt. system • Subversion! http://svnbook.red-bean.com/en/1.1/svn-book.html • Subversion is a free/open-source version control system. (replaces CVS). • That is, Subversion manages files and directories over time. • Subversion clients can access their repository across networks, which allows the version repository to be accessed by many users simultaneously. The Walter and Eliza Hall Institute of Medical Research

  25. Bioconductor – Version management system • Subversion uses a Copy-Modify-Merge solution, rather than a Lock-Modify-Unlock procedure. it remembers every change ever written to it: A client can ask historical questions like, “What did this directory contain last Wednesday?” or “Who was the last person to change this file, and what changes did they make?” The Walter and Eliza Hall Institute of Medical Research

  26. Graphical User Interfaces • These items are known as widgets. • Tcl/Tk is a tool for creating and interacting with widgets. • Tcl/Tk runs on unix, Windows and Mac OS X. The Walter and Eliza Hall Institute of Medical Research

  27. Tcl/Tk • Tcl/Tk needs to be installed on the computer as well as R. • There are prewritten librarys of Tcl/Tk tools- - for eg. TkTable. • The R package tcltk needs to be installed in R. • The tcltk R package is an interface between the R language and Tcl/Tk commands. The Walter and Eliza Hall Institute of Medical Research

  28. GUI Programs • On Windows Tcl/Tk talks to the MS Windows graphical window system. • On Unix(&Mac), Tcl/Tk talks to the X Windows system, hence X11 must be started first. • 1. Run X11 on Unix & Mac • 2. load the R package tcltk using: • library(tcltk) • library(affylmGUI) for example, (actually affylmGUI will automatically load tcltk) The Walter and Eliza Hall Institute of Medical Research

  29. R tcltk example • This can be used to test if tcltk (or Tcl/Tk) is working correctly: • >library(tcltk) • >tt <- tktoplevel() • >lbl <- tklabel(tt, text="Hello, World!") • >tkpack(lbl) • >but <- tkbutton(tt, text="OK") • >tkpack(but) The Walter and Eliza Hall Institute of Medical Research

  30. R tcltk testing tools • To check the path that Tcl/Tk uses to find libraries • >tclvalue(“auto_path”) • [1] "{C:\\R\\rw2020\\R-2.2.0/Tcl/lib/tcl8.4} • C:/R/rw2020/R-2.2.0/Tcl/lib ./lib • C:/R/rw2020/R-2.2.0/Tcl/lib/tk8.4 • C:/R/rw2020/R-2.2.0/library/tcltk/exec“ • To add an extra path to search, use: • >addTclPath(“C:/bin”) • >tclvalue(“auto_path”) • [1] "{C:\\R\\rw2020\\R-2.2.0/Tcl/lib/tcl8.4} • C:/R/rw2020/R-2.2.0/Tcl/lib ./lib • C:/R/rw2020/R-2.2.0/Tcl/lib/tk8.4 • C:/R/rw2020/R-2.2.0/library/tcltk/exec C:/bin“ • For a list of package commands: • >ls(package:tcltk) The Walter and Eliza Hall Institute of Medical Research

  31. Help Commands in R • help(mean) #help window on mean function • ?mean #same as help(mean) • help.search(“regression”)#Help files with alias or concept or title matching 'regression' using fuzzy matching: • help.start() #Browser into R docs • The Browser shows links into the R Language Definition, Installation & Administration of R, Package writing, Package documentation FAQ’s etc. The Walter and Eliza Hall Institute of Medical Research

  32. Some Useful R Commands for the GUI user! • getwd() #Get working directory. • setwd() #Set working Directory. • list.files() #list files in working directory. • ls() #list objects in workspace. • rm(list=ls()) #Remove all objects (recommended at start of a session). • savehistory(file=“History.txt”) • source(file="C:/path/to/filename/file.R", echo=T) #reads commands from file.R and executes them. • installed.packages() #detailed info on all packages installed. • summary(RG) #displays basic data about object RG. • library(limmaGUI) #loads limmaGUI package. The Walter and Eliza Hall Institute of Medical Research

  33. Cross Platform Issues • Installation issues are varied • MS Windows – able to be installed in C:\R by ordinary user • Unix – can be installed by user, but duplications if multiple users do so. • Mac OS X – special procedures necessary The Walter and Eliza Hall Institute of Medical Research

  34. LimmaGUI • limmaGUI is a Graphical User Interface (GUI) based on R-Tcl/Tk for the exploration and linear modelling of data from two-colour spotted microarray experiments, especially the assessment of differential expression in complex experiments. • Swirl Example Analysis. The Walter and Eliza Hall Institute of Medical Research

  35. AffylmGUI • AffylmGUI enables the user to perform quality assessment, low-level analysis and linear modeling of data from Affymetrix GeneChips®, with the ultimate goal of identifying differentially expressed genes. • Estrogen Example Analysis The Walter and Eliza Hall Institute of Medical Research

  36. WEHI website Resources • WEHI Bioinformatics home page http://bioinf.wehi.edu.au/ • Microarray Data Analysis http://bioinf.wehi.edu.au/marray/index.html LIMMA:Linear Models for Microarray Data http://bioinf.wehi.edu.au/limma/index.html limmaGUI: http://bioinf.wehi.edu.au/affylmGUI/ affylmGUI: http://bioinf.wehi.edu.au/affylmGUI/ James Wettenhall's Bioinformatics Home Page: http://bioinf.wehi.edu.au/folders/james/ R-Tcl/Tk Examples, Worked Examples for limma/affylmGUI at http://bioinf.wehi.edu.au/limmaGUI/R/library/limmaGUI/doc/DocIndex.html The Walter and Eliza Hall Institute of Medical Research

  37. Future Directions for AffylmGUI • additional plots to aid in quality assessment of a set of chips, including RNA degradation plots; • calculation and display of QC parameters recommended by Affymetrix (Affymetrix, 2004), such as percent present, ratios of 3’/5’ expression for hybridization controls and the like; • fitting of mixed linear models where there is technical replication; • support for other single-channel platforms. The Walter and Eliza Hall Institute of Medical Research

  38. Future Directions for LimmaGUI • additional plots to aid in quality assessment of a set of chips; • fitting of mixed linear models where there is technical replication; • fitting of mixed linear models where there is biological replication; • ? The Walter and Eliza Hall Institute of Medical Research

  39. Aknowledgments • James Wettenhall • Gordon Smyth • Ken Simpson • Terry Speed • Bioinformatics – many seminars on microarrays! The Walter and Eliza Hall Institute of Medical Research

  40. The Walter and Eliza Hall Institute of Medical Research

More Related