1 / 43

Data Management and Analysis issues in Microarray Data

Data Management and Analysis issues in Microarray Data. Aditya Phatak Persistent Systems Pvt. Ltd. http://www.persistent.co.in. Roadmap. Microarray technology basics Gene expression data analysis Microarray data management GeneChip Analysis Core at Washington University

juliannew
Download Presentation

Data Management and Analysis issues in Microarray Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Management and Analysis issues in Microarray Data Aditya Phatak Persistent Systems Pvt. Ltd. http://www.persistent.co.in Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  2. Roadmap • Microarray technology basics • Gene expression data analysis • Microarray data management • GeneChip Analysis Core at Washington University • Function Express (at WashU) Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  3. Microarray Technology Basics Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  4. Elementary Concepts Cell -> Chromosome -> DNA -> mRNA -> Proteins -> Function • Every cell of the body contains a full set of chromosomes and identical genes • Only a fraction of these genes are “switched on” or “expressed” • Gene expression is a highly complex and regulated process Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  5. Life Scientists Want to… • Identify genes that are involved in various diseases. • Find differentially expressed genes (“targets”) • Reveal new patterns of coordinated gene expression • Find co-regulated genes • Find genes responsible for “biological pathways”. • Uncover new categories of genes Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  6. DNA Microarrays • Microarrays allow biologists to analyze expression of hundreds of genes within a cell in a single experiment quickly and efficiently • Microarrays can be used to find gene expression within a single sample or compare gene expression from two different tissue samples – healthy and diseased tissue Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  7. DNA Microarrays: Technical Foundation • A set of unique probes (usually short, single-stranded DNA sequences) are immobilized as single spots on a solid surface (chemically modified glass chips) • mRNA is extracted from cell or tissue samples. • cDNA target is generated from the mRNA sample. This is labeled with fluorescent or radioactive dye (cy5 and cy3). • The target is incubated with the array, and each probe will bind its complementary target molecule if present. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  8. An example Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  9. A DNA Microarray Experiment • Prepare a DNA chip using chosen target DNAs • Generate a hybridization solution containing mixture of fluorescently labeled cDNAs • Hybridize mixture with DNA chip • Detect cDNA intensity using laser technology and store data in a computer • Analyze data using computational methods Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  10. Types of Microarrays • Two kind of samples are co-hybridized on the array (e.g. cDNA arrays) • Only one sample is hybridized and comparisons are made between arrays (e.g. Affymetrix oligonucliotide arrays) Need to deal with different data formats. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  11. Gene Expression Data Analysis Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  12. Issues With Output Data • Data Quality • Detect false positives from true positives • Replicate chips • Use independent methods to validate results • Dye effects • Position effects Replication is essential Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  13. Preprocessing Tasks • Adjusting data • Filter out genes that are not expressed in any experiments • Log Transform data: replace all data values X by log2(X) • Data Normalization • Intensities are scaled/normalized to a selected chip so that multiple chips can be compared • Uses data from a set of controls that have been “spiked” into the DNA and which has an avg. expression ratio of 1. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  14. Analysis Issues • Identify genes that are involved in various diseases. • Find differentially expressed genes (“targets”) • e.g. find genes that are overexpressed in 6 out of 7 tumor samples versus 8 out of 10 normal samples by five-fold or more • Reveal new patterns of coordinated gene expression • Find co-regulated genes • Find genes responsible for “biological pathways”. • Uncover new categories of genes Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  15. Data Mining: Extracting Meaningful Patterns • Data mining: extracting meaningful patterns • Supervised methods: You have apriori knowledge of the biological system and are looking for specific patterns e.g. Neighbourhood analysis, supervised tree harvesting • Unsupervised methods: Identify patterns that you couldn’t have necessarily been aware of beforehand. E.g. Hierarchical clustering, K-means clustering, SOM, PCA Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  16. Example of Hierarchical Clustering Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  17. Statistical Analysis • Ad hoc approaches (eg. ‘fold change’) do not consider variability of measurements • Gives more “sensitive” and “selective” analysis • Provides estimate of confidence that gene expression pattern observed would occur • Rank the genes by statistical scores Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  18. Microarray Data Management Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  19. Sharing Gene Expression Data • Goals • Facilitates comparisons between experiments • Improves analysis • Confidence in results • Conduct multivariate analysis of data generated by multiple researchers • Don’t penalize those who share Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  20. Tracking All Aspects of Microarray Experiments • An array experiment has many steps • RNA preparation • Array fabrication, Array platform • Scanner setting • Image Analysis • Use of integrated laboratory information management system (LIMS) • Common protocols and language for data sharing • MIAME: Minimal information about a microarry experiment (from MGED) Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  21. Sharing Paradigms • What to share • Raw images (TIFF) • Extracted raw spot intensity values with background measurements • Processed data such as avg. intensity values • List of genes that show clear differential expression Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  22. Protection of Intellectual Property • Most array experiments identify dozens of genes of interest, only a few of which can be studied by one lab • Some results might provide substantial intellectual property rights to Pharma companies Which data should be shared and when Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  23. GeneChip Analysis Core at Washington University Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  24. Architecture of GeneChip Core Image Format and Upload Image and Data Scan Gene Expression Database Web/ Application Server Wash Web-based Data Analysis Tools Hybridize probe to Array UniGene Locus Link GO Control Experiment DNA Samples Gene Annotation Databases Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  25. Function Express Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  26. Why Function Express? • Existing analysis software provide clustering algorithms • These software lack in gene annotation • It is not possible to visualize genes based on functional classification, chromosomal localization or tissue expression -- E.g. Give me genes that are transcription factors, are expressed in pancreas and are located on chromosome 1p31 • Integration of gene annotation with clustering techniques is vital to understanding the underlying biological process Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  27. Features of Function Express • Annotates genes on chips/experiment automatically • Annotation is updated periodically • Allows examination of gene expression across different experiments conducted on different arrays and on different species Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  28. Gene Annotation in Function Express • Provides annotation from UniGene and LocusLink and GO databases. • There databases are updated frequently • Uses Homologene database to get cross-species annotation Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  29. Cross-Species Investigation Seeing how genes that show differential expression in one experiment on an organism (say mice) correlate with genes from another experiment done in another organism (say human) • Find more about interesting genes • Validation Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  30. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  31. Microarray Data Schema Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  32. Gene Annotation Schema Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  33. Q2 Q1 Q3 View maintenance using MQO Experiment data Annotation data Append only Update deltas may or may not be available UniGene Locus Link GO Updated frequently Gene Annotation Databases Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  34. Screenshots of Function Express Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  35. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  36. The user enters an experiment name, chips included in the analysis along with an abscissa value and x-axis label for each chip in order to create an experiment. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  37. A comparison of raw (left panel) versus mean-standard deviation centered (right panel) data demonstrates that transformations reveal similar patterns of gene regulation Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  38. The query generator allows the user to create virtually any combination of logical queries, using a simple GUI interface. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  39. The Gene Inspector (A), Gene Annotation (B), Comments (C), and Chip data Inspector (D) windows are shown. • Each window is updated when the probe selection changes in the Spreadsheet window. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  40. Function Express Client Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  41. References • DJ Lockhart and EA Winzeler, Genomics, Gene Expression and DNA Arrays. Nature (2000) 405(6788):827-836. • The Chipping Forecast.http://www.nature.com/ng/chips_interstitial.html Nature Genetics published a special issue (January 1999 Supplement), The Chipping Forecast. It's a collection of more than 10 reviews (60 pages) on different aspects of microarray analysis. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  42. References… • John Quackenbush, Computational Analysis of Microarray Data. Nature Reviews (June 2001) Volume 2 • Kathleen Kerr and Gary Churchill, Statistical design and the analysis of gene expression microarray data. Genet. Res., Camb. (2001) 77: 123-128. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

  43. References… • Lot of Opinion/review articles from Nature (June 2001) Volume 2 • Microarray Gene Expression Database Group(MEGD) http://www.mged.org/ Home page for the organization that's trying to establish a data standard for microarray data. Persistent Systems Pvt. Ltd. http://www.persistent.co.in

More Related