1 / 20

Microarray Data Analysis Day 2

Microarray Data Analysis Day 2. Microarray Data Process/Outline. Experimental Design Image Analysis – scan to intensity measures (raw data) Normalization – “clean” data More “low level” analysis-fold change, ANOVA, (Z-score) --data filtering Data mining-how to interpret > 6000 measures

donoma
Download Presentation

Microarray Data Analysis Day 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Microarray Data Analysis Day 2

  2. Microarray Data Process/Outline • Experimental Design • Image Analysis – scan to intensity measures (raw data) • Normalization – “clean” data • More “low level” analysis-fold change, ANOVA, (Z-score) --data filtering • Data mining-how to interpret > 6000 measures • Databases • Software • Techniques-clustering, pattern recognition etc. • Comparing to prior studies, across platforms? • Validation *

  3. Today we will be using Spotfire software to filter and search your data. 10928 records in Spotfire -5999 S. pombe specific -166 Affy controls5763 S. cerevisiae specific

  4. 6603 4377 1407 819 The Affy detection oligonucleotide sequences are frozen at the time of synthesis, how does this impact downstream data analysis?

  5. Biology and Data Mining

  6. Subcellular Localization, Provides a simple goal for genome-scale functional prediction Determine how many of the ~6000 yeast proteins go into each compartment

  7. Subcellular Localization, a standardized aspect of function Cytoplasm Nucleus Membrane ER Extra-cellular[secreted] Golgi Mitochondria

  8. "Traditionally" subcellular localization is "predicted" by sequence patterns Cytoplasm NLS Nucleus Membrane TM-helix ER HDEL Extra-cellular[secreted] Golgi Import Sig. Mitochondria Sig. Seq.

  9. Subcellular localization is associated with the level of gene expression [Expression Level in Copies/Cell] Cytoplasm Nucleus Membrane ER Extra-cellular[secreted] Golgi Mitochondria

  10. Combine Expression Information & Sequence Patterns to Predict Localization [Expression Level in Copies/Cell] Cytoplasm NLS Nucleus Membrane TM-helix ER HDEL Extra-cellular[secreted] Golgi Import Sig. Mitochondria Sig. Seq.

  11. Epigenetics RNA editing Post-translational modification Translational regulation Major Objective: Discover a comprehensive theory of life’s organization at the molecular level • The major actors of molecular biology: the nucleic acids, DeoxyriboNucleic Acid (DNA) and RiboNucleic Acids (RNA) • The central dogma of molecular biology??? Proteins are very complicated molecules with 20 different amino acids.

  12. Biology Application Domain Validation Data Analysis Microarray Experiment Image Analysis Data Mining Experiment Design and Hypothesis Data Warehouse Artificial Intelligence (AI) Knowledge discovery in databases (KDD) Statistics

  13. Higher LevelMicroarray data analysis • Clustering and pattern detection • Data mining and visualization • Linkage between gene expression data and gene sequence/function/metabolic pathways databases • Discovery of common sequences in co-regulated genes • Meta-studies using data from multiple experiments

  14. Scatter plot of all genes in a simple comparison of two control (A) and two treatments (B: high vs. low glucose) showing changes in expression greater than 2.2 and 3 fold.

  15. Types of Clustering • Herarchical • Link similar genes, build up to a tree of all • Self Organizing Maps (SOM) • Split all genes into similar sub-groups • Finds its own groups (machine learning)

  16. Cluster by color/expression difference

  17. Self Organizing Maps

  18. Public Databases • Gene Expression data is an essential aspect of annotating the genome • Publication and data exchange for microarray experiments • Data mining/Meta-studies • Common data format - XML • MIAME (Minimal Information About a Microarray Experiment)

  19. The 3 Gene Ontologies • Molecular Function = elemental activity/task • the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity • Biological Process = biological goal or objective • broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions • Cellular Component= location or complex • subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme

  20. One Last Note • Microarrays are “cutting edge” technology • You now have experience doing a technique that most Ph.D.s have never done • Looks great on a resume…

More Related