mdms a web tool to manage analyze gene expression microarray data
Download
Skip this Video
Download Presentation
MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data

Loading in 2 Seconds...

play fullscreen
1 / 21

MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data - PowerPoint PPT Presentation


  • 128 Views
  • Uploaded on

MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data. Sachin Mathur. Overview. Steps in analysis of Gene Expression Microarray Data Preprocessing Filtering Statistical Analysis Machine Learning & Data Mining (Clustering) Functional Analysis Data Analysis features in MDMS

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data' - frieda


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
overview
Overview
  • Steps in analysis of Gene Expression Microarray Data
    • Preprocessing
    • Filtering
    • Statistical Analysis
    • Machine Learning & Data Mining (Clustering)
    • Functional Analysis
  • Data Analysis features in MDMS
  • Workflow in MDMS
  • Analysis of Early Lung Development dataset using MDMS
  • MDMS Demo
steps in microarray data analysis

Image Quantification &

Quality Control

Preprocessing

Filtering

Statistical Analysis

Machine learning

Functional Analysis

Steps in Microarray Data Analysis

Analysis of Data ~ Deriving Knowledgebase from Datum and mining

Information from the knowledgebase

steps in microarray data analysis1
Steps in Microarray Data Analysis
  • Image Quantification
    • Check for artifacts, Segmentation
    • Extraction of expression values of genes
  • Preprocessing
    • Background Correction
    • Normalization
    • Summarization
    • MAS5, RMA, GC-RMA, DChip

www.swegene.org/SWEGENE_microarray_eng.php?Id=18

steps in microarray data analysis2
Steps in Microarray Data Analysis
  • Filtering
    • About 10%-50% of the genome is not expressed in a given tissue
    • Aim is to isolate the genes that are expressed
    • Also helps in more accuracy in statistical significance tests
    • Specific & Non-specific filtering
      • Filter of Presence/Absence calls
      • Filter on expression signal, Variability in gene expression
steps in microarray data analysis3
Steps in Microarray Data Analysis
  • Statistical Analysis
    • Many genes will be expressed to perform many routine tasks in the cell
    • Aim is to isolate genes responsible for phenotypic variation
    • Interesting Vs Random
    • Variant significance tests ~ T-Test, ANOVA
    • Multiple Testing Correction
steps in microarray data analysis4
Steps in Microarray Data Analysis
  • Machine Learning Approaches ~ Data Mining
    • Small changes in gene expressions can collectively regulate an important pathway, which by themselves may not be statistically significant
    • Limitations with fewer replicates and fitting approximate models on data during statistical analysis
    • Aim is to find significant patterns in the data set.
      • Periodic, Time-lagged, cyclic
    • Machine Learning approaches mine data for information ~ data mining using computational and statistical techniques (Eg Clustering)
functional analysis
Functional Analysis
  • Functional Analysis
    • Given a statistically significant pattern or list significant of genes, how significant is it biologically?
    • Aim is to find genes that are responsible for the phenotypic condition
    • Extracting annotations and finding functionally similar genes.
      • Gene Ontology
    • Gene set enrichment, relating genes to known pathways

http://cardioserve.nantes.inserm.fr/ptf-puce/images/camembert_go.gif

data analysis features in mdms
Data Analysis Features in MDMS
  • All data analysis features in MDMS are implemented through Bioconductor Package (http://www.bioconductor.org)
    • Covers many aspects of data analysis for Gene-Expression, SNP, Custom made arrays
    • Many different tests for quality control, preprocessing, filtering, statistical analysis, machine learning and functional analysis
    • Large user community, helpful mailing lists, used by many labs in many countries
    • Tutorials are available on the website and hands-on training is also available.
    • Better than all available packages in terms of coverage of data analysis aspects.
    • Open Source
data analysis features in mdms1
Data Analysis Features in MDMS
  • MDMS supports Affymetrix Gene Expression arrays
  • No Image Quantification (usually done at microarray facility)
  • Quality Control
    • 3’/5’ bias
    • % Detection calls
    • Background signals
    • Correlation coefficients between arrays
mdms preprocessing
MDMS - Preprocessing
  • Preprocessing
    • MAS5 – Default Affymetrix normalization
    • RMA – Robust Multichip Analysis
    • GC-RMA, DChip (Li-Wong)
    • MAS5 and RMA are highly recommended
    • Available literature shows significant advantages of RMA over MAS5
mdms filtering
MDMS - Filtering
  • Filtering
    • Expression value cut-off
      • Eg. All genes > 200
    • Detection calls
      • Eg. All genes that are detected as Present
    • Fold Change
      • Eg. All genes that have > 2 fold or less than -2 fold
    • Inter-Quartile Range (1st & 3rd quartiles)
      • For genes that show higher variability
  • All analysis is done on a log 2 scale
mdms statistical analysis
MDMS – Statistical Analysis
  • Significance Tests
    • LIMMA (Linear Models of Microarrays)
    • SAM (Significance Analysis of Microarrays)
    • EBAM (E-Bayes Analysis of Microarrays)
    • Correction for Multiple Testing
      • FDR, Bonferroni, Holm’s correction
  • Machine Learning
    • Clustering
      • Hierarchical Clustering, K-Means, Self Organizing Maps.
mdms functional analysis
MDMS-Functional Analysis
  • Functional Analysis through GOAPhAR
    • Gene Annotation
    • Protein Annotation
    • Biological Pathways
    • Gene Ontology Annotation
    • Protein Interaction Evidence
  • All gene lists generated using the data analysis options can be saved in the database for future use. These can be also downloaded as text files.
mdms workflow
MDMS-WORKFLOW

Microarray

Core

USER

Data Repository

Software

Rat2302, Hg133U

MDMS

Database

Preprocessing

Filtering

Statistical Analysis

Machine Learning

GOAPhAR

Annotation

data analysis example
Data Analysis Example
  • Data set specifications (GSE3541)
  • The aim of the study is to find genes involved in early lung development.
  • Mechanical Stress was applied to fetal type II endothelial cells taken from 19 day old rat embryos
  • Data set Processing
    • Data was preprocessed by MAS5
    • Expression > 200, Invariant change between pairs of control & experiment samples > 50 (75% filtered)
    • SAM statistical method was used to find significant genes (92 genes, 63 up and 29 down-regulated)
    • 34 up-regulated genes were selected for further analysis
biological significance of clusterings
Biological Significance of Clusterings
  • K-Means was applied to 34 genes, with K=2, 3, 4, ….,29
  • Random clusterings were generated for K = 2,3,4,…29 to compare the statistical clusterings to random
  • Biological significance scores were calculated for all clusterings.
  • A z-score and P-value was calculated for each K value
biological significance of clusterings1
Biological Significance of Clusterings
  • The study found that genes related to amino acid synthesis, amino acid transport and sodium ion transport contributed to lung development.
  • 1 gene for sodium ion transport
  • 4 genes for amino acid transport were found in 2 clusters
  • 4 genes for amino acid synthesis were found in 2 clusters
slide20
MDMS
  • Demonstration - Using MDMS to analyze data
slide21
MDMS
  • Questions, comments, suggestions
ad