Mdms a web tool to manage analyze gene expression microarray data
This presentation is the property of its rightful owner.
Sponsored Links
1 / 21

MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on
  • Presentation posted in: General

MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data. Sachin Mathur. Overview. Steps in analysis of Gene Expression Microarray Data Preprocessing Filtering Statistical Analysis Machine Learning & Data Mining (Clustering) Functional Analysis Data Analysis features in MDMS

Download Presentation

MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Mdms a web tool to manage analyze gene expression microarray data

MDMS-A Web Tool to Manage & Analyze Gene Expression Microarray Data

Sachin Mathur


Overview

Overview

  • Steps in analysis of Gene Expression Microarray Data

    • Preprocessing

    • Filtering

    • Statistical Analysis

    • Machine Learning & Data Mining (Clustering)

    • Functional Analysis

  • Data Analysis features in MDMS

  • Workflow in MDMS

  • Analysis of Early Lung Development dataset using MDMS

  • MDMS Demo


Steps in microarray data analysis

Image Quantification &

Quality Control

Preprocessing

Filtering

Statistical Analysis

Machine learning

Functional Analysis

Steps in Microarray Data Analysis

Analysis of Data ~ Deriving Knowledgebase from Datum and mining

Information from the knowledgebase


Steps in microarray data analysis1

Steps in Microarray Data Analysis

  • Image Quantification

    • Check for artifacts, Segmentation

    • Extraction of expression values of genes

  • Preprocessing

    • Background Correction

    • Normalization

    • Summarization

    • MAS5, RMA, GC-RMA, DChip

www.swegene.org/SWEGENE_microarray_eng.php?Id=18


Steps in microarray data analysis2

Steps in Microarray Data Analysis

  • Filtering

    • About 10%-50% of the genome is not expressed in a given tissue

    • Aim is to isolate the genes that are expressed

    • Also helps in more accuracy in statistical significance tests

    • Specific & Non-specific filtering

      • Filter of Presence/Absence calls

      • Filter on expression signal, Variability in gene expression


Steps in microarray data analysis3

Steps in Microarray Data Analysis

  • Statistical Analysis

    • Many genes will be expressed to perform many routine tasks in the cell

    • Aim is to isolate genes responsible for phenotypic variation

    • Interesting Vs Random

    • Variant significance tests ~ T-Test, ANOVA

    • Multiple Testing Correction


Steps in microarray data analysis4

Steps in Microarray Data Analysis

  • Machine Learning Approaches ~ Data Mining

    • Small changes in gene expressions can collectively regulate an important pathway, which by themselves may not be statistically significant

    • Limitations with fewer replicates and fitting approximate models on data during statistical analysis

    • Aim is to find significant patterns in the data set.

      • Periodic, Time-lagged, cyclic

    • Machine Learning approaches mine data for information ~ data mining using computational and statistical techniques (Eg Clustering)


Functional analysis

Functional Analysis

  • Functional Analysis

    • Given a statistically significant pattern or list significant of genes, how significant is it biologically?

    • Aim is to find genes that are responsible for the phenotypic condition

    • Extracting annotations and finding functionally similar genes.

      • Gene Ontology

    • Gene set enrichment, relating genes to known pathways

http://cardioserve.nantes.inserm.fr/ptf-puce/images/camembert_go.gif


Data analysis features in mdms

Data Analysis Features in MDMS

  • All data analysis features in MDMS are implemented through Bioconductor Package (http://www.bioconductor.org)

    • Covers many aspects of data analysis for Gene-Expression, SNP, Custom made arrays

    • Many different tests for quality control, preprocessing, filtering, statistical analysis, machine learning and functional analysis

    • Large user community, helpful mailing lists, used by many labs in many countries

    • Tutorials are available on the website and hands-on training is also available.

    • Better than all available packages in terms of coverage of data analysis aspects.

    • Open Source


Data analysis features in mdms1

Data Analysis Features in MDMS

  • MDMS supports Affymetrix Gene Expression arrays

  • No Image Quantification (usually done at microarray facility)

  • Quality Control

    • 3’/5’ bias

    • % Detection calls

    • Background signals

    • Correlation coefficients between arrays


Mdms preprocessing

MDMS - Preprocessing

  • Preprocessing

    • MAS5 – Default Affymetrix normalization

    • RMA – Robust Multichip Analysis

    • GC-RMA, DChip (Li-Wong)

    • MAS5 and RMA are highly recommended

    • Available literature shows significant advantages of RMA over MAS5


Mdms filtering

MDMS - Filtering

  • Filtering

    • Expression value cut-off

      • Eg. All genes > 200

    • Detection calls

      • Eg. All genes that are detected as Present

    • Fold Change

      • Eg. All genes that have > 2 fold or less than -2 fold

    • Inter-Quartile Range (1st & 3rd quartiles)

      • For genes that show higher variability

  • All analysis is done on a log 2 scale


Mdms statistical analysis

MDMS – Statistical Analysis

  • Significance Tests

    • LIMMA (Linear Models of Microarrays)

    • SAM (Significance Analysis of Microarrays)

    • EBAM (E-Bayes Analysis of Microarrays)

    • Correction for Multiple Testing

      • FDR, Bonferroni, Holm’s correction

  • Machine Learning

    • Clustering

      • Hierarchical Clustering, K-Means, Self Organizing Maps.


Mdms functional analysis

MDMS-Functional Analysis

  • Functional Analysis through GOAPhAR

    • Gene Annotation

    • Protein Annotation

    • Biological Pathways

    • Gene Ontology Annotation

    • Protein Interaction Evidence

  • All gene lists generated using the data analysis options can be saved in the database for future use. These can be also downloaded as text files.


Mdms workflow

MDMS-WORKFLOW

Microarray

Core

USER

Data Repository

Software

Rat2302, Hg133U

MDMS

Database

Preprocessing

Filtering

Statistical Analysis

Machine Learning

GOAPhAR

Annotation


Data analysis example

Data Analysis Example

  • Data set specifications (GSE3541)

  • The aim of the study is to find genes involved in early lung development.

  • Mechanical Stress was applied to fetal type II endothelial cells taken from 19 day old rat embryos

  • Data set Processing

    • Data was preprocessed by MAS5

    • Expression > 200, Invariant change between pairs of control & experiment samples > 50 (75% filtered)

    • SAM statistical method was used to find significant genes (92 genes, 63 up and 29 down-regulated)

    • 34 up-regulated genes were selected for further analysis


Biological significance of clusterings

Biological Significance of Clusterings

  • K-Means was applied to 34 genes, with K=2, 3, 4, ….,29

  • Random clusterings were generated for K = 2,3,4,…29 to compare the statistical clusterings to random

  • Biological significance scores were calculated for all clusterings.

  • A z-score and P-value was calculated for each K value


Biological significance of clusterings1

Biological Significance of Clusterings

  • The study found that genes related to amino acid synthesis, amino acid transport and sodium ion transport contributed to lung development.

  • 1 gene for sodium ion transport

  • 4 genes for amino acid transport were found in 2 clusters

  • 4 genes for amino acid synthesis were found in 2 clusters


Mdms a web tool to manage analyze gene expression microarray data

MDMS

  • Demonstration - Using MDMS to analyze data


Mdms a web tool to manage analyze gene expression microarray data

MDMS

  • Questions, comments, suggestions


  • Login