slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Gene Expression meets Gene Ontology: A novel statistical method for Microarray analysis PowerPoint Presentation
Download Presentation
Gene Expression meets Gene Ontology: A novel statistical method for Microarray analysis

Loading in 2 Seconds...

play fullscreen
1 / 27

Gene Expression meets Gene Ontology: A novel statistical method for Microarray analysis - PowerPoint PPT Presentation


  • 260 Views
  • Uploaded on

Gene Expression meets Gene Ontology: A novel statistical method for Microarray analysis. Vasanth Singan Advisors: Dr. John Colbourne & Dr. Haixu Tang. OUTLINE Introduction Background Challenge Previous Work Methodology Results Future Works. INTRODUCTION.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Gene Expression meets Gene Ontology: A novel statistical method for Microarray analysis' - myrna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Gene Expression meets Gene Ontology:

A novel statistical method for

Microarray analysis

Vasanth Singan

Advisors: Dr. John Colbourne & Dr. Haixu Tang

slide2

OUTLINE

  • Introduction
  • Background
  • Challenge
  • Previous Work
  • Methodology
  • Results
  • Future Works
introduction
INTRODUCTION
  • Gene expression profiling is providing breakthroughs in medical and fundamental biology research.
  • Many statistical approaches have been developed to analyze microarray results and identify genes that are regulated under experimental conditions.
  • Most of the statistical approaches do not consider the existing biological knowledge. We explore the possibility of using existing knowledge to improve the analysis.
background
BACKGROUND

Microarray:

1.cDNA or Spotted Array

2. High Density Oligonucleotide Array

drosophila microarray experiment

ovootuSxl

Drosophila Microarray Experiment

The Drosophila gene called ovo(shavenbaby) is required in the germline for sex-determination and female specific germline viability and differentiation

OVO regulates its own transcription and the transcription of the gene out

OVO-B is a transcriptional activator and is sufficient for female fertility

OVO-A is a transcriptional repressor, which when miss-expressed, results in dominant-negative female sterility

drosophila microarray experiment6
Drosophila Microarray Experiment
  • Goal - To identify additional genes in the germline pathway by probing for both direct and indirect targets of ovo using microarrays
  • This microarray analysis searched for differentially expressed genes in dissected ovaries from ovo mutants compared to wildtype.
  • microarrays are printed with ~15k spots - PCR Primers designed by Incyte Genomics amplify 93% of genes in annotation version 1.0 and 75% in version 3.1
slide7

Significance Analysis of Microarrays (SAM)

SAM computes a statistic di for each gene i, measuring the strength of the relationship between gene expression.

It uses repeated permutations of the data to determine significance.

SAM produces ranked list of genes based on the expression levels.

Problem : Most of the statistical analyses treat each gene independently of each other, but in reality, genes are co-regulated and there are plenty of examples where individual genes do not meet statistical cut-off values yet may be significant if expression profiles are measured as a group.

challenge
CHALLENGE

How to integrate existing knowledge about gene relations to improve tests of significance in microarray analysis ?

previous work
Previous Work

1. Sung Geun Lee, Jung Uk Hur, and Yang Seok Kim

A graph-theoretic modeling on GO space for biological interpretation of gene clusters Bioinformatics Advance Access published on January 22, 2004

Bioinformatics 2004 20: 381-388.

2. Barry R Zeeberg, et. al.,GoMiner: a resource for biological interpretation of genomic and proteomic data, Genome Biology 2003.

3. Sung Geun Lee, Wan Seon Lee, Yang Seok Kim

GOODIES: GO Based Data Mining Tool for Characteristic Attribute Interpretation on a Group of Biological Entities Genome Informatics 14: 675-676 (2003).

4. Boris Adryan and Reinhard Schuh

Gene ontology-based clustering of gene expression data

Bioinformatics Advance Access published on April 29, 2004.

5. Peter N. Robinson, Andreas Wollstein, Ulrike Böhme, and Brad Beattie

Ontologizing gene-expression microarray data: characterizing clusters with Gene Ontology Bioinformatics

Advance Access published on February 5, 2004.

slide11

Gene Ontology (GO)

GO:01

Biological Process

GO:02

Development

GO:03

Behavior

. . . .

. . . .

. . . .

GO:04

Cell differentiation

GO:05

Locomotory behavior

GO:06

Reproductive behavior

  • Structured, controlled vocabularies (ontologies)
  • DAG (Directed Acyclic Graph)

Node A

Is_a / Part_of

Node B

annotation

GO:01

Genes a, b, c, d, k, l

GO:02

Genes d, k, l

GO:03

Genes a, b,c, d

GO:06

Genes a, c

GO:04

Gene k

GO:05

Gene d

Annotation
slide14

Methodology

Ranked list of genes from SAM

Gene Ontology DAG nodes

Gene 01

Gene 02

Gene 03

Gene 04

.

.

.

.

.

.

Gene n

Node 01

Node 02

Node 03

Node 04

.

.

.

.

.

.

.

Node m

iterative refinement
Iterative Refinement

Rank List of Genes from SAM

Task I

Compute significance of

GO Nodes

N iterations

Task II

Compute significance of

Genes

Ranked List of Genes and Nodes

methodology i log likelihood

Task - I

For each Node N, find the Log Likelihood & probability of it being differentially expressed.

Task - II

For each gene i, find the posterior probability of it being differentially expressed.

Methodology – I(Log-Likelihood)
inferences from methodology i
Inferences from Methodology - I
  • Test against scrambled input shows marginal significance.
  • The distribution of probabilities of genes within a node are not significantly different from scrambled data set.
  • Noise is high in lowly expressed genes.
  • Nodes with too few genes or too many genes are affected by the relatively less proportion of significant genes.
methodology ii rank based permutation test

Task - I

For each Node N, find the E-value based on the average rank of genes.

Task - II

For each gene i, find the posterior probability based on E-value of the nodes.

Methodology- II(Rank Based Permutation Test)
drosophila microarray experiment23
Drosophila Microarray Experiment

RANKED LIST OF GENES

RANKED LIST OF NODES

slide24

RESULTS

1. Functional categories (GO nodes) that are enriched with genes

which are up-regulated / down-regulated.

2. A ranked list of genes with associated scores representing how significantly these genes are up-regulated / down-regulated.

future works
FUTURE WORKS
  • Cut-off value for genes without GO annotations
  • Jack-knife analysis
  • Analyze additional data sets
acknowledgements
ACKNOWLEDGEMENTS

Dr. John Colbourne (CGB)

Dr. Haixu Tang

Center for Genomics and Bioinformatics

Genome Informatics Laboratory