learning rule based models from gene expression time profiles annotated with gene ontology terms n.
Download
Skip this Video
Download Presentation
Learning rule-based models from gene expression time profiles annotated with Gene Ontology terms

Loading in 2 Seconds...

play fullscreen
1 / 28

Learning rule-based models from gene expression time profiles annotated with Gene Ontology terms - PowerPoint PPT Presentation


  • 107 Views
  • Uploaded on

Learning rule-based models from gene expression time profiles annotated with Gene Ontology terms. Jan Komorowski and Astrid Lägreid. Joint work with. Torgeir R. Hvidsten, Herman Midelfart, Astrid Lægreid and Arne K. Sandvik. Selected Challenges in Gene-expression Analysis.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Learning rule-based models from gene expression time profiles annotated with Gene Ontology terms' - zonta


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
learning rule based models from gene expression time profiles annotated with gene ontology terms

Learning rule-based models from gene expression time profiles annotated with Gene Ontology terms

Jan Komorowski and

Astrid Lägreid

joint work with
Joint work with
  • Torgeir R. Hvidsten, Herman Midelfart, Astrid Lægreid and Arne K. Sandvik

J. Komorowski and A. Lägreid

selected challenges in gene expression analysis
Selected Challenges in Gene-expression Analysis
  • Function similarity corresponds to expression similarity but:
    • Functionally corelated genes may be expression-wise dissimilar (e.g. anti-coregulated)
    • Genes usually have multiple function
    • Measurements may be approximate and contradictory
  • Can we obtain clusters of biologically related genes?
  • Can we build models that classify unknown genes to functional classes, that are human legible, and that handle approximate and often contradictory data?
  • How can we re-use biological knowledge?

J. Komorowski and A. Lägreid

slide4
Data
  • Data material
    • Serum starved fibroblasts, 8,613 genes
      • Added serum to medium at time = 0
      • Used starved fibroblasts as reference
      • Measured gene activity at various time points
    • 493 genes found to be differentially expressed
  • Results
    • 278 genes known (3 repeats)
    • 212 genes unknown, (uncharacterized)
    • 211 genes given hypothetical function with 88% quality

J. Komorowski and A. Lägreid

fibroblast serum response

0

1

4

8

24

quiescent

non-proliferating

proliferating

Fibroblast - serum response

samples for

microarray

analysis

serum

J. Komorowski and A. Lägreid

processes

0

1

4

8

24

quiescent

non-proliferating

proliferating

Processes

re-entry

cell cycle

stress response

protein synthesis

organelle

biogenesis

transcription

cell

motility

lipid synthesis

J. Komorowski and A. Lägreid

dynamic processes

0

1

4

8

24

quiescent

non-proliferating

proliferating

Dynamic processes

delayed

immediate

early

late

immediate

early

intermediate

primary

secondary

tertiary

J. Komorowski and A. Lägreid

protein appears after the transcript

0

1

4

8

24

quiescent

non-proliferating

Protein appears after the transcript

primary

secondary

tertiary

proliferating

J. Komorowski and A. Lägreid

protein dynamics are not always similar to transcript dynamics

0

1

4

8

24

Protein dynamics are not always similar to transcript dynamics

gene

transcript

protein

J. Komorowski and A. Lägreid

molecular mechanisms of transcriptional response
Molecular mechanisms of transcriptional response

serum

= signal

effectors

= cellular

response

secondary

transcription

factors

immediate early

response factors

intermediate/late

response genes

delayed

immediate early

response genes

immediate early

response genes

J. Komorowski and A. Lägreid

slide11

The dynamics of cellular processes

stress response

cell motility

cell adhesion

DNA synthesis

energy metabolism

protein synthesis

cell cycle regulation

1

4

8

24

DNA synthesis

cell motility

lipid synthesis

cell proliferation, negative regulation

quiescent

non-proliferating

proliferating

J. Komorowski and A. Lägreid

methodology
Methodology

1. Mining functional classes from an ontology

2. Extracting features for learning

3. Inducing minimal decision rules using rough sets

0 - 4(Increasing) AND 6 - 10(Decreasing)

AND 14 - 18(Constant) => GO(cell proliferation)

!

4. The function of unknown genes is predicted using the rules

J. Komorowski and A. Lägreid

gene ontology
Gene Ontology

J. Komorowski and A. Lägreid

biological processes from go
Biological processes from GO

Amino acid and derivative metabolism

Protein targeting

Energy pathways

DNA metabolism

Lipid metabolism

Transport

Ion hemostasis

Intracellular traffic

Organelle organization and biogenesis

Cell death

Cell motility

Stress response

Cell surface receptor linked signal transduction

Oncogenesis

Cell cycle

Cell adhesion

Intracellular signaling cascade

Developmental processes

Blood coagulation

Circulation

J. Komorowski and A. Lägreid

hierchical clustering of the fibroblast data
Hierchical Clustering of the Fibroblast Data

It’s not a cluster!

J. Komorowski and A. Lägreid

template based feature synthesis
Template-based feature synthesis

12 measurement points, 55 possible intervals of length >2

J. Komorowski and A. Lägreid

examples of template definitions
Examples of template definitions

J. Komorowski and A. Lägreid

rule example 1
Rule example 1

J. Komorowski and A. Lägreid

rule example 2
Rule example 2

J. Komorowski and A. Lägreid

classification using template based rules
Classification using template-based rules

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF 0 - 4(Constant) AND 0 - 10(Increasing) THEN GO(prot. met. and mod.) OR …

IF … THEN

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

IF … THEN …

+4

Votes are normalized and processes with vote fractions higher than a selection-threshold are chosen as predictions

J. Komorowski and A. Lägreid

cross validation estimates iyer et al
Cross validation estimates Iyer et al.

A:

Coverage: 84%

Precision: 50%

B:

Coverage: 71%

Precision: 60%

C:

Coverage: 39%

Precision: 90%

Coverage = TP/(TP+FN)

Precision = TP/(TP+FP)

J. Komorowski and A. Lägreid

cross validation estimates cho et al
Cross validation estimates Cho et al.

Coverage: 58%

Precision: 61%

Coverage = TP/(TP+FN)

Precision = TP/(TP+FP)

J. Komorowski and A. Lägreid

protein metabolism and modification
Protein Metabolism and Modification

A

B

C

D

E

A – annotations

B – false negatives

C – false positives

D – true positives

E – pred. unknown gene

J. Komorowski and A. Lägreid

re classification of the known genes
Re-classification of the Known Genes

J. Komorowski and A. Lägreid

co classifications for the unknown genes
Co-classifications for the Unknown Genes

J. Komorowski and A. Lägreid

conclusions
Conclusions
  • Our methodology
    • Incorporates background biological knowledge
    • Handles well the noise and incompleteness in the microarray data
    • Can be objectively evaluated
    • Predicts multiple functions per gene
    • Can reclassify known genes and provide possible new functions of the known genes
    • Can provide hypotheses about the function of unknown genes
  • Experimental work needs to be done to confirm our predictions

J. Komorowski and A. Lägreid