Simple methods for peak detection in times series microarray data
Download
1 / 58

Simple Methods for Peak Detection in Times Series Microarray Data - PowerPoint PPT Presentation


  • 393 Views
  • Uploaded on

Simple Methods for Peak Detection in Times Series Microarray Data. I. Azzini R. Dell’Anna F. Ciocchetta F. Demichelis A. Sboner Bioinformatics Group, SRA, ITC-Irst Department of Information and C.T. Trento University, Italy E. Blanzieri A. Malossini

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Simple Methods for Peak Detection in Times Series Microarray Data' - belinda


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Simple methods for peak detection in times series microarray data l.jpg

Simple Methods for Peak Detection in Times Series Microarray Data

I. Azzini R. Dell’Anna

F. Ciocchetta F. Demichelis

A. Sboner

Bioinformatics Group, SRA, ITC-Irst Department of Information and C.T.Trento University, Italy

E. Blanzieri A. Malossini

Department of Information and Communication Technology

Trento University


Preliminary analysis l.jpg
Preliminary Analysis Data

  • Visual inspection of images

    • There are blurs on the images

  • We used alternative sw for image analysis

    • TIGR SpotFinder

    • Scananalyse

  • We reapplied the GenePixPro 3.0 quality control algorithm on a sample of images

  • Conclusions

    • From preliminary analysis did not emerged evidence againts reliability of original measures.

    • use QC_Dataset for further analysis


Our analysis problem l.jpg
Our analysis problem Data

  • To detect and characterize genes that present peaks and singularities over the time series.

  • Motivations:

    • Primary: Peak genes could play an intriguing role

    • Secondary: artifacts detection


Our approach l.jpg
Our approach Data

  • Detection of spike genes

    • Apply a series of simple methods based on discrete derivative and integral.

  • Characterization of genes

    • Functional Classification using Multiclass SVM


Outline of the talk l.jpg
Outline of the talk Data

  • Preliminary analysis

  • The analysis problem

  • Methods for peak detection

  • M-SVM for oligo classification

  • Results

  • Discussion


Qc dataset l.jpg
QC_Dataset Data

Our notation:

X0,t=E(o,t)


Missing value managment data imputing l.jpg
Missing value managment Data(data imputing)

  • Up to 2 adjacent missing values were replaced by interpolation

  • Oligos with more adjacent missing values were discarded

  • Extrapolation for TP1 and TP48 (For functional classification)


Methods for peak detection l.jpg
Methods for peak detection Data

None of the methods is

100% precise and

100% accurate



Methods for peak detection10 l.jpg
Methods for peak detection Data

The combination of M1, M2 and M3 are less

prone to detect ramps

Instead of peaks




Detection procedure l.jpg
Detection procedure Data

  • Each method M1-M6 scores the oligos.

  • We selected the oligos that were ranked among the first ten by at least one method


Detection procedure15 l.jpg
Detection procedure Data

  • We discarded oligos whose signal to noise ratio is less of 2

    • The S/N ratio is higher w.r.t. the one adopted in original work

    • We need such a filter to discard extremely noisy signals

  • We visual inspect all the oligos of the table and discarded the ones that does not present peaks


Detection procedure selected genes l.jpg

opfblob0072 Data

n128_25

f65819_1

m364_2

m12963_1

n159_34

ks244_7

n128_61

opfm60504

l1_28 ET

ks75_15 ET

c154

b593

b597

n176_5

opfh0008

opfblob0105

b541

n132_108

m50253_2

ks1030_4 OM

n128_33

f71224_1

opfh0022

e17542_1

Detection procedure:Selected genes


Functional classification m svm l.jpg
Functional Classification (M-SVM) Data

  • Multiclass Support Vector Machine

    • Pairwise classification (N-1)*N/2 classifiers for N classes.

    • Majority vote

  • Schema for replacement of missing values

  • Trained on data of Table S2

    • 530 samples and 14 functional classes

    • LOO accuracy is 73%

  • We applied the classifiers to the complete_dataset and scored the results depending on the voting.









Significant peaks or artifacts l.jpg
Significant peaks or artifacts? Data

  • We tested:

    • Data Quality (from preliminary analysis)

    • We discarded oligos with low signal to noise ratios

    • The peaks have different width and amplitude (not consistent with synchronization induced artifact)


How are the peaks distributed over time l.jpg
How are the peaks Datadistributed over time?

  • Plasmodium falciparum has different phases during the 48 hours cycle IDC (Ring, Trophozoide, Schizont)

  • The peaks that we detected seems to concetrate in specific time points.

  • We used Kolgomorov-Smirnov test for ruling out uniform distribution


How are the peaks distributed over time27 l.jpg
How are the peaks Datadistributed over time?



Discussion l.jpg
Discussion Data

  • The peaks do not distributed uniformely over time

  • There is a (possibly) interesting high number of peaks near a transition phase.


Conclusions l.jpg
Conclusions Data

  • We presented

    • Methods for peak detection

    • Functonal classificaton via M-SVM

  • The peaks do not distribute uniformely over time


Slide31 l.jpg

Azzini* R. Dell’Anna Data

F. Ciocchetta F. Demichelis

A. Sboner

Bioinformatics Group, SRA, ITC-Irst Department of Information and C.T.Trento University, Italy

E. Blanzieri A. Malossini

Department of Information and Communication Technology

University of Trento


Biological interpretation l.jpg
Biological Interpretation Data

  • Critical issue about our analysis


Selected genes l.jpg

opfblob0072 Data

n128_25

f65819_1

m364_2

m12963_1

n159_34

ks244_7

n128_61

opfm60504

l1_28

ks75_15

c154

b593

b597

n176_5

opfh0008

opfblob0105

b541

n132_108

m50253_2

ks1030_4

n128_33

f71224_1

opfh0022

e17542_1

opfblob0072

n128_25

f65819_1

m364_2

m12963_1

n159_34

ks244_7

n128_61

opfm60504

l1_28

ks75_15

c154

b593

b597

n176_5

opfh0008

opfblob0105

b541

n132_108

m50253_2

ks1030_4

n128_33

f71224_1

opfh0022

e17542_1

Selected genes



N128 25 l.jpg
n128_25 Data


F65819 1 l.jpg
f65819_1 Data


M364 2 l.jpg
m364_2 Data


M12963 1 l.jpg
m12963_1 Data


N159 34 l.jpg
n159_34 Data


Ks244 7 l.jpg
ks244_7 Data


N128 61 l.jpg
n128_61 Data


Opfm60504 l.jpg
opfm60504 Data


L1 28 l.jpg
l1_28 Data


Ks75 15 l.jpg
ks75_15 Data


Slide45 l.jpg
c154 Data


Slide46 l.jpg
b593 Data


Slide47 l.jpg
b597 Data


N176 5 l.jpg
n176_5 Data


Opfh0008 l.jpg
opfh0008 Data



Slide51 l.jpg
b541 Data


N132 108 l.jpg
n132_108 Data


M50253 2 l.jpg
m50253_2 Data


Ks1030 4 l.jpg
ks1030_4 Data


N128 33 l.jpg
n128_33 Data


F71224 1 l.jpg
f71224_1 Data


Opfh0022 l.jpg
opfh0022 Data


E17542 1 l.jpg
e17542_1 Data


ad