ENCODE Gene Prediction Workshop - 2005
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

CSTminer PowerPoint PPT Presentation


  • 54 Views
  • Uploaded on
  • Presentation posted in: General

ENCODE Gene Prediction Workshop - 2005. CSTminer. G. Pesole (F. Mignone). CSTminer and CPS computation I. CPS compuation. CSTminer: - compares evolutionary related sequences identifies Conserved Sequence Tags – CSTs

Download Presentation

CSTminer

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Cstminer

ENCODE Gene Prediction Workshop - 2005

CSTminer

G. Pesole

(F. Mignone)


Cstminer

CSTminer and CPS computation I

CPS compuation

  • CSTminer:

  • - compares evolutionary related sequences

  • identifies Conserved Sequence Tags – CSTs

  • assigns a Coding Potential Score (CPS) upon the quantification of a peculiar evolutionary dynamics of coding sequences at both coding and aminoacid level.

Homologous sequences

BLAST-like

alignment

HSP


Cstminer

Definition of CPS cutoff I

% CSTs

CPS

24,600 CSTs (≥ 5%)

Average CPS = 8.32 (± 0.99)

184,046 CSTs (≥ 5%)

Average CPS = 5.43 (± 0.79)


Cstminer

Definition of CPS cutoff II

Less 1%

Coding

CPS≤6.41

Less 1%

Non coding

CPS≥7.66

6.41

7.66

H-COD

L-COD

Non coding

CPS


Cstminer

Prediction of “novel” human genes by comparing mouse synthenic regions of Chr15, Chr21 and Chr22

H.sapiens Chr 22

H.sapiens Chr 21

H.sapiens Chr 15


Cstminer

CST annotation

L-COD

H-COD


Cstminer

CST annotation

Intergenic

CST

Intronic

CST

Exonic

CST

Exon1

Exon3

Exon2

Known gene


Cstminer

Genome annotation of Coding CSTs

984 coding CSTs in intergenic regions, 423 CSTs in intronic regions


Cstminer

Clustered intergenic/coding CSTs may represent novel genes

≥ 4 clustered coding CSTs (>90% genes)

Typical gene (average L: 57 kbp)


Cstminer

Cluster Definition

preclusters

Step I :preclusters definition

CSTstart i

CSTstarti+1 …

genomic sequence

pc

pc

pc

pc

pc

Step II :clusters building

genomic sequence


Cstminer

Clustered intergenic/coding CSTs:

Supporting features

-> 301 Clustered CSTs (out of 984 intergenic CSTs)

-> 25 Clusters

20/25 Genscan/twinscan

20/25 RefSeq, Trembl, Unigene

18/25 ESTs

19/25 Mouse ensembl genes

11/25 Human ensembl genes (new release)

4 unsupported clusters


Cstminer

CST cluster

15P1 corresponds to a newly annotated gene


Cstminer

Intronic CSTs may represent novel splicing isoforms

CST 22_E_936


Cstminer

Conclusion

  • What CSTminer does:

  • Detects Coding-conserved regions

  • With CST clustering is possible to detect coding gene regions

  • May support any other kind of gene predictions

  • May identify splice variants

What CSTminer doesn’t do:

-Doesn’t detect gene structure and exon boundaries

-May merge proximal genes

What we can do next:

-improve cutoff definition

-multi species comparison

-improve clustering definition

Pros

-No annotation or known sequences (mRNAs, ESTs…) required

-Easy to automatize

-No manual work

-very fast


Cps computation ii

CPS computation - II

for f = 1, 2, 3, -1, -2, -3

The CPS computation requires a given amount of genetic divergence between aligned CSTs (i.e. Ka>0 & Ks>0). But, which is the minimum divergence to obtain a reliable CPS? … and what the CPS cutoff to discriminate coding from non coding?


Cstminer

Definition of CST minimum divergence

To assess the minimum divergence for reliable CPS computation and to define optimal cutoff values we used two benchmark datasets (coding and non coding).

5 %

3

2

% of divergence


  • Login