Computational identification of promoters
Download
1 / 19

Computational identification of promoters and first exons in the human genome - PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on

Computational identification of promoters and first exons in the human genome. Ramana V Davuluri, Ivo Grosse & Michael Q. Zhang. Nature Genetics 29:412-417 2001. Identificar e caracterizar todos os genes do genoma humano. 3.300.000 kb: ~30.000 genes. Genscan, FGENES and MZEF. Draft.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Computational identification of promoters and first exons in the human genome' - honora


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Computational identification of promoters and first exons in the human genome

Computational identification of promoters

and first exons in the human genome

Ramana V Davuluri, Ivo Grosse & Michael Q. Zhang

Nature Genetics 29:412-417 2001

Identificar e caracterizar todos os genes

do genoma humano

3.300.000 kb: ~30.000 genes

Genscan, FGENES and MZEF


Computational identification of promoters and first exons in the human genome

Draft

51.4 %

Finished

47.1 %

Total

98.5 %

Sequencing Progress


Computational identification of promoters and first exons in the human genome

exons

introns

Gene


Computational identification of promoters and first exons in the human genome

Sinal de poliadenilação

1

2

3

Promotor

AATAAA

G

A

G

A

AG

AG

GT

AG

GT

AG

ATG

TAA

TAG

TGA

AATAAA

ORESTES

dbEST

Gene  glonina

5’m7G

AAAAAAAAA 3’


Computational identification of promoters and first exons in the human genome

Seqüência

rica em GU

Sinal de poliadenilação

5’ m7G

AAUAAA

GU

3’

endonuclease

5’ m7G

AAUAAA

3’

Poli(A) polimerase

5’ m7G

AAUAAA

AAAAAAAA 3’

Cauda poli(A)

Modificações químicas nas duas extremidades do RNAm

Poliadenilação


Computational identification of promoters and first exons in the human genome

Exon 1

Exon 2

Intron

10s a 10.000 nt

A

G

C

A

GT AGT

AG

CCCCCC C

TTTTTTT T

G

A

N

AG


Computational identification of promoters and first exons in the human genome

Gene

Sinal de poliadenilação

1

2

3

Promotor

AATAAA

G

A

G

A

AG

AG

GT

AG

GT

AG

ATG

TAA

TAG

TGA


Computational identification of promoters and first exons in the human genome

Exon 1

+1

G

A

AG

AG

GT

AG

GT

ATG

+1

ATG

Exon parcialmente

codificado

Exon não codificado

40%


Computational identification of promoters and first exons in the human genome

Computational identification of promoters

and first exons in the human genome

Ramana V Davuluri, Ivo Grosse & Michael Q. Zhang

Nature Genetics 29:412-417 2001


Computational identification of promoters and first exons in the human genome

  • Alinharam RNAm e 5’ UTR com

  • seqüências genômicas;

  • Recupera o primeiro exon com 500 bases

  • de cada lado;

  • Elimina a redundância e as seqs ambíguas.

FEdb

2.139

Splice-donor

Sites (GT)

Para todo sítio GT, o programa calcula a

probabilidade de ser um splice-donor site.

P(donor site|GT)

>

0.4

Promotor

500 pb

70 pb

ATG

GT

GT

500 pb

1.500 pb

1.500 pb

500 pb

P(promoter|window) > 0.4


Computational identification of promoters and first exons in the human genome

  • Alinharam RNAm e 5’ UTR com

  • seqüências genômicas;

  • Recupera o primeiro exon com 500 bases

  • de cada lado;

  • Elimina a redundância e as seqs ambíguas.

Para todo sítio GT, o programa calcula a

probabilidade de ser um splice-donor site.

P(donor site|GT)

>

0.4

GT

1.500 pb

ATG

GT

500 pb

Primeiro

exon

500 pb

1.500 pb

P(exon|all) > 0.5

FEdb

2.139

Splice-donor

Sites (GT)

Promotor

P(promoter|window) > 0.4


Computational identification of promoters and first exons in the human genome

FEdb

2.139

Resultados

  • Banco de dados de primeiro exon.

Parcialmente codificado

1.315 (61%)

348 pb

Não codificado

824 (39%)

151 pb


Computational identification of promoters and first exons in the human genome

GC%

GC%

GC%

GC%

GC%

GC%

GC%

GC%

GC%

GC%

201pb

  • Primeiro exon e ilhas CpG

500 pb

500 pb

ATG

GT

CpG score

=  GC% / total window



Computational identification of promoters and first exons in the human genome

0

-200

+1

Exon 1

Promotor

93,8 %

76,3 %

  • Primeiro exon e ilhas CpG


Computational identification of promoters and first exons in the human genome

Predizer o primeiro exon e a região

Promotora usando diferentes funções discriminantes estruturada como uma árvore de decisão.

FirstEF

Modelos probabilísticos destinados a encontrar sítios de splicing donor e regiões promotoras relacionadas e não relacionadas com ilhas CpG.

Para todo sitio de splicing (donor) e toda região promotora, o FirstEF decide se a região intermediária pode ser um primeiro exon baseado em um grupo de função quadrática discriminante.


Computational identification of promoters and first exons in the human genome

  • Análise sistemática de validação.

Cross-validation (FEdb)


Computational identification of promoters and first exons in the human genome

  • O programa foi rodado com a seqüência completa do

  • Chr 21 e Chr 22.