distribution of introns among full length cdna l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Distribution of Introns among Full Length cDNA PowerPoint Presentation
Download Presentation
Distribution of Introns among Full Length cDNA

Loading in 2 Seconds...

play fullscreen
1 / 30

Distribution of Introns among Full Length cDNA - PowerPoint PPT Presentation


  • 112 Views
  • Uploaded on

Bioinformatics Capstone . Distribution of Introns among Full Length cDNA. By Xin Hong Advisor: Dr. Michael Lynch and Dr. Sun Kim. Main Points. Motivation Background Data sources Method Results and discussion. Motivation. Genomic sequences Full length cDNA project

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Distribution of Introns among Full Length cDNA' - kilenya


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
distribution of introns among full length cdna

Bioinformatics Capstone

Distribution of Introns among Full Length cDNA

By Xin Hong

Advisor: Dr. Michael Lynch and Dr. Sun Kim

main points
Main Points
  • Motivation
  • Background
  • Data sources
  • Method
  • Results and discussion
motivation
Motivation
  • Genomic sequences
  • Full length cDNA project
  • Gene predict program does not include UTR regions.
  • The UTR structure and Function and NMD theory.
definition of utrs and introns

AUG

UAA

Definition of UTRs and Introns
  • 5’UTR sequences were defined as the mRNA region spanning from the cap site to the starting codon (excluded).
  • 3’UTR sequences were defined as the mRNA region spanning from the stop codon (excluded) to poly(A) starting site.
  • The coding region begins with the initiation codon, which is normally ATG. It ends with one of three termination codons: TAA, TAG or TGA.

Genomic sequence

Pre-mRNA

1

2

3

mRNA

3UTR

5UTR

CDS

function of utrs
Function of UTRs
  • Translational control
  • mRNA sub cellular localization
  • mRNA stability

Pesole, 2001

nonsense mediated decay nmd

Genomic sequence

Pre-mRNA

transcription

5’

3’

Exon-Exon Junction (EEJ)

Post transcriptional process

3’most EEJ

NMD

mRNA

50-55nt

AUG

UAA

CDS

5’ UTR

3’ UTR

Nonsense-Mediated Decay (NMD)
  • An mRNA is immune to NMD if translation terminates less than 50–55 nucleotides upstream or downstream of the 3′-most exon–exon junction, which is the last intron of cDNA.
  • NMD is a a mRNA surveillance mechanism that leads to selective degradation of transcripts containing premature termination codon.
objectives
Objectives
  • To explore introns in the UTR region
  • To find the rule about introns distribution among UTR regions.
  • To compare the introns distribution between UTRs and CDS.
  • To compare the introns distribution rules among different species.
data source
Data source
  • Full length cDNA sequences
    • MGC (Mammalian Gene Collection): - mammalian
    • BDGP : – fruit fly
    • KOME : – plant
  • Genomic sequences
    • Genbank
    • Ensmbal
  • CDS prediction (Furuno et al. 2003)
    • ProCrest
    • rsCDS
    • NCBI predictor
    • DECODER
    • Experiment
method
Do alignment between cDNA sequences and Genomic sequence

How about gaps, overlapping even polymorphism?

BLAST, Mega BLAST ..

sim4, gap2, spidey, BLAT and GeneSeqer

Method

Jim Kent - the Blat Rap

steps
Steps
  • Clear full length cDNA and genomic sequence.
  • Parse cDNA to 5UTR, CDS and 3UTR three parts.
  • Aligning cDNA to genomic sequence by BLAT
  • Parse BLAT result to get locations of exon and intron.
  • Get sequences of exon and intron.
  • Check if sum of exons equal to cDNA to remove suspect candidates.
  • Calculate the average length of the cDNA, the average number of introns in cDNA, etc.
  • Compare the intron distribution of 5UTR, CDS and 3UTR regions.
  • Compare the intron distribution rules among different species.
objectives12
Objectives
  • To explore introns in the UTR region
  • To find the rule about introns distribution among UTR regions.
  • To compare the introns distribution between UTRs and CDS.
  • To compare the introns distribution rules among different species.
introns do exist in utrs
Introns Do Exist in UTRs
  • Introns do exist in UTRs.
  • However, for arabidopsis as an example, 80% of sequences of 5’UTR don’t have introns. 90% of sequences of 3’UTR don’t have introns.
slide14

Introns in CDS

  • 80% of sequences of CDS have introns.
introns number utrs vs cds
Introns number: UTRs vs. CDS
  • Most of CDS sequences have introns, but most of UTR sequences don’t have introns.

Number of sequences

Number of intron

objectives16
Objectives
  • To explore introns in the UTR region
  • To find the rule about introns distribution among UTR regions
  • To compare the introns distribution between UTRs and CDS
  • To compare the introns distribution rules among different species
slide17

Introns in UTR

  • Introns of 5’UTR and 3’UTR are overspread, but not evenly or uniformly distributed.
  • If evenly distributed, the expected intron location = 1/(number of intron+1)

Intron Number

Number of intron

slide18

Introns in UTR

  • The number of intron increase, when the length of sequence increase.
  • For human 5’UTR, on average an intron is present for each 100nt.
  • Introns of 3’UTR tend to concentrate toward the center of 3’UTR.

Location of introns

Length of sequences

Number of intron

Number of intron

objectives19
Objectives
  • To explore introns in the UTR region
  • To find the rule about introns distribution among UTR regions.
  • To compare the introns distribution between UTRs and CDS.
  • To compare the introns distribution rules among different species.
slide20

Introns in CDS

  • Introns in CDS are overspread.
  • For human, if there are more than one intron, the interval between 2 introns is about 140nt. (In other word, the average exon in CDS is 140nt)
  • Introns are shift toward 5’.
slide21

Intron distribution: UTRs vs. CDS

Human as example:

  • The frequency of introns occurring 5’UTR is higher than that of CDS.
  • The frequency of introns occurring CDS is higher than that of 3’UTR.

Number of intron

Number of intron

objectives23
Objectives
  • To explore introns in the UTR region
  • To find the rule about introns distribution among UTR regions.
  • To compare the introns distribution between UTRs and CDS.
  • To compare the introns distribution rules among different species.
slide24

Different species: UTRs vs. CDS

  • Number of introns increase with the length of sequence in both UTRs and CDS.
  • The sequences of 5’UTR less than 100nt don’t have introns for human, mouse, rat, Arabidopsis and fruit fly.
  • While the sequences of CDS less than 800nt don’t have introns for human, mouse, Arabidopsis and fruit fly. For rat this boundary is 500nt.
  • The fruit fly sequence length increase faster than the other species in both UTRs and CDS.

Number of intron

Number of intron

different species utrs vs cds
For 5 species, most of UTRs don’t have introns.

For 5 species, most of CDS have introns.

The intron distribution rule works for human, mouse, rat, arabidopsis and fruit fly.

Different species: UTRs vs. CDS

Number of sequences

Number of sequences

Number of intron

Number of intron

summary
Summary
  • The introns do exist in UTRs.
  • The intron distributions in 5UTR, CDS and 3UTR are different for same organism.
  • The intron distribution rules are in common for human, mouse, rat, Arabidopsis and fruit fly.
    • The sequences of 5’UTR less than 100nt don’t have introns for human, mouse, rat, Arabidopsis and fruit fly.
    • While the sequences of CDS less than 800nt don’t have introns for human, mouse, Arabidopsis and fruit fly except for rat is 500nt.
    • The fruit fly fl-cDNA sequence length increase faster than the other species in both UTRs and CDS.
future work
Future work
  • NMD widely exists among different species.
  • The reason why most UTR don’t have introns.
  • The reason why intron frequency decrease when sequence goes from 5’ to 3’ along the full length cDNA.
reference
Reference

Lynch, Micheal and Kewalramani, Avinash (2003) Messenger RNA Surveillance and the Evolutioary Proliferation of introns. Mol.Biol.Evol 20(40):563-571

Flavio Mignone, Carmela Gissi, Sabino Liunu and Graziano Pesole (2002) Untranslated regions of mRNAs. Genome Biology 3(3): revies 0004.1-0004.10

Pesole G, Grillo G, Larizza A, Liuni S.(2000) The untranslated regions of eukaryotic mRNAs: Structure, function, evolution and bioinformatics tools for their analysis. Briefing in Bioinformatics. 1(3):236-249

W.James (2002) Kent BLAT The BLAST-Like Alignment ToolGenome Res. Apr;12(4):656-64.

Furuno M, Kasukawa T, Saito R, Adachi J, Suzuki H, Baldarelli R, Hayashizaki Y, Okazaki Y.(2003) CDS annotation in full-length cDNA sequence. Genome Res, Jun; 13(6B): 1478-1487

Strausberg RL et al. (2002) Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proc Natl Acad Sci U S A. 24;99(26):16899-903.

http://www.ncbi.nlm.nih.gov

acknowledgement
Acknowledgement

Dr. Micheal Lynch

Dr. Sun Kim

Dr. Douglas G. Scofield