Slide1 l.jpg
This presentation is the property of its rightful owner.
Sponsored Links
1 / 75

Alternative Splicing Hedi Hegyi, PhD @ Institute of Enzymology, Budapest http://www.enzim.hu/~hegyi/ Szeged University, Biochemistry Course Oct 31, 2007 PowerPoint PPT Presentation


  • 184 Views
  • Uploaded on
  • Presentation posted in: General

Alternative Splicing Hedi Hegyi, PhD @ Institute of Enzymology, Budapest http://www.enzim.hu/~hegyi/ Szeged University, Biochemistry Course Oct 31, 2007. Scientific American 2005/04. Scientific American 2005/04.

Download Presentation

Alternative Splicing Hedi Hegyi, PhD @ Institute of Enzymology, Budapest http://www.enzim.hu/~hegyi/ Szeged University, Biochemistry Course Oct 31, 2007

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Slide1 l.jpg

Alternative SplicingHedi Hegyi, PhD@ Institute of Enzymology, Budapesthttp://www.enzim.hu/~hegyi/Szeged University, Biochemistry Course Oct 31, 2007


Slide2 l.jpg

Scientific American 2005/04


Slide3 l.jpg

Scientific American 2005/04

  • Spring of 2000. Molecular biologists placing dollar bets: how many genes in human genome?

  • 90,000? 153,000? C.elegans:19,500, Maize:40,000

  • 35,000, 30,000,

  • a paltry 25,000!


Slide4 l.jpg

C-value paradox: Complexity does not correlate with genome size. (C.A. Thomas, Jr, 1971)

6.7 x 1011 bp

Amoeba dubia

3.4 x 109 bp

Homo sapiens


Slide5 l.jpg

N-value paradox: Complexity does not correlate with gene number.

~25,000 genes

~26,000 genes

~50,000 genes


Discovery of alternative splicing l.jpg

Discovery of Alternative Splicing

  • First predicted by Walter Gilbert in 1978

  • - First discovered for an Immunoglobulin heavy chain gene in 1980 (Edmund Choi, Michael Kuehl & Randolph Wall, Nature286, 776 - 779)

  • - Alternative splicing gives two forms of the protein with different C-termini:

    • - 1 form is shorter and secreted

    • - Other stays anchored in the plasma membrane via C-terminus


Slide7 l.jpg

Alternative splicing of the mouse immunoglobulin μ heavy chain gene

S - signal peptide Red – untranslated region

V - variable region Green – membrane anchor

C - constant region Yellow – end of coding reg. for secreted form


Splicing the spliceosome l.jpg

Splicing & the spliceosome

Structure

  • 60S dynamic structure – a large complex consisting of ~ 150 proteins

  • Five small nuclear RNAs (U1, U2, U4, U5 & U6)

  • RNAs assemble with proteins to form snRNPs (“snurps”)

  • Protein splicing factors

    Assembly of spliceosome requires ATP

    Splicing defects

  • Estimation: 15% of all genetic diseases associated with mutated splice sites

Green globule: RNA pol

Yellow globule: spliceosome


Snrnas l.jpg

snRNAs


Slide10 l.jpg

Secondary structure of snRNAs

U5

U1

U2

Orange - interaction with 5’ splice site

Green – Interaction with branch site

Blue - interaction between U2 and U6

Tan - Sm-binding site (PuAU4-6GPu)

flanked by two stem-loop structures

U6

U4


U1 snrna l.jpg

U1 snRNA

  • Contains conserved sequence complementary to 5’ splice site of nuclear mRNA introns

  • Contains pseudouridine(y)

5’ splice site

Upstream exon

GUAAGU-------3’

::::::

3’---CAUUCA---cap-5’

U1 snRNA


Splice site recognition l.jpg

Splice-site recognition

5’ splice site

3’ splice site

upstream

exon

downstream

exon

branch site

---AGGUAAGU-----------A--------(Py)nNCAGG

~ 20 – 50 nts

Intron

Branch site in yeast: often 5’- UACUAAC-3’


Splice site conservation l.jpg

Splice Site Conservation

Splice Junction

5’

XX

YY

3’

E

I

E

I

E

Donor (5’) SS

Acceptor (3’) SS


Splice site conservation14 l.jpg

Splice Site Conservation

Splice Junction

5’

XX

YY

3’

E

I

E

I

E

Donor (5’) SS

Acceptor (3’) SS


Splicing mechanism l.jpg

U5

U4

U6

ATP

AG

U1

U5

U2

U4

U6

Splicing mechanism

5’ splice site

branch site

3’ splice site

Exon 1 IntronExon 2

U1

U2

GU

A

AG


Slide16 l.jpg

Factors Playing a Role in Exon Recognition

1. Evolution appears to have weakened splice sites

Derived from 253, only 3% of the S. cerevisiae genes contain introns

No Alternative Splicing

Derived from 4,697 S. pombe genes; approximately 43% of all genes contain introns

Intron Retention

Derived from 49,778, nearly 100% contain introns

75% Alternative Splicing

- GTCCATTCA - 5' U1


Slide17 l.jpg

Factors Playing a Role in Exon Recognition

Exon Recognition is complex

Exon Definition

Intron Definition

Complexity means multiple points of possible regulation and that exons could be skipped by failing to get all the pieces in place

The ability to form or disrupt these interactions is thought to play a key role in alternative splicing!!!

Nature, Vol. 418, p. 236, 2002


Slide18 l.jpg

Intron statistics

SpeciesAverage Average Average Average % exon

exon No. intron No. length(kb) kb mRNA per gene

Yeast 1 0 1.6 1.6 100

Nematode 4 3 4.0 3.0 75

Fruit fly 4 3 11.3 2.7 24

Chicken 9 8 13.9 2.4 17

Mammals 7 6 16.6 2.2 13

Human genes

MedianMean

Size of internal exons 122 bp 145 bp

Number of exons 7 8.8

Size of introns 1023 bp 3356 bp


Slide19 l.jpg

Recognizing Exons - Are Splice Sites Enough?

Most mammalian genes contain more than one intron

Extreme Examples:

Collagen Gene - 50 exons and a 40 kb precursor RNA

DMD Gene - 79 exons, 2.3 mb precursor, intron 20 is 180 kb

Neurexin - has a 0.5 mb intron!

Most genes are uninterrupted in yeast, but most genes are interrupted in flies and mammals


Slide20 l.jpg

Support for the Exon Definition Model

Exon size is conserved in vertebrates

Intron size is not

Recognition requires a consistent target size


Slide21 l.jpg

Additional Cis Elements and Trans-Acting Factors

- Splicing has Enhancers and Silencers

- They function to modulate spliceosome formation

Recruiting SR proteins can stabilize E complex formation

Blocking snRNP or protein interactions can prevent E complex formation

ESE - Exonic Splicing Enhancer

ESS - Exonic Splicing Silencer

ISE - Intronic Splicing Enhancer

ISS - Intronic Splicing Silencer


Slide22 l.jpg

Models of Exon Recognition - Final Points

Trans-Factor Interaction with Exon differs from Interaction with Introns


Slide23 l.jpg

Types of Alternative Splicing

38%

18%

8%

3%

Remaining 33%

Cell 126:37 (2006)

Nature Reviews: Genetics 5:773 (2004)


How prevalent is alternative splicing l.jpg

How Prevalent is Alternative Splicing?

No one really knows for sure.

EST Database estimates between 35 - 60% of protein coding gene have alternative mRNAs

Caveat - These databases contain sequences derived from aberrant, as well as, alternative splicing, they are typically 3' and 5' end biased, and have insufficient number to infer frequency

Therefore, database mining may overestimate the rate of alternative splicing


Slide25 l.jpg

74%

Array-Based Numbers

Science 302, 2141-44 (2003)


Slide26 l.jpg

Genome-Wide Survey of Human Alternative Pre-mRNA Splicing with Exon Junction Microarrays (Science, 2003)

10,000 multi-exon

human genes in 52

tissues

Conclusion: 74% of multi-exon human genes are alternatively spliced


Slide27 l.jpg

Number of Splicing Isoforms per Gene by EST Comparison

3.8

Harrington et al. Nature Genetics 36:916 (2004)


Regulation of by alternative splicing l.jpg

Regulation of/by Alternative Splicing

  • Sex determination in Drosophila involves 3 regulatory genes that are differentially spliced in females versus males; 2 of them affect alternative splicing

    1.Sxl (sex-lethal) - promotes alternative splicing of tra (exon 2 is skipped) and of its own (exon 3 is skipped) pre-mRNA

    2.Tra – promotes alternative splicing of dsx (last 2 exons are excluded)

    3.Dsx (double-sex) - Alternatively spliced form of dsx needed to maintain female state

Fig. 14.38


Slide29 l.jpg

Alternative splicing in Drosophila maintains the female state

Alternative splicing

Sxl and Tra are SR proteins

Tra binds exon 4 in dsx mRNA causing it to be retained in mature mRNA.


Slide30 l.jpg

Known Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)

  • Introduction of stop codons -

  • 25-35% of alternative splicing events introduce stop codons that either function to produce truncated proteins or regulate mRNA stability through the nonsense mediated decay (NMD) pathway


Nonsense mediated decay l.jpg

Nonsense Mediated Decay

  • A surveillance mechanism that selectively degrades nonsense mRNAs

  • Regulates gene expression by alternative splicing

  • - Transcripts containing a PTC (premature termination codon) are degraded rapidly

1/3rd of alternative transcripts contain premature termination codons

Brenner, SE et al, PNAS January 7, 2003 vol. 100 no. 1 189–192


Slide32 l.jpg

Known Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)

  • Add new protein parts -

  • 75% of alternative splicing involves the protein coding region, in addition to truncations you can change the overall protein sequence


Slide33 l.jpg

Known Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)

  • Consequences of new protein parts -

  • Alter protein binding properties, eg. receptor/ligand

  • Alter intracellular localization, eg. membrane insertion

  • Alter extracellular localization, eg. secretion

  • Alter enzymatic or signaling activities, eg. TK truncations

  • Alter protein stability, eg. inclusion of cleavage sites

  • Insertion of post-translation modification domains

  • Change ion channel properties eg. slo


Slide34 l.jpg

Known Roles of Alternative SplicingStamm et al. Gene 344:1-20 (2005)

  • Coordinated Regulation of Biological Events

  • Potassium channel activity associated with hearing (slo)

  • Muscle contraction

  • Neurite (axon or dendrite) growth

  • Cell differentiation

  • Apoptosis

  • Neuron development (Dscam) (TIBS 31:581-588, 2006)


Slide35 l.jpg

The Power of Alternative RNA Splicing

Drosophila DSCAM gene codes for an axon guidance receptor

The final mRNA chooses 24 exons from 115 possibilities

(20 constitutive exons and 4 alternatively spliced ones)

Ig Loop 7

Trans-membrane

Ig Loop 3

Ig Loop 4

Exon 4

12 Alternatives

Exon 6

48 Alternatives

Exon 9

33 Alternatives

Exon 17

2 Alternatives

12 X 48 X 33 X 2

Equals

38,016 Possible mRNAs

Genome has only 14,800 genes!


Evolutionary overview of alternative splicing l.jpg

Evolutionary Overview of Alternative Splicing

  • Introns unlikely to have been derived from ancient genes

  • Multi-intron genes probably predated alternative splicing

  • Most eukaryotes have introns but alternative splicing prevalent only in multicellular organism

  • S.cerevisiae has only 253 introns (3% of its genes) and only 6 genes have 2introns

  • S. pombe: 43% of its genes have introns (usually 40-75 nt)

  • S.cerevisiae and S. pombe have NO alternative splicing


Slide37 l.jpg

Finding Alternatively Spliced Exons

  • Compare cDNA & genomic DNA sequences

  • Compare ESTs & genomic DNA sequences

  • Compare protein & genomic sequences


Large scale multiple alignment of expressed sequences l.jpg

Large-scale multiple alignment of expressed sequences

  • Databases:

    • tens of thousands of mRNAs

    • millions of ESTs

  • From large-scale alignments: 60-80% of all human genes undergo alternative splicing.


  • Alu elements l.jpg

    Alu elements

    • Length = ~300 bp

    • Repetitive: > 1,400,000 times in the human genome

    • Constitute >10% of the human genome

    • Found mostly in intergenic regions and introns

    • Propagate in the genome through retroposition (RNA intermediates).


    Alu elements can be divided into subfamilies l.jpg

    Alu elements can be divided into subfamilies

    The subfamilies are distinguished by ~16 diagnostic positions.


    Alu containing exons l.jpg

    Alu-containing exons

    • Out of 1,182 alternatively spliced cassette exons, 62 have a significant hit to an Alu sequence.

    • Out of 4,151 constitutively spliced exons, none has a significant hit to an Alu sequence.

      all Alu-containing exons are alternatively spliced.

    Graur et al., Genome Res. (2002)


    The minus strand of alu elements contains near splice sites l.jpg

    The minus strand of Alu elements contains “near” splice sites

    • The minus strand of Alu contains ~3 sites that resemble the acceptor recognition site:

      Consensus acceptor site:YYYYYYNCAG/R

      Alu-J: (127-114) :TTTTTTGtAG/A

    • The minus strand of Alu contains ~9 sites thatresemble the consensus donor site:

      Consensus donor site: CAG/GTRAGT

      Alu-J: (25-17) : CAG/GTGtGA


    Slide43 l.jpg

    Alu exonization

    The selection of AGs in the 3SSs of Alu-derived exons are underlined

    3 genetic diseases:

    1. COL4A3- Alport syndrome

    2. GUSB- Sly syndrome

    3. OAT- OAT deficiency

    Lev-Maor G, et al, Science, 2003.


    Slide44 l.jpg

    Factors Playing a Role in Exon Recognition

    - Sequence features of alternatively regulated exons are different from constitutive exons.

    - These features are conserved between species.

    Sorek, Genome Res 14:1617 (2004)


    Slide45 l.jpg

    Large-scale identification of alternative splice variants of human gene transcripts using 56,419 cDNAs

    Distribution of the length difference

    between the alternative

    splicing variants

    Takeda, J.-i. et al. Nucl. Acids Res. 2006 34:3917-3928; doi:10.1093/nar/gkl507


    Large scale identification of human alternative splic e variants l.jpg

    Large-scale identification of human alternative splice variants

    (A)‘motif-changed’

    (B)‘subcellular localization-changed’

    (C)‘transmembrane domain-changed’


    Slide47 l.jpg

    Alternative splicing databases (1,560,000 hits in google)

    • - Alternative Splicing & Transcript Diversity DbASTD

    • http://www.ebi.ac.uk/astd/

    • - SpliceMiner (querying EVDB - Evidence Viewer Database)

    • http://discover.nci.nih.gov/spliceminer/

    • Hollywood

    • http://hollywood.mit.edu

    • -Human Alternative Splicing Db (HASDB),

    • http://www.bioinformatics.ucla.edu/~splice/HASDB/

    • Putative Alternative Splicing Database,

    • PALS db, http://palsdb.ym.edu.tw/


    Slide48 l.jpg

    Structure of ASTD

    databases are integrated, cross-linked and are available through a variety of interface tools

    ASTD data are integrated with

    Ensembl genome annotation

    Stamm, S. et al. Nucl. Acids Res. 2006 34:D46-D55


    Spliceminer ncbi l.jpg

    Spliceminer (NCBI)

    Querying EVDB (Evidence Viewer DB). Composite of five separate interactive queries. Each query corresponds to a different Affymetrix HG-U133A Probe. The composite permitsfacile comparison of the exons that are targeted by each of the probes. For example, the probes for exons 16 and 18 uniquely identify the splice variants NM_006487 and NM_006485, respectively.


    Slide50 l.jpg

    Evolutionarily conserved and diverged alternative splicing events show different expression and functional profiles (Kan, NAR, 2005)

    Kan, Z. et al. Nucl. Acids Res. 2005 33:5659-5666; doi:10.1093/nar/gki834


    Slide51 l.jpg

    Evolutionarily conserved and diverged alternative splicing events show different expression and functional profiles (Kan et al, NAR, 2005)

    Alternative splicing events in 10818 pairs of human and mouse genes

    43% (8921) of mouse alternative splices could be found in the human genome but not in human transcripts

    Only 7% of human alternative splices are conserved in mouse transcripts

    5 of 11 tested mouse predictions were observed in human tissues

    Diverged alternative splicing is more prevalent in cancerous cell-lines


    Slide52 l.jpg

    Evolutionarily conserved and diverged alternative splicing events show different expression and functional profiles (Kan et al, NAR, 2005)


    Slide53 l.jpg

    Microarray expression of alternatively spliced human-mouse pairs (ASP) of genes in different tissues (Kan et al, 2005)

    • level of conserved alternative splicing most elevated in brain

    • diverged alternative splicing is the most enriched in testis


    Functional transcripts for the brain periphery and brain and receptors l.jpg

    Corticotrophinreleasing hormone receptor 2 (CRHR2) alternative splices

    Functional transcripts for the α,β (brain, periphery) andγ(brain) and receptors

    Extracellular Domain

    Cytoplasmic Domain

    β

    γ

    α

    Catalano et al, Molecular Endocrinology. First published December 18, 2002 as doi:10.1210/me.2002-0302


    Http www ensembl org l.jpg

    http://www.ensembl.org


    The implications of alternative splicing in the encode protein complement cont d l.jpg

    The implications of alternative splicing in theENCODE protein complement (cont’d)

    Fig. 2. The potential effect of splicing on protein structure. Four spliceisoformsmappedonto the nearest structural templates. Structures are coloredin purple where the sequence of the splice isoform is missing. (a) Hemoglobin(b) SET domain-containing protein 3, (c) Mitochondrial cysteine desulfurase(d).Eukaryoticinitiation factor 6.


    How many genes and transcripts in human genome l.jpg

    How many genes and transcripts in human genome?

    • Ensembl NCBI 35 release (Dec, 2005): 33,869 transcripts derived from 22,218 genes

    • Ensembl NCBI 36 release (May, 2006): 48,851 transcripts derived from 23,710 genes


    How many genes and transcripts in human genome cont d l.jpg

    How many genes and transcripts in human genome? Cont’d

    • Major source of uncertainty: nucleic acid-basedidentification of putative proteins in humans and other organisms.

    • Most proteinshave not been seen as proteins per se but their existence has been inferred fromgenomic DNA, cDNA and ESTs.


    Surveillance systems l.jpg

    Surveillance systems

    • NMD (nonsense-mediated decay) controls alternative splices that cause a premature stop codon

    • ERAD (Endoplasmic Reticulum Associated Degradation) eliminates AS (alternative splicing) isoforms with no stable 3D structure


    Computer modeled surveillance l.jpg

    Computer-modeled surveillance?

    • Selecting predicted proteins with domains of abnormal length

      • Study Pfam-A domains derived from Swissprot proteins

      • 8129 Pfam-A domain families currently in Pfam

      • How many Pfam-A families can be used for such a surveillance?


    20 most irregular pfams in human swissprot proteins l.jpg

    20 most irregular Pfams in human Swissprot proteins


    Nested and overlapping domains l.jpg

    Nested and overlapping domains


    Nested and truncated domains l.jpg

    Nested and truncated domains

    Existing proteins (PFAM models)

    - Matrix metalloproteinase 11

    - Matrix metalloproteinase 9

    Putative proteins (PFAM models)

    - Glycosyl hydrolase domain

    - Peptidoglycan-binding domain+DUF


    Computer modeled surveillance cont d l.jpg

    Computer-modeled surveillance? Cont´d

    • Select those predicted proteins with domains of abnormal length (>=40% of domain missing). Use for filtering:

      • 8129Pfam-A domain families currently in Pfam

      • 2752Pfam-A domain families in human Swissprot proteins

      • How many of these 2752Pfam-A families are regular enough to be used for such a surveillance? Answer: 2529


    Ensembl proteome coverage with regular pfam a domains l.jpg

    Ensembl Proteome Coverage with regular Pfam-A domains*

    * Belonging to one of the 2529 regular Pfam-A families


    Structural studies of as l.jpg

    Disordered AS

    regions

    Structural Studies of AS

    Structured AS regions

    Pyrophosphorylase

    RAC1

    Tumor necrosis factor

    Sulphotransferase

    Glutathione S-transferase


    Parallel paradigms l.jpg

    Parallel Paradigms

    Catalysis

    AA seq →3-D Structure→ Function

    Signaling

    AA seq →Disordered→Function

    Ensemble


    Alternative splicing and intrinsic disorder l.jpg

    Alternative Splicing and Intrinsic Disorder

    • Find proteins with both ordered and disordered regions.

    • Find mRNA alternative splicing information for these proteins and map to the orderedand disordered regions.

    • For alternatively spliced regions of mRNA, do they code for ordered protein more often or do they code for disordered protein more often?

    Romero PR et al. Proc Natl Acad Sci U S A. 2006 May 30;103(22):8390-5


    Studying the relationship intrinsic disorder as l.jpg

    ASED dataset:

    (Alternative Splicing & Experimental Disorder)

    46 proteins

    74 characterized AS regions

    >19,000 charaterized residues, 35% ID

    Studying the Relationship Intrinsic Disorder AS

    ASG

    (AS Gallery)

    DisProt

    SwissProt

    (VarSplic)

    Database of proteins with

    experimentally determinedstructure and disorder

    www.disprot.org

    Romero PR et al. Proc Natl Acad Sci U S A. 2006 May 30;103(22):8390-5


    Results on ased l.jpg

    Results on ASED

    Distribution of structurally characterized AS regions

    Romero PR et al. Proc Natl Acad Sci U S A. 2006 May 30;103(22):8390-5


    Take home message l.jpg

    Take-home Message

    • Alternative splicing evolved simultaneously with multicellular organisms

    • It increases the functional diversity and complexity of an organism

    • Not all of alternative splicing is functional (e.g. Alu-exonization)

    • It is a factor in genetic diseases (cancer, etc.)

    • It is strongly associated with protein disorder


    Slide75 l.jpg

    The End

    …to get updates on this lecture and other related, updated info, go tohttp://www.enzim.hu/~hegyi/


  • Login