[VI]. Post-Transcriptional Processing and Post-Transcriptional Control of Gene Expression Processing of eukaryotic pre-mRNA About 60% of human genes give spliced mRNAs Eukaryotic cells have evolved RNA surveillance mechanisms that prevent incorrectly processed RNAs to be transported out of the nucleus Regulation of pre-mRNA processing RNA editing Macromolecular transport across the nuclear envelop Cytoplasmic mechanisms of post-transcriptional regulation Processing of rRNA and tRNA
Processing of Eukaryotic Pre-mRNA • Capping of mRNA • Splicing of mRNA • Polyadenylation of mRNA
Overview of mRNA Processing in Eukaryotes • Processing of pre-mRNA is co-transcriptional
Structure of the 5’ Methylated CAP • A methyl group from S-adenosyl-methionine is added to the N7 position of the G and the 2’ oxygen of the 5’ ribose at the nascent RNA
Synthesis of 5’-Cap on Eukaryotic mRNAs • Capping occurs shortly after initiation of transcription • 7-methyl-G is added in the 5’ end of the nascent RNA shortly after transcription initiates, about 25-30 nucleotides in length. • The enzyme involved in this process is a dimeric capping enzyme associated with the phosphorylated carboxyl-terminal domain (CTD) of Pol II. Capping is specific for transcripts produced by Pol II • The g-phosphate is removed from the nascent RNA, replaced with a GMP (5’-5’ triphosphate structure), and a methyl group from S-adenosyl-methionine is added to the N7 position of the G and the 2’ oxygen of the 5’ ribose at the nascent RNA • Capping of the nascent transcript is coupled to elongation so that all of the transcripts will be capped • Capping of mRNA will protect it from degradation by 5’-exonuclease
Functions of Capping • In prokaryotes, the Shine-Dalgarno sequence, localized at 10 bases upstream of AUG, of the polycistronic mRNA, that binds to 16S rRNA to initiate translation • The AUG is localized within the consensus sequence of GCCA/GCCAUGG (Kozak’s sequence) • In eukaryotes, the 5’ end of the mRNA is Capped. The CAP, after binding to the CAP-binding complex (CBC), will protect the mRNA from been degraded by RNase • After been transported out of the nucleus, the CBC will be replaced with eIF4E and the complex will bind to 40S ribosome to initiate translation • CBC contains RNAS binding proteins, in mammals encoded by CBC20/CBC80 genes
Coupling Transcription with the 5’ Capping • Following the initiation of transcription and of the first few bases of the RNA, the RNA polymerase II pauses and only continues transcription once the nacent RNA has been capped. Capping is essential for recruitment of the pTEF-b kinase which is required for transcriptional elongation • GT: Guanylyl transferase • MT: 7-methyltransferasse RPB1 of poly II contains Tyr-ser-pro-Thr-Ser-Pro-Ser at the C-terminus
Polyadenylation • Cleavage of RNA at the site downstream of AAUAAA and upstream of a sequence rich in G/U. These sequences are recognized by CPSF (cleavage- and polyadenylation-specific complex) and CstF (cleavage-stimulation factor) • Endonucleolytic cleavage will take place • Following that, polyadenylation will take place in the left fragment • The right fragment will be degraded
3’ Cleavage and Polyadenylation of Pre-mRNA Are Tightly Coupled 1 • Eukaryotic mRNAs are polyadenlated except histone mRNA • Poly(A) is added at 3’ end of the mRNA after endonuclease cleavage of the longer RNA transcript • An AAUAAA sequence which is 10 – 35 nucleotide upstream of the poly(A) tail is the poly(A) signal • Second Poly(A) signal (G/U rich or U rich sequence), about 50 nucleotides off the cleavage site, functions for efficient cleavage and polyadenylation • CPSF (cleavage and polyadenylation specificity factor), a 360 kd complex consists of four different polypeptides • CStF: cleavage stimulatory factor • CF: cleavage factor;PAP: poly(A) polymerase • PABPII: Stimulating polyadenylation 2 3
The 3′ mRNA End Processing Is Critical for Transcriptional Termination • RNA polymerase I and III terminate their transcription upon meeting a terminating signal on the DNA • The cleavage of pre-mRNA occurs at the site downstream of AAUAAA and upstream of a sequence rich in G/U. These sequences are recognized by CPSF (cleavage- and polyadenylation-specific complex) and cleavage-stimulation factor (CstF)
Polyadenylation Enhances the Stability of the mRNA • Polyadenylation of mRNA will increase the stability of mRNA from degradation from the 3’ end • In standard histone mRNAs without polyadenylation, the mRNAs are stablized by formation of hairpin loop during S phase
hnRNP Proteins: a diverse set of proteins with conserved RNA binding domains associated with pre-mRNA • Pre-mRNA during the processing are associated with many nuclear proteins, major components of hnRNPs(Heterogenous ribonucleoprotein particles) • The nuclear RNA molecules are collectively referred as hnRNA (heternogenous nuclear RNA) – i.e., pre-mRNA and other nuclear RNAs of various sizes • The hnRNA with the associated proteins can be visualized by immunostaining • Proteins of hnRNP are with sizes of 34 to 120 kD. These proteins were isolated by irradiating the cultured cells to high-dose of UV, preparing nuclear extracts, run the extract through an oligo-dT cellulose column, recover the bound proteins, and then characterize the proteins. • The hnRNP proteins have a modular structure: containing one or more RNA-binding domainsand at least one domain that is believed to interact with other proteins • A diverse set of proteins with conserved RNA-binding domains associate with pre-mRNAs • Reading List VI: • hnRNP Complex • Heterogenous ribonuclear particles
Functions of the hnRNP Proteins Interaction of the pre-mRNA with hnRNP proteins will prevent the formation of secondary structure by pre-mRNA, thus making the pre-mRNA accessible for interaction with other RNA molecules or proteins Pre-mRNA – hnRNP protein complex will make pre-mRNA a more uniform substrate for further processing hnRNP proteins A1, C, and D bind preferentially to the pyrimidine-rich sequences at the 3’ ends of introns The above observation suggests that different hnRNP proteins will bind to different RNA sequences that specify RNA splicing or cleavage/polyadenylation and contribute to the structure recognized by RNA-processing factors. Other studies suggest that hnRNP proteins may function in the transport of mRNA to the cytoplasm
Conserved RNA Binding Motifs (I)RRM Domain & Its interaction with RNA RBD: RNA-binding domain containing 81 amino acid residues • The RNA recognition motif (RRM), also called as RNP motif and the RNA binding domain (RBP) are the most common RNA-binding domains in the hnRNP proteins. This 80-residue domain contains two highly conserved sequences (RNP1 and RNP2) found in yeast to human • RRM domain consists of 4-stranded β sheet flanked on one side by two α helices. The conserved RNP1 and RNP2 sequences lie side by side on the two central β strands and their side chains make multiple contacts with a single-stranded region of RNA.
Conserved RNA Binding Motifs (II) RGG Box: This is the other RNA-binding motif found in hnRNP proteins, containing five Arg-Gly-Gly (RGG) repeats with several interspersed aromatic amino acids This motif is similar to the RNA-binding domains of the HIV Tat protein The KH motif:a45-residue motif is found in the hnRNP K protein and several other RNA-binding proteins, commonly two or more copies of the KH motif are interspersed with RGG repeats The 3D-structure of KH domain is similar to that of the RRM domain but smaller. It consists of three b-sheet structure supported from one side by 2 a-helices RNA binds to the KH motif by interacting with a hydrophobic surface formed by the α helices and one β-strand
Splicing: RNA-DNA Hybridization to Introns Are Spliced from Pre-mRNA • (a)Eco RI fragment ofadenovirus DNA containing exon gene. The gene contains 4 short exons and three introns • (b) Electron micrograph (left) and schematic drawing (right) of hybrid between DNA and RNA • Introns in the pre-mRNA are removed by RNA splicing • Richard Roberts and Philip Sharp were awarded with a Nobel prize in 1993 for the discovery of splicing of precursor mRNA • For long transcription units, splicingof introns in the nascent RNA begins before the entire transcription is completed • Reading List VI: (i). R Loop Mapping; (ii) Nobel Lecture by Richard Sharp;(iii) Mapping of viral RNA with viral DNA
Consensus Sequences around 5’ and 3’ Splice Sites in Vertebrate Pre-mRNAs Donor Acceptor • Location of specific splice sites can be determined by comparing the genomic and specific cDNA sequences • A pyrimidine rich region (~15 bases) is located in the upstream of the 3’ splice site • About 30 – 40 nucleotides at each end of an intron are necessary for splicing to occur at normal rates • Donor splice site: GU; Acceptor splice site: AG; This is termed GU/AG rule Assigned Reading: • Intron splicing
Two Transesterification Reactions in Splicing of Exons • Two sequential transesterification reactions are required to remove the intron sequence • Introns are removed as a lariat-like structure in which the 5’ G of the intron is joined in an unusual 2’,5’-phosphoester bond to an adenosine near the 3’ end of the intron. This A is called the branch point. • Since the number of the phosphodiester bond in the RNA molecule during splicing is not changed, the process does not require energy
Two Steps of Transesterification Reactions • First step of splicing reaction is nucleophilic attack by the 2’-OH on the 5’ splice site • The left exon takes the formof a linear molecule • The right intron exon molecule forms a lariat, in which the 5’terminus generated at the end of the intron simultaneously transesterificates to become linked by a 2’-5’ bond to a base within the intron. The target base “A” in a sequence that is called the “branch site” • In the second step, the free 3’-OH of the exon that was released by the first reaction attacks the bond at the 3’ splice site to form a 3’-5’ phosphoester bond The consensus sequence of the branch site in yeast is UACUAAC, and in multicellular eukaryotes is purine and pyrimidines at each position
Structure of snRNA Molecule • U1 snRNP contains the core Sm proteins, three U1-specific proteins (U1 – 70K, U1A and U1C) and U1 snRNA • U1 snRNA contains several domains: • Sm-binding site: interacting with the common snRNP proteins • Stem-loop structures: binding to proteins that are unique to U1 snRNP • U1 snRNA interacts with the 5’ splice site by base pairing between its single-stranded 5’-terminus and a stretch of four to six bases of the 5’splice site • Mutations in the 5’splice site and U1 snRNA can be used to test the snRNA importance of pairing of 5’ splice site and the snRNA
snRNAs Base-Pairing with Pre-mRNA During Splicing • Splicing requires the presence of small nuclear RNA(snRNA) • Five U-rich small nuclear RNAs (snRNA), designated as U1, U2, U4, U5 and U6, participate in pre-mRNA splicing • These RNAs are 107 to 210 nucleotides long, associated with 6 to 10 proteins in small nuclear ribonucleoprotein particles (snRNPs)
Spliceosome • Small cytoplasmic RNAs (scRNA; scyrps) – RNAs that are present in the cytoplasm (and sometimes are also found in the nucleus). • Small nuclear RNA (snRNA; snurps) – One of many small RNA species confined to the nucleus; several of them are involved in splicing or other RNA processing reactions. • Small nucleolar RNA (snoRNA) – A small nuclear RNA that is localized in the nucleolus. • The five snRNPs involved in splicing are U1, U2, U5, U4, and U6. • Together with some additional proteins (splicing factors), the snRNPs form the spliceosome (~12 MDa). • Figure in the left shows the number of proteins present in the sliceosome
Formation of the Commitment Complex • SR proteins: Proteins that play a critical role in the formation of spliciosomes. These proteins are splicing factors that contains one or two RNA recognition domains (RS domains; Arg/Ser repeats) • U1 snRNP initiates splicing by binding to the 5′ splice site by means of an RNA–RNA pairing reaction. • The commitment complex (or E complex) contains U1 snRNP bound at the 5′ splice site and the protein U2AF bound to a pyrimidine tract between the branch site and the 3′ splice site.
Intron Definition/Exon Definition (I) • In multicellular eukaryotic cells where splicing signals are highly variable, SR proteins play an essential role in initiating the formation of the splicing commitment complex • In yeast, all intron containing genes are interrupted by a single small intron, the 5’ and 3’ splice sites are recognized by U1 snRNP, BBP and Mud2. this is referred as “intron definition” • The figure in the right shows exon definition which happens in short exons and long introns. More details are explained in the next slide
Intron Definition/Exon Definition (II) • Intron definition mechanism also applies to splicing of small introns in multicellular eukaryotic cells • Many muticellular eukaryotic genes possess long introns with many sequences resemble the real splice sites. To ensure correct recognition of the splice sites, the mechanism of “exon definition” is employed • In exon definition, the U2AF heterodimer (U2 snRNP auxiliary factor) binds to the 3’splice site and U1 snRNP base pairs with the 5’ splice site downstream from the exon sequence. This process may be aided by SR proteins that bind to specific exon sequences between the 3’ and downstream 5’ specific sites. By an unknown mechanism, the splicing is done properly
The Spliceosome Assembly Pathway • The commitment U4 dissociates from U6 snRNP to allow U6 snRNA to pair with U2 snRNA to form the catalytic center for splicing. • Both transesterification reactions take place in the activated spliceosome (the C complex). • The splicing reaction is reversible at all steps. • E complex progresses to pre-spliceosome (the A complex) in the presence of ATP. • Recruitment of U5 and U4/U6 snRNPs converts the pre-spliceosome to the mature spliceosome (the B1 complex). • The B1 complex is next converted to the B2 complex in which U1 snRNP is released to allow U6 snRNA to interact with the 5′ splice site. • The final step of transesterification is the formation of the lariate
Splicing Utilizes a Series of Base Pairing Reactions between snRNAs and Splice Sites • U1 pairs with 5’ splice site • U2 pairs with the branch site • U6 pairs with 5’ splice site • U5 is close to both exons
Mutation that Affects the Binding of SR Proteins to the Exonic Splicing Enhancer Mutation in SR protein that results in unable to bind to the exonic splicing enhancer-------resulting skipping exons during splicing. The truncated mRNA will be degraded or translated into protein with abnormal function For example, spinal muscle atrophy, a disease that causes childhood mortality, is resulted from mutation of SR protein that fails to bind to the exonic splicing enhancer of SMN2 pre-mRNA and causes exon skipping leading to low production of SMN2 protein. In childhood, low levels of SMN2 protein will lead to low viability of spinal cord motor neurons, resulting in death. Approximately 15% of the single base-pair mutations that cause human genetic diseases interfere with proper exon definition. Although some of the mutation may lead to use of alternative splice sites, others will result in skipping exons due to SR protein failing to bind to the exonic splicing enhancer (a six- base motif)
Splicing Can Also Occur Between AU and AC A minority of introns begin with AU and AC as opposed to the GU and AG found in most introns. Removal of these types of introns involves lariat formation and the U5 RNA, but the other U RNAs involved are different in the two cases. The AU-AC type splicing takes placing in the cytoplasm, suggesting that this type of splicing may have different function than the GU-AG type of splicing in the nucleus.
Discovery of a Hammerhead Ribozyme • Thomas Cech discovered an RNA molecule that possessed enzymatic activity that could remove intervening sequence in tetrahymena rRNA– this discovery led to a Nobel prize in 1989 • Self Splicing Introns (Ribozymes) : Introns that can be spliced out in the absence of splicing protein factors in vitro Hammerhead Ribozyme • Reading List VI: • Cech’s Nobel Prize lecture • Hammerhead ribozyme
Self-Splicing of Group II and Group I Introns • Group I introns (present in nuclear rRNA genes of protozoans) and group II introns (present in protein-coding genes and some rRNA and tRNA genes in mitochoria and chloroplasts of plants and fungi) • Group II introns excise themselves from RNA by an autocatalytic splicing event (autosplicing or self-splicing). • The splice junctions and mechanism of splicing of group II introns are similar to splicing of nuclear introns. • A group II intron folds into a secondary structure that generates a catalytic site resembling the structure of U6-U2-nuclear intron. • Group I introns excise themselves from RNA also by a autocatalytic splincing event
Comparison of Self-Splicing Group II Introns and U snRNAs in Spliceosome First Trans-esterification Second Transesterifi-cation • Two types of self splicing introns have been discovered: Group I introns (present in nuclear rRNA genes of protozoans) and group II introns (present in protein-coding genes and some rRNA and tRNA genes in mitochoria and chloroplasts of plants and fungi)
Self-Splicing Group II Introns Provide Clues to the Evolution of snRNAs Even though the precise sequences of group II introns are not highly conserved, they fold into conserved, complex secondary structure containing numerous stem loops The chemistry of self splicing by a group II intron is similar to that found in pre-mRNA. This observation led to a hypothesis that “snRNA function analogously to the stem-loop in the secondary structure of group II introns” According to this hypothesis, one expect to see the 3D structures presented in the previous slide Introns in ancient pre-mRNAs evolved from group II self-splicing introns through progressive loss of internal RNA structures, which concurrently evolved into trans-acting snRNA that perform the same functions Support for this type of evolutionary model comes from experiments with group II intron mutants in which domain V and I are deleted. The resulting mutant fail to perform self-splicing Maturase enzyme may function to stabilize the 3-D structure for self-splicing of group II introns in vivo
Nuclear Exonucleases Degrade RNA That Is Processed Out of Pre-mRNA The spliced introns are degraded by exonucleases from 5’ or 3’ end. The 2’,5’-phosphodieaster bond in the newly spliced intron is excised to linear structure by a debranching enzyme. The predominant decay pathway of the linear RNA molecule is 3’ to 5’ by 11 exonucleases that associate with one another in a large protein complex called the “exosome” Other protein in the exosome is RNA helicasethat disrupt baser pairing and protein RNA interactions Exosome also functions to degrade improper processed or polyadenylated pre-mRNA Nuclear cap-binding complex: protein that bind to 5’cap to prevent the cap been destroyed by exonulceases
Chain Elongation by RNA Pol II Is Coupled to the Process of RNA-Processing Factors • The carboxyl-terminal domain (CTD) of RNA Pol II is composed of multiple repeats of a seven residue sequence. When fully extended, the CTD domain in yeast enzyme for instance is about 65 nm long • The CTD of human RNA Pol II is about twice as long • The long CTD allows multiple proteins to associate with a single RNA Pol II molecule • For instance, enzyme that adds the 5’ cap to the nascent transcription initiation as well as RNA splicing and polyadenylation factors are associated with phosphorylated CTD. As a consequence, these processing factors are present at high local concentrations when splice sites and poly(A) signals are transcribed by the polymerase, enhancing the rate and specificity of RNA processing. Deletion of CTD will reduce transcription and processing of RNA • The association of RNA splicing factors with the phosphorylated CTD stimulates transcription elongation
Coupling of Transcription and RNA Processing • Transcription and RNA processing are coupled in the nucleus: • Capping • Release pausing • Splicing • Polyadenylation • Recruiting of these various factors is closely coupled to the phosphorylation of the CTD of RNA polymerase II (see next slide for details) • CE: capping enzyme; SC: splicing complex; PC: polyadenylation complex; ph: phosphorylation
Methylation of Arginine at H3 Stimulates RNA Splicing • Trimethylation of the arginine at position 4 in histone H3 not only results in a more open chromatin structure but also stimulates transcriptional elongation and enhances RNA splicing • Both the CBC and the CPSF polyadenylation complex can interact with splicesome (S), so linking together these different post-transcription events
Splicing Facilitate Transport of mRNA to cytoplasm • Splicing can occur during or after transcription. • The transcription and splicing machineries are physically and functionally integrated. • Splicing is connected to mRNA export and stability control. • Exon junction complex (EJC) – A protein complex that assembles at exon–exon junctions during splicing and assists in RNA transport, localization, and degradation. REE (Aly), a key protein mediating mRNA transport by interacting with TAP (Mex67p) EJC: exon junction complex, recruiting several protein complex for mRNA transport
The EJC Complex Couples Splicing with NMD • Splicing in the nucleus can influence mRNA translation in the cytoplasm. • Nonsense-mediated mRNA decay (NMD) – A pathway that degrades an mRNA that has a nonsense mutation prior to the last exon.
Regulation of Pre-mRNA Processing Macromolecular Transport Across the Nuclear Envelop
Regulation of Pre-mRNA Processing Alternative splicing is the principle mechanism for regulating mRNA processing By comparison of the genomic sequences of genes and the sequences of the expressed sequence tags (ESTs) of cDNAs reveals that several genes have complex transcription units, and capable of producing several different mRNAs by different combinations of exons More than 60 % of all transcription units in human genome are complex Although cleavage at alternative poly(A) sites of pre-mRNA are known example, alternative splicing of different exons is the more common mechanism for expressing different protein from one complex transcription unit Such alternative processing pathways are usually regulated in a cell-type specific or developmental stage-specific manner. Example: different fibronectin isoforms are produced in fibroblasts and hepatocytes
Different Mode of Alternative Splicing • Specific exons or exonic sequences may be excluded or included in the mRNA products by using alternative splicing sites. • Alternative splicing contributes to structural and functional diversity of gene products.
Effect of Alternative Splicing on Gene Expression (I) • Alternative splicing can Affect gene expression in the cell at least in two ways: • Create structural diversity of gene products by including or omitting some coding sequences or creating alternative reading frames for a protein of the gene. Example: CaMKIId gene encoding kinases Differentiatial splicing of the pre-mRNA for kinase results in production of three different kinases localized in different locations in the cells and kinasing different substrates • Another example is that alternatively spliced products may exhibit opposite functions . This example applies to all genes involved in apoptosis; one form promote apoptosis and the other form protect cells from apoptosis
Alternative Splicing of Primary Transcripts Where Both 5’ and 3’ Ends of Transcripts Are Identical • In muscle troponin T gene, the same precursor mRNA can be spliced in up to 64 different ways in different muscle cell types due to the presence of tissue-specific splicing factors • The indication of the presence of tissue specific splicing factors for differential removal of exons 4-8 in different muscle cells comes from the following experiment. When muscle troponin T gene is expressed in non-muscle cell types, exons 4-8 are completely removed and resulting in only one mature mRNA.
Expression of Dscam Isoforms in Drosophila Retinal Neurons The most extreme example of regulated alternative RNA processing is the expression of Dscam gene in Drsophila Dscam gene encodes a set of proteins in the neuron of Drosophila Mutations in this gene interfere with the normal connections made by the axons of the retinal neurons with neurons in a specific region of the brain There are 95 alternatively spliced exons that could be spliced to generate 38,000 possible isoforms These results raise the possibility that the expression of different Dascam isoforms through regulated RNA splicing helps to specify the tens of thousands of different specific synaptic connections made between retinal and brain neurons In other words, correct wiring of neurons in the brain may depend on regulated RNA splicing
Expression of Ca++-Activated K+ Channel Isoforms mRNAs in Vertebrate Hair Cells Is Another Example • In the inner ear of vertebrates, individual “hair cells” which are ciliated neurons, respond most strongly to a specific frequency of sound • In birds and reptiles, the turning of hair cells is affected by the opening of K+ channel in response to Ca++ concentration changes. The channel opens determine the frequency with which the membrane potential oscilates, and hence the frequency to which the cell is turned • Slogene controls the channel, which is expressed in multiple, alternatively spliced mRNAs. • Suggested Reading List Vi: • Distribution of Ca++-activated K+ chennel isoforms along the tonotopic gradient
Slo proteins encoded by different alternative spliced mRNAs open Ca++-activated K+ channel at different Ca++ concentrations Hair cells with different response frequencies express different Slo channel protein depending on their position along the length of the cochlea There are 8 regions on the slomRNA where alternative exons are utilized, permitting the expression of 576 possible isoforms RT-PCR analysis of sol mRNAs from individual hair cells has shown that each hair cell expresses a mixture of different alternative spliced sol mRNAs with different forms predominating in different cells according to their position along the cochlea In rat, the splicing at one of the alternative splice sites in the slo pre-mRNA is suppressed when a specific protein kinase is activated by depolarization of the neurons. This observation suggests that a splicing repressor specific for this site may be activated when it is phosphorylated by this protein kinase These observations suggest that modification of splicing factors may play a significant role in modulating neuron function Suggested reading List IV: Distribution of Ca++-Activated K+Channel Isoforms along the Tonotopic Gradient
Effect of Alternative Splicing on Gene Expression (II) • Alternative splicing may affect various properties of the mRNA by including or omitting certain regulatory RNA elements, which may significantly alter the half-life of the mRNA. This form of the splicing is caused by the presence of a splicing factor • This figure shows models by which an alternative splicing factor can affect splicing by binding to a cis-acting element. In (a), the factor acts by promoting the use of the weaker of the two potential splicing sites while in (b) it acts by inhibiting use of the stronger of the two sites so that other, weaker, site is used. • The sex determination in Drosophila is the best characterized example