[V].Process of Transcription and Transcriptional Control of Gene Expression • RNA polymerases and Initiation of transcription • Transcriptional elongation and termination • Gene promoter, enhancer and silencers • DNA binding by transcription factors • Activation and repression of transcription • Regulation at transcriptional elongation • Regulation of transcription by RNA polymerases I and III 1
Overview of Control of Gene Expression Regulation at transcriptional level: Regulation of initiation of transcription Chromatin-mediated transcriptional control Activators and repressors interaction with transcription complex Regulation at post-transcriptional level: Regulation of alternative splicing leading to production of multiple isoforms of proteins Regulation of transport of mRNA into cytoplasm Regulation at the translational or post-translational level Modification of the translational apparatus or specific protein factors Micro RNAs RNA intereference (RNAi or siRNA) Cytoplasmic polyadenylation mRNA degradation Localization of mRNA in the cytoplasm
TATA Box -25 - -35 bp +1 Transcription Distal promoter Proximal promoter Transcription start site Regulatory region (regulatory cis element) Structural gene Structure of Protein Coding Gene Two key features of transcription control: • Binding of transcription factors (proteins) to regulatory regions of genes and resulted in changing of chromatin structure • Specific proteins that bind to a gene’s regulatory sequences determines where transcription will start and either via activating or repressing its transcription
Conventions for Describing RNA Transcription Regulatory region Direction of transcription (RNA synthesis)
Polymerization of Ribonucleotides by RNA Polymerase during Transcription • In transcription, the sequence of the RNA strand is copied from one strand of the DNA. • The ribonucleotide is added at the 3’ end of the growing RNA strand, i.e., RNA strand grows from 5’ to 3’ direction • The DNA strand used as the template runs in the direction of 3’ to 5’ • By denoting the position where RNA polymerase starts to make RNA as +1. down stream sequence is toward the 5’ direction of the template while upstream is toward the 3’end
RNA Polymerase in Prokatrotes • In prokeryotes, there is only one type of RNA polymerase that is responsible for the synthesis of rRNA, tRNA and mRNA
The Structure of Bacterial RNA Polymerase • Generally speaking, similar structures of RNA polymerases are found in bacteria, archaea and eukaryotic cells • Bacterial RNA polymerase consists of two large submits (b’ and b), two small subunits (a), one sigma factor, and one omaga (w) submit
Stages of Transcription • RNA polymerase binds to the promoter region • Requires transcription factors to locate the promoter region • Melts the double stranded DNA (12 to 14 base pairs) in order to expose the template strand • Template strand enters into the active site of RNA polymerase that catalyzes the formation of phosphodieaser bond between two molecules iof ribonucleotide triphosphate that are complementary to the promoter template strand at the start site of transcription • Strand elongation • Transcription termination
Stages in Transcription (I) Promoter; Transcription factors; Transcription bubble
Stages in Transcription (II) Rate of Elongation: 1000 NT/min at 37oC
Active RNA Polymerase in Bacterial Cells For active transcription in eubacteria, the RNA polymerase needs to bind to a protein, sfactor (s70), to form a complete complex Sigma factor (s70)binds to the promoter DNA at -10 (six bases) and -35 (seven bases) to bring the core enzyme of RNA polymerase to initiate transcription at +1 position -35 -10 TTGACAT--------16 – 18 bp-------TATAAT -35 element TATA Box T82T84G78A65C54A43 T80A93T45A60T96 Sigma factor (s70) acts as an initiation factor for transcription since it falls off from the RNA polymerase once the first few bases are transcribed. This factor is not required for elongation of the transcription
Gene Organization, Transcription & Translation in Prokaryotes • Genes encoding proteins devoted to a single metabolic goal are most arranged in a contiguous array • Such arrangement of genes in a functional group is called “operon” • Transcription of genes in an operon results in a long mRNA (polycistronic mRNA) • Translation of the polycistronic mRNA gives rise to different peptides Poly-cistronic mRNA Translation initiates at five different sites
Gene Organization, Transcription and Translation in Eukaryotes • No polycistronic mRNA was found in eukaryotes • Genes devoted to a single metabolic pathway are physically separated • Each gene contains exons and introns • Precursor mRNA initially transcribed from the gene and the intron regions are spliced out to form a mature mRNA • Introns are rarely found in bacteria, archaea and uncomon to unicellular eukryotes such as yeast
Eukaryotic RNA Polymerases In eukaryotes, three different RNA polymerases are found. These are Assigned Reading: --Structural basis of eukaryotic gene transcription. FEBS Letters 579: 879-903, 2005.
Eukaryotic RNA Polymerases • All eukaryotic RNA polymerases have ~12 subunits and are complexes of ~500 kD. • Some subunits are common to all three RNA polymerases. • The largest subunit in RNA polymerase II has a CTD (carboxy-terminal domain) consisting of multiple repeats of seven amino acids (Tyr-Ser-Pro-Thr-Ser-Pro-Ser).
RNA Polymerases of Prokaryotes & Eukaryotes • Transcription of different RNA by three different RNA polymerase shares some common features: • Each of the polymerases have two large submits (b and b-like) • Contains 2 a and one w submits; the a submits in Pol I and Pol III are identical • Contain identical common submits
Three RNA Polymerases in Eukaryotes • Three different RNA polymerases can be separated by chromatography on y on DEAE cellulose column • Polymerase II is very sensitive to a-aminitin (I mg/ml), polymerase III is less sensitiveto a-aminitin(10 mg/ml) and polymerase I is not sensitive to a-aminitin • RNA pol I: pre-rRNA • RNA pol II: pre-mRNA, snRNA, miRNA • RNA pol III: tRNA, 5S rRNA, snRNA U6, 7S RNA Separation of RNA polymerases on a DEAE Cellulose column
Transcription by RNA Polymerase I (1) • RNA polymerase I has a bipartite promoter • Transcription of rRNA genes by RNA polymerase I is relatively simple. rRNA genes are arranged in tandem repeat flanked by non-transcribed spacers • RNA polymerase I exists as a holoenzyme that contains additional factors. The promoter of rRNA gene consists of two separate regions. The core promoter surrounds the start site extending from -45 to +20. • There is an A-T-rich site and a G-C-rich site in the Core promoter region. The efficiency of the core promoter is enhanced by the upstream promoter element (UPE or UCE). UPE is another G-C-rich sequence extending from -180 to -107 • UBF: upstream transcription factor that binds to UPE
Transcription by RNA Polymerase I (2) • To initiate the transcription, the TATA-binding protein (TBP) is required. Although the same TBP is required for initiating transcription by RNA polymerase II and RNA polymerase III, it does not bind to DNA directly in this case • A protein factor, upstream binding factor (UBP) recognizes and binds to the upstream promoter element. This complex will recruit other transcription factors (SL1), TBP, and RNA polymerase I • By doing so, the entire transcription initiation complex is formed and transcription initiated
Transcription by RNA Polymerase III • RNA polymerase III has two types of promoters: • Internal promoters (types1 & 2; 5S and tRNA genes) containing consensus sequences located within the transcription unit and cause initiation to occur at a fixed distance upstream (Promoters for 5S rRNA and tRNA genes belong this type) • Upstream promoters contain three short consensus sequences (Oct, PSE and TATA) upstream of the startpoint that are bound by transcription factors (promoter for snRNA belongs to this type)
Internal type I pol III promoters Internal type II poly III promoters
Binding Site of Transcription Factor to 5S rRNA Gene Can Be Localizeed by DNase I Footprinting Assay • DNase I Footprinting assay: When DNA is bound by a protein, the region of the DNA which is bound by the protein is resistant to digestion by DNase I. Using this technique, it is possible to map the region of the DNA which binds to any protein factor. This type of assay is called as DNaseI footprinting assay • By using this technique, the interaction of transcription factors such as TFIIIA, TFIIIB and TFIIIC for RNA polymerase III to transcribe the 5S rRNA gene was determined • TFIIIA, TFIIIB and TFIIIC are required for the formation of transcription pre-initiation complex for genes that are transcribed by RNA polymerase III
Internal Type I and Type II Promoter use Different Transcription Factors • Type 2 promoter, TFIIIC binds to both boxA and boxB. This enables TFIIIB to bind to start site • Type 1 promoter , TFIIIA must bind at boxA to enable TFIIIC to bind to boxC. Following this, TFIIIB and TBP bind to the start site to allow poly III to initiate transcription • TFIIIA and TFIIIC are assembly factors whose sole role is to assist the binding of the positioning factor TFIIIB at the correct location. Once TFIIIB is bound, TFIIIA and TFIIIC can be dissociated Type 2 poly III promoter Type 1 poly III promoter
Type 3 PolyIII Promoter • In this type of promoter, there are three upstream elements, TATA, PSE and Oct. These elements are also present in genes encoding for snRNA by polyII. These elements function similarly in poly II and poly III • TBP binds to TATA box which in turn binds to TFIIIB, and then recruit poly III to bind to startpoint. This is the formation of the pre-initiation complex. • Efficiency of transcription is much increased by PSE (proximal sequence element) and Oct (8-base-pair binding sequence) elements
Transcription by RNA Polymerase II • Genes encoding proteins are transcribed by RNA polymerase II • RNA polymerase II requires general transcription factors called TFIIX to initiate transcription. • RNA polymerase II promoters frequently have a short conserved sequence Py2CAPy5(the initiator, Inr) at the startpoint. • The TATA box is a common component of RNA polymerase II promoters and consists of an A-T-rich octamer located ~25 bp upstream of the startpoint.
RNA Polymerases in All Promoters Are Positioned by Factors Containing TBP • TBP is a component of the positioning factor that is required for each type of RNA polymerase to bind its promoter. • The factor for RNA polymerase II is TFIID, which consists of TBP and ~14 TAFs, with a total mass ~800 kD. • TBP binds to the TATA box in the minor groove of DNA. • TBP forms a saddle around the DNA and bends it by ~80°.
Saddle-Like Structure of TBP Bound to TATA Box • TFIID is composed of the TATA-binding protein (TBP) and a group of evolutionarily conserved proteins known as TBP-associated factors or TAFs • TBP has a saddle-like structure which binds to TATA box and bend the DNA as shown in figure • TFIIA binds to TBP at the N-terminus of TBP, and TFIIB binds at the C-terminus • TAFII250, one of the TBP associated factors which contains histone acetyltransferase activity, capable of acetylating histones leading to modulation of chromatin structure
Consensus Sequence of TATA Box • Transcription of genes by RNA polymerase II is regulated by binding of multiple transcription factors to the transcription-control regions: • Promoter region • Other transcription control elements • TATA box, initiators and CpG islands function as promoters in eukaryotic DNA • TATA box is found 25-35 bp upstream of the transcription start site
Initiators and CpG Islands Some genes do not contain TATA box. Instead they contain an initiator element (Inr) with a consensus sequence as: (5’Y-Y-A+1-N-T/A-Y-Y3’) where C is at the -1 position and A at the transcription start site (+1) and Y is pyrimidine (C or T) CpG islands: Some genes contain CG rich stretch of 20 to 50 nucleotides with 100 bp upstream of the start site. Since CG is statistically under represented in the vertebrate genome, the presence of a stretch of CG-rich region or the CpG island in the upstream of the start site is a distinctly nonrandom distribution Therefore, the presence of a CpG island in the genomic DNA suggests that it may contain a transcription-initiation region In this case, TBP does not bind to DNA but recruited initiator binding protein (IN) which will bind to the initiator element and recruit TFIIB and the RNA polymerase II. Hence TBP plays a central role for the formation of transcription initiation complex Assigned Reading: Weight matrix description of four eukaryotic RNA polymerase II promoter…..
The Role of TBP in Formation of Tranascription Initiation Complex Transcription of promoters by RNA polymerase II involves the recruitment of TBP (and associated factor (X)) forming the TFIID complex to the promoter. This may occur by direct binding of TBP to the TATA box where this present (a) or by protein-protein interaction with a factor (IN) bound to the initiator element (NR) in the promoters lacking a TATA box (b)
The Role of TBP in Transcription by RNA Polymerase I & III • TBP is a basic transcription factor for transcription by RNA polymerase I and III since it is one of the components for SKL1 and TFIIIB • TFIIIB is a complex of TBP and two other proteins, Bdp1 and Brf1 • Transcription of promoters by RNA polymerase III involves the recruitment of TBP (with its associated factor Y forming the TFIIIB) to the promoter. This is achieved by protein-protein interactions with TFIIIA and TFIIIC or TFIIIC alone in the case of promoter lacking TATA box (as in a) or by direct binding to the TATA box (as in b)
Therefore, all different RNA polymerase require TBP to initiate the formation of pre-initiation of transcription complex which will in turn recruit RNA polymerase to bind to the promoter to initiate the transcription. BTC: Basal Transcrip-tional Complex Initially, it was believed that each organism would have only one TBP protein. It has now become clear that other TBP-like proteins exist in multicellular organisms. So multicellular organisms contain a TBP, a TBP-like protein (TLF) and TRF1 (in insects) and TRF3 (in vertebrates).
Transcription of Protein Coding Genes by RNA Polymnerase II • The TFIID-DNA complex will recruit TFIIA and TFIIB (as in b) • Following this step, RNA polymerase II and TFIIF will be recruited to form the pre-transcription initiation complex • TFIIE and TFIIH will be subsequently recruited to the complex to form the transcription initiation complex • TFIIH contains kinase activity which phosphorylates the C terminus Tyr-ser-pro-Thr-Ser-Pro-Ser domains of RPBI of RNA polymerase II. This will allow transcription to occur and RNA polymerase II and TFIIF will move along the DNA molecule to continue the elongation of the RNA molecule • Therefore phosphorylation of RNA polymerase II is critical for transcription to produce RNA product
Formation of Initiation Complex of Transcription of Protein Encoding Genes • TFIID: contains TBP and many TAFs • B: TFIIB • F: TFIIF • TFIIE • TFIIH contains protein kinase activity** • TFIIF contains two subunits: RAP74 (ATP-dependent helicase and RAP38 (similar to bacterial s-factor that contact the core poly II
Pre-Initiation Complex of Transcription • TFIIB helps position RNA polymerase II • Other transcription factors bind to the complex in a defined order, extending the length of the protected region on DNA. • When RNA polymerase II binds to the complex, it initiates transcription.
Modification of the RNA Polymerase II CTD Heptapeptide During Transcription • Phosphorylation of CTD is required for release of poly II from the promoter and poly II proceedes transcription • Phosphorylation of serine 5 in CTD by TFIIH is required for Capping of the mRNA • Phosphorylation of serine 2 in CTD by P-TEFb can recruit SCAFs essential for splicing of mRNA and for polyadenylation of mRNA
Promoter Clearance and Elogation • TFIIH has several activities: (i). ATPase, (ii). Helicases of both polarity, (iii). Kinase activity that can phosphorylate CTD tail of poly II (serine 3 of the CTD) • In addition, TFIIH may play a role in elongation, The interaction of TFIIH with DNA downstream of the start site is required for poly II to escape from the promoter • TFIIH is also involved in repair of damage to DNA • For poly II to move from the start site to down stream, hydrolysis of ATP by TFIIE and melting of the supercoiled DNA by TFIIH (XPB subnit) are required • For successful elongation of the initiated transcript, a kinase (P-TEFb) is required to phosphorylate Serine 2 at the CTD • The phosphorylation pattern of CTD is dynamic during elongation process which is controlled by multiple protein kinases and phosphatases. Transcription factors associate with poly II when CTD is un-phosphorylated and dissociated with poly II when CTD is phosphorylated
Elongation of Transcription • Following the initiation of transcription, the synthesis of RNA will continue for 20-30 bases and the RNA polymerase II will pause • Release of the pause will take place when the serine 2 at the C-terminus (tyr-ser-pro-thr-ser-pro-ser) of RNA polymerase II is phosphrylated • The phosphorylation of serine 2 is closely linked to the modification of the free 5’ end of the nascent RNA molecule by addition of a modified G nucleotide in the process known as capping. This process promotes the binding of pTEF-b (positive transcription elongation factor) kinase which can phosphorylate the serine 2 on the RNA polymerase II allowing elongation to proceed
Addition of Poly (A) Tail to the mRN A The initial RNA transcript is cleaved downstream of the poly-adenylation signal (AAUAAA) and a poly(A) tail added to the free 3’ end. Hence , the 3’ end of the mature mRNA is in significantly upstream of the transcriptional termination site.
Stages in the Transcriptional Process of Protein Coding Genes • Initiation of transcription • Stalling of transcription • Elongation • Termination and polyadenylation Some protein coding genes contain a TATA box located approximately 30 bases upstream of the transcription start site. Other protein coding genes utilize an initiator (Inr) sequence with the consensus 5’-YCANTYY-3’, with A residue being the first base which is transcribed (Y=C or T; N=any nucleotide)
Regulatory Elements of a Gene Regulatory elements of a gene in eukaryotes often are many bases to kilobases from the start site of the transcription of the gene Promoter: A DNA sequences that specifies where RNA polymerase binds and initiates transcription of the gene Transcription factors: Protein factors necessary for transcription Transcription factor binding sites: Sites where transcription factors bind to regulate the transcription. These binding sites are also called as cis-acting elements which usually located many bases upstream or downstream of the site of initiation of transcription (or promoter) Transcription from a single promoter may be regulated by binding of multiple transcription factors to alternative cis-acting elements, permitting complex control of gene expression
The Promoter Structure of Typical RNA Polymerase II Transcribed Gene More precisely the promoter region should be described as: • Core or basal promoter (proximal promoter) : serves to recruit basal transcriptional complex to initiate basal transcription • Upstream promoter elements (distal promoter or regulatory region): Elements that regulate the rate of transcription
Identification of Promoter-Proximal Elements that Regulate Eukaryotic Genes There are elements in the promoter-proximal regions that regulate the expression of genes. These elements are termed as “promoter-proximal elements” or “promoter-proximal transcription regulatory elements” How are “promoter-proximal elements” determined? 5’ deletion analysis to determine the region that may contain the transcription regulatory site(s). Linker scanning mutations to pinpoint the sequence with regulatory function. In this analysis, a series of constructs with contiguous overlapping mutations are assayed for their effect on expression of a reporter gene or production of a specific mRNA The promoter-proximal regulatory region of the thymidine kinase (tk) was determined by this analysis Reading List An efficient protocol for linker scanning mutagenesis
Determination of Transpcrition Control Sequence • TTR: transthyretin, a protein that transports thyroid hormone in the blood and the cerebrospinal fluid that surrounds the brain and spinal cord • TTR is expressed in hepatocytes and in the choroid plexus in the brain • The control elements (cis-acting elements) that control the transcription of TTR gene are identified by experiments outlined in the left of this slide • Reporter gene: a gene that is used to report the activity of a promoter. e.g., green fluorescent protein gene (GFP), luciferase (lux), b-galactosidase (lac Z) • Two sites upstream of the transcription start sites are important (~ -2.1 – 1.8 kb; ~ -200 bp) • Reading List: • Deletion analysis identifies a region in the upstream of a gene that regulate the expression of the gene
Linker Scanning Mutation Analysis • To pin-point the exact location of a regulatory element, linker scanning mutation analysis should be conducted Reading List: An efficient protocol for linker scanning mutagenesis…..
Enhancers • In eukaryotic promotrs there is a CAAT box at -75 and a GGGCGG box at -90. these boxes function to increase efficiency of the promoter and they can function in either orentation • Enhancer: A cis-acting element located at a distance from the start site of transcription that can influence gene expression in either orientation • Example: a sequence at -100 upstream of histone H2 gene that is essential for high level expression of histone H2 gene • An enhancer can be at (i) many bases upstream of the core promoter, (ii) many bases behind the transcription unit, or (iii) inside of the transcription unit • Enhancer works in either orientation
Interactions of Promoter Regulatory Elements and Regulatory Factors • Promoter regulatory elements act by binding factors which either affect chromatin structure and/or influence transcription directly • As shown in (a), binding of glucocorticoid hormone receptor-hormone complex to Glucocorticoid Response Element (GRE) will result in displacement of a nucleosome and generate a DNase I-hypersensitive site allowing easy access to the gene for transcription • A second example: binding of HSF (heat shock factoe) to HSE (heat shock element) will directly activating transcription
How could these factors be discovered? • Chromatin precipitation • Yeast two hybrid system
Chromatin Immunoprecipitation • This technique can be used to detect protein-DNA interactions in the native chromatin context in vivo. The associated DNA is purified for analysis by identifying its specific sequence by PCR or by labeling the DNA and applying to a tiling array to detect genome wide interactions • This assay can be used to test the presence of a specific gene interacting with a transcription factor • Reading List: Chip assay, an overview
Using Yeast Two-hybrid System to Detect cDNAs Encoding Interacting Proteins • Yeast two-hybrid system exploits the flexibility in activator structures to identify genes whose products bind to a specific protein of interest • If one is interested in identifying the cDNA of GH receptor, the strategy is: • Bait hybrid: DNA binding domain of UAS + liner sequence + GH • Fish hybrid: cDNA (fish domain) + linker + activation domain of HIS gene • Transfer both bait hybrid and fish hybrid into yeast cells and observe the proper phenotype