Outline • Central Dogma: DNA RNA Protein • Gene transcription in prokaryotes. • Regulated of transcription in prokaryotes. • Gene transcription in eukaryotes. • Gene regulatory protein recognition of specific DNA sequences. • Processing eukaryotic transcripts and delivering them to the ribosomes for translation. • A proposal for a unified theory of gene expression.
Cells Contain Three Major Classes of RNA • mRNA, rRNA, tRNA and snRNA all participate in protein synthesis. • All of these RNAs are synthesized from DNA templates by DNA-dependent RNA polymerases in a process called transcription. • Only mRNAs direct the synthesis of proteins. • Transcription is tightly regulated in all cells. • Only 3% of genes in a typical eukaryotic cell are undergoing transcription at any given moment. • The metabolic conditions and growth status of the cell dictate which gene products are needed at any moment.
29.1 Genes Transcription in Prokaryotes • In prokaryotes, virtually all RNA is synthesized by a single species of DNA-dependent RNA polymerase. • RNA polymerases link NTPs (ATP, GTP, CTP, and UTP) in the order specified by base pairing with a DNA template. • The RNA polymerase moves along the DNA strand in the 3'-5' direction and the RNA chain grows 5'-3' during transcription. • Subsequent hydrolysis of PPi to inorganic phosphate by pyrophosphatases makes the polymerase reaction thermodynamically favorable.
Identifying Transcription Start Sites Transcription is initiated in prokaryotes by an RNA polymerase holoenzyme. It has the subunit composition: α2ββ'σ. α = scaffold and regulation β = part of polymerase active site β' = binds to and unwinds DNA σ = binds to promoter for initiation. Note: there is a nonstoichiometric subunit, ω, of unknown function in the holoenzyme that our author does not discuss, making: α2ββ'ωσ.
Identifying Transcription Start Sites • The core polymerase is α2ββ' or α2ββ'ω. • The core polymerase (without σ) can transcribe DNA into RNA, but cannot initiate transcription. • Binding of the σ subunit allows the polymerase to recognize different DNA sequences that act as promoters. (E.coli has a number of different σ subunits which seek different promoters.) • Promoters are nucleotide sequences that identify the location of transcription start sites, where transcription begins. • RNA polymerases do not require a primer.
Conventions Used in Expressing the Sequences of Nucleic Acids and Proteins • Certain conventions are used in describing information transfer from DNA to protein: • The strand of duplex DNA that is read by RNA polymerase is termed the template stand. • The strand not read is the nontemplate strand. • The template is read by the RNA polymerase moving 3'-5' along the template DNA strand, so the RNA product, the transcript, grows in the 5'-3' direction. • In procaryotes, polycistronic transcripts are common (1 promoter/several genes).
Conventions Used in Expressing the Sequences of Nucleic Acids and Proteins By convention, when the order of nucleotides in DNA is shown as a single strand, it is the 5'-3' sequence of nucleotides in the nontemplate strand that is shown or in dsDNA it is the top strand. Upstream Downstream 5' ------nontemplate strand (sense strand)……….3' 3' ------template strand (antisense strand)……….5' There is no #0 in DNA; the transcription start site = +1 The nontemplate strand is the coding strand. RNA formed has the same sequence as the sense strand and is formed from the antisense strand.
Conventions Used in Expressing the Sequences of Nucleic Acids and Proteins
The Process of Transcription Has Four Stages Transcription can be divided into four stages: • Binding of RNA polymerase holoenzyme to template DNA at promoter sites. • Initiation of polymerization. • Chain elongation. • Chain termination. • These are the same steps that applied in replication.
Binding of Polymerase to Template DNA • The holoenzyme (α2ββ'σ) binds nonspecifically to DNA with low affinity and migrates downstream looking for a σ promoter region. Kd = 10-6 to 10-9M • The σsubunit recognizes the promoter sequence and locks on. The holoenzyme and promoter form a "closed promoter complex" (DNA is not unwound). Kd = 10-6 to 10-9M. • Polymerase then unwinds about 12 pairs in the -9 to +3 region to form an "open promoter complex". Kd = 10-14M.
RNA Polymerase and DNA • α2ββ'σ + random DNA t1/2 = 3 sec • α2ββ'σ + open promoter t1/2 = 2-3 hours • α2ββ' + random DNA t1/2 = 60 min • RNA polymerase binding protects a nucleotide sequence spanning the region from -70 to +20. • Promoters recognized by the σ factor typically consist of a 40 bp region on the 5'-side of the transcription start site (+1) .
Prokaryotic Promoter Regions Within the promoter are two consensus sequence elements: The -35 region, (consensus TTGACA). The σ subunit appears to bind here. The more the -35 region sequence corresponds to the consensus sequence of the σ subunit, the greater is the efficiency of gene transcription. The Pribnow box near -10, (consensus TATAAT). This region is ideal for unwinding. It is rich in A and T, which only form two H bonds per base pair. This is also called the TATA box.
The Nucleotide Sequences of Representative E. coli Promoters Figure 29.4 Consensus sequences for the -35 region, the Pribnow box, and the initiation site are shown at the bottom. The numbers represent the percent occurrence of the indicated base. In this figure, sequences are aligned relative to the Pribnow box.
Initiation of Polymerization • RNA polymerase requires supercoiled dsDNA and Mg++. It copies only one DNA strand. • RNA polymerase has two binding sites for NTPs • The initiation site prefers to bind ATP and GTP (most RNAs begin with a purine at 5'-end). • The elongation site binds the second incoming NTP. • 3'-OH of first attacks α-P of second to form a new phosphoester bond (eliminating PPi). • Unwinding and synthesis of first residues is slow. When a 6-10 unit oligonucleotide has been made, sigma subunit dissociates, completing "initiation“.
Prokaryotic Initiation and Elongation Figure 29.3 Sequence of events in the initiation and elongation phases of transcription as it occurs in prokaryotes.
Prokaryotic Initiation and Elongation Figure 29.3 Numbering in this region starts with the base at the transcription start site, which is designated +1.
Chain Elongation • The core polymerase(without σ) is theelongation enzyme. NusA protein comes in after σ dissociation to prevent early termination. • RNA polymerase is accurate - only about 1 error in about 104 - 106 bases. • This error rate is acceptable, since many transcripts are made from each gene. • Elongation rate is 20-50 bases per second. Slower in G/C-rich regions and faster in AT. • Topoisomerases precede and follow polymerase to relieve supercoiling so the bubble size is constant.
Supercoiling Versus Transcription (a) If the RNA polymerase followed the template strand around the axis of the DNA duplex, no supercoiling of the DNA would occur but the RNA chain would be wrapped around the double helix once every 10 bp. This possibility seems unlikely because it would be difficult to untangle the transcript from the DNA duplex. (b) Alternatively, gyrases and topoisomerases lead and follow the bubble to remove the torsional stresses induced by transcription.
Chain Termination • Two types of transcription termination mechanisms operate in bacteria: • Rho termination factor (a protein): • rho is an ATP-dependent helicase. • it binds at a specific recognition sequence in the transcript upstream of the termination site. • it then moves along RNA transcript, finds the “transcription bubble", unwinds the RNA from the bubble and releases RNA chain. • It is likely that the RNA polymerase stalls in a G:C rich termination region, allowing rho factor to overtake it.
Intrinsic Termination • Intrinsic termination (hairpin): • In this case, termination is determined by specific sequences (terminationsites) in the DNA. • Termination sites consist of 3 structural features. • inverted repeats, rich in G:C, which form a stable stem-loop structure (hairpin) in RNA transcript. • A nonrepeating segment that punctuates the inverted repeats. • A run of 6-8 A in the DNA template, coding for U in the transcript.
Intrinsic Termination Figure 29.7 Transcription termination by rho factor. The transcript forms a hairpin structure using the GC rich inverted repeat which gives tight binding. The short polyAT segment at the end of the transcription unit produces polyU in the transcript. These are loosely held and the transcript dissociates. NusA assists in pausing at the termination site.
29.2 – Regulation of Transcription in Prokaryotes • Operon: A segment of DNA transcribed as a single mRNA strand (may be polycistronic) and includes the promoter and operator. An operon is also called a transcription unit. • Promoter: Region of DNA where initiation occurs. Unique for a given transcription unit. • Operator: A DNA sequence close the promoter regulates the transcription start. • Regulatory proteins work with operators to control transcription of the genes.
The General Organization of Operons Figure 29.8 Operons consist of transcriptional control regions and a set of related structural genes, all organized in a contiguous linear array along the chromosome. The transcriptional control regions are the promoter and the operator, which lie next to, or overlap, each other, upstream from the structural genes they control. Operators may lie at various positions relative to the promoter, either upstream or downstream. Expression of the operon is determined by access of RNA polymerase to the promoter, and occupancy of the operator by regulatory proteins influences this access. Induction activates transcription from the promoter; repression prevents it.
Transcription of Operons is Controlled by Induction and Repression • Increased synthesis of enzymes in response to the presence of a metabolite is induction. • Decreased synthesis in response to a metabolite is repression. • Some substrates induce enzyme synthesis even though the enzymes can’t metabolize the substrate - these are gratuitous inducers - such as IPTG (isopropyl β-thiogalactoside).
IPTG is a Gratuitous Inducer Figure 29.10 The structure of IPTG (isopropyl β-thiogalactoside).
Lactose is an Inducer of the lac Operon Figure 29.9 The structure of lactose, a β-galactoside. Metabolism of lactose depends on hydrolysis into its component sugars, glucose and galactose, by the enzyme β-galactosidase. Lactose availability induces the synthesis of this enzyme by activating transcription of the lac operon.
The lac Operon is a Pardigm of Operons • lacI mutants express the genes needed for lactose metabolism. • The structural genes of the lac operon are controlled by negative regulation. • lacI gene product is the lac repressor, a tetrameric protein. • The lac operator is a palindromic DNA segment. • lac repressor: a tetramer that has a DNA binding domain on the N-terminus; the C-terminus binds inducer.
The lac Operon Figure 29.11 The operon consists of two transcription units. In one unit, there are three structural genes, lacZ, lacY, and lacA, under control of the promoter, plac, and the operator O. In the other unit, there is a regulator gene, lacI, with its own promoter, placI.
The Mode of Action of lac Repressor Figure 29.12 The structure of the lac repressor tetramer. Lactose is subtrate for proteins from lacZ, lacY and lacA. Isopropyl β-thiogalactoside is an invitro inducer. Allolactose is the invivo inducer.
The Mode of Action of lac Repressor Figure 29.12 The structure of the lac repressor tetramer, with bound IPTG (purple) is also shown.
Nucleotide Sequence of the lac Operator Figure 29.13 This sequence comprises 36 bp showing nearly palindromic symmetry. The inverted repeats that constitute this approximate twofold symmetry are shaded in rose. The bases are numbered relative to the +1 transcription start site. The G:C base pair at position +11 represents the axis of symmetry. In vitro studies show that bound lac repressor protects a 26 bp region from -5 to +21 against nuclease digestion. Bases that interact with bound lac repressor are indicated below the operator.
Lac Repressor Is a Negative Regulator of the lac Operon * Kb = binding constant. †ratio of the two values.
Catabolite Activator Protein Provides Positive Control of the lac Operon • Some promoters require an accessory protein to speed transcription. • Catabolite activator protein or CAP is one such protein, a dimer of 22.5 kD peptides. • N-terminus binds cAMP; C-terminus binds DNA. • Binding of CAP-(cAMP)2 to DNA assists formation of closed promoter complex. • Catabolite repression is a global cell control based on a favored substrate. It ensures that the operons necessary for metabolism of alternative energy sources (the lac and gal operons) remain repressed until the supply of glucose is exhausted.
The Mechanism of Catabolite Repression and CAP Action Figure 29.14 The mechanism of catabolite repression and CAP action. Glucose promotes catabolite repression by lowering cAMP levels through control of phosphorylation/ dephosphorylation. cAMP is necessary for CAP binding near promoters of operons whose gene products are involved in the metabolism of alternative energy sources such as lactose, galactose, and arabinose.
The Mechanism of Catabolite Repression and CAP Action Figure 29.14 The mechanism of catabolite repression and CAP action. The binding sites for the CAP-(cAMP)2 complex are consensus DNA sequences containing the conserved pentamer TGTGA and a less well conserved inverted repeat, TCANA (where N is any nucleotide).
Summary of Control of the lac Operon Glucose Repression Lactose LR/I Transcription /Activation + CR - LR None + CR + I Slow - CAP - LR None - CAP + I Rapid Catabolite repression overrides inducer but CAP:cAMP does not override LR.
Negative and Positive Control Systems are Fundamentally Different Negative and positive control systems operate in fundamentally different ways. Genes under negative control are transcribed unless they are turned off by the presence of a repressor protein. Often, transcription activation is merely the release from negative control. In contrast, genes under positive control are expressed only in presence of an active regulator protein.
Negative and Positive Control Systems are Fundamentally Different Figure 29.16 Control circuits governing the expression of genes.
Attenuation is a Prokaryotic Mechanism for Post-Translational Regulation of Expression • In addition to repression, expression of the trp operon is controlled by transcription attenuation. • Unlike the mechanisms discussed thus far, attenuation regulates transcription after it has begun and is coordinated with translation. • Attenuation is any regulatory mechanism that manipulates transcription termination or transcription pausing to regulate gene transcription downstream. • In prokaryotes, transcription and translation are coupled, and the translating ribosome is affected by the formation and persistence of secondary structure in the mRNA.
DNA: Protein & Protein: Protein Interactions are Essential to Transcription Regulation • DNA: protein interactions are a central feature in transcriptional control. • The DNA sites where regulatory proteins bind commonly display at least partial dyad symmetry or inverted repeats. • DNA-binding proteins themselves are generally even-numbered oligomers (dimers, tetramers, etc.) that have innate twofold rotational symmetry. • Protein: protein interactions are an essential component of transcriptional activation. • Proteins that activate transcription work through protein: protein contacts with RNA polymerase.
DNA Looping Allows Multiple DNA-Binding Proteins to Interact With One Another • Because transcription must respond to a variety of regulatory signals, multiple proteins are essential for appropriate regulation of gene expression. • These regulatory proteins are the sensors of cellular circumstances. • They communicate this information to the genome by binding at specific nucleotide sequences. • But DNA is a one-dimensional polymer, with limited space for proteins to bind. • DNA looping permits additional proteins to convene at the initiation site and to exert their influence on creating and activating the initiation complex.
DNA Looping Allows Multiple DNA-Binding Proteins to Interact With One Another Figure 29.22 Formation of a DNA loop delivers DNA-bound transcriptional activator to RNA polymerase positioned at the promoter. Protein: protein interactions between the transcriptional activator and RNA polymerase activate transcription.
29.3 Gene Transcription in Eukaryotes There are three classes of RNA polymerases (I, II and III) which transcribe rRNA, mRNA and tRNA genes, respectively. Pol I is in the nucleolus and transcribes rRNA genes. Pol II is in the nucleoplasm and makes hnRNA (pre-mRNA) for proteins and some snRNA. Pol III is in the nucleoplasm and makes tRNA, 5S rRNA, U6 snRNA and some others. All 3 are large, multimeric proteins (500-700 kD).
29.3 Gene Transcription in Eukaryotes All have 2 large subunits with sequences similar to and ' in E.coli RNA polymerase, so catalytic site may be conserved. All three need transcription factors. These are different except for TATA binding protein (TBP). Pol II is most sensitive to -amanitin, an octapeptide from Amanita phalloides ("destroying angel mushroom"). Pol III is less sensitive to -amanitin. Pol I is insensitive to the toxin.
Sensitivity to α-Amanitin Distinguishes the Three Classes Figure 29.23 The structure of α-amanitin, one of a series of toxic compounds known as amatoxins that are found in the mushroom Amanita phalloides.
29.3 Gene Transcription in Eukaryotes • With three categories of polymerases acting on three sets of genes for three RNAs, there are also at least three categories of promoters that are used to maintain specificity. • Eukaryotic promoters are different from prokaryotic promoters. • All three eukaryotic RNA polymerases interact with their promoters via transcription factors. • Transcription factors are DNA-binding proteins that recognize and accurately initiate transcription at specific promoter sequences.
RNA Polymerase II Transcribes Protein-Coding Genes • RNA Pol II must be capable of transcribing a great diversity of genes, but must also function at any moment only on the genes whose products are appropriate to the needs of the cell. • The RNA Pol II enzymes from yeast and humans are homologous. The structure of RNA Pol II from yeast is known and consists of 12 polypeptides. • The 12 subunits of yeast RNA Pol II (RPB1 - RPB12) are listed in Table 29.2. • RNA polymerases adopt a claw-like structure, to grasp the DNA duplex.
RNA Polymerase II Transcribes Protein-Coding Genes • The CTD of Pol II (RPB1) contains many repeats of the heptad sequence: YSPTSPS • CTD = carboxy terminal domain • NTD = amino terminal domain • This sequence in the CTD has many OH groups which are potential phosphorylation sites. Only RNA Pol II whose CTD is NOT phosphorylated can initiate transcription. • TATA box (TATAAA) is a consensus promoter.