Intro • What causes gene products to be synthesized in some cells under some conditions, but not in others? A large part of research in molecular biology in aimed at trying to determine this. We are going to talk about a few basic systems, but a whole course could easily be devoted to the subject. • For most genes, the essential regulation point is transcription: whether the gene is transcribed or not. Regulation also occurs at other points: availability of the DNA to be transcribed at all, whether the mRNA is translated, stability of the mRNA, how quickly the protein is degraded, etc.
Gene Regulation in Prokaryotes • The first system of gene regulation that was understood was the lac operon in E. coli, worked out by Francois Jacob and Jacques Monod in 1962. Many other prokaryotic genes are regulated in a similar fashion, and the basic principles carry over into eukaryotes. • The lac operon codes for enzymes involved in the degradation of lactose. Lactose is a disaccharide that can be used as food in the absence of glucose. A lac- mutant is a chemoauxotroph that can’t use lactose.
Structure of the lac Operon • The lac operon consists of 3 protein-coding genes plus associated control regions. • The 3 genes are called z, y, and a. lacZ codes for the enzyme beta-galactosidase, which splits lactose into glucose plus galactose. lacY codes for a “permease” protein that allows lactose to enter the cell, and lacA codes for an enzyme that acetylates lactose. Together these three genes are called the “structural genes”. We will mainly focus on lacZ. • All 3 genes of the lac operon are transcribed on the same messenger RNA. Ribosomes translate the 3 proteins independently. This is a feature of prokaryotes that is only very rarely seen in eukaryotes, where 1 gene per mRNA is the rule.
Control Regions • Near the lac operon is another gene, called lacI, or just “i”. It codes for the lac repressor protein, which plays an essential role in lac operon control. The lac repressor gene is expressed “constitutively”, meaning that it is always on (but at a low level). It is a completely separate gene, producing a different mRNA than the lac operon. • Just upstream from the transcription start point in the lac operon are two regions called the operator (o) and the promoter (p). Neither region codes for protein: they act as binding sites on the DNA for important proteins. • The promoter is the site where RNA polymerase binds to start transcription. Promoters are found upstream from all protein-coding genes. • The operator is where the actual control occurs.
Control • The lac repressor protein (made by lacI) has 2 states: it can either bind to lactose (technically, to a lactose derivative called allolactose) or it can bind to the operator region of the lac operon. • In the presence of lactose, the repressor binds to it, and the repressor-lactose complexes float freely in the cytoplasm away from the DNA. In this situation, RNA polymerase can bind to the promoter, and the gene is transcribed. It makes beta-galactosidase which digests the lactose. • In the absence of lactose, the repressor binds to the operator DNA. The repressor is a large molecule, and when it is bound to the operator, RNA polymerase is blocked from reaching the promoter. The lac operon is not transcribed, and no beta-galactosidase is made. • If lactose appears, the operon is said to be “induced”. The lactose binds to the repressor, which then falls off the operator and allows transcription to occur.
Genetic Analysis with Mutants • Jacob and Monod developed their model of lac regulation through the use of mutants. Later, biochemical techniques showed that their model was correct. • They isolated two main types of mutant: lac- mutants, which can’t use lactose as a food source, and “constitutive” mutants, in which the lac operon is always on regardless of external conditions. • They then tested these mutants alone and in combination with each other, with extensive use of merodiploids (partial diploids) to test for dominance. In particular, they created and used several F’ strains.
Constitutive Mutants • There are two ways of making a mutant strain where the lac operon is always on, regardless of whether lactose is present or not. • One way is an i- mutation: the lacI gene does not produce a functional repressor protein. Since there is no repressor to bind to the operator, RNA polymerase is never inhibited, and the lac operon is always transcribed. • i- mutants are recessive: an i+ / i- heterozygote has normal gene regulation, because the wild type allele produces a normal repressor.
Operator Mutants • The operator is a region of DNA upstream from the structural genes that binds the promoter protein. It doesn’t make a protein product. • Mutations in the operator can only affect the gene to which it is attached. Such mutants are said to act in cis, or to be “cis-dominant”. • In contrast, repressor mutants make a protein which can move freely through the cell to any copy of the gene. Repressor mutants are “trans-acting”. (This terminology comes from cis and trans in organic chemistry.) • Many operator mutants are constitutive, oc. The operator is mutated so that the repressor can no longer bind to it. Transcription occurs and the lac operon is on when no repressor is bound to the operator. • Demonstrating cis-dominance: oc z+ / o+ z- is always on: the normal lacZ allele is attached to the constitutive operator. In contrast, oc z- / o+ z+ has normal regulation. The constitutive operator is attached to a defective lacZ gene, so expression of this gene is not detected, while the normal operator is attached to a normal lacZ gene, giving normal gene regulation.
Lac- Mutants • Most mutations in the 3 structural genes of the lac operon, lacZ, lacY, and lacA, affect only that particular gene. Thus, a lacZ- mutant is usually lacY+ and lacA+. • However, mutations in the control regions usually affect all 3 genes simultaneously. • One control mutation that affects all 3 genes is the super-repressed mutation, iS. The repressor protein made by this mutant binds very tightly to the operator and does not bind to lactose. The effect is that the lac operon is always repressed, even when lactose is present. • iS mutants are dominant: an iS / i+ merodiploid shows the super-repressed phenotype: it is always off. A mixture of normal repressor proteins and super-repressor proteins will end up with the super-repressor sticking permanently to all copies of the operator that are present in the cell. In contrast, i- mutants are recessive.
Cis-Acting oS Mutants • The operator can also be mutated to a super-repressed state, in which the operator binds so tightly to the repressor that it never gets released. • oS mutants act in cis. For example, a oS z+ / o+ z- strain is always off, because the only functional lacZ gene is attached to the super-repressed operator. A oS z- / o+ z+ strain shows normal regulation.
Summary of Lac Control Mutants • lacI = repressor gene, makes repressor protein. Mutants act in trans. • i+ = normal regulation: ON in presence of lactose, OFF in absence of lactose. • i- = no repressor made, gene always ON, recessive. • is = super-repressor permanently bound to operator. Gene always OFF, dominant. • lacO = operator region, controls expression of the attached lacZ, Y and A genes. Mutants act in cis only. • o+ = normal expression of attached structural genes. • oc = constitutive, represor unable to bind to operator. Lac genes in cis are always ON. • os = super-repressed operator. Normal repressor protein stays permanently bound. The attached lac genes are always OFF.
A Few Examples • What is the lacZ (beta-galactosidase) phenotype of these genotypes: normal expression, always on, or always off? • i+ oc z+ • i+ oc z- • is oc z+ • i+ os z+ / i+ oc z- • i- o+ z+ / i- oc z- • i- os z- / is oc z-
Negative and Positive Regulation • As described above, the lac operon is negatively regulated: the regulatory protein (repressor) causes transcription to stop. • Positive regulation, where the regulatory protein causes transcription to start, is more common. • The lac operon also contains an example of positive regulation, called “catabolite repression”. E. coli would prefer to use glucose as its food source. In the presence of glucose, the lac operon (and other similar genes) are turned off, even if lactose is present in the medium.
Catabolite Repression • Catabolite repression uses a regulatory protein called CAP (catabolite activator protein). It also uses the small molecule cyclic AMP (cAMP). • cAMP is made from ATP. When the glucose level in the cell is high, the cAMP level is low, because glucose inhibits synthesis of cAMP. When the glucose level is low, the cAMP level is high. • cAMP combines with the CAP protein to form a complex that binds to part of the lac operon promoter. This complex bends the DNA in a way that makes it much easier for RNA polymerase to bind to the promoter. This allows transcription to occur, but only if the lac repressor isn’t present. • Thus, low glucose levels cause high cAMP levels. When cAMP is high, it combines with CAP The CAP-cAMP complex then binds to the promoter to allow transcription to occur. • This is positive regulation because the binding of CAP to the DNA causes transcription to occur.
Eukaryotic Gene Regulation • Things are a bit more complex in higher eukaryotes (such as humans). • Major factors affecting gene expression: • Proteins and DNA sequences that affect binding of RNA polymerase to the promoter • Changes in chromatin structure • Alternate RNA splicing patterns • Regulation by small RNA molecules after transcription • Other forms of regulation also exist.
Eukaryotic Transcription Initiation • In eukaryotes, there are 3 RNA polymerases. However, all protein-coding genes are transcribed by RNA polymerase 2 (pol2). • Pol1 transcribes the ribosomal RNA genes • Pol3 transcribes transfer RNA genes • Pol2 binds to the “TATA” box, a region of about 8 bases which are mostly A and T. This is the equivalent of the promoter in prokaryotes. • Pol2 binding can only occur if several other proteins, the general transcription factors, have already bound to the DNA.
Activator Proteins, Response Elements and Enhancers • Just upstream from the TATA box there are a set of short DNA sequences (response elements) that regulate the time, tissue, and amount of transcription that occurs. • Response elements work in cis only • The order of response elements near a gene isn’t important • The proteins that bind to response elements, the activator proteins, influence the ability of RNA polymerase to start transcription. They are directly responsible for controlling the pattern of gene expression in higher organisms. • Activator proteins are trans-acting factors • There are also repressor proteins that reduce transcription. • Most response elements are directly upstream (say, within 200 bp) of the transcription start. • However, enhancers are groups of response elements found further away, either upstream or downstream from the gene. • Enhancers work the same way as regular response elements: they bind activator proteins that affect RNA polymerase binding. • Silencers are enhancers that repress transcription instead of increasing it.
Chromatin Structure • In eukaryotes, the DNA is organized into nucleosomes: about 200 bp of DNA wrapped around a protein core. • The protein core consists of 8 histone proteins • Histones are basic (i.e. alkaline): they contain positively charged amino acids that bind to the negative charges on the DNA (backbone phosphate groups). • DNA tightly wrapped around histones is inaccessible to RNA polymerase. • Thus, one important event in preparing a gene for transcription is “chromatin remodelling”: sliding the nucleosomes along the DNA to expose the promoter region.
Histone Acetylation • A second event needed for transcription affects large regions of the chromosome instead of individual genes. • DNA is normally tightly wrapped around the histones and is inaccessible to transcription factors. The structure can be loosened by acetylating the histones. • Acetyl groups are added to lysines, which removes their positive charge. The binding of the DNA to the histones is lessened, and the DNA structure opens up, allowing access to transcription factors. • Conversely, deacetylation tightens the chromatin structure, preventing transcription throughout that region of the chromosome.
Alternative RNA Transcription and Splicing • Many genes in eukaryotes contain introns. • This is especially true of large multicellular organisms like humans. • Introns are spliced out of the primary RNA transcript of the gene before it gets translated into protein. • Variant mRNAs (and resulting proteins) can be generated by skipping some introns, or by using a sequence as an intron in one cell type and as an exon in another cell type. • Alternative promoters or polyadenylation sites. Are also used to generated variants at the beginning and end of the mRNA (and protein). • Variant proteins are called isoforms.
Micro RNAs • Micro RNAs (miRNA) are the products of a new type of RNA-only (i.e. not translated into protein) genes. Their discovery and significance has only been known since about 2000. • An example of “junk” DNA that was later found to have a function. • miRNAs regulate gene activity in the cytoplasm, by binding to messenger RNA molecules. • This causes the mRNA to be untranslatable, or to be degraded. • Different sets of miRNAs are expressed in different tissues. • miRNA genes are transcribed into an RNA molecule that spontaneously forms a hairpin. • After some RNA processing, the miRNA is joined to a protein complex called RISC. • When the miRNA in RISC binds to a cellular messenger RNA, RISC acts as an RNAase to destroy the mRNA so it can’t be translated into protein.
Epigenetics • Epigenetics is the study of inherited changes caused by mechanisms other than changes in the DNA sequence. • This can between parent and offspring, or between cells within a single organism. • Within an organism, epigenetic changes are the main reason why it isn’t easy to take the nucleus from any random cell and use it to grow a whole new organism (i.e. reproductive cloning). • The molecular basis for (most) epigenetic mechanisms seems to be the methylation of cytosines in the DNA. The C must be followed by a G (CpG) for this to happen. • Methylation of C’s near the promoter region of a gene prevents transcription. This means a heavily methylated gene is permanently inactivated. • Each cell type and tissue has its own methylation pattern, keeping some genes functional and others permanently inactivated. This provides cells with "memory": after cell division, the daughter cells know what their type is.
More Epigenetics • How is the methylated state preserved? All of the cytosine methylations are renewed with every DNA replication: an enzyme called hemimethylase recognizes a 5-methyl C on the old strand, then methylates the corresponding C in the new strand. • Recalling that the methylation sequence CpG is also CpG on the opposite strand. • The methylation pattern is (mostly) reset in the early embryo, allowing embryonic cells to develop into any cell type. • An example of epigenetic effects: a small deletion of part of the long arm of chromosome 15 results in Prader-Willi syndrome if the single copy of chr 15 is inherited from the mother, and Angelman syndrome if the chromosome is inherited from the father. • This chromosomal region has different methylation patterns depending on whether it come from the sperm or the egg. • Prader-Willi is characterized by an uncontrollable appetite for food (among other things). • Angelman has been called the Happy Puppet syndrome: jerky movements and happy, laughing demeanor.