Prof. FATCHIYAH, M.Kes.PhD http://fatchiyah.lecture.ub.ac.id Protein engineering and recombinant protein expression
Outline • Why bother with recombinant fusion proteinor protein engineering? • Principle in recombinant protein expression • Things need to be considered for recombinant protein expression a. How to produce? b. How to make an expression recombinant DNA construct? c. Where to express? d. Difficulties (protein expression problems)
Why bother with recombinant fusion proteinor protein engineering? for different applications (specific expression scenarios): antibody production, biochemical experiments, structural biology, industrial usage. (protein biotechnology or protein engineering) 1. to minimize proteolysis. 2. for efficient and selective purification. 3. to optimize translation efficiency. 4.
Protein biotechnology or engineering Definition: Deliberate design and production of proteins with novel or altered structure and properties, that are not found in natural proteins. • To study protein structure and function • Applications in industry (enzymes) and medicine (drugs) -- New and improved proteins are always wanted. Example: Extremophilic proteins have been found in nature (temperatures, salt concentrations, pH values) could be useful.
Applications • Functional Studies • Enzymatic Assays • Protein-protein interactions • Protein Ligand Interactions • Structural Studies Protein Crystallography & NMR Structure Determination • Target Proteins for Rational Drug Design • Therapeutic Proteins – Preclinical Studies
Principle in recombinant protein expression Bioinformatics Target identification and cloning Protein purification and production Protein expression test Applications • Applications
Things need to be considered for recombinant protein expression • How to produce? choose for protein expression system (vector and host) 2. How to make an expression recombinant DNA construct? translational or transcriptional fusion, promoter use (inducible or constitutive) 3. Where to express? cytosol, periplasm, secretion, inclusion body 4.Difficulties (protein expression problems)
Which host cell expression system? • E. Coli • Yeast • Insect cells • Mammalian cells • Cell free
Choose of protein expression system The KEY idea is the cloned gene must be transcribed and translated most efficiently. Expression vector: MAXIMIZE GENE EXPRESSION. Host: MINIMIZE TURNOVER OF GENE PRODUCTS (preventing proteolysis in vivo in E. coli). ---- Use protease deficient mutants as hosts. Lon - a major ATP-dependent protease in E. coli. Has broad specificity for unfolded or misfolded proteins in vivo. lon mutants - pleiotropic, but two main phenotypes - mucoidy and UV sensitivity. ompT - an outer membrane localized protease. Cleaves at paired basic residues. degP - periplasmic protease - could inactivate some secreted proteins.
BL21(DE3) strain • lon and ompT proteases deficient • Carries a lambda DE3 lysogen, the lacI gene and lacUV5-driven T7 RNA polymerase
Increase selectivity of protein purification: (Gene fusion strategies) Most target protein lack a suitable Affinity ligand usable for capture on a solid matrix. A way to circumvent this obstacle is to genetically fuse the gene encoding the target protein with a gene encoding a purification tag. When the chimeric protein is expressed, the tag allows for specific capture of the fusion protein. This will allow the purification of virtually any protein without any prior knowledge of its biochemical properties. Hearn and Acosta, 2001
Advantages and disadvantages for using tags in fusion proteins Plus factors: (1)improve protein yield (2) prevent proteolysis (3) facilitate protein refolding (4) protect the antigenicity of the fusion protein and (5) increase solubility (6) increase the sensitivity of binding assays for tagged ScFv. Minus factors: (1) a change in protein conformation (solubility and activity) (2) lower protein yields (cleavage may not be complete) (3) inhibition of enzyme activity (4) alteration in biological activity (5) undesired flexibility in structural studies (6) cleavage/removing the fusion partner requires expensive protease (Factor Xa, enterokinase) and (7) toxicity.
Commonly used affinity tag system in recombinant protein expression: • expression and purification of maltose-binding protein fusions. (provides a factor Xa cleavage site). 2. expression and purification of Glutathione-S-transferase fusion proteins. (contains either a thrombin cleavage site, a factor Xa cleavage site, or an Asp-Pro acid cleavage site). 3. expression and purification of thioredoxin fusion proteins. (provides an enterokinase cleavage site). 4. expression and purification of 6X His-tagged proteins.
Affinity tags can be deWned as exogenous amino acid (aa) sequences with a high aYnity for a speciWc biological or chemical ligand.
Transcription vectors: 1. Vector itself already contains its own promoter and terminator sequences for efficient transcription initiation and termination. Therefore, No potential translation initiation site ahead of the cloning site should be provided by incoming cloned DNA. 2. Transcriptional fusion is a gene construct that investigates transcription activity of a gene of interest. Translation vectors: 1. Vector itself contains a segment from a specific gene whose protein product is synthesized more rapidly than any other protein during transformation or infection. Target DNA is fused to either 2nd or 11thcodon of gene 10. 2. The translational fusion bears the promoter of your gene and other sequence surrounding it (C-terminal) as well as the N-terminal sequence of your gene. The reporter gene is then inserted between these two terminals and in-frame such that you have one long protein product.
Architecture of reporter gene constructs (A) Transcriptional reporter, (B) translational reporter Transcriptional reporters consist of a promoter fragment from a gene of interest driving GFP (Figure 1A). Typically, promoter fragments of a few kilobases immediately upstream of the start codon contain a significant portion of the cis-regulatory information necessary to provide a tentative expression pattern of the endogenous gene under study. Translational reporters are in-frame gene fusions between GFP and a gene of interest (Figure 1B). Ideally, a translational reporter includes the entire genomic locus of a gene (5’ upstream region, exons, introns, 3 UTR). GFP can be inserted at any point in the open reading frame, preferably at a site that does not disrupt protein function or topology.
Translational fusion Assume the restriction site identified in the gene which you want to express is a BamHI site. Digest with BamHI to obtain: GATCCXXXXXXXXXXXX GYYYYYYYYYYYYY ↓ Treat with Klenow fragment to fill in the unpaired bases to obtain: GATCCXXXXXXXXXXXX CTAGYYYYYYYYYYYYY ↓ Determine the proper reading frame of the gene. Assume the coding sequence of the filled-in fragment should read: GA TCC XXX XXX XXX ↓ Determine which restriction endonuclease should be used to digest an expression vector pSKF301 in order to allow expression of the fusion protein. For this example, StuI is required to yield: ccatg gat cat atg tta aca gat atc aag gGA TCC XXX XXX pSKF301 (carrier sequence) your fusion gene
Popular promoters for heterologous protein expression in E. coli • Plac.Negatively regulated by lacI. Need for sufficient levels of repressor (lacIq and lacIq1 alleles on vectors). PlacUV5 is very popular because its regulation is not dependent on CAP. • Ptrp. Negatively regulated by trpR. Vectors containing this promoter can be transformed into any strain, easy induction by starvation for tryptophan. Not suitable for expression of proteins with high Trp content. • Hybrid promoters - Ptac and Ptrc. Induced by IPTG, a lot stronger than Plac and Ptrp. • PBAD - induced by arabinose (Invitrogen) • T7 system. Uses T7 promoters, which require T7 RNA polymerase. T7 RNA polymerase (encoded by T7 gene 1) has stringent specificity for its own promoters. It initiates and elongates chains 5 times faster than E. coli RNA Pol and is resistant to Rifampicin (unlike E. coli Pol). 6. pET series of vectors (Rosenberg et al, 1987, Gene). pET - Plasmid for Expression by T7 RNA pol. Commercially available by Novagen.
Where to express the recombinant proteins? • Direct expression (cytosol):E. coli cytoplasm is a reducing environment - difficult to ensure proper disulphide bonds formation. 2. Fusion expression (inclusion body?):Ensures good translation initiation. Can overcome insolubility and/or instability problems with small peptides. Has purification advantages based on affinity chromatography. 3. Secretion (periplasm or medium): a fusion alternative when proteins are fused to peptides or proteins targeted for secretion. Periplasm offers a more oxidizing environment, where proteins tend to fold better. Major drawbacks: limited capacity for secretion (0.1-0.2% total cell protein compared to 10% produced intracellularly) and inability for posttranslational modifications of proteins.
General problems with heterologus gene expression (a) Not enough protein is produced: * codon usage preferential (rare codon) * potential mRNA secondary structure.(5’-end ATcontent, 3’-end transcriptional terminator) * toxic gene. (b) Enough protein is produced, but it is insoluble: * vary the growth temperature. * change fermentation medium. * low-copy-number plasmas. * selection of promoter. The KEY idea is to slow down the expression rate of protein.
OPTIMIZING TRANSCRIPTION OF THE CLONED GENE 1. genetic fusion to strong promoters (transcriptional fusion). • increased gene dosage (utilize the gene’s own promoter with the gene on a high-copy plasmid). 3. potential problem with toxic genes and available methods for efficient repression. 4. solutions to potential problems with premature termination and mRNA instability.
OPTIMIZING TRANSLATION OF THE CLONED GENE 1. sequence determinants for translation initiation (Shine-Delargo sequence). 2. translational fusion vectors. 3. potential problem with biased codon usage. 4. enhancing the stability of protein products.
Insolubility of heterologous proteins produced in E.coli Inclusion bodies. Dense particles, containing precipitated proteins. Their formation depends on protein synthesis rate, growth conditions. Advantages: proteolysis resistant, big yield, relatively pure, easy to separate. Disadvantages: inactive product requires in vitro refolding and renaturation
Refolding of recombinant proteins Solubilisation: High T 0 C, detergents, high concentration of inorganic salts or organic solvents all used. The most commonly used organic solutes such as urea or guanidine-HCl often used in the presence of reducing agents (mercaptoethanol or DTT). Solubilized proteins can be purified by ion- exchange chromatography or other conventional methods, prior to refolding. Refolding: If no S-S bonds present - remove denaturing agent to allow protein to fold correctly. If S-S bonds present - their formation can be accomplished: by air oxidation, catalysed by trace metal ions; by a mixture of reduced and oxidized thiol compounds - oxidized DTT, reduced DTT; GSSG/GSH; cystine and cysteine, cystamine and cysteamine.