Molecular Biology Problem Solver: A Laboratory Guide. Edited by Alan S Gerstein opyright◎2001 ISBNS:0-471-37972-7( Paper);0-47 (Electronic) E coli Expression Systems Peter a. bell Expression Vector Structure What Makes a Plasmid an Expression Vector? Is a Stronger Promoter Always Desirable Why Do Promoters Leak and What Can You Do What Factors Affect the level of translation? 464 What Can Affect the Stability of the Protei the cell? Which Protein Expression System Suits Your Needs? Track Record What Do You Know about the Gene to Be Expressed?..465 Very Promising. What Levels of Expression snould o..468 What Do You know about Your protein? Advertisements for Commercial Expression Vectors Ar :r60 47 Why Should You Select a Fusion System? 471 When Should You Avoid a Fusion System? Is It Necessary to Cleave the Tag off the Fusion Protein? 474 Will Extra Amino acid Residues Affect Your protein of Interest after Digestion? Working with Expression Systems 475 What Are the Options for Cloning a Gene for Expression?... 475
461 15 E. coli Expression Systems Peter A. Bell Expression Vector Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462 What Makes a Plasmid an Expression Vector? . . . . . . . . . . . 462 Is a Stronger Promoter Always Desirable? . . . . . . . . . . . . . . 463 Why Do Promoters Leak and What Can You Do about It? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 What Factors Affect the Level of Translation? . . . . . . . . . . . . 464 What Can Affect the Stability of the Protein in the Cell? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464 Which Protein Expression System Suits Your Needs? . . . . . . . 465 Track Record . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 What Do You Know about the Gene to Be Expressed? . . . 465 What Do You Know about Your Protein? . . . . . . . . . . . . . . . 468 Advertisements for Commercial Expression Vectors Are Very Promising. What Levels of Expression Should You Expect? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 470 Which E. coli Strain Will Provide Maximal Expression for Your Clone? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 471 Why Should You Select a Fusion System? . . . . . . . . . . . . . . . 471 When Should You Avoid a Fusion System? . . . . . . . . . . . . . . . 472 Is It Necessary to Cleave the Tag off the Fusion Protein? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474 Will Extra Amino Acid Residues Affect Your Protein of Interest after Digestion? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Working with Expression Systems . . . . . . . . . . . . . . . . . . . . . . . 475 What Are the Options for Cloning a Gene for Expression? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Molecular Biology Problem Solver: A Laboratory Guide. Edited by Alan S. Gerstein Copyright © 2001 by Wiley-Liss, Inc. ISBNs: 0-471-37972-7 (Paper); 0-471-22390-5 (Electronic)
Is Screening Necessary Prior to Expression 476 What Aspects of Growth and Induction Are Critical to Success? What Are the Options for Lysing Cells? Troubleshooting No Expression of the Protein The Protein Is Expressed, but It Is Not the Expected Size Based on Electrophoretic Analysis The protein is Insoluble. now what? 48 Solubility Is Essential. What Are Your Options? The Protein Is Made, but Very Little Is Full-Length; Most of It Is Cleaved to Smaller fragments 483 Your Fusion Protein Wont Bind to Its Affinity Resin 484 Your Fusion Protein Won't Digest Cleavage of the Fusion Protein with a Protease Produced tra bands 485 Extra Protein Bands Are Observed after Affinity Must the Protease Be Removed after Digestion of the..485 Purification Fusion protein? Bibliography.... Over the past decade the variety of hosts and vector systems for recombinant protein expression has increased dramatically Researchers now select from among mammalian, insect, yeast, and prokaryotic hosts, and the number of vectors available for use in these organisms continues to grow. With the increased availabil ding sequencing information, it is certain that these and other, yet to be developed systems will be important in the future. Despite the development of eukaryotic systems, E coli remains the most widely used host for recombi nant protein expression. E. coli is easy to transform, grows quickly in simple media, and requires inexpensive equipment for growth and storage. And in most cases, E coli can be made to produce adequate amounts of protein suitable for the intended application The purpose of this chapter is to guide the user in selecting the appropriate host and troubleshooting the process of recombinant protein expression. EXPRESSION VECTOR STRUCTURE What Makes a Plasmid an Expression Vector? Vectors for expression in E. coli contain at a minimum the following elements Bell
Is Screening Necessary Prior to Expression? . . . . . . . . . . . . 476 What Aspects of Growth and Induction Are Critical to Success? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 478 What Are the Options for Lysing Cells? . . . . . . . . . . . . . . . . 479 Troubleshooting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480 No Expression of the Protein . . . . . . . . . . . . . . . . . . . . . . . . . 480 The Protein Is Expressed, but It Is Not the Expected Size Based on Electrophoretic Analysis . . . . . . . . . . . . . . . . . . . 480 The Protein Is Insoluble, Now What? . . . . . . . . . . . . . . . . . . 481 Solubility Is Essential. What Are Your Options? . . . . . . . . . . . 482 The Protein Is Made, but Very Little Is Full-Length; Most of It Is Cleaved to Smaller Fragments . . . . . . . . . . . 483 Your Fusion Protein Won’t Bind to Its Affinity Resin . . . . . . 484 Your Fusion Protein Won’t Digest . . . . . . . . . . . . . . . . . . . . . 485 Cleavage of the Fusion Protein with a Protease Produced Several Extra Bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Extra Protein Bands Are Observed after Affinity Purification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Must the Protease Be Removed after Digestion of the Fusion Protein? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 486 Over the past decade the variety of hosts and vector systems for recombinant protein expression has increased dramatically. Researchers now select from among mammalian, insect, yeast, and prokaryotic hosts, and the number of vectors available for use in these organisms continues to grow. With the increased availability of cDNAs and protein coding sequencing information, it is certain that these and other, yet to be developed systems will be important in the future. Despite the development of eukaryotic systems, E. coli remains the most widely used host for recombinant protein expression. E. coli is easy to transform, grows quickly in simple media, and requires inexpensive equipment for growth and storage. And in most cases, E. coli can be made to produce adequate amounts of protein suitable for the intended application. The purpose of this chapter is to guide the user in selecting the appropriate host and troubleshooting the process of recombinant protein expression. EXPRESSION VECTOR STRUCTURE What Makes a Plasmid an Expression Vector? Vectors for expression in E. coli contain at a minimum, the following elements: 462 Bell
Table 15.1 Characteristics of Popular Prokaryotic Promoters Regulation/Inducer Promoter (Concentration LacUv5 Lactose operon lacI/IPTG Strong (0.1-1mM) Ip trpR 3- Strong beta-indoleacrylic Hybrid of -35 lacI/IPTG Strong Trp and -10 (0.1-1mM) P Phage lambda ambda cl Strong Phage T5 lacl (2 operators)/ Strong IPTG(0.1-1 mM) Arabinose Arabinose Ara badlarabinose (1m-10mM) T7 phage RNA lacI/m ery strong (0.1-1mM) A transcriptional promoter. A ribosome binding site. a translation initiation site A selective marker(e.g, antibiotic resistance) An origin of replication In general, things that affect these can affect the level of protein xpression. At a minimum, transcription promoters in E. coli consist of two dna hexamers located -35 and -10 relative to the transcriptional start site. Together these elements mediate binding of the about 500kDa multimeric complex of Rna polymerase e. Suppliers of vectors for expression have selected highly active, inducible promoter sequences, and there is usually little need to be concerned until a problem is encountered in expression. A list of the commonly used promoters and their regulation is shown in Table 15.1 Is a Stronger Promoter Always Desirable? A strong promoter may not be best for all situations Over- production of RNa may saturate translation machinery, and maximizing RNa synthesis may not be desirable or necessary. A weaker promoter may actually give higher steady-state levels of soluble, intact protein than one that is rapidly induced E coli Expression Systems
• A transcriptional promoter. • A ribosome binding site. • A translation initiation site. • A selective marker (e.g., antibiotic resistance). • An origin of replication. In general, things that affect these can affect the level of protein expression. At a minimum, transcription promoters in E. coli consist of two DNA hexamers located -35 and -10 relative to the transcriptional start site.Together these elements mediate binding of the about 500kDa multimeric complex of RNA polymerase. Suppliers of vectors for expression have selected highly active, and inducible promoter sequences, and there is usually little need to be concerned until a problem is encountered in expression. A list of the commonly used promoters and their regulation is shown in Table 15.1. Is a Stronger Promoter Always Desirable? A strong promoter may not be best for all situations. Overproduction of RNA may saturate translation machinery, and maximizing RNA synthesis may not be desirable or necessary. A weaker promoter may actually give higher steady-state levels of soluble, intact protein than one that is rapidly induced. E. coli Expression Systems 463 Table 15.1 Characteristics of Popular Prokaryotic Promoters Regulation/Inducer Promoter (Concentration) Strength LacUV5 Lactose operon lacI/IPTG Strong (0.1–1 mM) Trp Tryptophan trpR 3- Strong operon beta-indoleacrylic acid Tac Hybrid of -35 lacI/IPTG Strong Trp and -10 (0.1–1 mM) lac promoter PL Phage lambda Lambda cI Strong repressor/heat Phage T5 T5 phage lacI (2 operators)/ Strong IPTG (0.1–1mM) Arabinose Arabinose AraBAD/arabinose Variable operon (1mm–10 mM) T7 T7 phage RNA lacI/IPTG Very strong polymerase (0.1–1 mM)
Why Do Promoters Leak and what can You do about It? Most promoters will have some background activity. Promoters regulated by the lactose operator/repressor will drive a small amount of transcription in the absence of added inducer(e.g IPTG). To minimize this leakage, 10% glucose can be added to the medium to repress the lactose induction pathway, the growth tem- perature can be reduced to 15 to 30C, and a minimal medium that contains no trace amounts of lactose can be used. promoter leakage is only a problem when the expressed protein is highly toxic to the cells The tightly regulated T7 promoter has very low background due to the low levels of T7 RNa polymerase made in the absence of inducer (in specifically engineered host cells such as BL21 (DE3/pLysS). It has been estimated that the fold induction of transcription in the t7 driven pET vector system is greater than 1000, while the magnitude of induction obtained with lac repres- sor regulated promoters is generally about 50-fold What Factors Affect the level of translation? Translation can be affected by nucleotides adjacent to the atG nitiator codon, the amino acid residue immediately following the initiator, and secondary structures in the vicinity of the start site. Most commercially available vectors for expression use optimal ATG and Shine-Dalgarno sequences. Secondary structures in the mRNa contributed by the gene of interest can prevent ribosome binding(Tessier et al., 1984; Looman et al., 1986; Lee et al., 1987) In addition the downstream box aauCacaaaGug found after the initiator codon in many bacterial genes can also enhance translation initiation. Conversion of the amino terminal sequence of the gene of interest to one that comes close to this consensus may improve the rate of translation of the mRNA(Etchegaray and Inouye, 1999) What Can Affect the Stability of the Protein in the Cell? One of the first steps in protein degradation in E coli is the catalyzed removal of the amino terminal methionine residue. This reaction, catalyzed by methionyl aminopeptidase, occurs more slowly when the amino acid in the +2 position has a larger side chain(Hirel et al., 1989; Lathrop et al., 1992). When the methionine residue is intact, the protein will be stable to all but endopeptidase cleavage. Tobias et al.(1991)have determined the relationship between a protein's amino terminal amino acid 464 Bell
Why Do Promoters Leak and What Can You Do about It? Most promoters will have some background activity. Promoters regulated by the lactose operator/repressor will drive a small amount of transcription in the absence of added inducer (e.g., IPTG).To minimize this leakage, 10% glucose can be added to the medium to repress the lactose induction pathway, the growth temperature can be reduced to 15 to 30°C, and a minimal medium that contains no trace amounts of lactose can be used. Promoter leakage is only a problem when the expressed protein is highly toxic to the cells. The tightly regulated T7 promoter has very low background due to the low levels of T7 RNA polymerase made in the absence of inducer (in specifically engineered host cells such as BL21 (DE3)/pLysS). It has been estimated that the fold induction of transcription in the T7 driven pET vector system is greater than 1000, while the magnitude of induction obtained with lac repressor regulated promoters is generally about 50-fold. What Factors Affect the Level of Translation? Translation can be affected by nucleotides adjacent to the ATG initiator codon, the amino acid residue immediately following the initiator, and secondary structures in the vicinity of the start site. Most commercially available vectors for expression use optimal ATG and Shine-Dalgarno sequences. Secondary structures in the mRNA contributed by the gene of interest can prevent ribosome binding (Tessier et al., 1984; Looman et al., 1986; Lee et al., 1987). In addition, the downstream box AAUCACAAAGUG found after the initiator codon in many bacterial genes can also enhance translation initiation. Conversion of the amino terminal sequence of the gene of interest to one that comes close to this consensus may improve the rate of translation of the mRNA (Etchegaray and Inouye, 1999). What Can Affect the Stability of the Protein in the Cell? One of the first steps in protein degradation in E. coli is the catalyzed removal of the amino terminal methionine residue. This reaction, catalyzed by methionyl aminopeptidase, occurs more slowly when the amino acid in the +2 position has a larger side chain (Hirel et al., 1989; Lathrop et al., 1992). When the methionine residue is intact, the protein will be stable to all but endopeptidase cleavage. Tobias et al. (1991) have determined the relationship between a protein’s amino terminal amino acid 464 Bell
and its stability in bacteria, that is, the N-end rule. They reported protein half-lives of only 2 minutes when the following amino acids were present at the amino terminus: Arg, Lys, Phe Leu, Trp, and Tyr. In contrast, all other amino acids conferred half-lives of >10 hours when present at the amino terminus of the protein examined. This suggests that one should examine the sequence to be expressed for the residue in the +2 position. If the residue is among those that destabilize the protein, it may be worth the effort to change this residue to one that confers stability WHICH PROTEIN EXPRESSION SYSTEM SUITS YOUR NEEDS? Track record What systems are currently used in the laboratory or by others in the field? If the protein coding sequence of interest is well characterized, and the protein or its close relatives have been expressed successfully by others in the field, it is wise to try the same expression system. Go with what has worked in the past If nothing else, results obtained using the familiar system will serve as a starting point. As an example, most of the recombinant expression of mammalian src homology SH2 protein interaction domains has been done using the pGeX vector series, and sim ilar examples of preferred systems are found in other fields of research. If little is known about the protein to be expressed, it is best to take stock of what information there is before entering the lab Before beginning any experimentation, it is wise to answer the following question What Do You Know about the gene to Be Expressed Source In general, simple globular proteins from prokaryotic and eukaryotic sources are good candidates for expression in e coli Monomeric proteins with few cysteines or prosthetic gr heme and metals)and of average size(<60kDa) will likely give good production. Secreted eukaryotic proteins and membrane bound proteins, especially those with several transmembrane domains, are likely to be problematic in E. coli Solubility of recombinant proteins in E coli can also be estimated by a math- ematical analysis of the amino acid sequences (wilkinson and Harrison. 1991 E coli Expression System 465
and its stability in bacteria, that is, the “N-end rule.”They reported protein half-lives of only 2 minutes when the following amino acids were present at the amino terminus: Arg, Lys, Phe, Leu, Trp, and Tyr. In contrast, all other amino acids conferred half-lives of >10 hours when present at the amino terminus of the protein examined. This suggests that one should examine the sequence to be expressed for the residue in the +2 position. If the residue is among those that destabilize the protein, it may be worth the effort to change this residue to one that confers stability. WHICH PROTEIN EXPRESSION SYSTEM SUITS YOUR NEEDS? Track Record What systems are currently used in the laboratory or by others in the field? If the protein coding sequence of interest is well characterized, and the protein or its close relatives have been expressed successfully by others in the field, it is wise to try the same expression system. Go with what has worked in the past. If nothing else, results obtained using the familiar system will serve as a starting point. As an example, most of the recombinant expression of mammalian src homology SH2 protein interaction domains has been done using the pGEX vector series, and similar examples of preferred systems are found in other fields of research. If little is known about the protein to be expressed, it is best to take stock of what information there is before entering the lab. Before beginning any experimentation, it is wise to answer the following question: What Do You Know about the Gene to Be Expressed? Source In general, simple globular proteins from prokaryotic and eukaryotic sources are good candidates for expression in E. coli. Monomeric proteins with few cysteines or prosthetic groups (e.g., heme and metals) and of average size (<60kDa) will likely give good production. Secreted eukaryotic proteins and membranebound proteins, especially those with several transmembrane domains, are likely to be problematic in E. coli. Solubility of recombinant proteins in E. coli can also be estimated by a mathematical analysis of the amino acid sequences (Wilkinson and Harrison, 1991). E. coli Expression Systems 465