Selecting an Appropriate Expression Host E sed Protein issue The properties of the protein and its intended usage will also have a direct impact on which expression system to choose. Since many eukaryotic proteins undergo post-translational modifica tions(phosphorylation, signal-sequence cleavage, proteolytic pro- cessing, and glycosylation), which can affect function, circulating half-life, antigenicity, and the like, these issues must be addressed when choosing an expression host. These steps have a direct influ nce on the quality of protein produced. For instance, it has been demonstrated that there is a clear difference in the glycosylation patterns between various mammalian and insect systems. Insect lls lack the pathways necessary to produce glycoproteins con- taining complex N-linked glycans with terminal sialic acids(Alor and Betenbaugh, 1999; Kornfeld and Kornfeld, 1985), and the absence of sialic acid residues can strongly influence the in vivo pharmacokinetic properties of many glycoproteins(Grossmann et aL., 1997). Using tPA as a model system, it has also been shown that glycosylation patterns differ within different mammalian cell types(Parekh et al., 1989) The expression strategies for both targets and reagents are the same. We desire a purified protein, cell membranes for a binding assay, or attached cell lines for a cell-based assay. The determin ing factor for selecting a host system depends on the quantity of the protein needed, what signaling components are necessary in the host line, and the degree to which endogenously expressed host proteins generate background responses(e. g, for receptors) For example, insect cell lines often provide a null background for mammalian signaling components, which enable lower basal level activation and high signal to background in cell-based assays. If the protein is a target and will be used in a cell-based assay, one needs to make a high expressing cell line. In most cases the higher the expression is, the better is the result. But this is not always the case for cytoplasmic or membrane anchored proteins where the expressed protein can be toxic. In these cases it might be better to achieve lower expression or to use some type of regulated promoter vector system as discussed in the following d used to sup are numerous examples of commercial therapeutic proteins being produced in E. coli and yeast. However, if the protein contains numerous disulfide linkages, or requires extensive post Eukaryotic Expression
Selecting an Appropriate Expression Host Expressed Protein Issues The properties of the protein and its intended usage will also have a direct impact on which expression system to choose. Since many eukaryotic proteins undergo post-translational modifications (phosphorylation, signal-sequence cleavage, proteolytic processing, and glycosylation), which can affect function, circulating half-life, antigenicity, and the like, these issues must be addressed when choosing an expression host. These steps have a direct influence on the quality of protein produced. For instance, it has been demonstrated that there is a clear difference in the glycosylation patterns between various mammalian and insect systems. Insect cells lack the pathways necessary to produce glycoproteins containing complex N-linked glycans with terminal sialic acids (Ailor and Betenbaugh, 1999; Kornfeld and Kornfeld, 1985), and the absence of sialic acid residues can strongly influence the in vivo pharmacokinetic properties of many glycoproteins (Grossmann et al., 1997). Using tPA as a model system, it has also been shown that glycosylation patterns differ within different mammalian cell types (Parekh et al., 1989). The expression strategies for both targets and reagents are the same. We desire a purified protein, cell membranes for a binding assay, or attached cell lines for a cell-based assay. The determining factor for selecting a host system depends on the quantity of the protein needed, what signaling components are necessary in the host line, and the degree to which endogenously expressed host proteins generate background responses (e.g., for receptors). For example, insect cell lines often provide a null background for mammalian signaling components, which enable lower basal level activation and high signal to background in cell-based assays. If the protein is a target and will be used in a cell-based assay, one needs to make a high expressing cell line. In most cases the higher the expression is, the better is the result. But this is not always the case for cytoplasmic or membrane anchored proteins where the expressed protein can be toxic. In these cases it might be better to achieve lower expression or to use some type of regulated promoter vector system as discussed in the following section. If the desired protein is to be a therapeutic and used to supply clinical trials, the choices are very well documented. There are numerous examples of commercial therapeutic proteins being produced in E. coli and yeast. However, if the protein contains numerous disulfide linkages, or requires extensive postEukaryotic Expression 501
translational modifications (i.e, folding of antibody heavy and light chains), one needs to consider expression in a mammalian cell line. The gene needs to be cloned into a plasmid system allow ing for some type of amplification so that the protein can be expressed at very high levels. In addition one needs to be cog- nizant of GMP, GLP, and FDa guidelines for the entire expres sion, selection or amplification process The inability to obtain homogeneously pure protein for crys- tallization is a frequently encountered problem due to the het erogeneous carbohydrate content of many eukaryotic proteins Grueninger-Leitch et aL., 1996). In the past E. coli expression systems were exclusively used to produce material for crystalliza tion in order to avoid having glycosylation at all. Recently there have been an increasing number of examples where crystals were generated using baculovirus-expressed protein( Cannan et al 1999: Sonderman et aL., 1999). Another approach has been to use the glycosylation-deficient mutant CHO cell line, Lec3 2.8.1 (Stanley, 1989; Butters et al., 1999; Casasnovas, Larvie, and Stehle 1999: Kern et aL., 1999). In these cases the incomplete or under- glycosylation has allowed the formation of high-resolution, dif- fractable crystals Transient Expression Systems Transient systems are used for the rapid production of small quantities of heterologous gene products and are often suitable to make"reagent"category proteins. The cell lines of choice include the following; COS cells(Cos-1, ATCC CRL 1650; COS-7 ATCC CRL 1651; see Gluzman, 1981). These are derived from the African green monkey cell line, CV-1, which was infected with an origin defective SV40 genome. Upon transfection with a plasmid con taining a functional SV40 origin of replication, the combination of SV40 replication origin(donor)and sv40 large T-antigen(host cell)results in high copy extrachromosomal replication of the transfected plasmid(Mellon et aL., 1981) Human embryonic kidney(HEK)293 cells(ATCC CRL 1573). An immortalized cell line derived from human embryonic kidney cells transformed with human adenovirus type 5 DNA cell line contains the adenovirus ElA gene, which trans- ates CMv promoter-based plasmids, and this results in increased expression levels. This cell line is widely used to express 7 trans membrane G-protein-coupled receptors(GPCRs)(Ames et al., 1999; Chambers et al., 2000) 502 Trill et al
translational modifications (i.e., folding of antibody heavy and light chains), one needs to consider expression in a mammalian cell line. The gene needs to be cloned into a plasmid system allowing for some type of amplification so that the protein can be expressed at very high levels. In addition one needs to be cognizant of GMP, GLP, and FDA guidelines for the entire expression, selection or amplification process. The inability to obtain homogeneously pure protein for crystallization is a frequently encountered problem due to the heterogeneous carbohydrate content of many eukaryotic proteins (Grueninger-Leitch et al., 1996). In the past E. coli expression systems were exclusively used to produce material for crystallization in order to avoid having glycosylation at all. Recently there have been an increasing number of examples where crystals were generated using baculovirus-expressed protein (Cannan et al., 1999; Sonderman et al., 1999). Another approach has been to use the glycosylation-deficient mutant CHO cell line, Lec3.2.8.1, (Stanley, 1989; Butters et al., 1999; Casasnovas, Larvie, and Stehle, 1999; Kern et al., 1999). In these cases the incomplete or underglycosylation has allowed the formation of high-resolution, diffractable crystals. Transient Expression Systems Transient systems are used for the rapid production of small quantities of heterologous gene products and are often suitable to make “reagent” category proteins. The cell lines of choice include the following; • COS cells (COS-1, ATCC CRL 1650; COS-7 ATCC CRL 1651; see Gluzman, 1981). These are derived from the African green monkey cell line, CV-1, which was infected with an origindefective SV40 genome. Upon transfection with a plasmid containing a functional SV40 origin of replication, the combination of SV40 replication origin (donor) and SV40 large T-antigen (host cell) results in high copy extrachromosomal replication of the transfected plasmid (Mellon et al., 1981). • Human embryonic kidney (HEK) 293 cells (ATCC CRL 1573). An immortalized cell line derived from human embryonic kidney cells transformed with human adenovirus type 5 DNA. This cell line contains the adenovirus E1A gene, which transactivates CMV promoter-based plasmids, and this results in increased expression levels. This cell line is widely used to express 7 trans membrane G-protein-coupled receptors (GPCRs) (Ames et al., 1999; Chambers et al., 2000). 502 Trill et al
n our own experiments involving transient expre we have consistently found that COs cells yield approximately 50% higher expression than HEK 293 cells (Trill, 2000, unpub- lished). To take monoclonal antibodies(mAbs)as an example transient systems such as Cos can allow one to examine multiple constructs in two to three days at expression levels ranging from 100ng/ml to 2ug/ml. Stable cell lines can yield over 200-fold more protein, but it is often a time-consuming process to achieve those levels, often taking six months to a year to accomplish (Trill, Shatzman, and Ganguly, 1995) Viral lytic systems Viral lytic systems offer the advantage of rapid expression com- bined with high-level production. The most popular of the viral lytic systems utilizes baculovirus. The baculovirus expression system is based on the manipula tion of the circular Autographa californica virus genome produce a gene of interest under the control of the highly efficient viral polyhedrin promoter. Engineered viruses are used to infect cell lines derived from pupal ovarian tissue of the fall army worm Spodoptera frugiperda(vaughn et al., 1977). This lytic system is most useful for the high-level expression of enzymes and other soluble intracellular proteins. Secreted proteins can also be obtained from this system but are more difficult to scale to large volumes due to the rapid onset of the lytic cycle. Cell lines include Sf9, Sf21, and T. ni(available as High Five)cells are from Trichoplusia ni egg cell homogenates. Refer to Section B for more detail on baculovirus expression Adenovirus expression has also increased in popularity of late This may be due in part to its use for in vivo gene delivery in animal systems and limited use in experimental gene therapy Robbins, Tahara, and ghivizzani, 1999: Ennist, 1999: Grubb et al., 1994). The advantages of this system include a broad host specificity and the ability to use the same expression vector to infect different host cells for contemporaneous animal studies (von Seggern and Nemerow, 1999). Commercial vectors are avail- able for generating recombinant viruses such as the adEasy system sold by Stratagene. This system simplifies the process of generating recombinant viruses since it relies on homologous recombination in E coli rather than in eukaryotic cells(He et al 1998). The main limitations of this system include moderate to low expression levels and the need to maintain a dedicated tissue culture space in order to avoid crosscontamination with other host Eukaryotic Expression 503
In our own experiments involving transient expression systems, we have consistently found that COS cells yield approximately 50% higher expression than HEK 293 cells. (Trill, 2000, unpublished). To take monoclonal antibodies (mAbs) as an example, transient systems such as COS can allow one to examine multiple constructs in two to three days at expression levels ranging from 100ng/ml to 2mg/ml. Stable cell lines can yield over 200-fold more protein, but it is often a time-consuming process to achieve those levels, often taking six months to a year to accomplish (Trill, Shatzman, and Ganguly, 1995). Viral Lytic Systems Viral lytic systems offer the advantage of rapid expression combined with high-level production. The most popular of the viral lytic systems utilizes baculovirus. The baculovirus expression system is based on the manipulation of the circular Autographa californica virus genome to produce a gene of interest under the control of the highly efficient viral polyhedrin promoter. Engineered viruses are used to infect cell lines derived from pupal ovarian tissue of the fall army worm, Spodoptera frugiperda (Vaughn et al., 1977). This lytic system is most useful for the high-level expression of enzymes and other soluble intracellular proteins. Secreted proteins can also be obtained from this system but are more difficult to scale to large volumes due to the rapid onset of the lytic cycle. Cell lines include Sf9, Sf21, and T. ni (available as High FiveTM) cells are from Trichoplusia ni egg cell homogenates. Refer to Section B for more detail on baculovirus expression. Adenovirus expression has also increased in popularity of late. This may be due in part to its use for in vivo gene delivery in animal systems and limited use in experimental gene therapy (Robbins, Tahara, and Ghivizzani, 1999; Ennist, 1999; Grubb et al., 1994). The advantages of this system include a broad host specificity and the ability to use the same expression vector to infect different host cells for contemporaneous animal studies (von Seggern and Nemerow, 1999). Commercial vectors are available for generating recombinant viruses such as the AdEasyTM system sold by Stratagene. This system simplifies the process of generating recombinant viruses since it relies on homologous recombination in E. coli rather than in eukaryotic cells (He et al., 1998).The main limitations of this system include moderate to low expression levels and the need to maintain a dedicated tissue culture space in order to avoid crosscontamination with other host Eukaryotic Expression 503
cells. Other animal viruses of interest, including Sindbis, Semliki Forest virus, and the adeno-associated virus(AAv), share many of the same advantages as adenovirus, including broad host specificity(Schlesinger, 1993; Olkkonen et al., 1994; Bueler, 1999) lone of these virus expression systems are discussed in detail in his chapter because they do not currently represent mainstream methods for large-scale protein production as is evident from the limitations discussed Stable Expression Systems Stable expression systems are preferred when one desires a con- tinuous source and high levels of expressed heterologous protein. The actual levels of expression largely depend on which host cells are used, what type of plasmids are used, and where the ger nes are integrated into the host genome (i.e whether they are influenced by chromosomal position effects) What are the cell line choices? If it is a mammalian system, the most common choices are as discussed next Mouse cells such as L-cells(ATCC CCL 1), Ltk cells(ATCC CCL 1.3), NIH 3T3(ATCC CRL 1658), and the myeloma cell lines, Sp2/0(ATCC CRL 1581), NSO(Bebbington et al., 1992)and P3X63 Ag8653(ATCC CRL 1580). These myeloma cell lines have the advantages of suspension growth in serum-free medium and their derivation from secretory cells makes them well-suited hosts for high-level protein production. Because of the presence of the endogenous dihydrofolate reductase (DHFR)gene, none of these cells can be amplified through the use of methotrexate (Schimke, 1988). However, as shown by Bebbington et al. (1992) NSO cells can be amplified using the glutamine synthetase system Rat Rat cell lines, RBL(ATCC CRL 1378), derived from a basophilic leukemia, have been used to express 7TM G-protein coupled receptors(Fitzgerald et al, 2000; Santini et al., 2000) while the myeloma cell line YB2/0 (ATCC CRL 1662), has been used in the high-level production of monoclonal antibodies (Shitara et al., 1994) Human Human cell lines that are frequently used include HEK 293 HeLa(atcc ccl 2), HL-60(ATCC CCL 240), and HT-1080 (ATCC CCL 121) 504 Trill et al
cells. Other animal viruses of interest, including Sindbis, Semliki Forest virus, and the adeno-associated virus (AAV), share many of the same advantages as adenovirus, including broad host specificity (Schlesinger, 1993; Olkkonen et al., 1994; Bueler, 1999). None of these virus expression systems are discussed in detail in this chapter because they do not currently represent mainstream methods for large-scale protein production as is evident from the limitations discussed. Stable Expression Systems Stable expression systems are preferred when one desires a continuous source and high levels of expressed heterologous protein. The actual levels of expression largely depend on which host cells are used, what type of plasmids are used, and where the genes are integrated into the host genome (i.e., whether they are influenced by chromosomal position effects). What are the cell line choices? If it is a mammalian system, the most common choices are as discussed next. Mouse Mouse cells such as L-cells (ATCC CCL 1), Ltk- cells (ATCC CCL 1.3), NIH 3T3 (ATCC CRL 1658), and the myeloma cell lines, Sp2/0 (ATCC CRL 1581), NSO (Bebbington et al., 1992) and P3X63.Ag8.653 (ATCC CRL 1580). These myeloma cell lines have the advantages of suspension growth in serum-free medium and their derivation from secretory cells makes them well-suited hosts for high-level protein production. Because of the presence of the endogenous dihydrofolate reductase (DHFR) gene, none of these cells can be amplified through the use of methotrexate (Schimke, 1988). However, as shown by Bebbington et al. (1992), NSO cells can be amplified using the glutamine synthetase system. Rat Rat cell lines, RBL (ATCC CRL 1378), derived from a basophillic leukemia, have been used to express 7TM G-proteincoupled receptors (Fitzgerald et al., 2000; Santini et al., 2000), while the myeloma cell line YB2/0 (ATCC CRL 1662), has been used in the high-level production of monoclonal antibodies (Shitara et al., 1994). Human Human cell lines that are frequently used include HEK 293, HeLa (ATCC CCL 2), HL-60 (ATCC CCL 240), and HT-1080 (ATCC CCL 121). 504 Trill et al
Hamster Chinese hamster ovary(CHO)cells, such as CHO-K1(ATCC CCL 61), and two different DHFR- cell lines dg44 (Urlaub et al 1983 ) or DUK-Bll (Urlaub and Chasin, 1983) in which the gene of interest can be amplified via the selection/amplification marker DHFR(Kaufman, 1990). CHO cells have been used to express a large variety of proteins ranging from growth factors(Madisen et al., 1990; Ferrara et aL., 1993), receptors(Deen et al., 1988 Newman-Tancredi, Wootton, and Strange, 1992), 7TM G-protein- coupled receptors(Ishii et aL, 1997; Juarranz et al., 1999), to mon clonal antibodies(Trill, Shatzman, and Ganguly, 1995) Also of significance are engineered derivatives of these lines One example is a CHO cell line containing the adenovirus elA gene. Cockett, Bebbington, and Yarronton(1991)first established a CHo cell line stably expressing the adenovirus ElA gene which trans-activates the cmv promoter Transfection of a humai procollagenase gene into this CHO cell line produced a 13-fold increase in stable expression compared with that of CHO-K1. This is significant because an ElA host cell line can be used to rapidly produce sufficient material for early purification and testing without the need for amplification. Stably expressing clones pro- duced from this host can be obtained in as little as two weeks and yield 10 to 20mg/L of expressed protein. Baby Hamster Kidney(BHK) Cells(ATCC CCL 10) bHK cells have also been used to express a variety of genes (Wirth et al., 1988). Drosophila ll line derived from prima cultures of late stage, 20 to 24 hours old, D. melanogaster(Oregan R)embryos (Schneider, 1972). The cell line is particularly useful for the stable transfection of multiple tandem gene arrays without amplification. High copy number genes can be expressed in a tightly regulated fashion under the control of the copper-inducible Drosophila metallothionein promoter (Johansen et al., 1989).This cell line is particularly useful for the inducible expression of secreted proteins. S2 cells also grow well in serum-free, condi- tioned medium, simplifying the purification of expressed protein Yeast Expression Systems(Pichia pastoris and Pichia methanolica) The main advantages of yeast systems over higher eukaryotic tissue culture systems such as CHO include their rapid growth rate Eukaryotic Expression 505
Hamster Chinese hamster ovary (CHO) cells, such as CHO-K1 (ATCC CCL 61), and two different DHFR- cell lines DG44 (Urlaub et al., 1983) or DUK-B11 (Urlaub and Chasin, 1983) in which the gene of interest can be amplified via the selection/amplification marker DHFR (Kaufman, 1990). CHO cells have been used to express a large variety of proteins ranging from growth factors (Madisen et al., 1990; Ferrara et al., 1993), receptors (Deen et al., 1988; Newman-Tancredi, Wootton, and Strange, 1992), 7TM G-proteincoupled receptors (Ishii et al., 1997; Juarranz et al., 1999), to monoclonal antibodies (Trill, Shatzman, and Ganguly, 1995). Also of significance are engineered derivatives of these lines. One example is a CHO cell line containing the adenovirus E1A gene. Cockett, Bebbington, and Yarronton (1991) first established a CHO cell line stably expressing the adenovirus E1A gene, which trans-activates the CMV promoter.Transfection of a human procollagenase gene into this CHO cell line produced a 13-fold increase in stable expression compared with that of CHO-K1.This is significant because an E1A host cell line can be used to rapidly produce sufficient material for early purification and testing without the need for amplification. Stably expressing clones produced from this host can be obtained in as little as two weeks and yield 10 to 20mg/L of expressed protein. Baby Hamster Kidney (BHK) Cells (ATCC CCL 10) BHK cells have also been used to express a variety of genes (Wirth et al., 1988). Drosophila Drosophila S2 is a continuous cell line derived from primary cultures of late stage, 20 to 24 hours old, D. melanogaster (OreganR) embryos (Schneider, 1972). The cell line is particularly useful for the stable transfection of multiple tandem gene arrays without amplification. High copy number genes can be expressed in a tightly regulated fashion under the control of the copper-inducible Drosophila metallothionein promoter (Johansen et al., 1989). This cell line is particularly useful for the inducible expression of secreted proteins. S2 cells also grow well in serum-free, conditioned medium, simplifying the purification of expressed proteins. Yeast Expression Systems (Pichia pastoris and Pichia methanolica) The main advantages of yeast systems over higher eukaryotic tissue culture systems such as CHO include their rapid growth rate Eukaryotic Expression 505