chapter AMINO ACIDS, PEPTIDES AND PROTEINS 3.1 Amino Acids 75 acids, covalently linked in characteristic linear sequences 3.2 Peptides and Proteins 85 Because each of these amino acids has a side chain with 3.3 Working with Proteins 89 distinctive chemical properties, this group of 20 pre cursor molecules may be regarded as the alphabet in 3. 4 The covalent structure of proteins 96 which the language of protein structure is written 3.5 Protein Sequences and Evolution 10 What is most remarkable is that cells can produce proteins with strikingly different properties and activi- ties by joining the same 20 amino acids in many differ ent combinations and sequences From these building The word protein that I propose to you. . I would wish to blocks different organisms can make such widely diverse derive from proteios, because it appears to be the products as enzymes, hormones, antibodies, trans primitive or principal substance of animal nutrition that porters, muscle fibers, the lens protein of the eye, feat- plants prepare for the herbivores, and which the latter ers, spider webs, rhinoceros horn, milk proteins, antibi- then furnish to the carnivores otics, mushroom poisons, and myriad other substances biological activities(Fig 3-1). Among . Berzelius letter to G. J. Mulder. 1838 these protein products, the enzymes are the most var ied and specialized. Virtually all cellular reactions are oteins are the most abundant biological macromol- Protein structure and function are the topics of this ecules, occurring in all cells and all parts of cells. Pro- and the next three chapters. We begin with a descrip ins also occur in great variety; thousands of different tion of the fundamental chemical properties of amino kinds, ranging in size from relatively small peptides to huge polymers with molecular weights in the millions, may be found in a single cell. Moreover, proteins exhibit enormous diversity of biological function and are the 3.1 Amino Acids ways discussed in Part Ill of this book. Proteins are the 0 Protein Architecture--Amino Acids molecular instruments through which genetic informa- Proteins are polymers of amino acids, with each amino tion is expressed acid residue joined to its neighbor by a specific type Relatively simple monomeric subunits provide the of covalent bond. (The term"residue"reflects the loss key to the structure of the thousands of different pro- of the elements of water when one amino acid is joined teins. All proteins, whether from the most ancient lines to another. Proteins can be broken down (hydrolyzed) of bacteria or from the most complex forms of life, are to their constituent amino acids by a variety of methods constructed from the same ubiquitous set of 20 amino and the earliest studies of proteins naturally focused on
chapter AMINO ACIDS, PEPTIDES, AND PROTEINS 3.1 Amino Acids 75 3.2 Peptides and Proteins 85 3.3 Working with Proteins 89 3.4 The Covalent Structure of Proteins 96 3.5 Protein Sequences and Evolution 106 The word protein that I propose to you . . . I would wish to derive from proteios, because it appears to be the primitive or principal substance of animal nutrition that plants prepare for the herbivores, and which the latter then furnish to the carnivores. —J. J. Berzelius, letter to G. J. Mulder, 1838 – + – + 3 75 Proteins are the most abundant biological macromolecules, occurring in all cells and all parts of cells. Proteins also occur in great variety; thousands of different kinds, ranging in size from relatively small peptides to huge polymers with molecular weights in the millions, may be found in a single cell. Moreover, proteins exhibit enormous diversity of biological function and are the most important final products of the information pathways discussed in Part III of this book. Proteins are the molecular instruments through which genetic information is expressed. Relatively simple monomeric subunits provide the key to the structure of the thousands of different proteins. All proteins, whether from the most ancient lines of bacteria or from the most complex forms of life, are constructed from the same ubiquitous set of 20 amino acids, covalently linked in characteristic linear sequences. Because each of these amino acids has a side chain with distinctive chemical properties, this group of 20 precursor molecules may be regarded as the alphabet in which the language of protein structure is written. What is most remarkable is that cells can produce proteins with strikingly different properties and activities by joining the same 20 amino acids in many different combinations and sequences. From these building blocks different organisms can make such widely diverse products as enzymes, hormones, antibodies, transporters, muscle fibers, the lens protein of the eye, feathers, spider webs, rhinoceros horn, milk proteins, antibiotics, mushroom poisons, and myriad other substances having distinct biological activities (Fig. 3–1). Among these protein products, the enzymes are the most varied and specialized. Virtually all cellular reactions are catalyzed by enzymes. Protein structure and function are the topics of this and the next three chapters. We begin with a description of the fundamental chemical properties of amino acids, peptides, and proteins. 3.1 Amino Acids Protein Architecture—Amino Acids Proteins are polymers of amino acids, with each amino acid residue joined to its neighbor by a specific type of covalent bond. (The term “residue” reflects the loss of the elements of water when one amino acid is joined to another.) Proteins can be broken down (hydrolyzed) to their constituent amino acids by a variety of methods, and the earliest studies of proteins naturally focused on 8885d_c03_075 12/23/03 10:16 AM Page 75 mac111 mac111:reb:
Chapter 3 Amino Acids, Peptides, and Proteins FIGURE 3-1 Some functions of proteins. (a) The light prodt ers. The black rhinoceros is nearing extinction in the wild because of fireflies is the result of a reaction involving the protein luciferin and he belief prevalent in some parts of the world that a powder derived ATP, catalyzed by the enzyme luciferase(see Box 13-2).(b)Erythro- from its horn has aphrodisiac properties. In reality, the chemical prop- cytes contain large amounts of the oxy porting protein erties of powdered rhinoceros horn are no different from those of pow- moglobin. (c) The protein keratin, formed by all vertebrates, is the dered bovine hooves or human fingernails hief structural component of hair, scales, horn, wool, nails, and feath- the free amino acids derived from them. twenty differ- symbols (Table 3-1), which are used as shorthand to in- ent amino acids are commonly found in proteins. The dicate the composition and sequence of amino acids first to be discovered was asparagine, in 1806. The last polymerized in proteins of the 20 to be found, threonine, was not identified until Two conventions are used to identify the carbons in 1938. All the amino acids have trivial or common names, an amino acid-a practice that can be confusing. The in some cases derived from the source from which they additional carbons in an R group are commonly desig were first isolated. Asparagine was first found in as- nated B, 2,8, E, and so forth, proceeding out from the paragus, and glutamate in wheat gluten; tyrosine a carbon. For most other organic molecules, carbon first isolated from cheese (its name is derived from atoms are simply numbered from one end, giving high Greek tyros,"cheese ) and glycine ( Greek glykos, est priority(C-1)to the carbon with the substituent con- sweet) was so named because of its sweet taste taining the atom of highest atomic number. within this latter convention, the carboxyl carbon of an amino acid Amino Acids share Common structural Features would be c-1 and the a carbon would be c-2. in some all 20 of the common amino acids are a-amino acids cases, such as amino acids with heterocyclic r groups They have a carboxyl group and no group bonded the greek lettering system is ambiguous and the num- to the same carbon atom (the a carbon)(Fig. 3-2). They bering convention is therefore used differ from each other in their side chains, or R groups, which vary in structure, size, and electric charge, and CH2-CH2-CH2-CH2-CH-CO0- which influence the solubility of the amino acids in wa- *NHs ter. In addition to these 20 amino acids there are many +NHs less common ones. Some are residues modified after a protein has been synthesized; others are amino acids For all the common amino acids except glycine, the present in living organisms but not as constituents of a carbon is bonded to four different groups: a carboxyl proteins. The common amino acids of proteins have group, an amino group, an R group, and a hydrogen atom been assigned three-letter abbreviations and one-letter (Fig. 3-2; in glycine, the R group is another hydrogen atom). The a-carbon atom is thus a chiral center (p. 17). Because of the tetrahedral arrangement of the bonding orbitals around the a-carbon atom, the four dif- ferent groups can occupy two unique spatial arrange- HaN-C-H stereoisomers. Since they are nonsuperimposable mir- or images of each other(Fig. 3-3), the two forms rep- FIGURE 3-2 General structure of an amino acid. This structure is resent a class of stereoisomers called enantiomers(see common to all but one of the a-amino acids (Proline, a cyclic amino Fig. 1-19). All molecules with a chiral center are also acid, is the exception. )The R group or side chain(red) attached to the optically active-that is, they rotate plane-polarized a carbon(blue) is different in each amino acid light(see Box 1-2)
the free amino acids derived from them. Twenty different amino acids are commonly found in proteins. The first to be discovered was asparagine, in 1806. The last of the 20 to be found, threonine, was not identified until 1938. All the amino acids have trivial or common names, in some cases derived from the source from which they were first isolated. Asparagine was first found in asparagus, and glutamate in wheat gluten; tyrosine was first isolated from cheese (its name is derived from the Greek tyros, “cheese”); and glycine (Greek glykos, “sweet”) was so named because of its sweet taste. Amino Acids Share Common Structural Features All 20 of the common amino acids are -amino acids. They have a carboxyl group and an amino group bonded to the same carbon atom (the carbon) (Fig. 3–2). They differ from each other in their side chains, or R groups, which vary in structure, size, and electric charge, and which influence the solubility of the amino acids in water. In addition to these 20 amino acids there are many less common ones. Some are residues modified after a protein has been synthesized; others are amino acids present in living organisms but not as constituents of proteins. The common amino acids of proteins have been assigned three-letter abbreviations and one-letter symbols (Table 3–1), which are used as shorthand to indicate the composition and sequence of amino acids polymerized in proteins. Two conventions are used to identify the carbons in an amino acid—a practice that can be confusing. The additional carbons in an R group are commonly designated , , , , and so forth, proceeding out from the carbon. For most other organic molecules, carbon atoms are simply numbered from one end, giving highest priority (C-1) to the carbon with the substituent containing the atom of highest atomic number. Within this latter convention, the carboxyl carbon of an amino acid would be C-1 and the carbon would be C-2. In some cases, such as amino acids with heterocyclic R groups, the Greek lettering system is ambiguous and the numbering convention is therefore used. For all the common amino acids except glycine, the carbon is bonded to four different groups: a carboxyl group, an amino group, an R group, and a hydrogen atom (Fig. 3–2; in glycine, the R group is another hydrogen atom). The -carbon atom is thus a chiral center (p. 17). Because of the tetrahedral arrangement of the bonding orbitals around the -carbon atom, the four different groups can occupy two unique spatial arrangements, and thus amino acids have two possible stereoisomers. Since they are nonsuperimposable mirror images of each other (Fig. 3–3), the two forms represent a class of stereoisomers called enantiomers (see Fig. 1–19). All molecules with a chiral center are also optically active—that is, they rotate plane-polarized light (see Box 1–2). CH2 NH3 COO NH3 CH2 CH2 CH2 CH Lysine 6 1 5 4 3 2 ed gba 76 Chapter 3 Amino Acids, Peptides, and Proteins (a) (b) (c) FIGURE 3–1 Some functions of proteins. (a) The light produced by fireflies is the result of a reaction involving the protein luciferin and ATP, catalyzed by the enzyme luciferase (see Box 13–2). (b) Erythrocytes contain large amounts of the oxygen-transporting protein hemoglobin. (c) The protein keratin, formed by all vertebrates, is the chief structural component of hair, scales, horn, wool, nails, and feathers. The black rhinoceros is nearing extinction in the wild because of the belief prevalent in some parts of the world that a powder derived from its horn has aphrodisiac properties. In reality, the chemical properties of powdered rhinoceros horn are no different from those of powdered bovine hooves or human fingernails. H3N C COO R H FIGURE 3–2 General structure of an amino acid. This structure is common to all but one of the -amino acids. (Proline, a cyclic amino acid, is the exception.) The R group or side chain (red) attached to the carbon (blue) is different in each amino acid. 8885d_c03_076 12/23/03 10:20 AM Page 76 mac111 mac111:reb:
3.1 Amino Acids Special nomenclature has been developed to spec CHO CHO ify the absolute configuration of the four substituents HO-C-H of asymmetric carbon atoms. The absolute configura- tions of simple sugars and amino acids are specified by OH CH2OH the D, L system(Fig. 3-4), based on the absolute con L-Glyceraldehyde acraldehyde figuration of the three-carbon sugar glyceraldehyde, a COr convention proposed by Emil Fischer in 1891.(Fischer H knew what groups surrounded the asymmetric carbon of glyceraldehyde but had to guess at their absolute H3 D-Alanine configuration; his guess was later confirmed by x-ray diffraction analysis. For all chiral compounds, stereo- FIGURE 3-4 Steric relationship of the stereoisomers of alanine to isomers having a configuration related to that of the absolute configuration of L-and D-glyceraldehyde. In these per. L-glyceraldehyde are designated L, and stereoisomers spective formulas, the carbons are lined up vertically, with the chiral related to D-glyceraldehyde are designated D. The func atom in the center. The carbons in these molecules are numbered be. tional groups of L-alanine are matched with those of L- ginning with the terminal aldehyde or carboxyl carbon (red), 1 to 3 glyceraldehyde by aligning those that can be intercon- from top to bottom as shown. When presented in this way, the R group verted by simple, one-step chemical reactions. Thus the of the amino acid (in this case the methyl group of alanine) is always carboxyl group of L-alanine occupies the same position below the a carbon L-Amino acids are those with the a-amino group about the chiral carbon as does the aldehyde group on the left, and D-amino acids have the a-amino group on the right of L-glyceraldehyde, because an aldehyde is readily converted to a carboxyl group via a one-step oxidation. Historically, the similar l and d designations were used L-amino acids are levorotatory, and the convention for levorotatory(rotating light to the left) and dextro- shown in Figure 3-4 was needed to avoid potential am rotatory(rotating light to the right). However, not all biguities about absolute configuration. By Fischer's con vention, L and D refer only to the absolute configura- tion of the four substituents around the chiral carbon COo not to optical properties of the molecule. Another system of specifying configuration around @H∈@→阻 a chiral center is the Rs system, which is used in the systematic nomenclature of organic chemistry and de- scribes more precisely the configuration of molecules CH with more than one chiral center(see p. 18) Alanine The Amino Acid residues in proteins Are L Stereoiso COo COo Nearly all biological compounds with a chiral center oc cur naturally in only one stereoisomeric form, either D or L. The amino acid residues in protein molecules are L-Alanine D-Alanine exclusively L stereoisomers. D-Amino acid residues have nly in COO cluding some peptides of bacterial cell walls and certain H3N-C-H -NH3 peptide antibiotics It is remarkable that virtually all amino acid residues H in proteins are L stereoisomers. When chiral compounds (c) D-Alanine are formed by ordinary chemical reactions, the result is racemic mixture of d and l isomers which are dififi- FIGURE 3-3 Stereoisomerism in a-amino acids. (a)The two stereoiso- cult for a chemist to distinguish and separate. but to a ages of each other (enantiomers).(b, c) Two different conventions for living system, D and L isomers are as different as the showing the configurations in space of stereoisomers In perspective right hand and the left. The formation of stable,re- formulas(b)the solid wedge-shaped bonds project out of the plane peating substructures in proteins(Chapter 4)generally of the paper, the dashed bonds behind it. In projection formulas(c) requires that their constituent amino acids be of one the horizontal bonds are assumed to project out of the plane of the stereochemical series. Cells are able to specifically syn- paper, the vertical bonds behind. However, projection formulas are thesize the l isomers of amino acids because the active ten used casually and are not always intended to portray a specific sites of enzymes are asymmetric, causing the reactions ey cat
Special nomenclature has been developed to specify the absolute configuration of the four substituents of asymmetric carbon atoms. The absolute configurations of simple sugars and amino acids are specified by the D, L system (Fig. 3–4), based on the absolute configuration of the three-carbon sugar glyceraldehyde, a convention proposed by Emil Fischer in 1891. (Fischer knew what groups surrounded the asymmetric carbon of glyceraldehyde but had to guess at their absolute configuration; his guess was later confirmed by x-ray diffraction analysis.) For all chiral compounds, stereoisomers having a configuration related to that of L-glyceraldehyde are designated L, and stereoisomers related to D-glyceraldehyde are designated D. The functional groups of L-alanine are matched with those of Lglyceraldehyde by aligning those that can be interconverted by simple, one-step chemical reactions. Thus the carboxyl group of L-alanine occupies the same position about the chiral carbon as does the aldehyde group of L-glyceraldehyde, because an aldehyde is readily converted to a carboxyl group via a one-step oxidation. Historically, the similar l and d designations were used for levorotatory (rotating light to the left) and dextrorotatory (rotating light to the right). However, not all L-amino acids are levorotatory, and the convention shown in Figure 3–4 was needed to avoid potential ambiguities about absolute configuration. By Fischer’s convention, L and D refer only to the absolute configuration of the four substituents around the chiral carbon, not to optical properties of the molecule. Another system of specifying configuration around a chiral center is the RS system, which is used in the systematic nomenclature of organic chemistry and describes more precisely the configuration of molecules with more than one chiral center (see p. 18). The Amino Acid Residues in Proteins Are L Stereoisomers Nearly all biological compounds with a chiral center occur naturally in only one stereoisomeric form, either D or L. The amino acid residues in protein molecules are exclusively L stereoisomers. D-Amino acid residues have been found only in a few, generally small peptides, including some peptides of bacterial cell walls and certain peptide antibiotics. It is remarkable that virtually all amino acid residues in proteins are L stereoisomers. When chiral compounds are formed by ordinary chemical reactions, the result is a racemic mixture of D and L isomers, which are difficult for a chemist to distinguish and separate. But to a living system, D and L isomers are as different as the right hand and the left. The formation of stable, repeating substructures in proteins (Chapter 4) generally requires that their constituent amino acids be of one stereochemical series. Cells are able to specifically synthesize the L isomers of amino acids because the active sites of enzymes are asymmetric, causing the reactions they catalyze to be stereospecific. 3.1 Amino Acids 77 (a) COO H3N CH3 CH3 C H H C COO L-Alanine D-Alanine NH3 H3N C COO CH3 H H C COO CH3 N H3 (b) L-Alanine D-Alanine H3N COO CH3 H HC COO CH3 N H3 L-Alanine D-Alanine C (c) FIGURE 3–3 Stereoisomerism in -amino acids. (a)The two stereoisomers of alanine, L- and D-alanine, are nonsuperimposable mirror images of each other (enantiomers). (b, c) Two different conventions for showing the configurations in space of stereoisomers. In perspective formulas (b) the solid wedge-shaped bonds project out of the plane of the paper, the dashed bonds behind it. In projection formulas (c) the horizontal bonds are assumed to project out of the plane of the paper, the vertical bonds behind. However, projection formulas are often used casually and are not always intended to portray a specific stereochemical configuration. HO C 1 CHO 3 CH2OH H HC CHO CH2OH OH H3N C COO CH3 H HC COO CH3 N H3 L-Glyceraldehyde D-Alanine 2 D-Glyceraldehyde L-Alanine FIGURE 3–4 Steric relationship of the stereoisomers of alanine to the absolute configuration of L- and D-glyceraldehyde. In these perspective formulas, the carbons are lined up vertically, with the chiral atom in the center. The carbons in these molecules are numbered beginning with the terminal aldehyde or carboxyl carbon (red), 1 to 3 from top to bottom as shown. When presented in this way, the R group of the amino acid (in this case the methyl group of alanine) is always below the carbon. L-Amino acids are those with the -amino group on the left, and D-amino acids have the -amino group on the right. 8885d_c03_077 12/23/03 10:20 AM Page 77 mac111 mac111:reb:
Chapter 3 Amino Acids, Peptides, and Proteins TABLE 3-1 Properties and Conventions Associated with the Common Amino Acids Found in Prote ves Hydropathy Occurrence in Amino acid M, (-CO0H)(NH3)(R group) pl index* proteins(%)T Nonpolar, aliphatic Glycine 5.97 Alanine Ala A 2.34 9.69 6.01 1.8 7.8 P11519910 6.48 1.6 Valine 117 2.32 597 4.2 6.6 Leucine Leu l 131 2.36 9.60 Isoleucine Methionine Met M 149 2.28 9.21 5.74 1.9 2.3 Aromatic R groups Phenylalanine Phe F 651.83 9.13 5.48 2.8 lyrosine 181 Tryptophan 204 2.38 9.39 0.9 1.4 Polar, uncharged R groups Serine Ser s 9.15 5.68 0.8 Threonine Thr T 119 2.11 9.62 0.7 59 121 196 10.28 8.18 19 Asparagine Asn N 132 2.02 8.80 3.5 4.3 2.17 5.65 Positively charged R groups sine 462.18 8.95 10.53 9.74 3.9 59 Histidine 1.82 6.00 -3.2 Arg R 174 2.17 9.0412.48 10.76 5 5.1 Negatively charged R groups Asp D 133 188 3.65 2.77 3.5 Glutamate Glu E 147 2.19 9.67 4.25 3.22 -3.5 6.3 "A scale combining hydrophobicity and hydroph icity of R groups it can be used to measure the tendency of an amino acid to seek an aqueous emironment(- values)or a hydrophobic environment (t values). See Chapter 11. From Kyte, I Doolittle, RE(1982)A simple method for displaying the hydropathic character of a protein. J Mol Biol. 157, 105-132. Amino Acids Can Be Classified by R Group listed in Table 3-l. Within each class there are grada- Knowledge of the chemical properties of the common tions of polarity, size, and shape of the r groups amino acids is central to an understanding of biochem- Nonpolar, Aliphatic R Groups The r groups in this class of istry. The topic can be simplified by grouping the amino amino acids are nonpolar and hydrophobic. The side acids into five main classes based on the properties of chains of alanine, valine, leucine, and isoleucine their R groups (Table 3-1), in particular, their polarity, tend to cluster together within proteins, stabilizing pro- or tendency to interact with water at biological pH (near tein structure by means of hydrophobic interactions pH 7.0). The polarity of the R groups varies widely, from Glycine has the simplest structure. Although it is for- nonpolar and hydrophobic (water-insoluble) to highly mally nonpolar, its very small side chain makes no real polar and hydrophilic (water-soluble) contribution to hydrophobic interactions. Methionine, The structures of the 20 common amino acids are one of the two sulfur-containing amino acids, has a non- shown in Figure 3-5, and some of their properties are polar thioether group in its side chain. Proline has an
Amino Acids Can Be Classified by R Group Knowledge of the chemical properties of the common amino acids is central to an understanding of biochemistry. The topic can be simplified by grouping the amino acids into five main classes based on the properties of their R groups (Table 3–1), in particular, their polarity, or tendency to interact with water at biological pH (near pH 7.0). The polarity of the R groups varies widely, from nonpolar and hydrophobic (water-insoluble) to highly polar and hydrophilic (water-soluble). The structures of the 20 common amino acids are shown in Figure 3–5, and some of their properties are listed in Table 3–1. Within each class there are gradations of polarity, size, and shape of the R groups. Nonpolar, Aliphatic R Groups The R groups in this class of amino acids are nonpolar and hydrophobic. The side chains of alanine, valine, leucine, and isoleucine tend to cluster together within proteins, stabilizing protein structure by means of hydrophobic interactions. Glycine has the simplest structure. Although it is formally nonpolar, its very small side chain makes no real contribution to hydrophobic interactions. Methionine, one of the two sulfur-containing amino acids, has a nonpolar thioether group in its side chain. Proline has an 78 Chapter 3 Amino Acids, Peptides, and Proteins TABLE 3–1 Properties and Conventions Associated with the Common Amino Acids Found in Proteins pKa values Abbreviation/ pK1 pK2 pKR Hydropathy Occurrence in Amino acid symbol Mr (OCOOH) (ONH3 ) (R group) pI index* proteins (%)† Nonpolar, aliphatic R groups Glycine Gly G 75 2.34 9.60 5.97 0.4 7.2 Alanine Ala A 89 2.34 9.69 6.01 1.8 7.8 Proline Pro P 115 1.99 10.96 6.48 1.6 5.2 Valine Val V 117 2.32 9.62 5.97 4.2 6.6 Leucine Leu L 131 2.36 9.60 5.98 3.8 9.1 Isoleucine Ile I 131 2.36 9.68 6.02 4.5 5.3 Methionine Met M 149 2.28 9.21 5.74 1.9 2.3 Aromatic R groups Phenylalanine Phe F 165 1.83 9.13 5.48 2.8 3.9 Tyrosine Tyr Y 181 2.20 9.11 10.07 5.66 1.3 3.2 Tryptophan Trp W 204 2.38 9.39 5.89 0.9 1.4 Polar, uncharged R groups Serine Ser S 105 2.21 9.15 5.68 0.8 6.8 Threonine Thr T 119 2.11 9.62 5.87 0.7 5.9 Cysteine Cys C 121 1.96 10.28 8.18 5.07 2.5 1.9 Asparagine Asn N 132 2.02 8.80 5.41 3.5 4.3 Glutamine Gln Q 146 2.17 9.13 5.65 3.5 4.2 Positively charged R groups Lysine Lys K 146 2.18 8.95 10.53 9.74 3.9 5.9 Histidine His H 155 1.82 9.17 6.00 7.59 3.2 2.3 Arginine Arg R 174 2.17 9.04 12.48 10.76 4.5 5.1 Negatively charged R groups Aspartate Asp D 133 1.88 9.60 3.65 2.77 3.5 5.3 Glutamate Glu E 147 2.19 9.67 4.25 3.22 3.5 6.3 *A scale combining hydrophobicity and hydrophilicity of R groups; it can be used to measure the tendency of an amino acid to seek an aqueous environment ( values) or a hydrophobic environment ( values). See Chapter 11. From Kyte, J. & Doolittle, R.F. (1982) A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132. † Average occurrence in more than 1,150 proteins. From Doolittle, R.F. (1989) Redundancies in protein sequences. In Prediction of Protein Structure and the Principles of Protein Conformation (Fasman, G.D., ed.), pp. 599–623, Plenum Press, New York. 8885d_c03_078 12/23/03 10:20 AM Page 78 mac111 mac111:reb:
3.1Ar Nonpolar, aliphatic R groups Aromatic r COO COO COO H小C-HH2NC Hs N-C-H H3N-C-H H3N-C--H HSN-C-H CHs CH CHO CH2 HC Glycine Alanine Proline COO CoO ISN- HSN-C-H Phenylalanine Tryptophan CH2 CH CHs CH positively charged R CHs COO Coo COO Leucin Isoleucine Methionine HoN-C-H HON-C-H CHa Polar, uncharged R groups CH2 COO COO H HoN-C-H H CH2 NH CH,O H-C-OHI CHo C=. CHs SHI NHo Serine Threonine L Histidine COO Coo Negatively charged R groups ISN- Coo CH2 CH2 H2N CH. Glutamine Aspartate FIGURE 3-5 The 20 common amino acids of proteins. The structural formulas show the state of ionization that would predominate at pH small but significant fraction 7.0. The unshaded portions are those common to all the amino acids; pH 7.0 the portions shaded in red are the R groups. Although the R group of aliphatic side chain with a distinctive cyclic structure. The tional group in some enzymes Tyrosine and tryptophan secondary amino (imino) group of proline residues is are significantly more polar than phenylalanine, because held in a rigid conformation that reduces the structural of the tyrosine hydroxyl group and the nitrogen of the flexibility of polypeptide regions containing proline tryptophan indole ring tryptophan and tyrosine, and to a much lesser ex Aromatic R Groups Phenylalanine, tyrosine, and tryp- tent phenylalanine, absorb ultraviolet light (Fig. 3-6 tophan, with their aromatic side chains, are relatively Box 3-1). This accounts for the characteristic strong ab- nonpolar (hydrophobic). All can participate in hy- sorbance of light by most proteins at a wavelength of drophobic interactions. The hydroxyl group of tyrosine 280 nm, a property exploited by researchers in the char- can form hydrogen bonds, and it is an important func acterization of proteins
aliphatic side chain with a distinctive cyclic structure. The secondary amino (imino) group of proline residues is held in a rigid conformation that reduces the structural flexibility of polypeptide regions containing proline. Aromatic R Groups Phenylalanine, tyrosine, and tryptophan, with their aromatic side chains, are relatively nonpolar (hydrophobic). All can participate in hydrophobic interactions. The hydroxyl group of tyrosine can form hydrogen bonds, and it is an important functional group in some enzymes. Tyrosine and tryptophan are significantly more polar than phenylalanine, because of the tyrosine hydroxyl group and the nitrogen of the tryptophan indole ring. Tryptophan and tyrosine, and to a much lesser extent phenylalanine, absorb ultraviolet light (Fig. 3–6; Box 3–1). This accounts for the characteristic strong absorbance of light by most proteins at a wavelength of 280 nm, a property exploited by researchers in the characterization of proteins. 3.1 Amino Acids 79 Nonpolar, aliphatic R groups H3N C COO H H H3N C COO CH3 H H3N C COO C CH3 CH3 H H Glycine Alanine Valine Aromatic R groups H3N C COO CH2 H H3N C COO CH2 H OH Phenylalanine Tyrosine H2N H2C C COO H C CH2 H 2 Proline H3N C COO C C CH H2 H NH Tryptophan Polar, uncharged R groups H3N C COO CH2OH H H3N C COO H C CH3 OH H H3N C COO C SH H2 H Serine Threonine H3N C COO C C H2N O H2 H H3N C COO C C C H2N O H2 H2 H Positively charged R groups N C C C C H3N C COO H H2 H2 H2 H2 H3 C N C C C H3N C COO H H2 H2 H2 H NH2 N H2 H3N C COO C C NH H 2 H C H N Lysine Arginine Histidine Negatively charged R groups H3N C COO C COO H2 H H3N C COO C C COO H2 H2 H Asparagine Glutamine Aspartate Glutamate Cysteine CH H3N C COO C C CH3 CH3 H H2 H Leucine H3N C COO C C S CH3 H2 H2 H Methionine H3 C COO H C C CH3 H2 H H Isoleucine N C 3 FIGURE 3–5 The 20 common amino acids of proteins. The structural formulas show the state of ionization that would predominate at pH 7.0. The unshaded portions are those common to all the amino acids; the portions shaded in red are the R groups. Although the R group of histidine is shown uncharged, its pKa (see Table 3–1) is such that a small but significant fraction of these groups are positively charged at pH 7.0. 8885d_c03_079 12/23/03 10:20 AM Page 79 mac111 mac111:reb: