8. 2 Nucleic Acid Structure 28 in which the vertically stacked bases inside the double helix would be 3. 4 A apart; the secondary repeat dis- tance of about 34 A was accounted for by the presence of 10 base pairs in each complete turn of the double helix In aqueous solution the structure differs slightly from that in fibers, having 10.5 base pairs per helical turn(Fig. 8-15) As Figure 8-16 shows, the two antiparallel polyn- cleotide chains of double-helical dna are not identical in either base sequence or composition. Instead they are complementary to each other. Wherever adenine oc- curs in one chain, thymine is found in the other: simi- larly, wherever guanine occurs in one chain, cytosine is found in the other The DNA double helix, or duplex, is held together by two forces, as described earlier: hydrogen bonding between complementary base pairs(Fig. 8-11)and base-stacking interactions. The complementarity be tween the DNA strands is attributable to the hydrogen bonding between base pairs. The base-stacking interac tions, which are largely nonspecific with respect to the identity of the stacked bases, make the major contribu tion to the stability of the double helix. The important features of the double-helical model of DNa structure are supported by much chemical and grand FIGURE 8-17 Replication of DNA as suggested by Watson and Crick. The preexisting or parent" strands become separated ach is the template for biosynthesis of a complementary 'daughter"strand (in red) biological evidence. Moreover, the model immediately suggested a mechanism for the transmission of genetic information. The essential feature of the model is the complementarity of the two DNA strands. As Watson and Crick were able to see, well before confirmatory data be came available, this structure could logically be replicated by(1)separating the two strands and(2)synthesizing a complementary strand for each. Because nucleotides in each new strand are joined in a sequence specified by 鱼 the base-pairing rules stated above, each preexisting strand functions as a template to guide the synthesis of one complementary strand(Fig. 8-17). These expecta tions were experimentally confirmed, inaugurating a rev- 间 olution in our understanding of biological inheritance DNA Can Occur in Different three-Dimensional forms DNA is a remarkably flexible molecule. Considerable ro- tation is possible around a number of bonds in the sugar-phosphate(phosphodeoxyribose) backbone, and FIGURE 8-16 Complementarity of strands in the DNA double helix. thermal fluctuation can produce bending, stretching, and The complementary antiparallel strands of DNA follow the pairing unpairing(melting) of the strands. Many significant de rules proposed by Watson and Crick. The base-paired antiparallel viations from the Watson-Crick DNA structure are found strands differ in base composition: the left strand has the composition in cellular DNA, some or all of which may play impor A3 T2 G, C3 the right, A2 T3 G3 C,. They also differ in sequence when tant roles in DNA metabolism. These structural varia- each chain is read in the 5-3 direction. Note the base equiva. tions generally do not affect the key properties of dna lences: A= T and G=C in the duplex defined by Watson and Crick: strand complementarity
in which the vertically stacked bases inside the double helix would be 3.4 Å apart; the secondary repeat distance of about 34 Å was accounted for by the presence of 10 base pairs in each complete turn of the double helix. In aqueous solution the structure differs slightly from that in fibers, having 10.5 base pairs per helical turn (Fig. 8–15). As Figure 8–16 shows, the two antiparallel polynucleotide chains of double-helical DNA are not identical in either base sequence or composition. Instead they are complementary to each other. Wherever adenine occurs in one chain, thymine is found in the other; similarly, wherever guanine occurs in one chain, cytosine is found in the other. The DNA double helix, or duplex, is held together by two forces, as described earlier: hydrogen bonding between complementary base pairs (Fig. 8–11) and base-stacking interactions. The complementarity between the DNA strands is attributable to the hydrogen bonding between base pairs. The base-stacking interactions, which are largely nonspecific with respect to the identity of the stacked bases, make the major contribution to the stability of the double helix. The important features of the double-helical model of DNA structure are supported by much chemical and biological evidence. Moreover, the model immediately suggested a mechanism for the transmission of genetic information. The essential feature of the model is the complementarity of the two DNA strands. As Watson and Crick were able to see, well before confirmatory data became available, this structure could logically be replicated by (1) separating the two strands and (2) synthesizing a complementary strand for each. Because nucleotides in each new strand are joined in a sequence specified by the base-pairing rules stated above, each preexisting strand functions as a template to guide the synthesis of one complementary strand (Fig. 8–17). These expectations were experimentally confirmed, inaugurating a revolution in our understanding of biological inheritance. DNA Can Occur in Different Three-Dimensional Forms DNA is a remarkably flexible molecule. Considerable rotation is possible around a number of bonds in the sugar–phosphate (phosphodeoxyribose) backbone, and thermal fluctuation can produce bending, stretching, and unpairing (melting) of the strands. Many significant deviations from the Watson-Crick DNA structure are found in cellular DNA, some or all of which may play important roles in DNA metabolism. These structural variations generally do not affect the key properties of DNA defined by Watson and Crick: strand complementarity, 8.2 Nucleic Acid Structure 283 FIGURE 8–16 Complementarity of strands in the DNA double helix. The complementary antiparallel strands of DNA follow the pairing rules proposed by Watson and Crick. The base-paired antiparallel strands differ in base composition: the left strand has the composition A3 T2 G1 C3; the right, A2 T3 G3 C1. They also differ in sequence when each chain is read in the 5 n 3 direction. Note the base equivalences: A T and G C in the duplex. FIGURE 8–17 Replication of DNA as suggested by Watson and Crick. The preexisting or “parent” strands become separated, and each is the template for biosynthesis of a complementary “daughter” strand (in red)
284Chapter8Nucleotides and Nucleic Acids OH OH OHOH e.Aden osne FIGURE 8-18 Structural variation in DNA. ( a) The conformation of a nucleotide in DNA is affected by rotation about seven different bonds. Six of the bonds rotate freely The limited rotation about bond 4 gives rise to ring pucker, in which one of the atoms in the five-membered furanose ring is out of the plane described by the other four. This conformation is endo or exo, depending on whether the atom is displaced to the same side of the plane as C-5 or to the opposite side(see Fig. 8-3b).(b) For purine bases in nucleotides, only two conformations with respect to the attached ribose units are sterically permitted, anti or syn. Pyrimidines generally occur in the anti conformation antiparallel strands, and the requirement for A=T and The Watson-Crick structure is also referred to as b- base form DNA, or B-DNA. The b form is the most stable Structural variation in DNA reflects three things: structure for a random-sequence DNA molecule under the different possible conformations of the deoxyribose, physiological conditions and is therefore the standard rotation about the contiguous bonds that make up the point of reference in any study of the properties of DNA. phosphodeoxyribose backbone(Fig. 8-18a), and free Two structural variants that have been well character- rotation about the C-1 -N-glycosyl bond(Fig. 8-18b). ized in crystal structures are the a and Z forms. These Because of steric constraints, purines in purine nu- three DNA conformations are shown in Figure 8-19, cleotides are restricted to two stable conformations with with a summary of their properties. The A form is fa- respect to deoxyribose, called syn and anti(Fig. 8-18b). vored in many solutions that are relatively devoid of wa- Pyrimidines are generally restricted to the anti confor- ter. The DNA is still arranged in a right-handed double mation because of steric interference between the sugar helix, but the helix is wider and the number of base pairs and the carbonyl oxygen at C-2 of the pyrimidine per helical turn is 11, rather than 10.5 as in B-DNA. The B for Helical sense Right handed Right handed Left handed Diameter ~26A Base pairs per helical 11 12 Helix rise per base pair 2.6 A 3.4A 3.7A Base tilt normal to the elix axis 6 7° Sugar pucker conformation C-3 endo C-2 endo C-2'endo for pyrimidines C3’ endo for rines Glycosyl bond conformation Anti Anti for pyrimidines, syn for purines 28A FIGURE 8-19 Comparison of A, B, and Z forms of DNA. Each struc ture shown here has 36 base pairs. The bases are shown in gray, the phosphate atoms in yellow, and the riboses and phosphate oxygens in blue. Blue is the color used to represent DNA strands in later chap- Z form ters. The table summarizes some properties of the three forms of DNA
antiparallel strands, and the requirement for APT and GqC base pairs. Structural variation in DNA reflects three things: the different possible conformations of the deoxyribose, rotation about the contiguous bonds that make up the phosphodeoxyribose backbone (Fig. 8–18a), and free rotation about the C-1–N-glycosyl bond (Fig. 8–18b). Because of steric constraints, purines in purine nucleotides are restricted to two stable conformations with respect to deoxyribose, called syn and anti (Fig. 8–18b). Pyrimidines are generally restricted to the anti conformation because of steric interference between the sugar and the carbonyl oxygen at C-2 of the pyrimidine. The Watson-Crick structure is also referred to as Bform DNA, or B-DNA. The B form is the most stable structure for a random-sequence DNA molecule under physiological conditions and is therefore the standard point of reference in any study of the properties of DNA. Two structural variants that have been well characterized in crystal structures are the A and Z forms. These three DNA conformations are shown in Figure 8–19, with a summary of their properties. The A form is favored in many solutions that are relatively devoid of water. The DNA is still arranged in a right-handed double helix, but the helix is wider and the number of base pairs per helical turn is 11, rather than 10.5 as in B-DNA. The 284 Chapter 8 Nucleotides and Nucleic Acids FIGURE 8–18 Structural variation in DNA. (a) The conformation of a nucleotide in DNA is affected by rotation about seven different bonds. Six of the bonds rotate freely. The limited rotation about bond 4 gives rise to ring pucker, in which one of the atoms in the five-membered furanose ring is out of the plane described by the other four. This conformation is endo or exo, depending on whether the atom is displaced to the same side of the plane as C-5 or to the opposite side (see Fig. 8–3b). (b) For purine bases in nucleotides, only two conformations with respect to the attached ribose units are sterically permitted, anti or syn. Pyrimidines generally occur in the anti conformation. FIGURE 8–19 Comparison of A, B, and Z forms of DNA. Each structure shown here has 36 base pairs. The bases are shown in gray, the phosphate atoms in yellow, and the riboses and phosphate oxygens in blue. Blue is the color used to represent DNA strands in later chapters. The table summarizes some properties of the three forms of DNA. A form B form Z form Helical sense Right handed Right handed Left handed Diameter 26 Å 20 Å 18 Å Base pairs per helical turn 11 10.5 12 Helix rise per base pair 2.6 Å 3.4 Å 3.7 Å Base tilt normal to the helix axis 20° 6° 7° Sugar pucker conformation C-3 endo C-2 endo C-2 endo for pyrimidines; C-3 endo for purines Glycosyl bond conformation Anti Anti Anti for pyrimidines; syn for purines
8. 2 Nucleic Acid Structure plane of the base pairs in A- DNA is tilted about 20 with The bending observed with this and other sequences may respect to the helix axis. These structural changes be important in the binding of some proteins to DNA. deepen the major groove while making the minor groove A rather common type of DNa sequence is a palin shallower. The reagents used to promote crystallization drome. A palindrome is a word, phrase, or sentence of dNa tend to dehydrate it, and thus most short dna that is spelled identically read either forward or back molecules tend to crystallize in the A form ard; two examples are ROTATOR and NUrSES rUN Z-form DNA is a more radical departure from the b The term is applied to regions of DNa with inverted structure:the most obvious distinction is the left. repeats of base sequence having twofold symmetry handed helical rotation. There are 12 base pairs per hel- over two strands of DNA(Fig. 8-20). Such sequences ical turm, and the structure appears more slender and are self-complementary within each strand and there- elongated. The DNA backbone takes on a zigzag ap. fore have the potential to form hairpin or cruciform pearance. Certain nucleotide sequences fold into left-(cross-shaped) structures(Fig. 8-21). When the in- handed Z helices much more readily than others. Promi- verted repeat occurs within each individual strand of nent examples are sequences in which pyrimidines the DNA, the sequence is called a mirror repeat altemate with purines, especially alternating C and G or Mirror repeats do not have complementary sequences 5-methyl-C and G residues. To form the left-handed within the same strand and cannot form hairpin or cru- helix in Z-DNA, the purine residues flip to the syn ciform structures Sequences of these types are found conformation, alternating with pyrimidines in the ant conformation. The major groove is barely apparent in TGcaATACTCATOCCA Z-DNA, and the minor groove is narrow and deep Whether A-DNA occurs in cells is uncertain but there is evidence for some short stretches(tracts)of Z-DNA in both prokaryotes and eukaryotes. These Z-DNA tracts may play a role(as yet undefined) in regulating the ex- pression of some genes or in genetic recombination. Certain DNA Sequences Adopt Unusual Structures A number of other sequence-dependent structural vari ations have been detected within larger chromosomes that may affect the function and metabolism of the dna segments in their immediate vicinity. For example, bends occur in the dna helix wherever four or more HHHHHHHHHHiHHH adenosine residues appear sequentially in one strand ACGCTATGAGTA Six adenosines in a row produce a bend of about 18 TTAUCA C AATCGTGCACGA TT Mirror peat 今 TA GCA CCAC0AT里 Cretan FIGURE 8-20 Palindromes and mirror repeats. Palindromes are se. quences of double-stranded nucleic acids with twofold symmetry order to superimpose one repeat (shaded sequence)on the other, it sequences can form alternative structures with intrastrand base pai must be rotated 180 about the horizontal axis then 180 about the ing. (a)When only a single DNA (or RNA) strand is involved, vertical axis, as shown by the colored arrows. A mirror repeat, on the structure is called a hairpin. (b)When both strands of a duplex DNA other hand, has a symmetric sequence within each strand. Superim- are involved, it is called a cruciform. Blue shading highlights asym- posing one repeat on the other requires only a single 180 rotation metric sequences that can pair with the complementary sequence ei- about the vertical axis ther in the same strand or in the complementary strand
plane of the base pairs in A-DNA is tilted about 20 with respect to the helix axis. These structural changes deepen the major groove while making the minor groove shallower. The reagents used to promote crystallization of DNA tend to dehydrate it, and thus most short DNA molecules tend to crystallize in the A form. Z-form DNA is a more radical departure from the B structure; the most obvious distinction is the lefthanded helical rotation. There are 12 base pairs per helical turn, and the structure appears more slender and elongated. The DNA backbone takes on a zigzag appearance. Certain nucleotide sequences fold into lefthanded Z helices much more readily than others. Prominent examples are sequences in which pyrimidines alternate with purines, especially alternating C and G or 5-methyl-C and G residues. To form the left-handed helix in Z-DNA, the purine residues flip to the syn conformation, alternating with pyrimidines in the anti conformation. The major groove is barely apparent in Z-DNA, and the minor groove is narrow and deep. Whether A-DNA occurs in cells is uncertain, but there is evidence for some short stretches (tracts) of Z-DNA in both prokaryotes and eukaryotes. These Z-DNA tracts may play a role (as yet undefined) in regulating the expression of some genes or in genetic recombination. Certain DNA Sequences Adopt Unusual Structures A number of other sequence-dependent structural variations have been detected within larger chromosomes that may affect the function and metabolism of the DNA segments in their immediate vicinity. For example, bends occur in the DNA helix wherever four or more adenosine residues appear sequentially in one strand. Six adenosines in a row produce a bend of about 18. The bending observed with this and other sequences may be important in the binding of some proteins to DNA. A rather common type of DNA sequence is a palindrome. A palindrome is a word, phrase, or sentence that is spelled identically read either forward or backward; two examples are ROTATOR and NURSES RUN. The term is applied to regions of DNA with inverted repeats of base sequence having twofold symmetry over two strands of DNA (Fig. 8–20). Such sequences are self-complementary within each strand and therefore have the potential to form hairpin or cruciform (cross-shaped) structures (Fig. 8–21). When the inverted repeat occurs within each individual strand of the DNA, the sequence is called a mirror repeat. Mirror repeats do not have complementary sequences within the same strand and cannot form hairpin or cruciform structures. Sequences of these types are found 8.2 Nucleic Acid Structure 285 FIGURE 8–20 Palindromes and mirror repeats. Palindromes are sequences of double-stranded nucleic acids with twofold symmetry. In order to superimpose one repeat (shaded sequence) on the other, it must be rotated 180 about the horizontal axis then 180 about the vertical axis, as shown by the colored arrows. A mirror repeat, on the other hand, has a symmetric sequence within each strand. Superimposing one repeat on the other requires only a single 180 rotation about the vertical axis. FIGURE 8–21 Hairpins and cruciforms. Palindromic DNA (or RNA) sequences can form alternative structures with intrastrand base pairing. (a) When only a single DNA (or RNA) strand is involved, the structure is called a hairpin. (b) When both strands of a duplex DNA are involved, it is called a cruciform. Blue shading highlights asymmetric sequences that can pair with the complementary sequence either in the same strand or in the complementary strand
286 Chapter 8 Nucleotides and Nucleic Acids in virtually every large DNA molecule and can encom- Watson-Crick base pair(Fig. 8-11)can form a number pass a few base pairs or thousands. The extent to which of additional hydrogen bonds, particularly with func palindromes occur as cruciforms in cells is not known, tional groups arrayed in the major groove. For example, although some cruciform structures have been demon- a cytidine residue (if protonated) can pair with the strated in vivo in E coli Self-complementary sequences guanosine residue of a G=C nucleotide pair, and a cause isolated single strands of DNA (or RNA)in solu- thymidine can pair with the adenosine of an A-T pair tion to fold into complex structures containing multiple(Fig. 8-22). The N-7, 0, and M of purines, the atoms groins that participate in the hydrogen bonding of triplex DNA, Several unusual DNA structures involve three or even are often referred to as Hoogsteen positions, and the four DNA strands. These structural variations merit non-Watson-Crick pairing is called Hoogsteen pairing, investigation because there is a tendency for many of after Karst Hoogsteen, who in 1963 first recognized the them to appear at sites where important events in dna potential for these unusual pairings. Hoogsteen pairing metabolism(replication, recombination, transcription) allows the formation of triplex DNAs. The triplexes are initiated or regulated. Nucleotides participating in a shown in Figure 8-22 (a, b)are most stable at low pH CH C≡G→C+ C-1 H FIGURE 8-22 DNA structures containing three or four DNA strands (a)Base-pairing patterns in one well-characterized form of triplex DNA. The Hoogsteen pair in each case is shown in red. (b)Triple- helical DNA containing two pyrimidine strands(poly(m)and and light blue strands are antiparallel and paired by normal Watson- Crick base-pairing patterns. The third (all-pyrimidine) strand (purple) is parallel to the purine strand and paired through non-Watson-Crick hydrogen bonds. The triplex is viewed end-on, with five triplets shown Only the triplet closest to the viewer is colored. (c)Base-pairing pat tern in the guanosine tetraplex structure. (d) Two successive tetraplet from a G tetraplex structure(derived from PDB ID 1QDG). Parallel Antiparallel end-on with the one closest to the viewer in color. (e)Possible vari- ants in the orientation of strands in a G tetraplex
in virtually every large DNA molecule and can encompass a few base pairs or thousands. The extent to which palindromes occur as cruciforms in cells is not known, although some cruciform structures have been demonstrated in vivo in E.coli. Self-complementary sequences cause isolated single strands of DNA (or RNA) in solution to fold into complex structures containing multiple hairpins. Several unusual DNA structures involve three or even four DNA strands. These structural variations merit investigation because there is a tendency for many of them to appear at sites where important events in DNA metabolism (replication, recombination, transcription) are initiated or regulated. Nucleotides participating in a Watson-Crick base pair (Fig. 8–11) can form a number of additional hydrogen bonds, particularly with functional groups arrayed in the major groove. For example, a cytidine residue (if protonated) can pair with the guanosine residue of a GqC nucleotide pair, and a thymidine can pair with the adenosine of an AUT pair (Fig. 8–22). The N-7, O6 , and N6 of purines, the atoms that participate in the hydrogen bonding of triplex DNA, are often referred to as Hoogsteen positions, and the non-Watson-Crick pairing is called Hoogsteen pairing, after Karst Hoogsteen, who in 1963 first recognized the potential for these unusual pairings. Hoogsteen pairing allows the formation of triplex DNAs. The triplexes shown in Figure 8–22 (a, b) are most stable at low pH 286 Chapter 8 Nucleotides and Nucleic Acids CH3 CH3 O N O H N N H H N H N N C-1 1-C C-1 N N N O O T AT (a) N N O H N O H H H N H N N C-1 1-C C-1 N H H N N N O H N C GC H Guanosine tetraplex (c) H H N N O C-1 N N N H H H N N N N 1-C N O C-1 N N N N O N H H H O C-1 N H H N N N N H Parallel Antiparallel (e) FIGURE 8–22 DNA structures containing three or four DNA strands. (a) Base-pairing patterns in one well-characterized form of triplex DNA. The Hoogsteen pair in each case is shown in red. (b) Triplehelical DNA containing two pyrimidine strands (poly(T)) and one purine strand (poly(A)) (derived from PDB ID 1BCE). The dark blue and light blue strands are antiparallel and paired by normal WatsonCrick base-pairing patterns. The third (all-pyrimidine) strand (purple) is parallel to the purine strand and paired through non-Watson-Crick hydrogen bonds. The triplex is viewed end-on, with five triplets shown. Only the triplet closest to the viewer is colored. (c) Base-pairing pattern in the guanosine tetraplex structure. (d) Two successive tetraplets from a G tetraplex structure (derived from PDB ID 1QDG), viewed end-on with the one closest to the viewer in color. (e) Possible variants in the orientation of strands in a G tetraplex.
8. 2 Nucleic Acid Structure because the C=G.C triplet requires a protonated cy S-.GGACACGTO tosine. In the triplex, the pKa of this cytosine is >7.5. altered from its normal value of 4.2. The triplexes also 3--CCTCTCCA GAGA CACA GA CA GACA\ form most readily within long sequences containing only TETeTCTCTCTCTCTOT-L A pyrimidines or only purines in a given strand. Some triplex DNAs contain two pyrimidine strands and one A GAGACAGACA≠ purine strand; others contain two purine strands and one pyrimidine strand. Four DNA strands can also pair to form a tetraplex (quadruplex), but this occurs readily only for dNA se- quences with a very high proportion of guanosine residues(Fig. 8-22c, d). The guanosine tetraplex, or G tetraplex, is quite stable over a wide range of condi- tions. The orientation of strands in the tetraplex can ary as shown in Figure 8-22e a particularly exotic DNA structure, known as Triple helx H-DNA, is found in polypyrimidine or polypurine tracts hat also incorporate a mirror repeat. a simple example is a long stretch of altemating T and C residues(Fig 8-23). The H-DNA structure features the triple-stranded form illustrated in Figure 8-22(a, b). Two of the three 入 strands in the H-DNA triple helix contain pyrimidines and the third contains purines In the dNA of living cells, sites recognized by many sequence-specific DNA-binding proteins( Chapter 28) are arranged as palindromes, and polypyrimidine or polypurine sequences that can form triple helices or even H-DNA are found within regions involved in the regulation of expression of some eukaryotic genes In principle, synthetic DNA strands designed to pair with these sequences to form triplex dNa could disrupt gene expression. This approach to controlling cellular me. FIGURE 8-23 H-DNA.(a)A sequence of alternating T and C residues tabolism is of growing commercial interest for its po- can be considered a mirror repeat centered about a central T or C tential application in medicine and agriculture (b) These sequences form an unusual structure in which the strands in one half of the mirror repeat are separated and the pyrimidine. Messenger RNAs Code for Polypeptide Chains containing strand (altemating T and C residues) folds back on the other half of the repeat to form a triple helix. The purine strand (altemating We now turn our attention briefly from DNA structure A and G residues) is left unpaired. This structure produces a sharp to the expression of the genetic information that it con- bend in the DNA. tains. RNA, the second major form of nucleic acid in ells, has many functions In gene expression, RNA acts as an intermediary by using the information encoded in ery of the ribosome. In 1961 Francois Jacob and Jacques dna to specify the amino acid sequence of a functional Monod presented a unified (and essentially correct) pic protein. ture of many aspects of this process. They proposed the Given that the dna of eukaryotes is largely con- name messenger RNA(mRNA)for that portion of the fined to the nucleus whereas protein synthesis occurs total cellular RNA carrying the genetic information from on ribosomes in the cytoplasm, some molecule other DNA to the ribosomes, where the messengers provide than DNa must carry the genetic message from the nu- the templates that specify amino acid sequences in cleus to the cytoplasm. As early as the 1950s, RNA was polypeptide chains. Although mRNAs from different considered the logical candidate: RNA is found in both genes can vary greatly in length, the mRNAs from a par- the nucleus and the cytoplasm, and an increase in pro. ticular gene generally have a defined size. The proc tein synthesis is accompanied by an increase in the of forming mRNA on a DNa template is known as amount of cytoplasmic RNA and an increase in its rate transcription. of turnover These and other observations led several In prokaryotes, a single mRNA molecule may code for researchers to suggest that RNA carries genetic infor- one or several polypeptide chains. If it carries the mation from DNa to the protein biosynthetic machin- for only one polypeptide, the mRNA is monocistro
because the CqG C triplet requires a protonated cytosine. In the triplex, the pKa of this cytosine is 7.5, altered from its normal value of 4.2. The triplexes also form most readily within long sequences containing only pyrimidines or only purines in a given strand. Some triplex DNAs contain two pyrimidine strands and one purine strand; others contain two purine strands and one pyrimidine strand. Four DNA strands can also pair to form a tetraplex (quadruplex), but this occurs readily only for DNA sequences with a very high proportion of guanosine residues (Fig. 8–22c, d). The guanosine tetraplex, or G tetraplex, is quite stable over a wide range of conditions. The orientation of strands in the tetraplex can vary as shown in Figure 8–22e. A particularly exotic DNA structure, known as H-DNA, is found in polypyrimidine or polypurine tracts that also incorporate a mirror repeat. A simple example is a long stretch of alternating T and C residues (Fig. 8–23). The H-DNA structure features the triple-stranded form illustrated in Figure 8–22 (a, b). Two of the three strands in the H-DNA triple helix contain pyrimidines and the third contains purines. In the DNA of living cells, sites recognized by many sequence-specific DNA-binding proteins (Chapter 28) are arranged as palindromes, and polypyrimidine or polypurine sequences that can form triple helices or even H-DNA are found within regions involved in the regulation of expression of some eukaryotic genes. In principle, synthetic DNA strands designed to pair with these sequences to form triplex DNA could disrupt gene expression. This approach to controlling cellular metabolism is of growing commercial interest for its potential application in medicine and agriculture. Messenger RNAs Code for Polypeptide Chains We now turn our attention briefly from DNA structure to the expression of the genetic information that it contains. RNA, the second major form of nucleic acid in cells, has many functions. In gene expression, RNA acts as an intermediary by using the information encoded in DNA to specify the amino acid sequence of a functional protein. Given that the DNA of eukaryotes is largely confined to the nucleus whereas protein synthesis occurs on ribosomes in the cytoplasm, some molecule other than DNA must carry the genetic message from the nucleus to the cytoplasm. As early as the 1950s, RNA was considered the logical candidate: RNA is found in both the nucleus and the cytoplasm, and an increase in protein synthesis is accompanied by an increase in the amount of cytoplasmic RNA and an increase in its rate of turnover. These and other observations led several researchers to suggest that RNA carries genetic information from DNA to the protein biosynthetic machinery of the ribosome. In 1961 François Jacob and Jacques Monod presented a unified (and essentially correct) picture of many aspects of this process. They proposed the name “messenger RNA” (mRNA) for that portion of the total cellular RNA carrying the genetic information from DNA to the ribosomes, where the messengers provide the templates that specify amino acid sequences in polypeptide chains. Although mRNAs from different genes can vary greatly in length, the mRNAs from a particular gene generally have a defined size. The process of forming mRNA on a DNA template is known as transcription. In prokaryotes, a single mRNA molecule may code for one or several polypeptide chains. If it carries the code for only one polypeptide, the mRNA is monocistronic; 8.2 Nucleic Acid Structure 287 FIGURE 8–23 H-DNA. (a) A sequence of alternating T and C residues can be considered a mirror repeat centered about a central T or C. (b) These sequences form an unusual structure in which the strands in one half of the mirror repeat are separated and the pyrimidinecontaining strand (alternating T and C residues) folds back on the other half of the repeat to form a triple helix. The purine strand (alternating A and G residues) is left unpaired. This structure produces a sharp bend in the DNA.