MACINETIC RESONANCE ELSEVIER Progress in Nuclear Magnetic Resonance Spectroscopy 32(1998)287-387 The use of nmr methods for conformational studies of nucleic acids Sybren S. Wijmenga*, Bernd N M. van Buuren Umed University, Department of Medical Biochemistry and Biophysics, S 90187 Umed, Sweden Received 10 July 1997 Contents Introduction 2. RNA and DNA synthesis and purification 3. Nomenclature 4. Distances 291 4. 1. Overview of short distances and their general characteristics 4.2. Overview of structurally important intra-nucleotide distances 4.3. Overview of structurally important sequential and cross-strand distances 4.4. Derivation of distances from NOESY spectra and structure characterization using distances 4.5. Conclusion 5. J-couplings 5.1. JHC- and Jcc-couplings 5.2. Overview of J-couplings in the bases 5.3. Ribose sugar 5.4. Determination of the B torsion angle 310 5.5. Determination of the e torsion angle 312 5. 7. x orison angle and JHc sugar to base eo specific assignment 5.6. Torsion angle y and H5 and H5" ste 316 5.8. Measurement of homo- and heteronuclear J-coupling constants 5.8.1. Determination of J-couplings from the shape of the signal 5.8.2. Determination of J-couplings from E COSY patterns 18 5.8. 1. Homonuclear E cosy 318 5.8.2.2. Heteronuclear E cosY 5.8.2.2.1. Determination of JHP- and Jcp-couplings 5.8.2.2.2. Determination of JHc-couplings 5.8.2.2.3. Determination of JHH-couplings via HCC-E. COSY spectra 5.8.3. Determination of J-couplings from signal intensities 321 5.8.3. 1. Determination of JHH-couplings from homonuclear(H, H) TOCSY transfer 5.8.3.2. Determination of J-couplings from heteronuclear experiments Corresponding author. Tel: +469078 6500; fax: +469013 6310, mail: sybren a indigo. chem.umuse 0022-2860798/$19.00@ 1998 Published by Elsevier Science B v. All rights reserved PS0079-6565(97)00023-X
The use of NMR methods for conformational studies of nucleic acids Sybren S. Wijmenga*, Bernd N.M. van Buuren Umea˚ University, Department of Medical Biochemistry and Biophysics, S 901 87 Umea˚, Sweden Received 10 July 1997 Progress in Nuclear Magnetic Resonance Spectroscopy 32 (1998) 287–387 0022-2860/98/$19.00 q 1998 Published by Elsevier Science B.V. All rights reserved PII S0079-6565(97)00023-X * Corresponding author. Tel: +46 9078 6500; fax: +46 9013 6310; e-mail: sybren@indigo.chem.umu.se Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 2. RNA and DNA synthesis and purification . . . . . . . . . . . . . . . . . . . . . . . . . 290 3. Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 4. Distances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 4.1. Overview of short distances and their general characteristics . . . . . . . . . . . . . . . . 292 4.2. Overview of structurally important intra-nucleotide distances . . . . . . . . . . . . . . . . 294 4.3. Overview of structurally important sequential and cross-strand distances . . . . . . . . . . . 295 4.4. Derivation of distances from NOESY spectra and structure characterization using distances . . . 295 4.5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304 5. J-couplings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 5.1. 1 JHC- and 1 JCC-couplings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 5.2. Overview of J-couplings in the bases . . . . . . . . . . . . . . . . . . . . . . . . . 307 5.3. Ribose sugar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 5.4. Determination of the b torsion angle. . . . . . . . . . . . . . . . . . . . . . . . . . 310 5.5. Determination of the e torsion angle . . . . . . . . . . . . . . . . . . . . . . . . . . 312 5.6. Torsion angle g and H59 and H599 stereo specific assignment. . . . . . . . . . . . . . . . 314 5.7. x torison angle and 3 JHC sugar to base . . . . . . . . . . . . . . . . . . . . . . . . . 316 5.8. Measurement of homo- and heteronuclear J-coupling constants . . . . . . . . . . . . . . . 316 5.8.1. Determination of J-couplings from the shape of the signal . . . . . . . . . . . . . . 316 5.8.2. Determination of J-couplings from E.COSY patterns . . . . . . . . . . . . . . . . 318 5.8.2.1. Homonuclear E.COSY . . . . . . . . . . . . . . . . . . . . . . . . 318 5.8.2.2. Heteronuclear E.COSY . . . . . . . . . . . . . . . . . . . . . . . . 319 5.8.2.2.1. Determination of JHP- and JCP-couplings . . . . . . . . . . . . 319 5.8.2.2.2. Determination of JHC-couplings . . . . . . . . . . . . . . . . 320 5.8.2.2.3. Determination of JHH-couplings via HCC-E.COSY spectra . . . . . 320 5.8.3. Determination of J-couplings from signal intensities . . . . . . . . . . . . . . . . 321 5.8.3.1. Determination of JHH-couplings from homonuclear (H,H) TOCSY transfer . . . 321 5.8.3.2. Determination of J-couplings from heteronuclear experiments . . . . . . . . 321
S.S. Wijmenga, B.N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32(1998)287-387 6. Chemical shifts 322 6. 1. Chemical shifts; qualitative aspects 6.2. Theory 324 6.3. shifts 6.4. Structurally important H shifts 5. N and c shifts in dna and rna 6.6. shifts 330 7. Assignment methods 330 7. 1. Assignment without isotope labeling 7.2. Assignment with isotope labeling 335 7.2.1. NOE-based correlation 7.2.2. Through-bond correlation 337 7.2.2. 1. Coherence transfer functions 337 7.2.2.2. Through-bond amino/imino to non-exchangeable proton correlation 7.2.2.3. Through-bond H2-H8 correlation 344 7.2.2.4. Through-bond base sugar correlation 347 7.2.2.5. Through-bond sugar correlation 7.2.2.6. Through-bond sequential backbone assignment 357 7. 2.3. X-filter techniques 361 8. Relaxation and dynamics 363 9. Calculation of structures 375 10. Prospects for larger systems 378 11. Concl 383 References 383 Keywords: NMR; Conformational studies, Nucleic acids; RNA; DNA; Labeling; Assignment; Structure 1. Introduction However. the alternate rna and dNA structures associated with many of the different processes Nucleic acid molecules play a central role in cell mentioned above, are less well known. Only> in biological processes. DNA s main role is to act as the the early 1990s have technological advances in carrier of genetic information. Furthermore, DNA is sample preparation, such as isotope labeling and transcribed into RNa by a carefully regulated process, developments in crystallization, made such structural and it is duplicated on cell division. RNAs main role data available, and allowed the structural basis of the Is to communicate the genetic information for protein biological functions of DNA and rNa to be synthesis to the ribosomes RNA is, however, very addressed versatile. It can also take on the role of dna as the In the past ten years, we have witnessed an carrier of genetic information, and it can function as explosion in the number of crystal and solution struc an enzyme. It has even been hypothesized that early in tures of proteins determined by X-ray crystallography evolution, life was based entirely on RNA(see, for and NMr, respectively. In comparison, the increase in example, Ref. [I]). All these different processes the number of nucleic acid structures determined by require different structures. The basic structural either X-ray or NMR has been relatively small. This elements of rna and dna are well established. ie. can be attributed to the difficulties encountered when DNA forms a B-helix, while RNa may be either trying to crystallize nucleic acids for detailed X-ray single-stranded or may form an A-type helix. analysis and to the problem of extensive resonance
Keywords: NMR; Conformational studies; Nucleic acids; RNA; DNA; Labeling; Assignment; Structure 1. Introduction Nucleic acid molecules play a central role in cell biological processes. DNA’s main role is to act as the carrier of genetic information. Furthermore, DNA is transcribed into RNA by a carefully regulated process, and it is duplicated on cell division. RNA’s main role is to communicate the genetic information for protein synthesis to the ribosomes. RNA is, however, very versatile. It can also take on the role of DNA as the carrier of genetic information, and it can function as an enzyme. It has even been hypothesized that early in evolution, life was based entirely on RNA (see, for example, Ref. [1]). All these different processes require different structures. The basic structural elements of RNA and DNA are well established, i.e. DNA forms a B-helix, while RNA may be either single-stranded or may form an A-type helix. However, the alternate RNA and DNA structures, associated with many of the different processes mentioned above, are less well known. Only since the early 1990s have technological advances in sample preparation, such as isotope labeling and developments in crystallization, made such structural data available, and allowed the structural basis of the biological functions of DNA and RNA to be addressed. In the past ten years, we have witnessed an explosion in the number of crystal and solution structures of proteins determined by X-ray crystallography and NMR, respectively. In comparison, the increase in the number of nucleic acid structures determined by either X-ray or NMR has been relatively small. This can be attributed to the difficulties encountered when trying to crystallize nucleic acids for detailed X-ray analysis and to the problem of extensive resonance 6. Chemical shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 6.1. Chemical shifts; qualitative aspects . . . . . . . . . . . . . . . . . . . . . . . . . . 324 6.2. Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 6.3. 1 H shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 6.4. Structurally important 1 H shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327 6.5. 15N and 13C shifts in DNA and RNA . . . . . . . . . . . . . . . . . . . . . . . . . 329 6.6. 31P shifts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 7. Assignment methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330 7.1. Assignment without isotope labeling. . . . . . . . . . . . . . . . . . . . . . . . . . 330 7.2. Assignment with isotope labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 7.2.1. NOE-based correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 7.2.2. Through-bond correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 7.2.2.1. Coherence transfer functions . . . . . . . . . . . . . . . . . . . . . . 337 7.2.2.2. Through-bond amino/imino to non-exchangeable proton correlation . . . . . . 337 7.2.2.3. Through-bond H2-H8 correlation . . . . . . . . . . . . . . . . . . . . 344 7.2.2.4. Through-bond base ¹ sugar correlation . . . . . . . . . . . . . . . . . 347 7.2.2.5. Through-bond sugar correlation. . . . . . . . . . . . . . . . . . . . . 355 7.2.2.6. Through-bond sequential backbone assignment . . . . . . . . . . . . . . 357 7.2.3. X-filter techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 8. Relaxation and dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 9. Calculation of structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 10. Prospects for larger systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 11. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 288 S.S. Wijmenga, B.N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32 (1998) 287–387
S.S. Wijmenga, B N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32(1998)287-387 overlap in NMR spectra of these compound this is more reliable resonance assignments. In addi- Advances in crystallization techniques have in recent tion, more extensive constraints lists could be years resulted in the structure determination of the obtained for subsequent structure determination. In RNA hammerhead enzyme [2, 3], one of the two fold- the past two years a number of RNA structures with ing domains of the group I intron self-splicing RNa a size up to 30 to 40 nucleotides have been published [4, 5], and a few RNA-protein complexes [6, 7].In [17-301, together with RNA-peptide complexes addition, the structures of several DNA duplexes, as [31-36] and an RNA-protein complex of total mol well as of a DNa quadruplex, have been determined ecular weight 22 kDa 37, 38]. These studies have also by means of X-ray crystallography [8]. However, made it clear that the upper size limit for RNAs which despite the two X-ray structures of the hammerhead, can be studied by NMr lies around 30 nucleotides catalytic mechanism of this ribozyme has not yet when uniform labeling is employed, a size limit clarified. Crystal packing forces sometimes considerably below that for proteins affect RNA or DNA structures. For example, there Only quite recently has it become possible to enrich is still no crystal structure available of a DNA or DNA withC> N isotopes. Zimmer and Crothers RNA hairpin, since these tend to crystallize in bio- 39]demonstrated that DNA can be enriched via an duplex structures enzymatic approach, while even more recentlyC [9, 10- Solution structures, which can be determined and N labeled dNa phosphoramidites have become via NMR, are therefore particularly important in dna available [401, so that C andN enriched DNAs can and RNA structural biology as a complement to now also be obtained via chemical synthesis. It is to be crystallography. In addition, nucleic acids often con- expected that these possibilities will also have an tain regions of higher conformational flexibility. effect on NMR structural studies of DNA of larger NMR is particularly suited for identifying such size. Larger DNA systems, such as those forming regions three- and four-way junctions, have already been In the field of NMR of nucleic acids, advances were studied [41], but these have not yet produced detailed made in the 1980s with the introduction of synthetic solution structures, again due to the extensive signal methods for preparing well defined DNA sequences. overlap(see, for example, Refs. [42-44D) It is note- This development also made it possible to produce worthy that Altona and co-workers used an extremely well defined RNA sequences from DNA templates interesting approach to achieve the assignments in by enzymatic synthesis via T7-polymerase. These their studies of four-way junctions [43, 44]. They developments led to the determination of several solu- used well-determined hairpins as building blocks for tion DNA and RNA hairpin structures, from which the the larger four-way and three-way junctions they main folding principles of hairpin loops could be studied. This made it possible to obtain resonance determined [11, 12]. In addition, these developments assignment in very crowded spectra. The future will led to the determination of the solution structure of a reveal whether combining this approach with labeling DNA quadruplex [13, 14] and solution structures of will allow an extension to larger systems, both for triple helix molecules [151, as well as to the determi- RNAs and dNAs nation of a new DNA multi-stranded fold the C-motif Naturally, as isotope enriched nucleic acid mol [16. Still, the overlap encountered in NMR spectra ecules are now used in NMR studies, we will pay limited the size of the molecules that could be studied particular attention in this review to the related and the detail by which the structures could be deter- NMR methods. Various other reviews [36, 45-48] mined. In the early 1990s, methods were developed to have recently appeared, but they have focused gener- produce C or N enriched RNAS, via enzymatic ally on specific aspects of the NMR of isotope synthesis, in quantities large enough for NMR studies. enriched RNA. We try here to provide a broad over This possibility enabled more detailed studies of bio- view, covering as much as possible of the various logically relevant RNA sequences and folds. Initial aspects that come into play when performing NMR NMR studies have been performed and methods structural studies of both DNA and RNa molecules have been developed for assignment of resonances Furthermore, the field is developing rapidly and new of C and N labeled RNAS. The direct result of aspects have been published since the appearance of
overlap in NMR spectra of these compounds. Advances in crystallization techniques have in recent years resulted in the structure determination of the RNA hammerhead enzyme [2,3], one of the two folding domains of the group I intron self-splicing RNA [4,5], and a few RNA–protein complexes [6,7]. In addition, the structures of several DNA duplexes, as well as of a DNA quadruplex, have been determined by means of X-ray crystallography [8]. However, despite the two X-ray structures of the hammerhead, the catalytic mechanism of this ribozyme has not yet been clarified. Crystal packing forces sometimes affect RNA or DNA structures. For example, there is still no crystal structure available of a DNA or RNA hairpin, since these tend to crystallize in biologically less relevant extended duplex structures [9,10]. Solution structures, which can be determined via NMR, are therefore particularly important in DNA and RNA structural biology as a complement to crystallography. In addition, nucleic acids often contain regions of higher conformational flexibility. NMR is particularly suited for identifying such regions. In the field of NMR of nucleic acids, advances were made in the 1980s with the introduction of synthetic methods for preparing well defined DNA sequences. This development also made it possible to produce well defined RNA sequences from DNA templates by enzymatic synthesis via T7-polymerase. These developments led to the determination of several solution DNA and RNA hairpin structures, from which the main folding principles of hairpin loops could be determined [11,12]. In addition, these developments led to the determination of the solution structure of a DNA quadruplex [13,14] and solution structures of triple helix molecules [15], as well as to the determination of a new DNA multi-stranded fold, the C-motif [16]. Still, the overlap encountered in NMR spectra limited the size of the molecules that could be studied and the detail by which the structures could be determined. In the early 1990s, methods were developed to produce 13C or 15N enriched RNAs, via enzymatic synthesis, in quantities large enough for NMR studies. This possibility enabled more detailed studies of biologically relevant RNA sequences and folds. Initial NMR studies have been performed and methods have been developed for assignment of resonances of 13C and 15N labeled RNAs. The direct result of this is more reliable resonance assignments. In addition, more extensive constraints lists could be obtained for subsequent structure determination. In the past two years a number of RNA structures with a size up to 30 to 40 nucleotides have been published [17–30], together with RNA–peptide complexes [31–36] and an RNA–protein complex of total molecular weight 22 kDa [37,38]. These studies have also made it clear that the upper size limit for RNAs which can be studied by NMR lies around 30 nucleotides when uniform labeling is employed, a size limit considerably below that for proteins. Only quite recently has it become possible to enrich DNA with 13C and 15N isotopes. Zimmer and Crothers [39] demonstrated that DNA can be enriched via an enzymatic approach, while even more recently 13C and 15N labeled DNA phosphoramidites have become available [40], so that 13C and 15N enriched DNAs can now also be obtained via chemical synthesis. It is to be expected that these possibilities will also have an effect on NMR structural studies of DNA of larger size. Larger DNA systems, such as those forming three- and four-way junctions, have already been studied [41], but these have not yet produced detailed solution structures, again due to the extensive signal overlap (see, for example, Refs. [42–44]). It is noteworthy that Altona and co-workers used an extremely interesting approach to achieve the assignments in their studies of four-way junctions [43,44]. They used well-determined hairpins as building blocks for the larger four-way and three-way junctions they studied. This made it possible to obtain resonance assignment in very crowded spectra. The future will reveal whether combining this approach with labeling will allow an extension to larger systems, both for RNAs and DNAs. Naturally, as isotope enriched nucleic acid molecules are now used in NMR studies, we will pay particular attention in this review to the related NMR methods. Various other reviews [36,45–48] have recently appeared, but they have focused generally on specific aspects of the NMR of isotope enriched RNA. We try here to provide a broad overview, covering as much as possible of the various aspects that come into play when performing NMR structural studies of both DNA and RNA molecules. Furthermore, the field is developing rapidly and new aspects have been published since the appearance of S.S. Wijmenga, B.N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32 (1998) 287–387 289
290 S.S. Wijmenga, B.N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32(1998)287-387 these reviews. For example, a complete overview of enzymatic synthesis is the usual method of prepara J-couplings in the nucleic acid bases has been pub- tion; although chemical synthesis is possible it is still lished [49] and proton structural chemical shifts have prohibitively expensive when large quantities are been calculated and compared with experimental data required. Chemical synthesis is the usual approach [50]. We will incorporate these aspects into this for the preparation of DNAs of defined sequence review, together with a detailed description and criti- Zimmer and Crother [39]have shown how large quan- cal evaluation of the present state of the art NMr tities of DNA can be made via enzymatic synthesis, methodology for determining the structure of labeled thus demonstrating the feasibility of C and N DNA and RNa molecules. This review is divided into labeling of DNA via this method. However, C and eleven sections. In Section 2, C andN labeling, as N labeled DNA phosphoramidites have also recently well as other labeling methods, are described, albeit become commercially available, so that labeled briefly, in view of the quite detailed descriptions that DNAs can conveniently be prepared via chemical have recently appeared. The IUPAC nomenclature is synthesis [40]. We refer the reader to the original introduced in Section 3. In Section 4, we present an papers or reviews for the detailed protocols and for overview of the distances found in dna and rna discussions of the relative merits of the various molecules and discuss their relevance for NMR approaches [36,45,. 47, 48, 51-59]. Here we will structural studies. Section 5 gives an overview of all concentrate on some general and qualitative aspects homonuclear and heteronuclear J-couplings and A certain amount of confusing terminology has describes their structural dependencies. We also give crept into the literature with regard to labeling. We an overview of the NMr methods that are or can be will use the following terms: uniform labeling, when used to determine these J-couplings In Section 6, we every atom of a certain type in the molecule describe the chemical shifts and discuss their use both enriched: re esidue-type-specific labeling, if all residues for assignment purposes and as structural parameters. of a certain type(e.g. all Adenines) in the molecule Section 7 forms the heart of this review, and describes are enriched; site-specific labeling, if a particular resi- nd discusses in detail the currently available methods due or a number of particular residues are enriched for assignment both in unlabeled and C and N e.g. Al0 partial labeling, if the labeling of a certain labeled compounds. Section 8 concentrates on a residue is on, say, CI'only. In order to indicate that description of relaxation. Isotope enrichment has labeling is not 100%, we add the percentage after the opened up the way for detailed relaxation studies in word labelin the field of proteins. Such relaxation studies are still For the enzymatic synthesis of RNA, a DNA tem- scarce in the field of nucleic acids. We place relaxa- plate is required from which the RNA is transcribed tion studies on nucleic acids in the context of parallel by T7-polymerase using NTPs as building blocks tudies on proteins, and give an overview of the The C and/or N and/or H labeled NTPs are theoretical background. In Section 9 we briefly usually obtained from E coli cells, which are grown describe the actual structure determination from on either C enriched glucose, and/or 5N enriched NMr data. In Section 10, we discuss the prospects ammonium chloride. The RNA isolated from the cells for extension of NMR studies to larger systems and is broken down to C and/or N labeled NMPs we attempt to draw some conclusions in Section 11 which are subsequently converted into NTPs. Th method thus allows uniformly labeled RNAs to be made, or residue-type-specific labeled RNA when 2. RNA and DNA synthesis and purification the in vitro transcription occurs on a mixture of labeled and unlabeled NTPs. The method can in prin- Two strategies are available for preparing large ciple easily be extended to achieve deuteration or par quantities of DNA and RNa of defined sequence tial labeling. For example, Michnicka et al. [60]have nd high purity for NMR studies: (1)chemical suggested partial C labeling using acetate as a car synthesis by the phosphoramidite method, and (2) bon source, most recently Nikonowicz et al. [57]have enzymatic synthesis of RNAs via T7-polymerase demonstrated uniform HN labeling via the nd of dNAs via DNA-polymerase. For RNA enzymatic more complicated to
these reviews. For example, a complete overview of J-couplings in the nucleic acid bases has been published [49] and proton structural chemical shifts have been calculated and compared with experimental data [50]. We will incorporate these aspects into this review, together with a detailed description and critical evaluation of the present state of the art NMR methodology for determining the structure of labeled DNA and RNA molecules. This review is divided into eleven sections. In Section 2, 13C and 15N labeling, as well as other labeling methods, are described, albeit briefly, in view of the quite detailed descriptions that have recently appeared. The IUPAC nomenclature is introduced in Section 3. In Section 4, we present an overview of the distances found in DNA and RNA molecules and discuss their relevance for NMR structural studies. Section 5 gives an overview of all homonuclear and heteronuclear J-couplings and describes their structural dependencies. We also give an overview of the NMR methods that are or can be used to determine these J-couplings. In Section 6, we describe the chemical shifts and discuss their use both for assignment purposes and as structural parameters. Section 7 forms the heart of this review, and describes and discusses in detail the currently available methods for assignment both in unlabeled and 13C and 15N labeled compounds. Section 8 concentrates on a description of relaxation. Isotope enrichment has opened up the way for detailed relaxation studies in the field of proteins. Such relaxation studies are still scarce in the field of nucleic acids. We place relaxation studies on nucleic acids in the context of parallel studies on proteins, and give an overview of the theoretical background. In Section 9 we briefly describe the actual structure determination from NMR data. In Section 10, we discuss the prospects for extension of NMR studies to larger systems and we attempt to draw some conclusions in Section 11. 2. RNA and DNA synthesis and purification Two strategies are available for preparing large quantities of DNA and RNA of defined sequence and high purity for NMR studies: (1) chemical synthesis by the phosphoramidite method, and (2) enzymatic synthesis of RNAs via T7-polymerase and of DNAs via DNA-polymerase. For RNA, enzymatic synthesis is the usual method of preparation; although chemical synthesis is possible it is still prohibitively expensive when large quantities are required. Chemical synthesis is the usual approach for the preparation of DNAs of defined sequence. Zimmer and Crother [39] have shown how large quantities of DNA can be made via enzymatic synthesis, thus demonstrating the feasibility of 13C and 15N labeling of DNA via this method. However, 13C and 15N labeled DNA phosphoramidites have also recently become commercially available, so that labeled DNAs can conveniently be prepared via chemical synthesis [40]. We refer the reader to the original papers or reviews for the detailed protocols and for discussions of the relative merits of the various approaches [36,45,47,48,51–59]. Here we will concentrate on some general and qualitative aspects. A certain amount of confusing terminology has crept into the literature with regard to labeling. We will use the following terms: uniform labeling, when every atom of a certain type in the molecule is enriched; residue-type-specific labeling, if all residues of a certain type (e.g. all Adenines) in the molecule are enriched; site-specific labeling, if a particular residue or a number of particular residues are enriched, e.g. A10; partial labeling, if the labeling of a certain residue is on, say, C19 only. In order to indicate that labeling is not 100%, we add the percentage after the word labeling. For the enzymatic synthesis of RNA, a DNA template is required from which the RNA is transcribed by T7-polymerase using NTPs as building blocks. The 13C and/or 15N and/or 2 H labeled NTPs are usually obtained from E. coli cells, which are grown on either 13C enriched glucose, and/or 15N enriched ammonium chloride. The RNA isolated from the cells is broken down to 13C and/or 15N labeled NMPs, which are subsequently converted into NTPs. This method thus allows uniformly labeled RNAs to be made, or residue-type-specific labeled RNA when the in vitro transcription occurs on a mixture of labeled and unlabeled NTPs. The method can in principle easily be extended to achieve deuteration or partial labeling. For example, Michnicka et al. [60] have suggested partial 13C labeling using acetate as a carbon source; most recently Nikonowicz et al. [57] have demonstrated uniform 2 H/15N labeling via the enzymatic approach. It is more complicated to 290 S.S. Wijmenga, B.N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32 (1998) 287–387
S.S. Wijmenga, B N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32(1998)287-387 291 achieve site-specific labeling via the enzymatic des(; r)p for distances between adjacent base paired method(see, for example, Ref. [36]). Site-specific labeling, on the other hand, can quite easily be nucleotides, e.g. dcs(1, 2)3 achieved via chemical synthesis. This would be the The symbols NH and NHz represent imino and amino method of choice for the preparation of labeled dNA oligonucleotide protons, respectively. The directionality in the equential cross-strand distances has to be indicated Consider two adjacent base pairs, and define the 5 3. Nomenclature and 3 -nucleotides. It can be easily seen that dcs either between two 3 -nucleotides or between two For atom numbering and torsion angle definitions in 5-nucleotides. This is indicated by the subscript nucleic acids we will follow the IUPAC/UB guide Alternatively, when two protons I and r do not fall lines [61]. Accordingly, the chemical structure and in any of the above categories the distance is indicated atom numbering of the five common bases, the by pyrimidines C, T and U, and the purines G and A are given in Fig. 1(A), and of the B-D-(deoxy)riboses d(; r) for long -range internucleotide distances in Fig. I(B), which also indicates the torsion angles in e.g. d(T2-NB3, A9-NH,6) the sugar-phosphate backbone(a, 6, 7, 8, 8 and 5) and the glycosidic torsion angle x. Their definitions Here, T2-NH3 indicates the imino proton of Thymine are:O3’-P-O5′-C5′(a),P-O5′-C5-C4’(B), number2andA9NH26 indicates the amino group of O5′-C5′-C4-C3′(),C5′-C4-C3-03′(6), Adenine number9 C4-C3-03-P(e),C3-03′-P-O5′(3),O4 CI'-NI-C2 ( x (Py), and 04'-C1'-N9-C4 (x(Pu)). Furthermore, it gives a designation of the 4. Distances chain direction and the unit numbering in a poly- nucleotide chain. Fig. I(C)shows the two most Proton to proton distances are essential parameters common conformations of the B-D(deoxy )ribose for the three-dimensional structure determination of sugar ring, the C2'-endo (E)and the C3'-endo(E) biomolecules by NMR. Since only short distances conformers, also referred to as S-type and N-type (<5-6A)can be obtained by NMR, it is difficult to conformers, respectively determine global features, such as bending of the To describe the distances we will use the shorthand helix. On the other hand, local features can be deter- notation introduced by Wijmenga et al. [62]. In this mined quite well and most NMR structural studies notation the distance between the protons I and r is have focused on these aspects. Consequently, it is of paramount importance to have a good overview of the di(; r)for intranucleotide distances, e.g. di (8, 2) short distances in the main structural elemer as the sugar ring, the bases, the base pairs, etc ds(; r) for internucleotide distances, e.g. ds(1, 6) how these distances determine the structural of those elements. Another aspect is that several of the Here, I corresponds to the proton in the 5'-nucleotide short distances do not depend on conformation nor do unit and r with the proton in the 3-nucleotide unit. For they take on well defined values for the two major methyl protons the I or r is indicated by the letter helical conformations. A- and B-helices. for this M. To indicate that the distance is between H3 in reason, it is particularly useful to have at hand an the 5'-nucleotide and H5 or the methyl protons in overview of these distances and their characteristics, the3′- nucleotide se d(3, 5/M). Cross-strand so that one can focus on the relevant data for interesting structural aspects dci(l; r) for distances within a base pair, In the next sections we therefore discuss the short distances and how they reflect structural characteris- e.g. dci(T-NH3; A-NH26 tics, by first giving a more general overview and
achieve site-specific labeling via the enzymatic method (see, for example, Ref. [36]). Site-specific labeling, on the other hand, can quite easily be achieved via chemical synthesis. This would be the method of choice for the preparation of labeled DNA oligonucleotides. 3. Nomenclature For atom numbering and torsion angle definitions in nucleic acids we will follow the IUPAC/IUB guidelines [61]. Accordingly, the chemical structure and atom numbering of the five common bases, the pyrimidines C, T and U, and the purines G and A, are given in Fig. 1(A), and of the b-D-(deoxy) riboses in Fig. 1(B), which also indicates the torsion angles in the sugar–phosphate backbone (a, b, g, d, « and z) and the glycosidic torsion angle x. Their definitions are: O39–P–O59–C59 (a), P–O59–C59–C49 (b), O59–C59–C49–C39 (g), C59–C49–C39–O39 (d), C49–C39–O39–P («), C39–O39–P–O59 (z), O49– C19–N1–C2 (x (Py)), and O49–C19–N9–C4 (x (Pu)). Furthermore, it gives a designation of the chain direction and the unit numbering in a polynucleotide chain. Fig. 1(C) shows the two most common conformations of the b-D-(deoxy)ribose sugar ring, the C29-endo (2 E) and the C39-endo (3 E) conformers, also referred to as S-type and N-type conformers, respectively. To describe the distances we will use the shorthand notation introduced by Wijmenga et al. [62]. In this notation the distance between the protons l and r is given by: di(l; r) for intranucleotide distances, e:g: di(8; 29) ds(l; r) for internucleotide distances, e:g: ds(19; 6) Here, l corresponds to the proton in the 59-nucleotide unit and r with the proton in the 39-nucleotide unit. For methyl protons the l or r is indicated by the letter M. To indicate that the distance is between H39 in the 59-nucleotide and H5 or the methyl protons in the 39-nucleotide we use ds(39;5/M). Cross-strand distances are defined as: dci(l; r) for distances within a base pair, e:g: dci(T ¹ NH3; A ¹ NH26) dcs(l; r)p for distances between adjacent base paired nucleotides, e:g: dcs(19; 2)39 The symbols NH and NH2 represent imino and amino protons, respectively. The directionality in the sequential cross-strand distances has to be indicated. Consider two adjacent base pairs, and define the 59- and 39-nucleotides. It can be easily seen that dcs is either between two 39-nucleotides or between two 59-nucleotides. This is indicated by the subscript p. Alternatively, when two protons l and r do not fall in any of the above categories the distance is indicated by: d(l; r) for long ¹ range internucleotide distances, e:g: d(T2 ¹ NH3; A9 ¹ NH26) Here, T2-NH3 indicates the imino proton of Thymine number 2 and A9-NH26 indicates the amino group of Adenine number 9. 4. Distances Proton to proton distances are essential parameters for the three-dimensional structure determination of biomolecules by NMR. Since only short distances ( , 5–6 A˚ ) can be obtained by NMR, it is difficult to determine global features, such as bending of the helix. On the other hand, local features can be determined quite well and most NMR structural studies have focused on these aspects. Consequently, it is of paramount importance to have a good overview of the short distances in the main structural elements, such as the sugar ring, the bases, the base pairs, etc. and of how these distances determine the structural features of those elements. Another aspect is that several of the short distances do not depend on conformation nor do they take on well defined values for the two major helical conformations, A- and B-helices. For this reason, it is particularly useful to have at hand an overview of these distances and their characteristics, so that one can focus on the relevant data for the more interesting structural aspects. In the next sections we therefore discuss the short distances and how they reflect structural characteristics, by first giving a more general overview and S.S. Wijmenga, B.N.M. van Buuren/Progress in Nuclear Magnetic Resonance Spectroscopy 32 (1998) 287–387 291