20Principles of Molecular Virology Liquid-phase hybridization target sequence Solid-phase hybridization (a)Complex mixture of (b)Add labelled'probe' pro nylon membrane with target sequence bound Figure 1.7 Nucleic acid hybridization relies on the specificity of base-pairing which allows a labelled nucleic acid probe to pick out a complementary target sequence from a complex mixture of sequences in the test sample.The label used to identify the probe may be a radioisotope or a nonisotopic label such as an enzyme o chemiluminescent system.Hybridization may be performed with both the probe and test sequences in the liquid phase (top of figure)or with the test sequences bound to a solid phase,usually a nitrocellulose or nylon membrane (below).Both methods may be used to quantify the amount of the test sequence present,but solid-phase hybridization is also used to locate the position of sequences immobilized on the membrane.Plaque and colony hybridization are used to locate recombinant molecules directly from a mixture of bacterial colonies or bacterio- phage plaques on an agar plate.Northern and Southern blotting are used to detect RNA and DNA,respectively,after transfer of these molecules from gels following separation by electrophoresis (cf.,western blotting,Figure 1.2)
20 Principles of Molecular Virology Liquid-phase hybridization (a) Complex mixture of nucleic acids in solution (b) Add labelled ‘probe’ and allow hybridization with target sequence (c) Remove unbound probe and analyse amount of label bound Solid-phase hybridization (b) Add labelled ‘probe’ and allow hybridization with target sequence (a) Complex mixture of nucleic acids bound to nitrocellulose or nylon membrane (c) Remove unbound probe and analyse amount of label bound Figure 1.7 Nucleic acid hybridization relies on the specificity of base-pairing which allows a labelled nucleic acid probe to pick out a complementary target sequence from a complex mixture of sequences in the test sample. The label used to identify the probe may be a radioisotope or a nonisotopic label such as an enzyme or chemiluminescent system. Hybridization may be performed with both the probe and test sequences in the liquid phase (top of figure) or with the test sequences bound to a solid phase, usually a nitrocellulose or nylon membrane (below). Both methods may be used to quantify the amount of the test sequence present, but solid-phase hybridization is also used to locate the position of sequences immobilized on the membrane. Plaque and colony hybridization are used to locate recombinant molecules directly from a mixture of bacterial colonies or bacteriophage plaques on an agar plate. Northern and Southern blotting are used to detect RNA and DNA, respectively, after transfer of these molecules from gels following separation by electrophoresis (cf., western blotting, Figure 1.2)
First cycle (1)Heat DNA to melt strands Primers Second cycle (4)Heat DNA to melt strands again (Co and extend again Third cycle(etc) Figure 1.8 Polymerase chain reaction (PCR)relies on the specificity of base- pairing between short synthetic olignucleotide probes and complementary sequences in a complex mixture of nucleic acids to prime DNA synthesis using thermostable DNA polymerase.Multiple cycles of primer annealing,extension,and thermal denaturation are carried out in an automated process,resulting in a massive amplification(2n-fold increase after ncycles of amplification)of the target sequence located between the two primers
Introduction 21 Primers First cycle (1) Heat DNA to melt strands (2) Cool to allow primers to anneal to target sequences (3) Incubate to allow polymerase to extend primers Second cycle (4) Heat DNA to melt strands again (5) Cool to allow primers to anneal to target sequences and extend again Third cycle (etc) Figure 1.8 Polymerase chain reaction (PCR) relies on the specificity of basepairing between short synthetic olignucleotide probes and complementary sequences in a complex mixture of nucleic acids to prime DNA synthesis using a thermostable DNA polymerase. Multiple cycles of primer annealing, extension, and thermal denaturation are carried out in an automated process, resulting in a massive amplification (2n-fold increase after n cycles of amplification) of the target sequence located between the two primers
22Principles of Molecular Virology function from the linear sequence and is thus central to all areas of modern biology. Due to the flood of new sequence information,computers are being used increas ingly to make predictions based on nucleotide sequences (Figure 1.9).These include detecting the presence of open reading frames,the amino acid sequences of the proteins encoded by them,control regions of genes such as promoters and splice signals,and the secondary structure of proteins and nucleic acids.However (par- ticularly in the case of RNA),the secondary structure assumed by molecules is almost as important as the primary nucleotide sequence in determining the bio- logical reactions that the molecule may undergo.Caution is needed in interpret- ing such predicted rather than factual information,and the validity of such predictions should not be accepted without question unless confirmed by bio- chemical and/or genetic data.However,when the structure of a protein has been determined by x-ray crystallography or NMR,the shape can be accurately mod- elled and explored in three dimensions on computers(Figure 1.10). While the genome is the nucleic acid comprising the entire genetic informa tion of an organism,by extension 'genomics'is the study of the composition and function of the genetic material of an organism.Virus genomics began with the first complete sequence of a virus genome (bacteriophage x174 in 1977).Vast international databases of nucleotide and protein sequence information have now been compiled,and these can be rapidly accessed by computers to compare newly determined sequences with those whose function may have been studied in great Human immunodeficiency virus type Total Bases Sequenced:9719bp gag polyprotein 27 K protein trs protein cxon口LTR exon intron intron polyA signal Legend: RNA other feature Figure 1.9 An example of the use of a computer to store and process digitized information from a nucleic acid sequence.This figure shows an analysis of all of the open reading frames(ORFs)present in an HIV-1 provirus.The ORFs present in the three main retrovirus genes,gag,pol,and en,can be seen.This complex analysis took only a few seconds to perform using an ordinary personal computer. Manually,the same task may have taken several days
function from the linear sequence and is thus central to all areas of modern biology. Due to the flood of new sequence information, computers are being used increasingly to make predictions based on nucleotide sequences (Figure 1.9).These include detecting the presence of open reading frames, the amino acid sequences of the proteins encoded by them, control regions of genes such as promoters and splice signals, and the secondary structure of proteins and nucleic acids. However (particularly in the case of RNA), the secondary structure assumed by molecules is almost as important as the primary nucleotide sequence in determining the biological reactions that the molecule may undergo. Caution is needed in interpreting such predicted rather than factual information, and the validity of such predictions should not be accepted without question unless confirmed by biochemical and/or genetic data. However, when the structure of a protein has been determined by x-ray crystallography or NMR, the shape can be accurately modelled and explored in three dimensions on computers (Figure 1.10). While the genome is the nucleic acid comprising the entire genetic information of an organism, by extension ‘genomics’ is the study of the composition and function of the genetic material of an organism. Virus genomics began with the first complete sequence of a virus genome (bacteriophage fX174 in 1977). Vast international databases of nucleotide and protein sequence information have now been compiled, and these can be rapidly accessed by computers to compare newly determined sequences with those whose function may have been studied in great 22 Principles of Molecular Virology Human immunodeficiency virus type 1 Total Bases Sequenced: 9719 bp 9719 gag polyprotein sor 23K protein 27K protein pol polyprotein R protein tat protein env polyprotein LTR exon Legend: repeat region exon exon repeat region HXB2 genomic mRNA primary transcript intron intron CDS/intron/exon RNA other feature polyA signal exon LTR trs protein - 1000-2000-nt Figure 1.9 An example of the use of a computer to store and process digitized information from a nucleic acid sequence. This figure shows an analysis of all of the open reading frames (ORFs) present in an HIV-1 provirus.The ORFs present in the three main retrovirus genes, gag, pol, and env, can be seen. This complex analysis took only a few seconds to perform using an ordinary personal computer. Manually, the same task may have taken several days
Introduction2 Figure 1.10 Three-dimensional structure of the DNA binding domain of SV40 T-antigen reconstructed from NMR data using a computer. Table 1.1 Genomic comparison of different organisms Organism Number of genes Percent (%of genes with known or inferred function Hepatitis B virus SV40 100 Herpes simplex virus 95 Mimivirus 900 Escherichia coli 4,288 0 Yeast 6,.600 0 Caenorhabditis elegans 19.000 Drosophila 14.000 25 25,000 Mouse 100,000 Human 100,000 000 detail.At the time of publication,the complete genome sequences of almost 1500 different viruses had been published,with more appearing almost weekly (Table1.1).© Thus we have,in a sense,come full circle in our investigations of viruses- from particles via genomes back to proteins again-and have emerged with a far more profound understanding of these organisms;however,the current pace of research in virology tells us that there is still far more that we need to know
detail. At the time of publication, the complete genome sequences of almost 1500 different viruses had been published, with more appearing almost weekly (Table 1.1). Thus we have, in a sense, come full circle in our investigations of viruses— from particles via genomes back to proteins again—and have emerged with a far more profound understanding of these organisms; however, the current pace of research in virology tells us that there is still far more that we need to know. Introduction 23 Figure 1.10 Three-dimensional structure of the DNA binding domain of SV40 T-antigen reconstructed from NMR data using a computer. Table 1.1 Genomic comparison of different organisms Organism Number of genes Percent (%) of genes with known or inferred function Hepatitis B virus 4 75 SV40 6 100 Herpes simplex virus 80 95 Mimivirus 900 10 Escherichia coli 4,288 60 Yeast 6,600 40 Caenorhabditis elegans 19,000 40 Drosophila 14,000 25 Arabidopsis 25,000 40 Mouse 100,000 10 Human 100,000 10
2Principles of Molecular Virology FURTHER READING Alberts.B.Brav,D.Hopkin.K.Johnson,A.Lewis.I.Raff.M.Roberts.K.and Walter,P.(2003).Essential Cell Biology.Garland Science,New York.ISBN 0815334818. Cann,A.J.(1999).Virs Culture:A Practical Approach.Oxford University Press, Oxford.ISBN 0199637148. Hendrix,R.W.(2003).Bacteriophage genomics.Current Opinion in Microbiology,6: 506-511 Kuby.J.,Goldsby,R.Kindt,TJ.,and Osborne,B.(2003).Immmology.W.H.Freeman, New York.ISBN 0716749475. Lesk,A.M.(2002).Introduction to Bioinformatics.Oxford University Press,Oxford. I5BN0199251967. Lio,P and Goldman,N.(04).Phylogenomics and bioinformatics of SARS-CoV. Trends in Microbiology,12:106-111. Primrose,S.B.,Twyman,R.M.,and Old,R.W.(2001).Principles of Gene Manipula- tion.Blackwell Scientific,London.ISBN 0632059540. Rohwer,E and Edwards,R.(2002).The Phage Proteomic Tree:a genome-based taxonomy for phage.Journal of Bacteriology,184:4529-4535
FURTHER READING Alberts, B., Bray, D., Hopkin, K., Johnson, A., Lewis, J., Raff, M., Roberts, K., and Walter, P. (2003). Essential Cell Biology. Garland Science, New York. ISBN 0815334818. Cann, A.J. (1999). Virus Culture: A Practical Approach. Oxford University Press, Oxford. ISBN 0199637148. Hendrix, R.W. (2003). Bacteriophage genomics. Current Opinion in Microbiology, 6: 506–511. Kuby, J., Goldsby, R., Kindt,T.J., and Osborne, B. (2003). Immunology.W.H. Freeman, New York. ISBN 0716749475. Lesk, A.M. (2002). Introduction to Bioinformatics. Oxford University Press, Oxford. ISBN 0199251967. Lio, P. and Goldman, N. (2004). Phylogenomics and bioinformatics of SARS-CoV. Trends in Microbiology, 12: 106–111. Primrose, S.B., Twyman, R.M., and Old, R.W. (2001). Principles of Gene Manipulation. Blackwell Scientific, London. ISBN 0632059540. Rohwer, F. and Edwards, R. (2002). The Phage Proteomic Tree: a genome-based taxonomy for phage. Journal of Bacteriology, 184: 4529–4535. 24 Principles of Molecular Virology