grown.Specialists estimated that the research of this area would brought a evolution for Grown biology The developme t of molecular biology announced that life essence had high order and consistency,w hich is a very important leap of hunan understanding.Becauseof the consistency of life activity.molecular biology would been developed a real general biology,and all subjects in biology would been unified in molecular level. Because of the effect of the molecular biology,many scientists in area of biochemistry and biophysic studied into the ar of hiol which moted the development of life science and effected the development of physics d chemistry Molecular biology is the most quickly development ,the most vitality in nature science,and also a lead academics in new century. Chapter 2 Genome Structure 2.1 DNA and RNA 2.1.1 Nucleic acid primary components A nucleic acid is a macr omolecule com osed of chains of monomeric nucleotide.In biochemistry these molecules cary genet tic information form str es within ce The most acids are deoxyribonucleic acid(DNA)and ribonucleic acid (RNA).Nucleic acids are universal in living things.as they are found in all cells and viruses 2.1.1.1 Chemical structure.The term"nucleic acid"is the generic name for a family of biopolymers named for their role in the cell nucleus The monomers from which nucleic acids nstructed are called nucleotides. Each nucleotide consists of thre components:a nitrogenous heterocyclic base which is either a purine or a pyrimidine;a pentose sugar,and a phosphate group Nucleic acid types differ in the structure of the sugar in their nucleotides-DNA contains 2-deoxyriboses while RNA contains ribose(where the only difference is the presence of a hydroxyl group).Also,the nitrogenous bases found in the two nucleic id type are diffe rent: denine cy 。and guanin are found in both RNA and DNA,while thymine only occurs in DNA and uracil only occurs in RN Other rare nucleic acid bases can occur,for example inosine in strands of mature transfer RNA. Nucleic acids are usually either single-stranded or double-stranded,though structures with three or more strands can form.a double-stranded nucleic acid onsists anded nucleic acids held t ther by hydrog bonds,such as in the DNA double he ix.In contrast,RNA is usually single-stranded,but given strand may fold back upon itself to form secondary structure as in tRNA and rRNA.Within cells,DNA is usually double-stranded,though some viruses have single-stranded dNA as their genome.retroviruses have single-stranded rNa as alternating chain,linked by shared oxygens,forming a phosphodiester bond.In conventional nomenclature,the carbons to which the phosphate groups attach are the 3'end and the 5'end carbons of the sugar.This gives nucleic acids polarity.The 6
6 grown. Specialists estimated that the research of this area would brought a evolution for Grown biology. The development of molecular biology announced that life essence had high order and consistency, which is a very important leap of hunan understanding. Because of the consistency of life activity , molecular biology would been developed a real general biology, and all subjects in biology would been unified in molecular level. Because of the effect of the molecular biology, many scientists in area of biochemistry and biophysics studied into the area of biology, which promoted the development of life science and effected the development of physics and chemistry. Molecular biology is the most quickly development ,the most vitality in nature science,and also a lead academics in new century. Chapter 2 Genome Structure 2.1 DNA and RNA 2.1.1 Nucleic acid primary components A nucleic acid is a macromolecule composed of chains of monomeric nucleotide. In biochemistry these molecules carry genetic information or form structures within cells. The most common nucleic acids are deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Nucleic acids are universal in living things, as they are found in all cells and viruses 2.1.1.1 Chemical structure. The term "nucleic acid" is the generic name for a family of biopolymers, named for their role in the cell nucleus. The monomers from which nucleic acids are constructed are called nucleotides. Each nucleotide consists of three components: a nitrogenous heterocyclic base, which is either a purine or a pyrimidine; a pentose sugar; and a phosphate group. Nucleic acid types differ in the structure of the sugar in their nucleotides - DNA contains 2-deoxyriboses while RNA contains ribose (where the only difference is the presence of a hydroxyl group). Also, the nitrogenous bases found in the two nucleic acid types are different: adenine, cytosine, and guanine are found in both RNA and DNA, while thymine only occurs in DNA and uracil only occurs in RNA. Other rare nucleic acid bases can occur, for example inosine in strands of mature transfer RNA. Nucleic acids are usually either single-stranded or double-stranded, though structures with three or more strands can form. A double-stranded nucleic acid consists of two single-stranded nucleic acids held together by hydrogen bonds, such as in the DNA double helix. In contrast, RNA is usually single-stranded, but any given strand may fold back upon itself to form secondary structure as in tRNA and rRNA. Within cells, DNA is usually double-stranded, though some viruses have single-stranded DNA as their genome. Retroviruses have single-stranded RNA as their genome. The sugars and phosphates in nucleic acids are connected to each other in an alternating chain, linked by shared oxygens, forming a phosphodiester bond. In conventional nomenclature, the carbons to which the phosphate groups attach are the 3' end and the 5' end carbons of the sugar. This gives nucleic acids polarity. The
bases extend from a glycosidic linkage to the I'carbon of the pentose sugar ring Base sare joined thr ribose t出 ted through N-I of pyrimidines and N-9 of purines to I'carbon of 2.1.1.2 Types of nucleic acids. i ribonucleic acid ribonucleic acid or rna is a nucleic acid polymer consisting of nucleotide monomers.which plavs several important roles in the processes of translating genetic information from deoxyribonucleic acid (DNA)into proteins.RNAacts as messenger between DNA and protein n synthesi complexes known as ribosomes,forms vital portions of ribosomes,and serves as an essential carrier molecule for amino acids to be used in protein synthesis II Deoxyribonucleic acid deoxyribonucleic acid (DNA)is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living orga s.The main role of DNA mole es is the long-term st rag of information and DNA is often compared to a set of blueprints,since it contains the instructions needed to construct other components of cells,such as proteins and RNA molecules.The DNA segments that carry this genetic information are called genes,but other DNA sequences have structural purp oses.or are involved in eg use of this etic info 2Doubtehica structures polymorphism DNA exists in many possible conformations.However,only A-DNA.B-DNA, and Z-DNA have been observed in organisms.Which conformation DNA adopts depends on the sequence of the DNA.the amount and direction of supercoiling chemical modifications of the bases and also solution conditions such as the conce etal ions and polyamin Of thes orma the "B" I he two alternative double-helical forms of DNA differ in their geometry and dimensions DB-DNA.B-DNA is an antiparallel double helix.It is a right-handed helix.The base-pairs are perpendicular to the axis of the helix.(Actually,they are very slightly tilted -at an a e of 4 degrees)The axis of the helix p hrough the ntre of the base pairs.Eac se pa r is rotated by 36 degrees from the adjacent base pair.The base-pairs are stacked 0.34 nm apart from one another.The double helix repeats every 3.4 nm,i.e.the pitch of the double helix is 3.4 nm.B-DNA has two distinct grooves:a MAJOR groove;and,a MINOR groove.These grooves form as a onsequence of the fact that the beta-glycosidic bonds of the two bases in each base air are attached on the ame edge.However,because the axis of the helix passes through the centre of the base pairs,both grooves are similar in depth (2A-DNA.A-DNA is one of the many possible double helical structures of DNA.It is a right-handed double helix fairly similar to the more common and well-known B-DNA form,but with a shorter more compact helical structure. A-DNA is th ought to one of three biologically a tive ures along with B-andZ-DNA.It appears likely that it occursonly in dehydrated samples of DNA,such as those used in crystallographic experiments,and possibly in hybrid pairings of DNA and RNA strands. A-DNA is fairly similar to B-DNA given that it is a right-handed double helix
7 bases extend from a glycosidic linkage to the 1' carbon of the pentose sugar ring. Bases are joined through N-1 of pyrimidines and N-9 of purines to 1' carbon of ribose through N-β glycosyl bond. 2.1.1.2 Types of nucleic acids. ⅠRibonucleic acid. Ribonucleic acid, or RNA, is a nucleic acid polymer consisting of nucleotide monomers, which plays several important roles in the processes of translating genetic information from deoxyribonucleic acid (DNA) into proteins. RNA acts as a messenger between DNA and the protein synthesis complexes known as ribosomes, forms vital portions of ribosomes, and serves as an essential carrier molecule for amino acids to be used in protein synthesis. ⅡDeoxyribonucleic acid. Deoxyribonucleic acid (DNA) is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms. The main role of DNA molecules is the long-term storage of information and DNA is often compared to a set of blueprints, since it contains the instructions needed to construct other components of cells, such as proteins and RNA molecules. The DNA segments that carry this genetic information are called genes, but other DNA sequences have structural purposes, or are involved in regulating the use of this genetic information. 2.1.2 Double-helical structures polymorphism DNA exists in many possible conformations. However, only A-DNA, B-DNA, and Z-DNA have been observed in organisms. Which conformation DNA adopts depends on the sequence of the DNA, the amount and direction of supercoiling, chemical modifications of the bases and also solution conditions, such as the concentration of metal ions and polyamines. Of these three conformations, the "B" form described above is most common under the conditions found in cells. The two alternative double-helical forms of DNA differ in their geometry and dimensions. ①B-DNA. B-DNA is an antiparallel double helix. It is a right-handed helix. The base-pairs are perpendicular to the axis of the helix. (Actually, they are very slightly tilted - at an angle of 4 degrees)The axis of the helix passes through the centre of the base pairs. Each base pair is rotated by 36 degrees from the adjacent base pair. The base-pairs are stacked 0.34 nm apart from one another. The double helix repeats every 3.4 nm, i.e. the pitch of the double helix is 3.4 nm. B-DNA has two distinct grooves: a MAJOR groove; and, a MINOR groove. These grooves form as a consequence of the fact that the beta-glycosidic bonds of the two bases in each base pair are attached on the same edge. However, because the axis of the helix passes through the centre of the base pairs, both grooves are similar in depth. ②A-DNA. A-DNA is one of the many possible double helical structures of DNA. It is a right-handed double helix fairly similar to the more common and well-known B-DNA form, but with a shorter more compact helical structure. A-DNA is thought to be one of three biologically active double helical structures along with B- and Z-DNA. It appears likely that it occurs only in dehydrated samples of DNA, such as those used in crystallographic experiments, and possibly in hybrid pairings of DNA and RNA strands. A-DNA is fairly similar to B-DNA given that it is a right-handed double helix
with major and minor grooves.However,as shown in the comparison table below, there is a slight increase in the number of base pairs per rotation(resulting in a er rotatio groove and a s hal llowing of the minor. (3Z-DNA.Z-DNA is one of the many possible double helical structures of DNA It is a left-handed double helical structure in which the double helix winds to the left in a zig-zag pattern (instead of to the right like the more common B-Dna form) Z-DNA is tho ughttobe one of three biologically active double helical structure along with A-and B-DNA Z-DNA is quite different from the right-handed forms.In fact,Z-DNA is often compared against B-DNA in order to illustrate the major differences.The Z-DNA helix is left-handed and has a structure that repeats every 2 base pairs.The major and minor grooves.unlike A-and B-DNA,show litte difference in width.Formation of this structure is generally unfavourable DNA supercoiling or high salt and some cations.Z-DNA can form a junction with B-DNA in a structure which involves the extrusion of a base pair 2.1.3 Physical and chemical properties 2 1 3 i Ultraviolet absorntion peak because dna and rna absorb ultraviolet ight,with an absorption pea at 260nm wavele common used to detemin the concntratin of DNAution Inside a spectrophotometer.a sample is exposed to ultraviolet light at 260 nm.and a photo-detector measures the light that passes through the sample.The more light absorbed by the sample,the higher the nucleic acid concentration in the sample. 2.1.3.2D tion De on is a major cha protein or nucleic acid structure by applica ation of some external stress or compound for r example,treatment of proteins with strong acids or bases,high concentrations of inorganic salts,organic solvents (e.g.alcohol or chloroform).or heat.If proteins in a living cell are denatured,this results in disruption of cell activity and possibly cell death. Denatured proteins can exhibit a wide range of characteristics,from loss of solubility to communal aggregation.Denat ed alcohol is an exceptio to this defini term refers not to any alteration of the substance's structure but to the addition of toxins and other things to make it undrinkable. The denaturation of nucleic acids such as DNA due to high temperatures,is the eparation of a double strand into two single strands which occurs when the en the stra rands are broken.This ma occur during polymerase reaction.Nucleic acid strandsrealighenma ns are restore during annealing.If the conditions are restored too quickly.the nucleic acid strands may realign imperfectly. 2.1.3.3 Renaturation.The process by which proteins or complementary strands of nucleic acids re-form their nativ e conformation 2.1.3.4 Northern/Souther blot.A Southern blot is a method routinely used ir molecular biology to check for the presence of a DNA sequence in a DNA sample Southern blotting combines agarose gel electrophoresis for size separation of DNA with methods to transfer the size-separated DNA to a filter membrane for probe
8 with major and minor grooves. However, as shown in the comparison table below, there is a slight increase in the number of base pairs per rotation (resulting in a tighter rotation angle), and smaller rise/turn. This results in a deepening of the major groove and a shallowing of the minor. ③Z-DNA. Z-DNA is one of the many possible double helical structures of DNA. It is a left-handed double helical structure in which the double helix winds to the left in a zig-zag pattern (instead of to the right, like the more common B-DNA form). Z-DNA is thought to be one of three biologically active double helical structures along with A- and B-DNA. Z-DNA is quite different from the right-handed forms. In fact, Z-DNA is often compared against B-DNA in order to illustrate the major differences. The Z-DNA helix is left-handed and has a structure that repeats every 2 base pairs. The major and minor grooves, unlike A- and B-DNA, show little difference in width. Formation of this structure is generally unfavourable, although certain conditions can promote it; such as alternating purine-pyrimidine sequence, DNA supercoiling or high salt and some cations. Z-DNA can form a junction with B-DNA in a structure which involves the extrusion of a base pair. 2.1.3 Physical and chemical properties 2.1.3.1 Ultraviolet absorption peak. Because DNA and RNA absorb ultraviolet light, with an absorption peak at 260nm wavelength, spectrophotometers are commonly used to determine the concentration of DNA in a solution. Inside a spectrophotometer, a sample is exposed to ultraviolet light at 260 nm, and a photo-detector measures the light that passes through the sample. The more light absorbed by the sample, the higher the nucleic acid concentration in the sample. 2.1.3.2 Denaturation. Denaturation is a major change in protein or nucleic acid structure by application of some external stress or compound for example, treatment of proteins with strong acids or bases, high concentrations of inorganic salts, organic solvents (e.g., alcohol or chloroform), or heat. If proteins in a living cell are denatured, this results in disruption of cell activity and possibly cell death. Denatured proteins can exhibit a wide range of characteristics, from loss of solubility to communal aggregation. Denatured alcohol is an exception to this definition, as the term refers not to any alteration of the substance's structure but to the addition of toxins and other things to make it undrinkable. The denaturation of nucleic acids such as DNA due to high temperatures, is the separation of a double strand into two single strands, which occurs when the hydrogen bonds between the strands are broken. This may occur during polymerase chain reaction. Nucleic acid strands realign when "normal" conditions are restored during annealing. If the conditions are restored too quickly, the nucleic acid strands may realign imperfectly. 2.1.3.3 Renaturation. The process by which proteins or complementary strands of nucleic acids re-form their native conformations 2.1.3.4 Northern/Southern blot. A Southern blot is a method routinely used in molecular biology to check for the presence of a DNA sequence in a DNA sample. Southern blotting combines agarose gel electrophoresis for size separation of DNA with methods to transfer the size-separated DNA to a filter membrane for probe
hybridization.The method is named after its inventor.the british biologist edwin eore blot is tcniu used inmolur bo rearcotud expression.The gels may be run on either agarose or denaturing polyacrylamide gels depending on the size of the RNA to be detected.A notable difference in the procedure in case of agarose gels.(as compared with the Southern blot)is the addition of formaldehyde which acts as a denaturant for smaller fragments hybridization probe may be made from DNA or RNA 2.1.4 Structure information 2.1.4.1 Hydrogen bond.A hydrogen bond is a special type of dipole-dipole force that exists between an electronegative atom and a hydrogen atom bonded to another clectronegati e atom(Nitroge or ne).This type of force ays involves a hydrogen atom and the energy of this attraction is close to that of weak covalent bonds(155 kJ/mol),thus the name-Hydrogen Bonding.These attractions can occur between molecules(intermolecularly),or within different parts of a single molecule (intramolecularly).The hydrogen bond is a very strong fixed dipole-dipole n der Waals-Kees som for e,but weaker than covalent,ionic and metallic bonds d an electrostati intermolecular attraction. Hydrogen bonding also plays an important role in determining the three-dimensional structures adopted by proteins and nucleic bases.In these macromolecules,bonding between parts of the same macromolecule cause it to fold into a specific ermine the ule's physi gical or biochemica role.The double helical structure of DN for example,is due largely to hydrogen bonding between the base pairs.which link one complementary strand to the other and enable replication 2 14.2 Maior and minor grooves.The double helix is a right-handed spiral As the DNA strands wind around each other,they leave gaps betwe n each s phosphate backbones.revealing the sides of the bases inside.There are two of these grooves twisting around the surface of the double helix:one groove,the major groove,is 22 A wide and the other,the minor groove,is 12 A wide.The narrowness of the minor groove means that the edges of the bases are more accessible in the major groove.As a result,proteins like transcription factors that can bind to specific n double-stranded DNA usually make contacts to the of the e bases exposed in the major groove. 2.2 Gene and Genome 2.1.1 Gene that individual esponsible for the production of. difference in chemical nature between the DNA of the gene and its protein product led to the concept that a gene codes for a protein.This in turn led to the discovery of the complex apparatus that allows the DNA sequence of gene to generate the amino acid
9 hybridization. The method is named after its inventor, the British biologist Edwin Southern. The northern blot is a technique used in molecular biology research to study gene expression. The gels may be run on either agarose or denaturing polyacrylamide gels depending on the size of the RNA to be detected. A notable difference in the procedure in case of agarose gels, (as compared with the Southern blot) is the addition of formaldehyde which acts as a denaturant. For smaller fragments denaturing polyacrylamide urea gels are employed.As in the Southern blot, the hybridization probe may be made from DNA or RNA. 2.1.4 Structure information 2.1.4.1 Hydrogen bond. A hydrogen bond is a special type of dipole-dipole force that exists between an electronegative atom and a hydrogen atom bonded to another electronegative atom (Nitrogen, Oxygen or Fluorine). This type of force always involves a hydrogen atom and the energy of this attraction is close to that of weak covalent bonds (155 kJ/mol), thus the name - Hydrogen Bonding. These attractions can occur between molecules (intermolecularly), or within different parts of a single molecule (intramolecularly). The hydrogen bond is a very strong fixed dipole-dipole van der Waals-Keesom force, but weaker than covalent, ionic and metallic bonds. The hydrogen bond is somewhere between a covalent bond and an electrostatic intermolecular attraction. Hydrogen bonding also plays an important role in determining the three-dimensional structures adopted by proteins and nucleic bases.In these macromolecules, bonding between parts of the same macromolecule cause it to fold into a specific shape, which helps determine the molecule's physiological or biochemical role. The double helical structure of DNA, for example, is due largely to hydrogen bonding between the base pairs, which link one complementary strand to the other and enable replication. 2.1.4.2 Major and minor grooves. The double helix is a right-handed spiral. As the DNA strands wind around each other, they leave gaps between each set of phosphate backbones, revealing the sides of the bases inside. There are two of these grooves twisting around the surface of the double helix: one groove, the major groove, is 22 Å wide and the other, the minor groove, is 12 Å wide. The narrowness of the minor groove means that the edges of the bases are more accessible in the major groove. As a result, proteins like transcription factors that can bind to specific sequences in double-stranded DNA usually make contacts to the sides of the bases exposed in the major groove. 2.2 Gene and Genome 2.1.1 Gene The first definition of the gene as a functional unit followed from the discovery that individual genes are responsible for the production of specific proteins. The difference in chemical nature between the DNA of the gene and its protein product led to the concept that a gene codes for a protein. This in turn led to the discovery of the complex apparatus that allows the DNA sequence of gene to generate the amino acid
sequence of a protein.A gene is a sequence of DNA that produces another nucleic acid,RNA.The DNA has two strands of nucleic acid,and the RNA has only one strand.The sequence of the RNA is deter ined by the seque of the DNA(in fac it is identical to one of the DNA strands).In many,but not in all cases,the RNA is in turn used to direct production of a protein.Thus a gene is a sequence of DNA that codes for an RNA;in protein-coding genes,the RNA in turn codes for a protein. 2.1.2 Genome In biology the nism is its whole hereditary information and is encoded in the DNA(or,for some viruses,RNA).This includes both the genes and the non-coding sequences of the DNA More precisely.the genome of an organism is a complete genetic sequence on one set of chromosomes:for example,one of the two sets that a diploid individual ries inevery somatic cell.Thet rm that stored ona complete set of nuclear DNA(i.e. the cear genome)but can also be applied to that stored within organelles that contain their own DNA,as with the mitochondrial genome or the chloroplast genome.When people say that the genome of a sexually reproducing species has been"sequenced,"typically they are referring to a determination of the quences of one set of autoso es and one of each type of se rom me which togethe represent both of the possible sexes.Ever in species that exist in only one sex,what is described as"a genome sequence"may be a composite read from the chromosomes of various individuals.In general use. the phrase"genetic makeup"is sometimes used conversationally to mean the genome of a particular individual or organism.The study of the global properties of lated or is referred to which distinguishes which generally studies the properties of single genes or groupsof genes Both the number of base pairs and the number of genes vary widely from one species to another,and there is little connection between the two.At present,the highest known number of genes is around 60,000,for the protozoan causing trichon oniasis(see List of sequenced eukaryotic genomes),almost three times as many as in the human genome. 2.1.2.1 Types.Most biological entities are more complex than a virus sometimes or always carry additional genetic material besides that which resides in their chromosomes.In some contexts.such as sequencing the genome of a pathogenic microbe,"ger is me to include infor red on this auxiliar which is carried in plasmids.In such circumstances then,"genome deseribes all of the genes and information on non-coding DNA that have the potential to be present In eukaryotes such as plants,protozoa and animals,however,"genome"carries the typical connotation of only information on chromosomal DNA.So although the organism: an mitochondria that have thern DNA.the mitochondrial DNA are not considered part of the genon e.In fact,mit chondria are sometimes said to have their own genome,often referred to as the "mitochondrial genome". 10
10 sequence of a protein. A gene is a sequence of DNA that produces another nucleic acid, RNA. The DNA has two strands of nucleic acid, and the RNA has only one strand. The sequence of the RNA is determined by the sequence of the DNA (in fact, it is identical to one of the DNA strands). In many, but not in all cases, the RNA is in turn used to direct production of a protein. Thus a gene is a sequence of DNA that codes for an RNA; in protein-coding genes, the RNA in turn codes for a protein. 2.1.2 Genome In biology the genome of an organism is its whole hereditary information and is encoded in the DNA (or, for some viruses, RNA). This includes both the genes and the non-coding sequences of the DNA. More precisely, the genome of an organism is a complete genetic sequence on one set of chromosomes; for example, one of the two sets that a diploid individual carries in every somatic cell. The term genome can be applied specifically to mean that stored on a complete set of nuclear DNA (i.e., the "nuclear genome") but can also be applied to that stored within organelles that contain their own DNA, as with the mitochondrial genome or the chloroplast genome. When people say that the genome of a sexually reproducing species has been "sequenced," typically they are referring to a determination of the sequences of one set of autosomes and one of each type of sex chromosome, which together represent both of the possible sexes. Even in species that exist in only one sex, what is described as "a genome sequence" may be a composite read from the chromosomes of various individuals. In general use, the phrase "genetic makeup" is sometimes used conversationally to mean the genome of a particular individual or organism. The study of the global properties of genomes of related organisms is usually referred to as genomics, which distinguishes it from genetics which generally studies the properties of single genes or groups of genes. Both the number of base pairs and the number of genes vary widely from one species to another, and there is little connection between the two. At present, the highest known number of genes is around 60,000, for the protozoan causing trichomoniasis (see List of sequenced eukaryotic genomes), almost three times as many as in the human genome. 2.1.2.1 Types. Most biological entities are more complex than a virus sometimes or always carry additional genetic material besides that which resides in their chromosomes. In some contexts, such as sequencing the genome of a pathogenic microbe, "genome" is meant to include information stored on this auxiliary material, which is carried in plasmids. In such circumstances then, "genome" describes all of the genes and information on non-coding DNA that have the potential to be present. In eukaryotes such as plants, protozoa and animals, however, "genome" carries the typical connotation of only information on chromosomal DNA. So although these organisms contain mitochondria that have their own DNA, the genes in this mitochondrial DNA are not considered part of the genome. In fact, mitochondria are sometimes said to have their own genome, often referred to as the "mitochondrial genome