INSIGHT REVIEW NATUREIVol 447 24 May 2007 doi: 10.1038/nature05917 Epigenetic inheritance in plants lanR. henderson'& steven e. jacobsen' The function of plant genomes depends on chromatin marks such as the methylation of dna and the post- translational modification of histones. Techniques for studying model plants such as Arabidopsis thaliana have enabled researchers to begin to uncover the pathways that establish and maintain chromatin modifications, scale. Small RNAs seem to be important in determining the distribution of chromatin modifications, ano o and genomic studies are allowing the mapping of modifications such as dNa methylation on a genom RNA might also underlie the complex epigenetic interactions that occur between homologous sequences. Plants use these epigenetic silencing mechanisms extensively to control development and parent -of-origin imprinted gene expression. akaryotic genomes are covalently modified with a diverse set of chro- highly transcribed and constitutively expressed By contrast, genes matin marks, which are present on both the DNA and the associated with methylated promoters had lower expression levels and frequent mary DNA sequence, they are frequently heritable through cell division, methylation is in contrast to that observed in mammalian genomes, sometimes for multiple generations, and can thus often be classified as which are often densely methylated but have hypomethylated CG islands pigenetic marks. These conserved epigenetic marks have been found in gene promoters. It will be important to describe the methylome of to influence many aspects of gene expression and chromosome biology, other repeat-rich plant genomes, such as those of the grasses, to test the and they have characteristic genomic distributions generality of the patterns observed in A thaliana. Here, we review the The size of eukaryotic genomes v tensively and does not cor- emerging and prominent role of RNA in epigenetic inheritance in plants relate with gene number. This is often because of the presence of large and how such mechanisms are used to control development. amounts of non-gene sequences, which can include pseudogenes, transposable elements, integrated viruses and simple repeats' At the Mediating silencing with RNA hromosomal level, genomes are organized into euchromatin, which is A central question in understanding the epigenetic regulation of gene-rich, and heterochromatin, which is repeat-rich Heterochromatin genomes is how sequences are recognized or avoided as targets for is defined by three main properties: greater compaction than other silencing. There is an increasing appreciation that siRNAs, whic genomic regions during interphase, lower accessibility than other are generated by the RNA interference(RNai)pathway, can provide regions to transcription and recombination machinery, and the for- sequence specificity to guide epigenetic modifications in a diverse range mation of structured nucleosome arrays"(see page 399). The defining of eukaryotes. Well-studied examples include transcriptional silen characteristics of heterochromatin depend on epigenetic information, cing in yeast(see page 399), cytosine methylation in plants. and including post-translational modification of histones and methylation nome rearrangements in ciliates. RNA-directed DNA methylation of cytosine bases in DNA". The silencing of transposable-element was discovered in tobacco, in which genomic sequences homologous uences within heterochromatin is probably a genome-defence strat- to infectious RNA viroids were found to become cytosine methy gy. However, heterochromatin can also have important roles during ated. Subsequently, the expression of double-stranded RNA (dsRNA) chromosomal segregation, and transposons and epigenetic silencing in plants was shown to generate siRNAs and cause dense cytosine ave been shown to both modulate gene expression and contribute to methylation of homologous DNA in all sequence contexts. This is cis-regulatory sequences". Plant systems have been a rich source for the reflected by the high coincidence of endogenous siRNA clusters with study of epigenetic inheritance, and examples of important discoveries methylated sequences and repeats in A. thaliana.215.20 include transposable elements, Paramutation, small interfering RNAs All known de novo DNA methylation in A. thaliana is carried out (siRNAs) and RNA-directed DNA methylation by DOMAINS REARRANGED METHYLTRANSFERASE 2(DRM2), Genomic resources for studying the model plant Arabidopsis thaliana which is a homologue of the mammalian DNA methyltransferase 3 have begun to provide insight into the epigenetic landscape' of this (DNMT3)enzymes(Fig 2b). DRM2 can be targeted to a sequence organism.A thaliana has a compact-130-megabase(Mb)genome, by siRNAs generated from the expression of either direct or inverted although it contains considerable amounts of heterochromatin, which is repeats. Plants encode multiple homologues of the RNAi-machinery repeat-rich and largely located in the centromeric and pericentromeric components, some of which are specialized for function in RNA regions(Fig. 1). High-resolution mapping of cytosine methylation directed DNA methylation. 26. The endoribonuclease DICER-LIKE3 by using whole-genome microarrays has confirmed previous reports, (DCL3)generates 24-nucleotide siRNAs, which are loaded into the PA nowing that this modification co-localizes with repeat sequences and and PIWI-domain-containing protein ARGONAUTE 4(AGO4) with the centromeric regions2.5. Fewer than 5% of expressed genes (Fig 2a). These AGO4-associated siRNAs are proposed to guide the were shown to have methylated promoters, although about one-third cytosine-methyltransferase activity of DRM2 (refs 26-31). The mecha of genes were methylated in their open reading frame.. The signif- nism by which siRNAs target epigenetic modifications is poorly under ance of methylation in the body of a gene is not fully understood, stood and could involve either DNA-RNA or RNA-RNA hybridization but such methylation was found to correlate with genes that are both events. Interestingly, epigenetic modifications guided by AGo4 in Department of Molecular, Cell and Developmental Biology, Howard Hughes Medical Institute, University of California, Los Angeles, California 90095, USA. @2007 Nature Publishing Group
Eukaryotic genomes are covalently modified with a diverse set of chromatin marks, which are present on both the DNA and the associated histones (see page 407). Although these changes do not alter the primary DNA sequence, they are frequently heritable through cell division, sometimes for multiple generations, and can thus often be classified as epigenetic marks. These conserved epigenetic marks have been found to influence many aspects of gene expression and chromosome biology, and they have characteristic genomic distributions. The size of eukaryotic genomes varies extensively and does not correlate with gene number1 . This is often because of the presence of large amounts of non-gene sequences, which can include pseudogenes, transposable elements, integrated viruses and simple repeats1 . At the chromosomal level, genomes are organized into euchromatin, which is gene-rich, and heterochromatin, which is repeat-rich2 . Heterochromatin is defined by three main properties: greater compaction than other genomic regions during interphase, lower accessibility than other regions to transcription and recombination machinery, and the formation of structured nucleosome arrays2 (see page 399). The defining characteristics of heterochromatin depend on epigenetic information, including post-translational modification of histones and methylation of cytosine bases in DNA2,3. The silencing of transposable-element sequences within heterochromatin is probably a genome-defence strategy. However, heterochromatin can also have important roles during chromosomal segregation4 , and transposons and epigenetic silencing have been shown to both modulate gene expression and contribute to cis-regulatory sequences5,6. Plant systems have been a rich source for the study of epigenetic inheritance, and examples of important discoveries include transposable elements7 , paramutation8 , small interfering RNAs (siRNAs)9 and RNA-directed DNA methylation10. Genomic resources for studying the model plant Arabidopsis thaliana have begun to provide insight into the epigenetic ‘landscape’ of this organism11,12. A. thaliana has a compact ~130-megabase (Mb) genome, although it contains considerable amounts of heterochromatin, which is repeat-rich and largely located in the centromeric and pericentromeric regions13,14 (Fig. 1). High-resolution mapping of cytosine methylation by using whole-genome microarrays has confirmed previous reports, showing that this modification co-localizes with repeat sequences and with the centromeric regions11,12,15. Fewer than 5% of expressed genes were shown to have methylated promoters, although about one-third of genes were methylated in their open reading frame11,12. The significance of methylation in the body of a gene is not fully understood, but such methylation was found to correlate with genes that are both highly transcribed and constitutively expressed11,12. By contrast, genes with methylated promoters had lower expression levels and frequently had tissue-specific expression patterns11,12. This distribution of cytosine methylation is in contrast to that observed in mammalian genomes, which are often densely methylated but have hypomethylated CG islands in gene promoters3 . It will be important to describe the ‘methylome’ of other repeat-rich plant genomes, such as those of the grasses, to test the generality of the patterns observed in A. thaliana. Here, we review the emerging and prominent role of RNA in epigenetic inheritance in plants and how such mechanisms are used to control development. Mediating silencing with RNA A central question in understanding the epigenetic regulation of genomes is how sequences are recognized or avoided as targets for silencing. There is an increasing appreciation that siRNAs, which are generated by the RNA interference (RNAi) pathway, can provide sequence specificity to guide epigenetic modifications in a diverse range of eukaryotes. Well-studied examples include transcriptional silencing in yeast16 (see page 399), cytosine methylation in plants10,17 and genome rearrangements in ciliates18. RNA-directed DNA methylation was discovered in tobacco, in which genomic sequences homologous to infectious RNA viroids were found to become cytosine methylated10. Subsequently, the expression of double-stranded RNA (dsRNA) in plants was shown to generate siRNAs and cause dense cytosine methylation of homologous DNA in all sequence contexts19. This is reflected by the high coincidence of endogenous siRNA clusters with methylated sequences and repeats in A. thaliana11,12,15,20. All known de novo DNA methylation in A. thaliana is carried out by DOMAINS REARRANGED METHYLTRANSFERASE 2 (DRM2), which is a homologue of the mammalian DNA methyltransferase 3 (DNMT3) enzymes21–24 (Fig. 2b). DRM2 can be targeted to a sequence by siRNAs generated from the expression of either direct or inverted repeats23,24. Plants encode multiple homologues of the RNAi-machinery components, some of which are specialized for function in RNAdirected DNA methylation25,26. The endoribonuclease DICER-LIKE 3 (DCL3) generates 24-nucleotide siRNAs, which are loaded into the PAZand PIWI-domain-containing protein ARGONAUTE 4 (AGO4)26–31 (Fig. 2a). These AGO4-associated siRNAs are proposed to guide the cytosine-methyltransferase activity of DRM2 (refs 26–31). The mechanism by which siRNAs target epigenetic modifications is poorly understood and could involve either DNA–RNA or RNA–RNA hybridization events. Interestingly, epigenetic modifications guided by AGO4 in Epigenetic inheritance in plants Ian R. Henderson1 & Steven E. Jacobsen1 The function of plant genomes depends on chromatin marks such as the methylation of DNA and the posttranslational modification of histones. Techniques for studying model plants such as Arabidopsis thaliana have enabled researchers to begin to uncover the pathways that establish and maintain chromatin modifications, and genomic studies are allowing the mapping of modifications such as DNA methylation on a genome-wide scale. Small RNAs seem to be important in determining the distribution of chromatin modifications, and RNA might also underlie the complex epigenetic interactions that occur between homologous sequences. Plants use these epigenetic silencing mechanisms extensively to control development and parent-of-origin imprinted gene expression. 1 Department of Molecular, Cell and Developmental Biology, Howard Hughes Medical Institute, University of California, Los Angeles, California 90095, USA. 418 INSIGHT REVIEW NATURE|Vol 447|24 May 2007|doi:10.1038/nature05917
NATUREIVol 447 24 May 2007 INSIGHT REVIEW A. thaliana have been shown to depend partly on the RNaseH(slicer) revert to an active state. This gives rise to the concept of the epigenetic talytic activity of AGo4(ref. 30). This could be taken as support for allele(epiallele), which is defined as an allele that shows a heritable di INA-RNA hybridization having an important role in the targeting of ference in expression as a consequence of epigenetic modifications and epigenetic modifications not changes in DNA sequence. For example, hypermethylated(silent) The accumulation of siRNAs associated with RNA-directed epialleles of SUPERMAn (which is involved in floral development) DNA methylation in A thaliana often depends on RNA-DEPEND- known as clark kent are stable during many generations of inbreeding, ENT RNA POLYMERASE 2 (RDR2)and the plant-specific protein but they can revert to an unmethylated (active)state at a frequency of NUCLEAR RNA POLYMERASE IV A(also known as NUCLEAR -3%per generation". Another notable characteristic of certain epialleles RNA POLYMERASED 1A; NRPDlA), which are involved in a putative is their ability to influence other homologous sequences both in cis and amplification pathway 32-3(Fig 2a). Together, RDR2 and NRPDIA might generate dsRNA substrates for DCL3 to process into siRNAs, Genes although how these proteins are recruited to target loci is unknown. everal loci also show dependence on AGo4 and DRM2 for siRNA 35 accumulation, suggesting that there might be a feedback loop between ional silencing and siRNA generation2 426 NRPDIA functions in a complex with NRPD2. A variant of this 20 required for RNA-directed DNA methylation but participates less fre- 9 15/ NRPD complex, which contains NRPDIB instead of NRPDlA, is also quently in siRNA accumulation.(Fig 2a). One possible function for the NRPDIB-containing complex is to generate a target transcript that can hybridize with siRNA-loaded AGO4-containing complexes Indeed, AGO4 has been observed to bind directly to NRPDiB.The#100,000 Repeats SWI-SNF-family chromatin-remodelling protein DEFECTIVE IN g RNA-DIRECTED DNA METHYLATION 1(DRDI)is also required for RNA-directed DNA methylation and could function to facilitate access of DRM2 to target DNA. Recently, several proteins in the RNA-directed 60.000 DNA-methylation pathway have been found to localize bodies, including the Cajal body, which is a centre for the processing and modification of many non-coding RNAs2 .Localization to these bodies 20.000 might be required for the efficient loading of AGO4-containing com- plexes with siRNA before these complexes travel to the nucleoplasm and, together with DRM2, direct RNA-directed DNA methylation. 200250 Plants show extensive methylation of cytosine bases in the CG, CNG Cytosine methylation ce contexts. By contrast, most cytosine methylation in mammals is found in the CG sequence context. CG methylation is maintained DNMTI in plants and mammals, respectively(Fig. 2 b). DNMTI 40,001 has a catalytic preference for hemimethylated substrates, providing an attractive model for the efficient maintenance of CG methylation after DNA replication and during cell division. Most non-CG methylation in lants is maintained redundantly by dRM2 and the plant-specific protein 2体 200 CHROMOMETHYLASE 3(CMT3)(Fig 2b); however, some loci siRNA show residual non-CG methylation in drmI drm2 cmt3 triple mutants, which might be maintained by METl(ref. 25). Non-CG methylation 2500 differs from CG methylation, because it seems to require an active maintenance signal after DNA replication. At some loci, siRNAs seem g2000 to provide this signal, acting through dRM2 activity: for example, at the 1.500 MEA-ISRlocus(MEDEA INTERSTITIAL SUBTELOMERIC REPEATS 31,000 locus, an array of seven tandem repeats located downstream of the MEDEA gene), the repeats lose all non-CG methylation in drm2 mutants 2500 and in several RNAi-pathway mutants such as ago4 and rdr2 (refs 23, 37) By contrast, other loci-for le. the sIne-class 150200250 ArSNI-completely lose non-CG methylation only in drmI drm2 cm Distance along chromosome (100 kb riple mutants. At AtSNI, CMT3 contributes to the maintenance of both Centromere CNG methylation and asymmetrical( CHH)methylation. The activ ity of CMT3 largely depends on the main methyltransferase for H3K A thaliana chromosome e lysine residue at position 9 of histone H3)-SU(VAR)3-9 HOM OLOGUE 4(SUVH4; also known as KRYPTONITE)-showing that Figure 1I The epigenetic"landscapeof A thaliana. The relative abundance histone methylation is also an important al for the maintenance of tive importance of the RNAi pathway and histone methylation for the siRNAs (doned siRNAs Per 100 kb; ref 20) is shown for the lengthor A thaliana chromosome 1, which is -30 Mb. Numbers on the x axis represent 100-kb windows along the chromosome. A diagram of Communication of silent information chromosome 1 is also shown, with white bars indicating euchromatic arms, grey bars indicating pericentromeric heterochromatin and the black bar Epigenetically silent expression states can show remarkable stability indicating the centromeric core(Figure courtesy of X.Zhang, University throughout mitosis and meiosis, although they can retain the ability to of California, Los Angeles) @2007 Nature Publishing Group
A. thaliana have been shown to depend partly on the RNaseH (‘slicer’) catalytic activity of AGO4 (ref. 30). This could be taken as support for RNA–RNA hybridization having an important role in the targeting of epigenetic modifications. The accumulation of siRNAs associated with RNA-directed DNA methylation in A. thaliana often depends on RNA-DEPENDENT RNA POLYMERASE 2 (RDR2) and the plant-specific protein NUCLEAR RNA POLYMERASE IV A (also known as NUCLEAR RNA POLYMERASE D 1A; NRPD1A), which are involved in a putative amplification pathway26,32–35 (Fig. 2a). Together, RDR2 and NRPD1A might generate dsRNA substrates for DCL3 to process into siRNAs, although how these proteins are recruited to target loci is unknown. Several loci also show dependence on AGO4 and DRM2 for siRNA accumulation, suggesting that there might be a feedback loop between transcriptional silencing and siRNA generation24,26. NRPD1A functions in a complex with NRPD2. A variant of this NRPD complex, which contains NRPD1B instead of NRPD1A, is also required for RNA-directed DNA methylation but participates less frequently in siRNA accumulation33,35 (Fig. 2a). One possible function for the NRPD1B-containing complex is to generate a target transcript that can hybridize with siRNA-loaded AGO4-containing complexes. Indeed, AGO4 has been observed to bind directly to NRPD1B28. The SWI–SNF-family chromatin-remodelling protein DEFECTIVE IN RNA-DIRECTED DNA METHYLATION 1 (DRD1) is also required for RNA-directed DNA methylation and could function to facilitate access of DRM2 to target DNA27,36. Recently, several proteins in the RNA-directed DNA-methylation pathway have been found to localize to distinct nuclear bodies, including the Cajal body, which is a centre for the processing and modification of many non-coding RNAs28,29. Localization to these bodies might be required for the efficient loading of AGO4-containing complexes with siRNA before these complexes travel to the nucleoplasm and, together with DRM2, direct RNA-directed DNA methylation. Plants show extensive methylation of cytosine bases in the CG, CNG (where N denotes any nucleotide) and CHH (where H denotes A, C or T) sequence contexts37. By contrast, most cytosine methylation in mammals is found in the CG sequence context3,38. CG methylation is maintained by the homologous proteins METHYLTRANSFERASE 1 (MET1) and DNMT1 in plants and mammals, respectively39,40 (Fig. 2b). DNMT1 has a catalytic preference for hemimethylated substrates, providing an attractive model for the efficient maintenance of CG methylation after DNA replication and during cell division38. Most non-CG methylation in plants is maintained redundantly by DRM2 and the plant-specific protein CHROMOMETHYLASE 3 (CMT3)23,37 (Fig. 2b); however, some loci show residual non-CG methylation in drm1 drm2 cmt3 triple mutants, which might be maintained by MET1 (ref. 25). Non-CG methylation differs from CG methylation, because it seems to require an active maint enance signal after DNA replication. At some loci, siRNAs seem to provide this signal, acting through DRM2 activity: for example, at the MEA-ISR locus (MEDEA INTERSTITIAL SUBTELOMERIC REPEATS locus, an array of seven tandem repeats located downstream of the MEDEA gene), the repeats lose all non-CG methylation in drm2 mutants and in several RNAi-pathway mutants such as ago4 and rdr2 (refs 23, 37). By contrast, other loci — for example, the SINE-class retrotransposon AtSN1 — completely lose non-CG methylation only in drm1 drm2 cmt3 triple mutants. At AtSN1, CMT3 contributes to the maintenance of both CNG methylation and asymmetrical (CHH) methylation. The activity of CMT3 largely depends on the main methyltransferase for H3K9 (the lysine residue at position 9 of histone H3) — SU(VAR)3-9 HOMOLOGUE 4 (SUVH4; also known as KRYPTONITE) — showing that histone methylation is also an important signal for the maintenance of non-CG methylation41,42. At present, the factors that determine the relative importance of the RNAi pathway and histone methylation for the maintenance of non-CG methylation at different loci remain unclear. Communication of silent information Epigenetically silent expression states can show remarkable stability throughout mitosis and meiosis, although they can retain the ability to revert to an active state2 . This gives rise to the concept of the epigenetic allele (epiallele), which is defined as an allele that shows a heritable difference in expression as a consequence of epigenetic modifications and not changes in DNA sequence. For example, hypermethylated (silent) epialleles of SUPERMAN (which is involved in floral development) known as clark kent are stable during many generations of inbreeding, but they can revert to an unmethylated (active) state at a frequency of ~3% per generation43. Another notable characteristic of certain epialleles is their ability to influence other homologous sequences both in cis and siRNA 500 1,000 1,500 2,000 2,500 3,000 0 20,000 40,000 60,000 80,000 5 10 15 20 25 30 35 40 0 0 50 300 100 150 200 250 0 0 50 300 100 150 200 250 0 250 50 300 100 150 200 0 0 250 50 300 100 150 200 20,000 40,000 60,000 80,000 100,000 Repeats Genes No. of genes per 100 kb No. of repeat bases per 100 kb No. of methylated bases per 100 kb No. of siRNAs per 100 kb Cytosine methylation A. thaliana chromosome 1 Centromere Distance along chromosome (100 kb) Figure 1 | The epigenetic ‘landscape’ of A. thaliana. The relative abundance of genes (number of annotated genes11), repeats (repeat bases per 100 kb; ref. 11), cytosine methylation (methylated bases per 100 kb; ref. 11) and siRNAs (cloned siRNAs per 100 kb; ref. 20) is shown for the length of A. thaliana chromosome 1, which is ~30 Mb. Numbers on the x axis represent 100-kb windows along the chromosome. A diagram of chromosome 1 is also shown, with white bars indicating euchromatic arms, grey bars indicating pericentromeric heterochromatin and the black bar indicating the centromeric core. (Figure courtesy of X. Zhang, University of California, Los Angeles.) 419 NATURE|Vol 447|24 May 2007 INSIGHT REVIEW
INSIGHT REVIEW NATURE Vol 447 24 May 2007 in trans?. One example is paramutation, which was discovered in plants and is defined as allelic interactions that cause a meiotically heritable change in the expression of one of the alleles. Trans-phenomer NRPDZ to paramutation have also been described in mammals, including at a 八八八八八 chimaeric version of the mouse Rasgrfl (Ras protein-specific guanine SSRNA dsRNA nucleotide-releasing factor 1)locus that contained the imprinting con- trol region from the insulin-like growth factor 2 receptor gene One of the best-studied paramutation systems is the maize(Zea mays)locus bl, which encodes a transcription factor that is required DNA for accumulation of the pigment anthocyanin. The paramutagenic epiallele B, which light pigm siRNA DRM2 NRPD1B low frequency from its paramutable parent allele B-I, which causes dark pigmentation. B'epialleles convert B-lalleles to B'epialleles when \\siRNA. heterozygous with 100% penetrance, and the newly created paramutated B'epialleles can pass on their silent state in subsequent crosses(Fig 3) B' epialleles are transcribed at one-twentieth to one-tenth the rate of DRM2 AGO. B-I alleles but have identical gene sequences".Fine-structure recom- bination mapping of alleles resulting from a cross between individuals with paramutagenic alleles and those with neutral alleles(which can- not participate in paramutation) enabled the sequences required for paramutation to be defined; these sequences are present as an array of 7 tandem 853-base repeats, which is located -100 kilobases(kb) upstream ofb(refs 45, 46). The sequences are present as a single copy in neutral alleles. Recombinant alleles with three repeats show partial A tha METT aramutational ability, whereas alleles with seven repeats are fully active in paramutation". These repeats were also shown to have a closed H sapiens DNMT1 chromatin structure and more cytosine methylation in B'epialleles thar BAR in B-I alleles. However, for B, cytosine methylation was found to b A thaliana DRM2 Zinc finger established after the silent state, so it is unlikely to be the cause.There H Cytosine methyltransferase are several models of trans-communication between alleles, including physical pairing of alleles and transmission of an RNA signal. A model UBA domain for paramutagenic interactions being mediated by siRNA is supported H sapiens DNMT3B PWWP domain by the finding that a genetic suppressor of paramutation, mediator of paramutation(mop1), encodes the maize orthologue of the RNA- A thaliana CMT3 dependent RNa polymerase RDR2(refs 47, 48). So far, siRNAs homol- gous to the tandem repeats upstream of B have not been reported, Figure 2 RNA-directed DNA methylation. a, Putative pathway for rNA. although such repeats are commonly associated with small RNAs directed DNA methylation in A thaliana. Target loci(in this case tandemly The mopI gene is also required for silencing transgenes and Mutator repeated sequences; coloured arrows)recruit an RNA polymerase I\ like transposons, indicating that RNA-dependent RNA polymerases and of NRPDlA and NrPD2 through an unknow siRNAs have a role in heterochromatic silencing in monocotyledonous mechanism, and this results in the ion of as -stranded RNa plants 0. The detailed relationships between siRNAS,chromatin struc ssRNA)species. This ssRNA is converted to double-stranded RNA sRNA)by the RNA-dependent RNA polymerase RDR2. The dsrNA ture at the repeats upst of B, and the ability is then processed into 24-nucleotide siRNAs by DCL3. The siRNAs are states will be intriguing to determine. ubsequently loaded into the PAZ- and PIwl-domain-containing prote The A thaliana gene FWA has similarities to maize bl in that it AGo4, which associates with another form of the RNa polymerase lv silencing of expres has tandem repeats upstream that, when methylated, cause heritable complex, NRPDIB-NRPD2 AGO4 that is programmed 'with siRNAs silencing of expression. Stably hypomethylated fwa-1 epialleles have can then locate homologous genomic sequences and guide the protein been found to be generated spontaneously and in metI mutant back DRM2 which has de n sine methyltransferase activity. Targeting causing overexpression of the transcription factor FWA DRM2 to DNA sequences also involves the SWl-SNF-family chromatin and a dominant late-flowering phenotype. In contrast to B' epialleles, tein dRDl. The NRPDIB-NrPD nerate a target transcript (ssRNA) to which the AGO4-associated sirNas presence of one another in heterozygotes.49,51.However, introduc- drm2 mutants and agod mutants, it is possible that DNA methylation (blue tion of unmethylated transgenic copies of FWA by Agrobacterium circles)also stimulates siRNA generation and reinforces silencing. b, DNA tumefaciens-mediated transformation leads to efficient de novo silen- methyltransferase structure and function. Plant and mammalian genomes DRM2 and the RNa-directed DNA-methylation RNAi pathway ing of the incoming transgene, in a process that depends on both encode homologous cytosine methyltransferases, of which there ar lants and two in mammals. A thaliana meti and homo Fig 3). Intriguingly, an unmethylated FWA transgene obtained after sapiens(human) DNMTI both function to maintain CG methylation after transformation into a drm2 mutant does not become remethylated DNA replication, through a preference for hemimethylated substrates after outcrossing to wild-type A thaliana. This finding suggests ology(Bah) hat, during the transformation process, there is a'surveillance' win- domains of unknown function. De novo DNA methylation is carried out dow when the incoming FWA transgene is competent to be silenced. by the homologous proteins DRM2(in A thaliana) and DNMT3A and DNMT3B(both in H. sapiens). Despite s thol winos these proteinate during transformation, but introduction of FWA into DRM2/drm2 A tumefaciens targets the female gametophyte(which is haploid) cytosine methyltransferase domain are ordered differently in DRM2 and heterozygotes revealed that the silencing window must be present after the dnmt3 proteins. Plants also have another class of methyltransfera fertilization" Structure-function analysis of an FWA transgene showed which is not found in mammals. CMT3 functions together with DRM2 to that the upstream tandem repeats are necessary and sufficient for trans- maintain non-CG methylation PwWP, Pro-Trp-Trp-Pro motif; formation-dependent silencing and were also found to produce homol- ogous siRNA. Interestingly, the efficiency by which an incoming @2007 Nature Publishing Group
in trans2 . One example is paramutation, which was discovered in plants and is defined as allelic interactions that cause a meiotically heritable change in the expression of one of the alleles8 . Trans-phenomena similar to paramutation have also been described in mammals, including at a chimaeric version of the mouse Rasgrf1 (Ras protein-specific guaninenucleotide-releasing factor 1) locus that contained the imprinting control region from the insulin-like growth factor 2 receptor gene44. One of the best-studied paramutation systems is the maize (Zea mays) locus b1, which encodes a transcription factor that is required for accumulation of the pigment anthocyanin8 . The paramutagenic epiallele Bʹ, which causes light pigmentation, arises spontaneously at a low frequency from its paramutable parent allele B-I, which causes dark pigmentation45. Bʹ epialleles convert B-I alleles to Bʹ epialleles when heterozygous with 100% penetrance, and the newly created paramutated Bʹ epialleles can pass on their silent state in subsequent crosses45 (Fig. 3). Bʹ epialleles are transcribed at one-twentieth to one-tenth the rate of B-I alleles but have identical gene sequences45,46. Fine-structure recombination mapping of alleles resulting from a cross between individuals with paramutagenic alleles and those with neutral alleles (which cannot participate in paramutation) enabled the sequences required for paramutation to be defined; these sequences are present as an array of 7 tandem 853-base repeats, which is located ~100 kilobases (kb) upstream of b1 (refs 45, 46). The sequences are present as a single copy in neutral alleles. Recombinant alleles with three repeats show partial paramutational ability, whereas alleles with seven repeats are fully active in paramutation45,46. These repeats were also shown to have a closed chromatin structure and more cytosine methylation in Bʹ epialleles than in B-I alleles46. However, for Bʹ, cytosine methylation was found to be established after the silent state, so it is unlikely to be the cause46. There are several models of trans-communication between alleles, including physical pairing of alleles and transmission of an RNA signal. A model for paramutagenic interactions being mediated by siRNA is supported by the finding that a genetic suppressor of paramutation, mediator of paramutation1 (mop1), encodes the maize orthologue of the RNAdependent RNA polymerase RDR2 (refs 47, 48). So far, siRNAs homologous to the tandem repeats upstream of Bʹ have not been reported, although such repeats are commonly associated with small RNAs20,49. The mop1 gene is also required for silencing transgenes and Mutatorlike transposons, indicating that RNA-dependent RNA polymerases and siRNAs have a role in heterochromatic silencing in monocotyledonous plants50. The detailed relationships between siRNAs, chromatin structure at the repeats upstream of Bʹ, and the ability to transfer epigenetic states will be intriguing to determine. The A. thaliana gene FWA has similarities to maize b1 in that it has tandem repeats upstream that, when methylated, cause heritable silencing of expression51. Stably hypomethylated fwa-1 epialleles have been found to be generated spontaneously and in met1 mutant backgrounds39,40,51, causing overexpression of the transcription factor FWA and a dominant late-flowering phenotype51. In contrast to Bʹ epialleles, methylated and unmethylated fwa epialleles are not influenced by the presence of one another in heterozygotes23,49,51. However, introduction of unmethylated transgenic copies of FWA by Agrobacterium tumefaciens-mediated transformation leads to efficient de novo silencing of the incoming transgene, in a process that depends on both DRM2 and the RNA-directed DNA-methylation RNAi pathway22,23 (Fig. 3). Intriguingly, an unmethylated FWA transgene obtained after transformation into a drm2 mutant does not become remethylated after outcrossing to wild-type A. thaliana22,23. This finding suggests that, during the transformation process, there is a ‘surveillance’ window when the incoming FWA transgene is competent to be silenced. A. tumefaciens targets the female gametophyte (which is haploid) during transformation, but introduction of FWA into DRM2/drm2 heterozygotes revealed that the silencing window must be present after fertilization49. Structure–function analysis of an FWA transgene showed that the upstream tandem repeats are necessary and sufficient for transformation-dependent silencing and were also found to produce homologous siRNA49. Interestingly, the efficiency by which an incoming a b NRPD2 DRM2 DRM2 NRPD2 NRPD1B AGO4 AGO4 DCL3 AGO4 DRD1 NRPD1A RDR2 ssRNA siRNAs siRNA ssRNA dsRNA DNA A. thaliana DRM2 H. sapiens DNMT1 H. sapiens DNMT3A A. thaliana MET1 H. sapiens DNMT3B A. thaliana CMT3 BAH domain Cytosine methyltransferase UBA domain PWWP domain Zinc finger Chromodomain Figure 2 | RNA-directed DNA methylation. a, Putative pathway for RNAdirected DNA methylation in A. thaliana. Target loci (in this case tandemly repeated sequences; coloured arrows) recruit an RNA polymerase IV complex consisting of NRPD1A and NRPD2 through an unknown mechanism, and this results in the generation of a single-stranded RNA (ssRNA) species. This ssRNA is converted to double-stranded RNA (dsRNA) by the RNA-dependent RNA polymerase RDR2. The dsRNA is then processed into 24-nucleotide siRNAs by DCL3. The siRNAs are subsequently loaded into the PAZ- and PIWI-domain-containing protein AGO4, which associates with another form of the RNA polymerase IV complex, NRPD1B–NRPD2. AGO4 that is ‘programmed’ with siRNAs can then locate homologous genomic sequences and guide the protein DRM2, which has de novo cytosine methyltransferase activity. Targeting of DRM2 to DNA sequences also involves the SWI–SNF-family chromatinremodelling protein DRD1. The NRPD1B–NRPD2 complex might generate a target transcript (ssRNA) to which the AGO4-associated siRNAs can hybridize. Given that siRNAs homologous to some loci are absent in drm2 mutants and ago4 mutants, it is possible that DNA methylation (blue circles) also stimulates siRNA generation and reinforces silencing. b, DNA methyltransferase structure and function. Plant and mammalian genomes encode homologous cytosine methyltransferases, of which there are three classes in plants and two in mammals. A. thaliana MET1 and Homo sapiens (human) DNMT1 both function to maintain CG methylation after DNA replication, through a preference for hemimethylated substrates, and both have amino-terminal bromo-adjacent homology (BAH) domains of unknown function. De novo DNA methylation is carried out by the homologous proteins DRM2 (in A. thaliana) and DNMT3A and DNMT3B (both in H. sapiens). Despite their homology, these proteins have distinct N-terminal domains, and the catalytic motifs present in the cytosine methyltransferase domain are ordered differently in DRM2 and the DNMT3 proteins. Plants also have another class of methyltransferase, which is not found in mammals. CMT3 functions together with DRM2 to maintain non-CG methylation. PWWP, Pro-Trp-Trp-Pro motif; UBA, ubiquitin associated. 420 INSIGHT REVIEW NATURE|Vol 447|24 May 2007
NATUREIVol 447 24 May 2007 INSIGHT REVIEW FWA transgene is silenced can be influenced by the methylation state with fwa-I(ref. 49).Hence, recruitment of siRNA machinery to a locus of endogenous FWA". Whereas introduction of an FWA transgene is not always sufficient for RNA-directed DNA methylation and prob into a background in which the endogenous FWA gene is methylated ably also requires modifications of chromatin leads to extremely efficient silencing of the transgene, transformation Maintenance of silencing at FWA depends mainly on CG methylation into the fwa-1 background, which contains an unmethylated endogen- because metl alleles generate hypomethylated fwa-l epialleles at a high ous gene, leads to inefficient methylation and silencing of the Fwa frequency.. Although the tandem repeats upstream of FWA are als transgene(Fig 3). Furthermore, an introduced transgene can occa- methylated at non-CG sequences, loss of this methylation in drmI drm2 sionally cause silencing of the unmethylated fwa-1 endogenous gene". cmt3 triple mutants does not cause reactivation and late flowering' These results reveal extensive communication between the transgenic Genome-wide analysis of cytosine methylation and transcription in and endogenous FWA gene copies during transformation, and this drmI drm2 cmt3 triple mutants has identified genes with methylated communication depends on the DNA methylation state of the endogen- promoters, the expression of which depends strongly on DRM-and ous gene. Surprisingly, these differences between fwa-I epialleles are CMT3-mediated non-CG methylation. These methylated genes might SiRNAs accumulate equally in plants with wild-type FWA and those triple mutants, which include misshapenleaotypes of drmI drm2cmt3 not accounted for by siRNA production, because the repeat-derived be responsible for the developmental phene 0% Spontaneous conversion 99999 Crossing B"with B-4 Heterozygote 八八八八 八 Wild-type TFWA endogenous gene (NRPD1A)(NRPD1B)(DRD1 A tumefaciens-mediated transformation 99 Transgene Figure 3 Trans-epiallele interactions at b1 and FWA. a, Paramutation at is methylated at cytosine bases in a pair of tandem repeats in its the bl locus in maize. The B-I allele(pink) of the bl gene in maize has an promoter, silencing its expr Mutations that decrease dna pstream tandem-repeat region(coloured arrows)and spontaneously hylation give rise to hypomethylated fwa-I epialleles(blue), t e more heavily methylated at cytosine bases in the repeat region and are late flowering. Introduction of an unmethylated FWA trangene py of B-I by crossing of maize plants, the B-1 allele is paramutated results in efficient methylation and silencing of the incoming transge to a silenced b' state with 100% penetrance Trans-communication depends on DRM2, AGO4, DCL3, RDR2, NRPDIA, etween epialleles requires MoPl, the maize homologue of A. thaliana NRPDIB and DRDl. By contrast, transformation of an fwa-1 background DR2, suggesting that siRNA-mediated silencing might be involved in the results in inefficient silencing of the transgene, indicating that the nversion of B-I to B: b, De novo silencing of FWA transgenes in wild methylation state of endogenous FWA is important for transgene type and fwa-1 A thaliana. The FWA gene in wild-type A thaliana(pi 4 @2007 Nature Publishing Group
FWA transgene is silenced can be influenced by the methylation state of endogenous FWA49. Whereas introduction of an FWA transgene into a background in which the endogenous FWA gene is methylated leads to extremely efficient silencing of the transgene, transformation into the fwa-1 background, which contains an unmethylated endogenous gene, leads to inefficient methylation and silencing of the FWA transgene49 (Fig. 3). Furthermore, an introduced transgene can occasionally cause silencing of the unmethylated fwa-1 endogenous gene49. These results reveal extensive communication between the transgenic and endogenous FWA gene copies during transformation, and this communication depends on the DNA methylation state of the endogenous gene. Surprisingly, these differences between fwa-1 epialleles are not accounted for by siRNA production, because the repeat-derived siRNAs accumulate equally in plants with wild-type FWA and those with fwa-1 (ref. 49). Hence, recruitment of siRNA machinery to a locus is not always sufficient for RNA-directed DNA methylation and probably also requires modifications of chromatin. Maintenance of silencing at FWA depends mainly on CG methylation, because met1 alleles generate hypomethylated fwa-1 epialleles at a high frequency39,40. Although the tandem repeats upstream of FWA are also methylated at non-CG sequences, loss of this methylation in drm1 drm2 cmt3 triple mutants does not cause reactivation and late flowering37. Genome-wide analysis of cytosine methylation and transcription in drm1 drm2 cmt3 triple mutants has identified genes with methylated promoters, the expression of which depends strongly on DRM- and CMT3-mediated non-CG methylation11. These methylated genes might be responsible for the developmental phenotypes of drm1 drm2 cmt3 triple mutants, which include misshapen leaves and reduced stature27,37. a b Heterozygote B-I ~10% Spontaneous conversion Paramutation B’ Crossing B’ with B-l B’ Wild-type endogenous gene FWA Transgene FWA fwa-1 FWA endogenous gene FWA Transgene fwa-1 FWA endogenous gene A. tumefaciens-mediated transformation Wild-type endogenous gene Transgene B-I B’ B’ FWA FWA DRM2 NRPD1B DCL3 NRPD1A DRD1 AGO4 RDR2 MOP1 Figure 3 | Trans-epiallele interactions at b1 and FWA. a, Paramutation at the b1 locus in maize. The B-I allele (pink) of the b1 gene in maize has an upstream tandem-repeat region (coloured arrows) and spontaneously gives rise to silenced Bʹ epialleles (blue) at a low frequency. Bʹ epialleles are more heavily methylated at cytosine bases in the repeat region and are less frequently transcribed. When the Bʹ epiallele is brought together with a new copy of B-I by crossing of maize plants, the B-I allele is paramutated to a silenced Bʹ state with 100% penetrance. Trans-communication between epialleles requires MOP1, the maize homologue of A. thaliana RDR2, suggesting that siRNA-mediated silencing might be involved in the conversion of B-I to Bʹ. b, De novo silencing of FWA transgenes in wildtype and fwa-1 A. thaliana. The FWA gene in wild-type A. thaliana (pink) is methylated at cytosine bases in a pair of tandem repeats in its promoter, silencing its expression. Mutations that decrease DNA methylation give rise to hypomethylated fwa-1 epialleles (blue), which overexpress the transcription factor FWA, thereby causing late flowering. Introduction of an unmethylated FWA transgene (green) by A. tumefaciens-mediated transformation of wild-type plants results in efficient methylation and silencing of the incoming transgene. This process depends on DRM2, AGO4, DCL3, RDR2, NRPD1A, NRPD1B and DRD1. By contrast, transformation of an fwa-1 background results in inefficient silencing of the transgene, indicating that the methylation state of endogenous FWA is important for transgene silencing. 421 NATURE|Vol 447|24 May 2007 INSIGHT REVIEW
INSIGHT REVIEW NATURE Vol 447 24 May 2007 In contrast to the independently segregating epialleles that arise in DNA glycosylase-lyase DEMETER(DME), which can directly excise backcrossing drmI drm2 cmt3 triple mutants to wild-type plants or differentiating extra-embryonic tissue, this mechanism does not neces- introducing either DRM2 or CMT3 by transformation immediately state remethylation of FWA. This is in contrast to mammals, in which rescues these morphological phenotypes". This finding suggests that demethylation of imprinted genes occurs in primordial germ cells(the non-CG methylation can be more easily re-established, possibly allowing cells that ultimately generate the germ line) and is followed by germline- flexible regulation of genes. However, it is unclear how commonly this specific remethylation and silencing(see page 425). Other imprinted type of regulation is used, because few examples of DNA-methylation- genes such as MEA and FERTILIZATION-INDEPENDENT SEED 2 also regulated plant genes have been described. have cytosine-methylated regions in their promoters that are associated with maternally restricted expression.However, only for FWA has Silencing through time and development it been shown that differential methylation of particular sequences is The life cycles of plants differ from those of animals in that the prod- required for the regulation of imprinting ss. ucts of meiosis undergo mitotic proliferation to form multicellular Cytosine demethylation is also likely to have an important role in gametophytes(that is, the embryo sac and the pollen in flowering the control of silencing in situations other than gametophytic genera The embryo sac(female) contains an egg cell, which is haploid, tion and imprinting. DMe belongs to a small A. thaliana gene family is fertilized by a sperm nucleus, which is also haploid, to form a that includes the somatically expressed gene REPRESSOR OF SILEN- embryo. A second sperm nucleus fertilizes the central cell, which CING I(ROSI)..Mutations in ROSI l en shown to increase is diploid, to form triploid endosperm, an extra-embryonic tissue that RNA-directed DNA methylation, and ROSI has been shown to func- endosperm show parent-of-origin-dependent monoallelic expression, ies have defined a long-sought cytosine demethylation pathway, and or imprinting, which is important for proper seed development. For they raise many interesting questions. For example, to what extent are example, in A thaliana, the tandem repeats of maternal FWA alleles are genomic methylation patterns balanced by the targeting of de novo specifically demethylated in the central cell and the endosperm, lead DNA methyltransferases and DNA glycosylases? Furthermore, there ng to expression of FWA in these tissues. Demethylation and activa- are indications of a similar mechanism for cytosine demethylation in tion of FWA depend on maternal expression of the gene encoding the vertebrates Adult plant Vegetative c Flowering Ovary Anther Germination Flower FLC Embryo sac Mitosis Pollen Figure 4 I PeG-protein-mediated silencing throughout the A thaliana be induced by other cues. d, During flower development, the anthers cycle. The activation state of the PcG protein target FLC is illustrated d ovaries are sites of meiotic differentiation, giving rise to haploid throughout the plant life cycle. a, FLC is transcriptionally active in seeds cells known as microspores and megaspores, respectively. e, These and seedlings, preventing the plant from flowering and prolonging meiotic products undergo mitotic proliferation to form the multicellular vegetative development. b, Exposure to a long period of cold(that embryo sac and pollen gametophytes. f, PcG-protein-mediated vernalization)results in the expression of VIN3(red), which initiates repression at FLC is removed during an undefined resetting proce repression of FLC transcription, and the binding of the PcG protein VRN2, g, Then, the pollen contributes sperm nuclei to the embr ac, and these well as VRNI and LHPl(blue). In this process, chromatin at FLC is fertilize the haploid egg cell and diploid central cell (not shown), formi pigenetically modified by the trimethylation of H3K27. c, After warmer nbr n anew seed. in which flc is temperatures return, FLC repression is maintained, allowing flowering to re-expressed @2007 Nature Publishing Group
In contrast to the independently segregating epialleles that arise in met1 mutants (as a result of the stable loss of CG methylation)39,40,51, backcrossing drm1 drm2 cmt3 triple mutants to wild-type plants or reintroducing either DRM2 or CMT3 by transformation immediately rescues these morphological phenotypes27. This finding suggests that non-CG methylation can be more easily re-established, possibly allowing flexible regulation of genes. However, it is unclear how commonly this type of regulation is used, because few examples of DNA-methylationregulated plant genes have been described. Silencing through time and development The life cycles of plants differ from those of animals in that the products of meiosis undergo mitotic proliferation to form multicellular gametophytes (that is, the embryo sac and the pollen in flowering plants). The embryo sac (female) contains an egg cell, which is haploid, and this is fertilized by a sperm nucleus, which is also haploid, to form a diploid embryo. A second sperm nucleus fertilizes the central cell, which is diploid, to form triploid endosperm, an extra-embryonic tissue that has a supportive role during embryogenesis. The central cell and the endosperm show parent-of-origin-dependent monoallelic expression, or imprinting, which is important for proper seed development52. For example, in A. thaliana, the tandem repeats of maternal FWA alleles are specifically demethylated in the central cell and the endosperm, leading to expression of FWA in these tissues53. Demethylation and activation of FWA depend on maternal expression of the gene encoding the DNA glycosylase–lyase DEMETER (DME), which can directly excise the base 5-methylcytosine54–56. Because the endosperm is a terminally differentiating extra-embryonic tissue, this mechanism does not necessitate remethylation of FWA53. This is in contrast to mammals, in which demethylation of imprinted genes occurs in primordial germ cells (the cells that ultimately generate the germ line) and is followed by germlinespecific remethylation and silencing (see page 425). Other imprinted genes such as MEA and FERTILIZATION-INDEPENDENT SEED 2 also have cytosine-methylated regions in their promoters that are associated with maternally restricted expression55,57. However, only for FWA has it been shown that differential methylation of particular sequences is required for the regulation of imprinting53,58. Cytosine demethylation is also likely to have an important role in the control of silencing in situations other than gametophytic generation and imprinting. DME belongs to a small A. thaliana gene family that includes the somatically expressed gene REPRESSOR OF SILENCING 1 (ROS1) 54,59. Mutations in ROS1 have been shown to increase RNA-directed DNA methylation, and ROS1 has been shown to function as a cytosine demethylase56,59,60. Together, these exciting discoveries have defined a long-sought cytosine demethylation pathway, and they raise many interesting questions. For example, to what extent are genomic methylation patterns balanced by the targeting of de novo DNA methyltransferases and DNA glycosylases? Furthermore, there are indications of a similar mechanism for cytosine demethylation in vertebrates61,62. Flowering Flower Vegetative development Adult plant Germination Fertilization Seed Seedling Megaspore Embryo sac Pollen Microspore Resetting Meiosis Mitosis Mitosis Anther FLC VIN3 FLC FLC Ovary Vernalization FLC FLC × FLC × × a b c d e f g VRN2 VRN2 LHP1 LHP1 VRN1 VRN1 LHP1 VRN1 VRN2 Figure 4 | PcG-protein-mediated silencing throughout the A. thaliana life cycle. The activation state of the PcG protein target FLC is illustrated throughout the plant life cycle. a, FLC is transcriptionally active in seeds and seedlings, preventing the plant from flowering and prolonging vegetative development. b, Exposure to a long period of cold (that is, vernalization) results in the expression of VIN3 (red), which initiates repression of FLC transcription, and the binding of the PcG protein VRN2, as well as VRN1 and LHP1 (blue). In this process, chromatin at FLC is epigenetically modified by the trimethylation of H3K27. c, After warmer temperatures return, FLC repression is maintained, allowing flowering to be induced by other cues. d, During flower development, the anthers and ovaries are sites of meiotic differentiation, giving rise to haploid cells known as microspores and megaspores, respectively. e, These meiotic products undergo mitotic proliferation to form the multicellular embryo sac and pollen gametophytes. f, PcG-protein-mediated repression at FLC is removed during an undefined resetting process. g, Then, the pollen contributes sperm nuclei to the embryo sac, and these fertilize the haploid egg cell and diploid central cell (not shown), forming the embryo and endosperm (respectively) in a new seed, in which FLC is re-expressed. 422 INSIGHT REVIEW NATURE|Vol 447|24 May 2007