1993 Oxford University Press Nucleic Acids Research,1993,Vol.21,No.17 3981-3987 cDNA structure.expression and nucleic acid-binding properties of three RNA-binding proteins in tobacco: occurence of tissue-specific alternative splicing Tetsuro Hirose,Mamoru Sugita and Masahiro Sugiura* Center for Gene Research,Nagoya University,Nagoya 464-01,Japan Received June 7.1993:Accepted July 14,1993 DDBJ accession nos D16204-D16206 (incl.) ABSTRACT Three cDNAs encoding RNA-binding oteins lated from a tobacc (Nicc (C-RBD)id ()R and consist ot a cons RNP-CS)nd.nth ain of 80 ar of 61-78 me nexamer RNP-2 (5). CS-RBD has bee wn to be the acids in the C. We ha ed five RNA-binding prote dthat the wo typical CS-RDs a I an acidic amind e genes are sed in leav mount of pre-NAr ch and maize chlor oplasts (1 14)and the sp 8RN wn to be r pre- amounts of alt ch ar also iso ated in various tissues on the basis of RNP-CS an RNP-2 5 The alte e dete n pla ept fo is kn and are sugg ed whose e sion ible role rich (16)The deducedp to col whic INTRODUCTION ent pla a maize is k e regula n at the to Dna ind (20).Au of the ing RN ike se t he ef ently been r as cal f expre DNA type comple and rea ins and/o RNA fact Th RNP-CS.Hen t three cDNAs sed of e the tr tide and small eoprotein (snRNP)compl 6nucearp e-mR A proce ing and ing (3).Many RN nRN e pre various tissu eover,tw ng pro To whom correspondence should be addressed
Nucleic Acids Research, 1993, Vol. 21, No. 17 3981 -3987 cDNA structure, expression and nucleic acid-binding properties of three RNA-binding proteins in tobacco: occurence of tissue-specific alternative splicing Tetsuro Hirose, Mamoru Sugita and Masahiro Sugiura* Center for Gene Research, Nagoya University, Nagoya 464-01, Japan Received June 7, 1993; Accepted July 14, 1993 ABSTRACT Three cDNAs encoding RNA-binding proteins were isolated from a tobacco (Nicotiana sylvestris) cDNA library. The predicted proteins (RGP-1) are homologous to each other and consist of a consensus-sequence type RNA-binding domain of 80 amino acids in the Nterminal half and a glycine-rich domain of 61-78 amino acids in the C-terminal half. Nucleic acid-binding assay using the in vitro synthesized RGP-1 protein confirmed that it is an RNA-binding protein. Based on its strong affinity for poly(G) and poly(U), the RGP-1 proteins are suggested to bind specifically to G and/or U rich sequences. All three genes are expressed in leaves, roots, flowers and cultured cells, however, the substantial amount of pre-mRNAs are accumulated especially in roots. Sequence analysis and ribonuclease protection assay indicated that significant amounts of alternatively spliced mRNAs, which are produced by differential selection of 5' splice sites, are also present in various tissues. Tissue-specific alternative splicing was found in two of the three genes. The alternatively spliced mRNAs are also detected in polysomal fractions and are suggested to produce truncated polypeptides. A possible role of this alternative splicing is discussed. INTRODUCTION The gene expression in eukaryotic cells is known to be regulated at various levels. The regulation at the post-transcriptional level (e.g. at the steps involving RNA capping, RNA splicing, RNA polyadenylation, RNA transport and/or RNA stability) has recently been recognized as a critical factor for the expression of several genes in animal and fungal cells (1, 2). These events are complex and require many proteins and/or RNA factors. The heterogeneous nuclear ribonucleoprotein (hnRNP) complexes, which are composed of pre-mRNAs, 20-25 different proteins and small nuclear ribonucleoprotein (snRNP) complexes (consisting of U-snRNAs and multiple proteins), are involved in nuclear pre-mRNA processing and splicing (3). Many RNAbinding proteins related to these steps contain one or more copies DDBJ accession nos D16204-D16206 (incl.) of a conserved domain, named as, consensus-sequence type RNAbinding domain (CS-RBD) of about 80 amino acids (4). CS-RBD includes two highly conserved motifs, one an octamer, termed as ribonucleoprotein consensus-sequence (RNP-CS) and, another a hexamer, RNP-2 (5). CS-RBD has been shown to be the minimum structure for RNA-binding activity (6). We have isolated five RNA-binding proteins (or ribonucleoproteins, RNPs) from tobacco chloroplasts (7, 8). Analysis of their cDNAs and genes revealed that these proteins were nuclear encoded and contain two typical CS-RBDs and an acidic amino terminal domain (7-10). Nucleic acid-binding assay of these proteins revealed their preferential binding to poly(G) and poly(U) (11, 12). Recently similar proteins and their cDNAs were isolated from spinach and maize chloroplasts (13, 14) and the spinach 28RNP has been shown to be required for the pre-mRNA processing from the chloroplast petD gene (14). Two cDNAs encoding chloroplast RNA-binding proteins were also isolated from Nicotiana plunbaginifolia using two sets of oligonucleotides designed on the basis of RNP-CS and RNP-2 (15). In plants, except for these chloroplast RNPs, little is known about protein factors related to various nuclear RNA processing events. A maize gene whose expression was induced by abscisic acid and water stress was isolated and reported to encode a glycine-rich protein (16). The deduced protein was later pointed out to contain a sequence which comforms to RNP-CS (17). Recently the homologous cDNAs have been reported from different plant species; e.g. a maize cDNA induced by heavy metal shock (18), two cDNAs in sorghum (19) and one carrot cDNA induced by wounding (20). All of them contain CS-RBDlike sequences, although, their nucleic acid-binding properties have not been characterized. We have attempted to isolate more cDNAs encoding consensus-sequence type RNA-binding proteins from tobacco using oligonucleotide probes, designed by using RNP-CS. Here we present three cDNAs which encode consensussequence type RNA-binding proteins with high affinities for poly(U) and poly(G) but do not encode the transit peptides in their N-termini. All the three genes are expressed in leaves as well as in roots while substantial levels of alternatively spliced mRNAs are present in various tissues. Moreover, two of the three genes also display alternative splicing in tissue-specific manner. * To whom correspondence should be addressed . 1993 Oxford University Press
3982 Nucleic Acids Research,1993,Vol.21,No.17 MATERIALS AND METHODS DNA isolation and (RNP. (Fuji Photo ribonucleoproteins as shown below. Amino acid sequence of RNP-CS and flanking residues of two Total RNA isolation and Northern analysis T G R S R G F G F V T M S (N.tab ltured cell s wa s obtain Deduced nucleotide sequence (Amersham) are sh RNP-1 probe untranslaicg nuc cpm/pmol) al of Hybond GRCR-I TAGAACCA 3 tide kinase and hybridiza TGGATACI AATCTAAAGGCCAA 3 were car rts were estimated by P Sugiura () tIDNAs were pre epared by the p according to the In vitro transcription and translation omal RNA preparation he dins RGP-Ih mal RNA was prepared according to de vries et al.(25) t60g of N.ylvestris young lav dwi迪180 ml of poly buffer (50 mM was cut with Xhol/EcoRI ntil ice his suspe wa rified by DN c路A er of Whatman 3MM p er.The filtrate was 40000pmo Te N-temin protein is MARAEVE-】 gradients and centrifuged at 25000 rpm for 70 min a Nucleie acid-binding and Ribonuclease protection assav e (0 SDNA- -1.5m照 6 ruction manus RNA/ml) and kind radioactive RNA an y(C:0.24m ml) in the 1 mM PMSE with R DNA RNA as o amounts o 4C for 10 min The RNemnight to 10 of tissue RNA at with binding RNa and 100 units of RNase TI at yacrylamide gel troph wer
3982 Nucleic Acids Research, 1993, Vol. 21, No. 17 MATERIALS AND METHODS cDNA isolation and sequencing An oligonucleotide probe (RNP-1) was prepared based on the amino acid sequences of RNP-CSs of chloroplast ribonucleoproteins as shown below. Amino acid sequence of RNP-CS and flanking residues of two each: T G R S R G F G F V T M S D F Deduced nucleotide sequence: 5' ACX GGX ACX AGX CGX GGX TTT GGX TT T GTX ACX ATX TC 3' A C TC T CC C A T RNP-1 probe: 3' TGI CCI TCI TCI GCI CCI AAA CCI AAA CAI TGI AAI AG 5' T G AG A G G G T T Screening of a tobacco (Nicotiana sylvestris) XgtlO cDNA library was essentially according to the instruction manual of Hybond-N membrane (Amersham). The oligonucleotide probe was 5' endlabeled by polynucleotide kinase and hybridizations were carried out at 50°C. Positive clones were isolated and the sizes of the inserts were estimated by PCR method described by Li and Sugiura (7). Recombinant IDNAs were prepared by the plate lysate method (21). The inserts were then cut with EcoRI, subcloned into pBluescript SK+ and sequenced by dideoxy chain termination method (22). In vitro transcription and translation The DNA fragment encoding RGP-lb was prepared by PCR using its cloned cDNA as a template and primers as below: explb = 5' GCTCGAGCTGAAGTAGAATACAGTTGC 3' reverse = 5' AACAGCTATGACCATG 3' The amplified fragment was cut with XhoI/EcoRl, purified by 1% agarose gel electrophoresis and ligated after the TMV leader of TMV expression vector at the XhoI site (11). The sequence encoding RGP-lb was verified by DNA sequencing. A capped transcript was produced using a mRNA capping kit (mCAPrm kit, Stratagene) and translation was performed in a rabbit reticulocyte in vitro translation system (Promega) as described previously (12). The N-terminal sequence of in vitro synthesized protein is MARAEVE-. Nudeic acid-binding assay Binding assay was carried out essentially according to Ye and Sugiura (12). The in vitro synthesized protein labeled with 35S was mixed with 20 tul each of ssDNA-cellulose (0.75-1.5 mg DNA/ml), dsDNA-cellulose (0.75-2 mg DNA/ml), tobacco total RNA-Sepharose (0.6 mg RNA/ml), and four kinds of ribonucleotide homopolymer-Sepharose (poly(G): 0.21 mg/ml, poly(A): 0.8 mg/ml, poly(U): 0.22 mg/ml, poly(C): 0.24 mg/ml) in lml buffer B (10 mM Tris-HCI, pH 7.6, 2.5 mM MgCl2, 0.5% Triton X-100, 1 /ig/ml pepstatin, 1 mM PMSF and 0.1 to 2.0 M NaCl). The mixture was incubated at 4°C for 10 min with gentle shaking. The beads were washed successively once with buffer B containing 2mg/ml heparin, twice with binding buffer B and twice with distilled water. Bound proteins were eluted with the loading buffer for SDS-polyacrylamide gel electrophoresis (PAGE) and 15 tl of released proteins were applied to 0.1% SDS/12.5% polyacrylamide gels. After electrophoresis at 100 V for 4-5 h, the gels were dried and exposed to an imaging plate (Fuji Photo film Co) overnight. The relative amount of bound proteins was calculated by a Fuji Bioimage analyzer BAS2000 (Fuji Photo Film Co.). Total RNA isolation and Northern analysis Isolation of total RNA from N. sylvestris leaves, roots and flowers was performed by hot phenol method (23). Total RNA from tobacco (N. tabacum BY-2) cultured cells was obtained by the small scale preparation method using mini-BeadBeater (24). RNA electrophoresis and Northern blotting to Hybond-N membrane were according to the instruction manual (Amersham). Oligonucleotide probes (UTR-la, -lb, -ic; sequences and positions are shown below) corresponding to the 3' untranslated regions of the three cDNAs were labeled at their 5' ends using polynucleotide kinase (specific activity of 2.8-5.1 x 106 cpm/pmol). UTR-la (positions 896-937 in the RGP-la cDNA sequence). 5' CCACAGTAAACCATAACGGAACTTCAACCAAACTTAGAACCA 3' UTR-lb (positions 777-820 in the RGP-lb cDNA sequence). 5' ACCAACCACACTAAAACAGTAATGGATACTAATCTAAAGGCCAA 3' TR-lc (positions 875 -914 in the RGP-lc cDNA sequence). 5' GCGATCAAAATAACTAAAATCCACATCTTCTCAATTATCT 3' Hybridization and washing were done according to the oligonucleotide protocol of ZETA-PROBE instruction manual (Bio-Rad). Hybridized and washed membranes were exposed to an imaging plate for 48 h and analyzed by a Fuji Bioimage analyzer BAS2000. Polysomal RNA preparation Polysomal RNA was prepared according to de Vries et al. (25). About 60 g of N.sylvestris young leaves were homogenized in liquid nitrogen by a motor and pestle. The resultant leaf powder was gently mixed with 180 ml of polysome buffer (50 mM Tris-HCI, pH 9.0, 50 mM MgCl2, 25 mM EGTA, 250 mM NaCl, 1% Nonidet P40) and gently ground until ice pieces completely disappeared. This suspension was centrifuged at 12000 rpm for 15 min and the supernatant was filtered through single layer of Whatman 3MM paper. The filtrate was overlaid onto 60% sucrose cushion and centrifuged at 40000 rpm for 3 h at 2°C. The polysome/ribosome pellets were suspended in 3 ml of gradient buffer (10 mM Tris-HCI, pH 8.5, 10 mM MgCl2, 5 mM EGTA, 50 mM NaCl) and overlaid on 10-40% sucrose gradients and centrifuged at 25000 rpm for 70 min at 2°C. After fractionation, polysomal fractions were collected and precipitated with ethanol. RNA was extracted as described above. Ribonuclease protection assay This was carried out according to the instruction manual of RPAII kit (Ambion). Plasmid DNAs were linearized with EcoRI. Radioactive antisense RNA probes were synthesized using an in vitro transcription kit (Stratagene), and purified by 6% PAGE in the presence of 7 M urea. RNAs from various tissues were treated with RNase-free DNase I to remove trace amounts of contaminating DNA. RNA probes (about 1 x 105 cpm) were hybridized overnight to 10 4g of tissue RNA at 42°C. The hybridized RNA was digested with a mixture of 0.1 units of RNaseA and 100 units of RNase TI at 37°C for 30 min. The protected fragments were separated on 6% polyacrylamide gels containing 7 M urea and analyzed by a Fuji Bioimage analyzer BAS2000
Nucleic Acids Research,1993,Vol.21,No.17 3983 The rnpCS (RGEGEVTE and rnp.2 (CEVGGL)Se in their CS-RBDs are identical among the RGP-1groupAbou probe U0 ain ar rom the RNP-CSmn in the lycine-r (RRE or RDR)ar 7,8) e inserted betw Out of ated.CDNA log and 3 were completely se d,w ever the hom of resp untranslate s tho vely uished fou totheir predicted p hthe de ed KG glycine-ric proten 1 ze (16) termin half and a glycine-rich o Nucleic acid-binding property of the RGP-1b protein leic acids.the RGP-1 ein wa em to ogy and was r as RGP- homolo eac 1b:1c=82%)(Fig.1B). (a:: A - 1e-2 e-2 the fir (the RGP-1 (.A t ide EMBL/Gen ession Dos.D16204 (RGP-). 1c-2 50 op codor a。od0 DNAs.RGP- sDNA (ca RNA (N. nd four n SDS PAGE
Nucleic Acids Research, 1993, Vol. 21, No. 17 3983 RESULTS Structure of the cDNA encoding proteins containing CS-RBDs An oligonucleotide probe (RNP-1), complementary to the deduced nucleotide sequence from the RNP-CS amino acid sequence (7, 8) was used to screen a Nicotiana sylvestris leaf cDNA library. Out of eleven positive clones isolated, cDNA inserts of six clones were completely sequenced, whereas those of the remainder were partially sequenced. These clones could be distinguished four groups according to their predicted protein structures. The cDNAs of three groups (about 0.7-1.0 kb in length) encode proteins consisting of a single CS-RBD in the Nterminal half and a glycine-rich domain in the C-terminal half (Fig. lA), which were named as RNA-binding Glycine-rich Protein-I (RGP-1) (RGP-la, Ib, ic). They do not seem to be imported into organelles as no apparent transit peptides could be detected. The fourth group is distinct from RGP-1 group as it has only 50% homology and was named as RGP-2 and not analyzed further. The three predicted RGP-1 proteins are highly homologous to each other (la:lb=81%, la:lc=80%, lb:lc=82%) (Fig. IB). A B lGW-lb no-I 3GW-lb 3GW-ic LAIP *3alycine-rich domain W-i " tGP- IC i W,1010FN B / 11A%%ll'S'g'''^''ztt''l 110,z,' *:*:* AD***** V*****R****GT*V*** ***.* S* * A ******* '**R***** E***** ***, C****** ****** ************* NiW-2 -cGlycina-ricb dimain GG**YGGGRGGR**G**GGGGGG*RBGGYGGG*GG***G*****GGGGYGN*RG Figure 1. Structure of the RGP-1 proteins deduced from their cDNA sequences. A. Domain structures of the three RGP-l proteins. Numbers indicate amino acid positions from the first methionine. B. Alignment of the three RGP-1 protein sequences and the abscisic acid induced glycine-rich protein (AAIP) sequence in maize (16). Asterisks represent identical amino acids to that of RGP-la. An arrowhead indicates the insertion site of additional sequences (see Fig. 3). The sequences are in DDBJ/EMBL/GenBank accession nos. D16204 (RGP-la), D16205 (RGP-lb), D16206 (RGP-lc). ssDNA dsDNIA Tobacco RNA I 0.1 0.5 1 2 0.1 0.5 1 2 0.1 0.5 1 2 (M) 100 27 0.2 0.2 0.2 3.3 0 0 0 100 2.4 0.7 0.2 0.2 %) poly(G) poly(A) poly(U) poly(C) 0.1 0.5 1 2 0.1 0.5 1 2 0.10.5 1 2 0.10.5 1 2 (M) B The RNP-CS (RGFGFVTF) and RNP-2 (CFVGGL) sequences in their CS-RBDs are identical among the RGP-l group. About 60% of glycine residues in the glycine-rich domain are contiguous, and single tyrosines and charged amino acid stretches (RRE or RDR) are inserted between the glycine stretches. The protein-coding regions of the cDNAs are highly conserved, however the homology of respective 5' and 3' untranslated regions is relatively low. Each of the RGP-1 proteins showed about 70% homology with the deduced amino acid sequence of an abscisic acid induced cDNA encoding glycine-rich protein in maize (16). Nucleic acid-binding property of the RGP-lb protein To examine its affinity to nucleic acids, the RGP-lb protein was synthesized in vitro and its binding to DNAs and RNAs was assayed. Labeled RGP-lb protein was incubated with nucleic A RGP-l., r I I RGP-1-2 I I RGP-1 b I I RGP-1b-2 RGP-lo-I r I 1 I RGP-1o2 I -I RGP-1£0-3 1 bp 60 ~ZCCGTI'mTCGGTCGCAGAGCAGAGATCGGAATCCGAGCCS = CCCGTTTAC g rCGT1GTCGGTCGCAGAGCAGAGATCGGAATCCGAGCC% IGCTTCGTPTAC 120 SCCCTCTFACTG?NWTATCTGTrACTGTPACTATGTCTCTCTCT'PTTACTGTT'IGCT 1|0 IA ACTGTrCTGPACTATrrGATACTATT=AGCTCT=AACGGTACGT=CCGT **************************************** _CACmTATGrATAGAACTTPTrGTCTTAG 169 240 CTTACTrCTCIlTiATAAAAGAGATGAAGATAGATCGGTTATTTCTATTTCTATTTTTT 300 GGTCCAGTTTGATTGGAAT.AATG?rAGT'TCCGAGCAAGCTCGACATAGTCCTTrTGCT 360 W ¶TCT TGACAAGTGTGGATTGGCAGATCTGTGA I = WGGWATIG lC-2 ATT7TACj 369 C Ic-I lc-2 lc-3 stop stop codon codon lc-2 lc-3 lc-2 lc-3 lc-2 lc-3 lc-2 1C-2 1C-2 9 bn_ 100 21 5.92.11.1 0 0 0 0 32 215.92.1 0 0 0 0 (%M Figure 2. Nucleic acid-binding properties of the RGP-lb protein. The in vitro synthesized [35S] protein was mixed with various nucleic acids, ssDNA and dsDNA (calf thymus), total RNA (N.sylvestris), and four kinds of ribonucleotid& homopolymers, at indicated salt concentrations (M). Bound proteins were analyzed by SDS-PAGE. Numbers under the respective lanes indicate the relative amount (%) of bound proteins to that of the input protein (lanes I). Figure 3. Additional sequences in the cDNAs. A. Sequenced cDNAs. RGP-la-l and RGP-lc-I were derived from PCR amplification using respective specific primers corresponding to the 3' untranslated regions and the N-terminal protein coding regions. Open boxes represent exons from start codons to stop codons and lines with bp show additional sequences. B. Alignment of the nucleotide sequences of the additional sequences in RGP-lc-2 and RGP-lc-3. Asterisks indicate identical nucleotides. Bordering sequence between exons and introns, GT and AG, are underlined. The first stop codon in exon reading frames is boxed. C. Schematic model of the formation of the three RGP-lc cDNAs. Boxes are as described in Fig. 1A. I YJkk*D***+.Lt*-IJ.***Mg**ZQ**ANFCQVID**V*T******4LLIASAI&S*NMM****N***KE*********Q****-G I ********F***RRZ*G**Y*C4GY*G*=Z*G***YGG*****-RD§t-****D*(;Y*GD***RYSRGGGDSDGNWRN ******Y*GM*G**YG**RRDGGYGG***Y*GRRZ=********Y*GRRB=******GGGVRD la jk.tL 00939393M I f If f AEL mmims A AIV1111,4.%.%.%.% I
3984 Nucleic Acids Research,1993,Vol.21,No.17 acid-cellulose se beads at 0.1-2.0 M NaCI.Afte Additional sequences found in some of the cDNAs (Fig.2)We there and is in h ding RNP n RVP-CS the region in CS-RBD.This site chlo A店品 xamined th tron nce. Direct sequene of the PCR fragme y to po nce matche rich regions of RNA molecules. yto P-le genomi The CDNAs that these gene only one ron,at least ir LR LRLR with different 25S 185> e(169b9 o R-DNA onal sequence Transcript levels of the RGP-1 genes 1a 1b 1c Norther blot analysis of tobacco leaf and root RNAs was erton ess con d than the d) three genes es (Fig.4).In gly,twe r25 IRNAs. and MLR F CPrY 16 MLR F C PrY 1c LR FCPrYM 605上- 210- 12 回 5.De (x174 RF-DNAige with ea RNA:R.RNA; RNA:C. RN tion of with
3984 Nucleic Acids Research, 1993, Vol. 21, No. 17 acid-cellulose or -Sepharose beads at 0.1-2.0 M NaCl. After washing with buffer containing heparin, the bound proteins were separated by SDS-PAGE. The RGP-lb protein remains bound to both ssDNA- and dsDNA-cellulose up to 0.1 M NaCl and to total tobacco RNA up to 1 M NaCl (Fig. 2). We therefore concluded that the RGP-lb protein (and probably la and lc proteins) is an RNA-binding protein. However, the relative amount of protein bound to ssDNA at 0.1 M NaCl is higher than that bound to total tobacco RNA (Fig. 2). This suggests that the RGP-1 protein binds nonspecifically to nucleic acids at a low salt concentration whereas its binding to a subpopulation of tobacco total RNA is more specific. We next examined the binding properties of their proteins to poly(G), poly(A), poly(U) and poly(C). Fig. 2 shows that the RGP-lb protein binds specifically to poly(G) and poly(U) up to 2 M NaCl, suggesting that it binds specifically to G/U rich RNA species and/or G/U rich regions of RNA molecules. L R L R L R 25S > 18S > _5 kb -.<1.1 . 0 -<O la 1b 1c Figure 4. Northern blot analysis of transcripts from the three protein (la, lb and lc) genes. N.sylvestris leaf (L) and root (R) RNA (10 ,ug each), and oligonucleotide probes (UTR-la, -lb,-lc) that are complementary to parts of the 3' untranslated regions were used. Size markers are tobacco 25S and 18S rRNAs. Additional sequences found in some of the cDNAs Four cDNAs among the analyzed cDNAs were found to contain an additional sequence of 169-369 bp with respect to the corresponding RGP-1 cDNA (Fig. 3A). The insertion of these sequences is at the same position and is in between the region encoding RNP-2 and RNP-CS in CS-RBD. This site also corresponds to the insertion site of an intron in the genes encoding chloroplast RNA-binding proteins (8-10). Furthermore these additional sequences contain the GT-AG consensus bordering sequence and A/T rich nucleotide composition typical to the intron sequence of dicot plants (26). Therefore the additional sequence is likely to be an unprocessed intron sequence. Direct sequencing of the PCR fragments (amplified from tobacco genomic DNA using RGP-lc specific primers) also comfirmed that the additional sequence matches perfectly to the RGP-lc genomic sequence. These results indicate that the additional sequence is derived from an intron and that these cDNAs were synthesized from pre-mRNAs. Furthermore, it is suggested that these genes have only one intron, at least in the coding region. Interestingly, the additional sequence with different sizes (169 bp and 369 bp) were found in a RGP-lc population (Fig. 3A). The additional sequence (169 bp) of the RGP-lc-3 cDNA matches perfectly with the sequence 169 bp 5' to the additional sequence (369 bp) of RGP-lc-2 cDNA (Fig. 3B). Transcript levels of the RGP-1 genes Northern blot analysis of tobacco leaf and root RNAs was performed using oligonucleotide probes corresponding to the 3' untranslated region of the three cDNAs (the 3' untranslated region is less conserved than the coding region). All three genes were found to be transcribed in both tissues but their transcript levels were higher in roots than in leaves (Fig. 4). Interestingly, two different transcripts (0.7 kb and 1.1 kb) from the three genes were detected both in leaves and roots. Based on the size of these ML R F CI 1057 - 3 :a 340 _ 294 210 - 162 - 79- PrY I lb ML R F C PrY 1057- .- <471 0 -I <282 210 - - 162 - 0: < 127 <357 < 249 <: 180 473 Ir ,&rT C;T Ali ? ::660 pob 597 pre-rn R NA 357 | spliced 180 249 180 SPhued 11 :7373 180 ~~~~~~~~~~~~~splice'd '79 - c LLR FC PrYM - _1057 - -210 ff -162 1 29 - * a 1 0e - q - eat -<:66 liv GT A GTEA . L: I 526 | 528 ; eprobe. 471 66 preo-mRNA . :282 127 6 spliced; 66 spliced 11 .276 107 -1- : -~- pfiB -~~ ~~ probE .pre-mRNA - 129 . spliced 19 spliced Figure 5. Detection of mRNA species from the RGP-la (la), RGP-lb (lb) and RGP-lc (1c) genes by ribonuclease protection assay. Bands indicated by arrowheads with length in nt represent protected fragments from 10 lAg of N. sylvestris RNA hybridized with an antisense RNA probe. M, size markers (+x174 RF-DNA digested with Hincd); L, leaf RNA; R, root RNA; F, flower RNA; C, cultured BY-2 cell RNA; Y, yeast RNA as control; Pr, RNA probe. The experimental design is shown below. Arrows represent antisense RNA probes corresponding to a 3' portion of exon 1 (El) + an intron (bold line) + a 5' portion of exon 2 (E2) + a 63 nt portion of pBluescript sequence (double line): la, RGP-la-2 (660 nt, positions 74-670 + 63 nt), lb, RGP-lb-2 (526 nt, positions 1-471 + 55 nt) and lc, RGP-lc-2 (668 nt, positions 11-615 + 63 nt). Solid lines designate protected fragments with length in nt. Asterisks indicate the first stop codons in introns
Nucleic Acids Research,1993.Vol.21,No.17 3985 s.it is ted that the kb rnas are the pre mature mRNAs.Northern se of RGP e results showe only 1.I kb RNAs are the and flower but not in le esan cultured cells,indicatin s and abscisic acid 1b) ed fre ente I th on the gen exp ositions of po ites in th was prepared from thes h of nativel chan the we RGP-la (and lc)but no The se r su ate that the al e spli ng by matively plice Pre-mRNA splicing of the RGP-1 gene in truncate As described abov isolated two different rgp-lc cDNa upo ddit ent 69b Detection of the alternatively spliced mRNA in polysomal GP-le-2(see Fig 3.Te169b p( e is foll. 3B)This s her two kinds from ar ctio rep RGP-1 Ronuclease pro was perfo med i rib ion (Fi 6A)Both fully RNA)and total RNAs fron while the mRNA 6B).Thi eratively spliced RN learly alternative mRNA into es. When the mRNA on er xt to the fourt he distalnl DISCUSSION We have isolated thre related cDNAs en oding cons equence type RNA- inding proteRBD N.sylvestris.The polysomes rich main towards C-terminal half but no transi are h highly conserved.h ong glyci B M L P Pr Y GGGRREGGG are nly preser with fo nd in g ich 28).A glycine in th on c A 29) dA2/B1(30),in the inn in fr plasts (8).In the fore the oteins migh sinthe case of hnRNP AI pro Figure 6.Detec in polys A.Fract n has confirm bindin f 10 mM 17 with s differ At a low salt concentra on the al RN y of bound RNA (P salt ration it b und to RNA (Fig.2).Thi
Nucleic Acids Research, 1993, Vol. 21, No. 17 3985 two transcripts, it is suggested that the 1.1 kb RNAs are the premRNAs and 0.7 kb RNAs are the mature mRNAs. Northern blot analysis using an oligonucleotide probe corresponding to a part of the intron also confirmed that only 1.1 kb RNAs are the intron-containing RNAs (data not shown). Next we analyzed the effect of drought stress and abscisic acid on the gene expression of the RGP- 1 genes. Young tobacco plants were either sprayed with 0.1 mM abscisic acid or desiccated by air drying. After 12 and 24 h total RNA was prepared from these plants and analyzed by northern hybridization using the above oligonucleotide probes. However no obvious changes were detected for the transcript levels of each RGP-1 gene (data not shown). Pre-mRNA splicing of the RGP-1 genes As described above, we isolated two different RGP-lc cDNAs which contain the additional sequence of different sizes, 169 bp (the 5' half of the intron) in RGP-lc-3 and 369 bp (the full intron size) in RGP-lc-2 (see Fig. 3). The 169 bp sequence is followed by GT, the 5' intron boundary sequence, in the 369 bp intron (see Fig. 3B). This suggests that the RGP-lc-3 cDNA is derived from an alternatively spliced mRNA, due to alternative selection of the 5' splice site (Fig. 3C). In order to confirm this suggestion, ribonuclease protection assay was performed using an antisense RNA probe synthesized from a part of RGP-lc-2 cDNA (derived from the pre-mRNA) and total RNAs from tobacco tissues. It is observed that pre-mRNA, alternatively spliced RNA and fully spliced RNA are present in all four tissues examined, viz. leaves, roots, flowers and cultured cells (Fig. 5, ic). A saimw ttwuon - Fraction numbers B M L P Pr Y 1057_ 210- 162- <605 <276 -Au -K<129 <107 79- Figure 6. Detection of the RGP-lc mRNAs in polysomes. A. Fractionation of polysomes/ribosomes from N. sylvestris leaves by sucrose density gradient in the presence of 10 mM MgC12 (solid line) and 200 mM EDTA (dashed line). Fractions 8 to 17 were collected and polysomal RNA was extracted. B. Ribonuclease protection assay. Bands indicated by arrowheads with length in nt represent protected fragments from 10 Ug of N.sylvestris polysomal RNA (P) hybridized with the RGP-lc-2 antisense RNA probe. Other details are as in the legend to Fig. 5. Ribonuclease protection assay was also perf6rmed for RGP-la and lb transcripts as in the case of RGP-lc. The results showed that alternatively spliced RGP-la and lb mRNAs are present, in roots and flowers but not in leaves and cultured cells, indicating that alternative splicing occurs in tissue-specific manner (Fig. 5, la and lb). Based on the size of protected fragments and the positions of potential 5' splice sites (GT) in the introns, in RGP-la two alternatively spliced RNAs are likely to be present in roots and flowers whereas in the case of RGP-lb only one alternatively spliced RNA species is probably present. The substantial amounts of pre-mRNAs accumulated in RGP-lb (and ic) but not in RGP-la. These results indicate that the alternative splicing by differential selection of 5' splice sites generally occurs in all premRNAs from the three RGP-l genes. These alternatively spliced mRNAs from all three genes might result in truncated polypeptides upon translation (see Fig. 3B, 3C). Detection of the alternatively spliced mRNA in polysomal fractions To find out whether two kinds of polypeptides are in fact produced due to alternative splicing of the RGP-lc pre-mRNA, we prepared polysomal fractions from tobacco young leaves and analyzed the RGP-lc mRNA species in the polysomal RNA pool by ribonuclease protection assay (Fig. 6A). Both fully and alternatively spliced mRNAs were found in the polysomal fraction while the pre-mRNA could not be detected (Fig. 6B). This result clearly indicates that the alternatively spliced mRNA is transported from the nucleus to the cytoplasm and incorporated into polysomes. When the alternatively spliced mRNA is translated, a termination codon emerges next to the fourteenth codon from the distal 5' splicing site and a short polypeptide of 50 amino acids is expected to be produced (see Fig. 3C). DISCUSSION We have isolated three related cDNAs encoding consensussequence type RNA-binding proteins from N. sylvestris. These predicted proteins include a single CS-RBD towards N-terminal, a glycine-rich domain towards C-terminal half but no transit peptide (Fig. 1A). They are homologous to each other and also to a maize protein induced by abscisic acid (16) (Fig. 1B). CSRBDs are highly conserved, however, homology among glycinerich domains is lower than that of CS-RBDs. Repetitive units like GGGGYGGG and GGGRREGGG are commonly present in the glycine-rich domain of these proteins. Glycine stretches with tyrosines are also found in glycine-rich proteins lacking CSRBD in tobacco and Arabidopsis (27, 28). A glycine-rich domain is found in the several RNA-binding proteins, in the C-terminal region of animal hnRNPs Al (29) and A2/B1 (30), in the inner region of animal nucleolin (31), the spacer region of the cp29 protein from tobacco chloroplasts (8). In the hnRNP Al protein, this domain has been shown to enhance its affinity to RNA (32). Therefore, the glycine-rich domain in the RGP-l proteins might also function as in the case of hnRNP Al protein. Our nucleic acid-binding assay using the in vitro synthesized RGP-lb protein has confirmed that it really is an RNA-binding protein, however, the strength of binding to each nucleic acid is different. At a low salt concentration the quantity of bound protein was larger to ssDNA than to tobacco total RNA, whereas at a higher salt concentration it bound specifically to RNA (Fig. 2). This suggests that this protein binds to specific sequences I11i~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ t I\ -* polysomes3. 0I510 152(C. 5 10 15 2C I 0