|
|
||||||||
|
First published online May 11, 2007; 10.1104/pp.107.100396 Plant Physiology 144:1247-1255 (2007) © 2007 American Society of Plant Biologists Distinct Expression Patterns of Natural Antisense Transcripts in Arabidopsis1,[C],[W]Max Planck Institute for Developmental Biology, Department of Molecular Biology, D72076 Tuebingen, Germany (S.R.H., J.U.L., D.W., M.S.); and Center for Genome Research and Biocomputing and Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331 (J.S.C., K.D.K., J.C.C.)
It has been shown that overlapping cis-natural antisense transcripts (cis-NATs) can form a regulatory circuit in which small RNAs derived from one transcript regulate stability of the other transcript, which manifests itself as anticorrelated expression. However, little is known about how widespread antagonistic expression of cis-NATs is. We have determined how frequently cis-NAT pairs, which make up 7.4% of annotated transcription units in the Arabidopsis (Arabidopsis thaliana) genome, show anticorrelated expression patterns. Indeed, global expression profiles of pairs of cis-NATs on average have significantly lower pairwise Pearson correlation coefficients than other pairs of neighboring genes whose transcripts do not overlap. However, anticorrelated expression that is greater than expected by chance is found in only a small number of cis-NAT pairs. The degree of anticorrelation does not depend on the length of the overlap or on the distance of the 5' ends of the transcripts. Consistent with earlier findings, cis-NATs do not exhibit an increased likelihood to give rise to small RNAs, as determined from available small RNA sequences and massively parallel signature sequencing tags. However, the overlapping regions of cis-NATs appeared to be enriched for small RNA loci compared to nonoverlapping regions. Furthermore, expression of cis-NATs was not disproportionately affected in various RNA-silencing mutants. Our results demonstrate that there is a trend toward anticorrelated expression of cis-NAT pairs in Arabidopsis, but currently available data do not produce a strong signature of small RNA-mediated silencing for this process.
Much of gene expression is primarily regulated at the level of transcription. Over the last few years, however, it has become increasingly apparent that posttranscriptional regulation at the RNA level is more widespread and important than previously assumed (Behm-Ansmant and Izaurralde, 2006
Large-scale genome projects have revealed the common occurrence of overlapping gene pairs in most species analyzed (Lehner et al., 2002
NATs have been implicated in such diverse processes as transcription occlusion, RNA interference, alternative splicing, RNA editing, DNA methylation, and genomic imprinting (Farrell and Lukens, 1995
Making use of large collections of microarray data, we have analyzed the extent to which cis-NATs in Arabidopsis show anticorrelated expression, as reported under salt stress for the SRO5 and P5CDH paradigm. We find that cis-NATs on average are significantly more anticorrelated than nonoverlapping neighboring genes, but clear global anticorrelated expression is restricted to a small subset of cis-NAT pairs, solving conflicting results that had previously been published (Jen et al., 2005
Antisense Transcript Pairs in Arabidopsis
As a first step toward analyzing the transcriptional regulation of NATs derived from the same or adjacent loci, we categorized the transcription units of the Arabidopsis genome, as annotated by The Arabidopsis Information Resource (TAIR), release 6 (Haas et al., 2005
To investigate the expression profiles of the cis-NATs, we mapped the TAIR6 transcription units onto the Affymetrix ATH1 microarray. We found that 21,021 (out of 30,359) transcripts were represented by the array. Of these, 16,014 were arranged in adjacent pairs, which correspond to about one-half of all transcript pairs encoded by the Arabidopsis genome. There was no substantial difference between adjacent nonoverlapping transcripts transcribed from the same strand (8,258; 51.8%) or from opposite strands (7,022; 53.0%). In contrast, overlapping transcripts derived from the same strand were slightly underrepresented (20; 37.7%), while cis-NATs were slightly overrepresented (714; 63.4%). The latter make up 4.4% of all transcript pairs mapped onto the ATH1 array. Because of the low number of transcript pairs in category 3, these were dropped from further analysis. Mapping information of the four different transcript pair categories onto the Arabidopsis genome and the ATH1 array can be found in Supplemental Tables S1 and S2, respectively.
One concern with cis-NAT predictions is that the transcript ends reported in the TAIR6 annotation might not necessarily be correct (Haas et al., 2005
The number of cis-NATs identified is slightly higher than what had previously been reported (Jen et al., 2005
To examine if there is a general difference between the expression profiles of cis-NATs and nonoverlapping transcript pairs, we calculated the pairwise Pearson correlation coefficient (PCC) for these transcript pairs from four publicly available data sets generated by the AtGenExpress initiative. The first set comprised data from 234 arrays that capture expression of 78 different tissue samples assayed in triplicate throughout development (Schmid et al., 2005 In all four data sets, the pairwise PCCs of cis-NATs are skewed toward negative values (Fig. 1 ) when compared to nonoverlapping transcript pairs located on either the same or opposite strands. This shift in distribution was statistically significant in all four data sets using a two-sided, two-sample Welch t test (Table II ) regardless of whether all cis-NATs supported by the TAIR6 annotation (714; Table I, category 4) or only the manually curated set (515; Table I, category 4*) were used. Similar results were obtained using pairwise Spearman's rank correlation coefficients (SCCs), which are less sensitive to outliers (Supplemental Table S4). Comparisons of the PCC and SCC values by scatter plot analysis revealed a high degree of similarity, with R2 values ranging between 0.71 an 0.83, indicating the robustness of the anticorrelation we observed (Supplemental Fig. S1).
In contrast, distributions between nonoverlapping transcript pairs located on either the same or the opposite strand were not significantly different in any of the data sets. Figure 2 shows the expression profiles of the cis-NATs with the lowest PCCs for the individual microarray experiments.
One limitation of the AtGenExpress data sets is that they lack cellular resolution. We therefore analyzed microarray data Birnbaum et al. (2003)
The fact that we found on average statistically significant lower PCCs for cis-NATs suggests that expression of one of the transcripts in these pairs can influence expression of the other. However, the PCCs for the majority of cis-NATs fell in the same range as nonoverlapping transcript pairs, suggesting strong mutual regulation for only a subset of cis-NATs. Thus, anticorrelated expression is much less widespread than previously suggested based on MPSS expression data from 14 cDNA libraries in which, for the majority of cis-NATs, coexpression in the same tissue was rarely found (Wang et al., 2005
It has experimentally been demonstrated that SRO5 and P5CDH, a pair of cis-NATs, have antagonistic functions in the regulation of salt tolerance in Arabidopsis (Borsani et al., 2005
We next analyzed whether the same cis-NAT pairs always displayed strong negative anticorrelation in the various data sets. We found that across the different data sets, the most strongly anticorrelated cis-NATs varied (Fig. 3 ) and that there was only weak overall correlation between the individual experiments. The highest correlation was found between the development and hormone data sets with R2 = 0.25. For the remaining comparisons, the R2 value ranged from 0.05 to 0.14. Of the 515 manually curated cis-NAT pairs analyzed, only six showed an average PCC of less than 0.5 in all four microarray experiments. Of these, only two had PCC values lower than 0.5 in every individual experiment (Supplemental Table S3). These findings are consistent with the idea that gene expression is primarily regulated at the transcriptional level by factors such as tissue identity, hormone status, or stress, and that only under specific conditions clear anticorrelation is seen. This finding also implies that the simple presence of an antisense transcript is not sufficient for the negative cross regulation, suggesting that the effectiveness of posttranscriptional RNA regulation by RNA interference greatly varies.
Anticorrelation of Antisense Transcripts Is Not Predicted by Extent of Overlap or Promoter Distance One obvious parameter that might influence the degree of mutual regulation could be the length of the overlapping region. We therefore analyzed whether the PCC for a given cis-NAT pair was correlated with the length of the overlap but found no evidence for such a relationship (Fig. 4 ) We next determined whether the distance between the 5' ends of the transcripts of cis-NATs was indicative for the degree of negative correlation found, with the idea that proximity of promoters could cause positive correlation in expression. However, similar to the length of the overlap, the distance of 5' ends of cis-NAT pairs had no effects on their PCCs (data not shown), indicating that varying promoter distance is unlikely to confound the conclusions about transcript overlap and anticorrelated expression.
cis-NAT Transcripts and RNA Silencing
One possible mechanism that might cause negative correlation of cis-NAT RNA accumulation could be the formation of double-stranded RNAs from the overlapping mRNA regions and subsequent processing to siRNAs by DCL proteins. The resulting siRNAs could in turn lead to the destruction of one of the transcripts by an RNA interference-like mechanism, as demonstrated for P5CDH and SRO5 (Borsani et al., 2005
To analyze the contribution of siRNAs to anticorrelated expression of cis-NATS, we examined the distribution of small RNA loci across the genome (Gustafson et al., 2005
We found that over all 1,126 cis-NAT pairs predicted by the TAIR6 annotation, small RNAs were not enriched in cis-NATs when compared to nonoverlapping neighboring genes pairs (Table III, top half). For example, we observed 1.467 small RNA loci/kb genomic sequence in nonoverlapping gene pairs, but we found only 0.388 loci/kb in the cis-NATs. However, if small RNAs were present in cis-NATs at all, they appeared to be enriched in the overlapping region of cis-NAT pairs (1.126 loci/kb) when compared to the nonoverlapping region (0.315 loci/kb). Similar results were obtained when we restricted the analysis to those cis-NATs that are present on the ATH1 arrays (714) and were confirmed by manual curation (515). In all instances, no enrichment of small RNAs in cis-NATs was observed. If one takes into account that not all gene pairs in a given category contain small RNA loci, the outcome differs in that small RNAs were found to be enriched in the overlapping region of cis-NATs (4.949 loci/kb) compared to nonoverlapping gene pairs (2.523 loci/kb) by a factor of approximately 2 (Table III, lower half). Together, these findings point to the fact that siRNA-mediated silencing does not play a major role in the global regulation of cis-NAT expression, at least not under those conditions examined in published small RNA-sequencing projects (Gustafson et al., 2005
Further support for this notion came from analyzing microarray data of mutants affected in the biogenesis of small RNAs (Allen et al., 2005
Our results paint the most detailed picture of the global regulation of cis-NATS in plants so far. While we could show that cis-NAT pairs tend to have more anticorrelated expression patterns than nonoverlapping neighboring transcripts, we found that pronounced anticorrelation across many samples can only be found in a small subset of cis-NATs. Along these lines, we found that discrete cis-NAT pairs show anticorrelated expression in different experiments, suggesting that independent transcriptional regulation of both members of a pair has a strong influence on cis-NAT expression. The negative correlation of cis-NATs was also observed in a cell type-specific data set, indicating that cis-NATs affect each others' expression in individual cells. The observation that small RNA loci, representing mainly siRNAs, were underrepresented in cis-NATs along with the fact that mutations in the RNA silencing machinery did not have a significant effect on cis-NAT expression confirm this notion and complement previous suggestions that small RNAs and RNA interference are important for only a subset of cis-NATs (Lu et al., 2005
However, there is at least one known example in which small RNAs derived from cis-NATs have been shown to be important in mutually antagonistic expression, namely, the SRO5 and P5CDH pair of cis-NATs involved in Arabidopsis salt tolerance (Borsani et al., 2005
Mapping of Transcript Pairs
The XML file containing the latest annotation (version 6) of Arabidopsis (Arabidopsis thaliana) pseudochromosomes was downloaded from the TAIR FTP server (ftp://ftp.arabidopsis.org/home/tair/). Start and stop position of the transcription units along with information on the strand that encodes a mRNA and the gene description were extracted. We used Perl scripts to categorize pairs of adjacent transcripts, depending on overlap and whether they were transcribed from the same strand. In a first step, we defined all antisense transcripts that overlapped for at least one base as predicted by the TAIR6 annotation as potential cis-NATs. In a second step, all predicted cis-NATs were manually inspected, and only those that were supported by spliced cDNA and/or EST clones were analyzed further. Single exon genes and gene models not supported by any mRNA were required to be clearly coding (
Mapping information of transcripts onto the Affymetrix ATH1 array was obtained from TAIR as well. We only used those probe sets that mapped to a single transcription unit. In those few cases where a transcription unit was represented by more than one specific probe set, we retained for further analysis only one of the probe sets at random. Pairwise PCCs and pairwise SCCs were calculated using programs written in Java. Histograms (bin size 0.1), ranking, and comparisons of PCCs between individual microarray data sets were created in Microsoft Excel.
All microarray data used are publicly available. Data for correlation analysis were from the AtGenExpress initiative (available from TAIR). Microarray data of small RNA biogenesis mutants (Allen et al., 2005
All MPSS tags and small RNA sequences used are publicly available. MPSS tags were downloaded from the Arabidopsis MPSS database (http://mpss.udel.edu/at/; Meyers et al., 2004a
The following materials are available in the online version of this article.
We are indebted to Blake Meyers for making the MPSS data of small RNAs available as a database dump. The initial generation of AtGenExpress microarray data was supported by the Deutsche Forschungsgemeinschaft through a grant to L. Nover, T. Altmann, and D.W., and by the Max Planck Society. J.U.L. is an EMBO Young Investigator, and D.W. is a director of the Max Planck Institute. Received March 29, 2007; accepted May 2, 2007; published May 11, 2007.
1 This work was supported by the Max Planck Society, by the National Science Foundation (grant no. MCB0618433 to J.C.C.), and by the U.S. Department of Agriculture (grant no. 20053531915280 to J.C.C.). The author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantphysiol.org) is: Markus Schmid (markus.schmid{at}tuebingen.mpg.de).
[C] Some figures in this article are displayed in color online but in black and white in the print edition.
[W] Online version contains Web-only data. www.plantphysiol.org/cgi/doi/10.1104/pp.107.100396 * Corresponding author; e-mail markus.schmid{at}tuebingen.mpg.de; fax 4970716011412.
Allen E, Xie Z, Gustafson AM, Carrington JC (2005) microRNA-directed phasing during trans-acting siRNA biogenesis in plants. Cell 121: 207221[CrossRef][ISI][Medline] Bartel DP (2004) MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116: 281297[CrossRef][ISI][Medline] Behm-Ansmant I, Izaurralde E (2006) Quality control of gene expression: a stepwise assembly pathway for the surveillance complex that triggers nonsense-mediated mRNA decay. Genes Dev 20: 391398 Billy E, Brondani V, Zhang H, Muller U, Filipowicz W (2001) Specific interference with gene expression induced by long, double-stranded RNA in mouse embryonal teratocarcinoma cell lines. Proc Natl Acad Sci USA 98: 1442814433 Birnbaum K, Shasha DE, Wang JY, Jung JW, Lambert GM, Galbraith DW, Benfey PN (2003) A gene expression map of the Arabidopsis root. Science 302: 19561960 Borsani O, Zhu J, Verslues PE, Sunkar R, Zhu JK (2005) Endogenous siRNAs derived from a pair of natural cis-antisense transcripts regulate salt tolerance in Arabidopsis. Cell 123: 12791291[CrossRef][ISI][Medline] Breitling R, Armengaud P, Amtmann A, Herzyk P (2004) Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett 573: 8392[CrossRef][ISI][Medline] Brodersen P, Voinnet O (2006) The diversity of RNA silencing pathways in plants. Trends Genet 22: 268280[CrossRef][ISI][Medline] Farrell CM, Lukens LN (1995) Naturally occurring antisense transcripts are present in chick embryo chondrocytes simultaneously with the down-regulation of the alpha 1 (I) collagen gene. J Biol Chem 270: 34003408 Gustafson AM, Allen E, Givan S, Smith D, Carrington JC, Kasschau KD (2005) ASRP: the Arabidopsis Small RNA Project Database. Nucleic Acids Res 33: D637D640 Haas BJ, Wortman JR, Ronning CM, Hannick LI, Smith RK Jr, Maiti R, Chan AP, Yu C, Farzad M, Wu D, et al (2005) Complete reannotation of the Arabidopsis genome: methods, tools, protocols and the final release. BMC Biol 3: 7[CrossRef][Medline] Jen CH, Michalopoulos I, Westhead DR, Meyer P (2005) Natural antisense transcripts with coding capacity in Arabidopsis may have a regulatory role that is not linked to double-stranded RNA degradation. Genome Biol 6: R51[CrossRef][Medline] Jones-Rhoades MW, Bartel DP, Bartel B (2006) MicroRNAs and their regulatory roles in plants. Annu Rev Plant Biol 57: 1953[CrossRef][Medline] Kasschau KD, Fahlgren N, Chapman EJ, Sullivan CM, Cumbie JS, Givan SA, Carrington JC (2007) Genome-wide profiling and analysis of Arabidopsis siRNAs. PLoS Biol 5: e57[CrossRef][Medline] Kiba T, Naitou T, Koizumi N, Yamashino T, Sakakibara H, Mizuno T (2005) Combinatorial microarray analysis revealing Arabidopsis genes implicated in cytokinin responses through the His->Asp phosphorelay circuitry. Plant Cell Physiol 46: 339355 Kilian J, Whitehead D, Horak J, Wanke D, Weinl S, Batistic O, D'Angelo C, Bornberg-Bauer E, Kudla J, Harter K (2007) The AtGenExpress global stress data set: protocols, evaluation and model data analysis of UV-B light, drought and cold stress responses. Plant J 50: 347363[CrossRef][ISI][Medline] Kim DD, Kim TT, Walsh T, Kobayashi Y, Matise TC, Buyske S, Gabriel A (2004) Widespread RNA editing of embedded alu elements in the human transcriptome. Genome Res 14: 17191725 Kumar M, Carmichael GG (1998) Antisense RNA: function and fate of duplex RNA in cells of higher eukaryotes. Microbiol Mol Biol Rev 62: 14151434 Lehner B, Williams G, Campbell RD, Sanderson CM (2002) Antisense transcripts in the human genome. Trends Genet 18: 6365[CrossRef][ISI][Medline] Lu C, Tej SS, Luo S, Haudenschild CD, Meyers BC, Green PJ (2005) Elucidation of the small RNA component of the transcriptome. Science 309: 15671569 Meyers BC, Tej SS, Vu TH, Haudenschild CD, Agrawal V, Edberg SB, Ghazal H, Decola S (2004a) The use of MPSS for whole-genome transcriptional analysis in Arabidopsis. Genome Res 14: 16411653 Meyers BC, Vu TH, Tej SS, Ghazal H, Matvienko M, Agrawal V, Ning J, Haudenschild CD (2004b) Analysis of the transcriptional complexity of Arabidopsis thaliana by massively parallel signature sequencing. Nat Biotechnol 22: 10061011[CrossRef][ISI][Medline] Nakabayashi K, Okamoto M, Koshiba T, Kamiya Y, Nambara E (2005) Genome-wide profiling of stored mRNA in Arabidopsis thaliana seed germination: epigenetic and genetic regulation of transcription in seed. Plant J 41: 697709[CrossRef][ISI][Medline] Nemhauser JL, Hong F, Chory J (2006) Different plant hormones regulate similar processes through largely nonoverlapping transcriptional responses. Cell 126: 467475[CrossRef][ISI][Medline] Newbury SF (2006) Control of mRNA stability in eukaryotes. Biochem Soc Trans 34: 3034[CrossRef][ISI][Medline] Osato N, Yamada H, Satoh K, Ooka H, Yamamoto M, Suzuki K, Kawai J, Carninci P, Ohtomo Y, Murakami K, et al (2003) Antisense transcripts with rice full-length cDNAs. Genome Biol 5: R5[CrossRef][Medline] Rajagopalan R, Vaucheret H, Trejo J, Bartel DP (2006) A diverse and evolutionarily fluid set of microRNAs in Arabidopsis thaliana. Genes Dev 20: 34073425 Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Scholkopf B, Weigel D, Lohmann JU (2005) A gene expression map of Arabidopsis thaliana development. Nat Genet 37: 501506[CrossRef][ISI][Medline] Shendure J, Church GM (2002) Computational discovery of sense-antisense transcription in the human and mouse genomes. Genome Biol 3: research0044.1research0044.14 Sureau A, Soret J, Guyon C, Gaillard C, Dumon S, Keller M, Crisanti P, Perbal B (1997) Characterization of multiple alternative RNAs resulting from antisense transcription of the PR264/SC35 splicing factor gene. Nucleic Acids Res 25: 45134522 Tufarelli C, Stanley JA, Garrick D, Sharpe JA, Ayyub H, Wood WG, Higgs DR (2003) Transcription of antisense RNA leading to gene silencing and methylation as a novel cause of human genetic disease. Nat Genet 34: 157165[CrossRef][ISI][Medline] Vanhee-Brossollet C, Vaquero C (1998) Do natural antisense transcripts make sense in eukaryotes? Gene 211: 19[CrossRef][ISI][Medline] Vazquez F (2006) Arabidopsis endogenous small RNAs: highways and byways. Trends Plant Sci 11: 460468[CrossRef][ISI][Medline] Wang H, Chua NH, Wang XJ (2006) Prediction of trans-antisense transcripts in Arabidopsis thaliana. Genome Biol 7: R92[CrossRef][Medline] Wang XJ, Gaasterland T, Chua NH (2005) Genome-wide prediction and identification of cis-natural antisense transcripts in Arabidopsis thaliana. Genome Biol 6: R30[CrossRef][Medline] Wu Z, Irizarry RA, Gentleman R, Murillo FM, Spencer FA (2004) A model-based background adjustment for oligonucleotide expression arrays. J Am Stat Assoc 99: 909917[CrossRef][ISI] Yamada K, Lim J, Dale JM, Chen H, Shinn P, Palm CJ, Southwick AM, Wu HC, Kim C, Nguyen M, et al (2003) Empirical analysis of transcriptional activity in the Arabidopsis genome. Science 302: 842846 Yelin R, Dahary D, Sorek R, Levanon EY, Goldstein O, Shoshan A, Diber A, Biton S, Tamir Y, Khosravi R, et al (2003) Widespread occurrence of antisense transcription in the human genome. Nat Biotechnol 21: 379386[CrossRef][ISI][Medline] This article has been cited by other articles:
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY | THE PLANT CELL | |
|---|---|---|---|