|
|
||||||||
|
First published online January 9, 2003; 10.1104/pp.012732 Plant Physiol, February 2003, Vol. 131, pp. 430-442 Gene Expression in Autumn Leaves1Umea Plant Science Center, Department of Plant Physiology, Umea University, 901 87 Umea, Sweden (R.B., Jo.K., H.B., S.J.B., Ja.K., Per. G., Pet. G., S.J.); and Department of Biotechnology, Kungliga Tekniska Högskolan, Royal Institute of Technology, Stockholm Center for Physics, Astronomy, and Biotechnology, 106 91 Stockholm, Sweden (F.S., R.E., J.L.)
Two cDNA libraries were prepared, one from leaves of a field-grown aspen (Populus tremula) tree, harvested just before any visible sign of leaf senescence in the autumn, and one from young but fully expanded leaves of greenhouse-grown aspen (Populus tremula × tremuloides). Expressed sequence tags (ESTs; 5,128 and 4,841, respectively) were obtained from the two libraries. A semiautomatic method of annotation and functional classification of the ESTs, according to a modified Munich Institute of Protein Sequences classification scheme, was developed, utilizing information from three different databases. The patterns of gene expression in the two libraries were strikingly different. In the autumn leaf library, ESTs encoding metallothionein, early light-inducible proteins, and cysteine proteases were most abundant. Clones encoding other proteases and proteins involved in respiration and breakdown of lipids and pigments, as well as stress-related genes, were also well represented. We identified homologs to many known senescence-associated genes, as well as seven different genes encoding cysteine proteases, two encoding aspartic proteases, five encoding metallothioneins, and 35 additional genes that were up-regulated in autumn leaves. We also indirectly estimated the rate of plastid protein synthesis in the autumn leaves to be less that 10% of that in young leaves.
Leaf senescence is the final stage
in leaf development, and understanding senescence is important not only
for purely scientific reasons, but also for practical purposes.
Premature senescence leads, for example, to decreased photosynthetic
capacity, and consequently lower yield. Senescence is not simply the
passive death of a leaf because of aging, but is a tightly controlled process during which cell components are degraded in a coordinated fashion and, when nutrients have been relocated to other parts of the
plant body, the cell finally dies (Gan and Amasino,
1997 During the last decade, studies of leaf senescence, focusing especially
on Arabidopsis, and other annual species to a lesser extent, have
identified a number of senescence-associated genes (SAGs) and cellular
mechanisms of senescence have begun to be elucidated, as reviewed by
various authors (Buchanan-Wollaston, 1997 We have initiated a project to understand the genetic basis of autumn
senescence and describe here the first steps in this initiative:
large-scale sequencing and analysis of aspen (Populus tremula) expressed sequence tags (ESTs) to identify
candidate genes for regulating and mediating the process. In addition
to gene identification, EST sequencing can also be used to obtain estimates of relative expression levels. Provided that no subtractive methods have been applied during library construction, relative EST
abundance provides an approximate indication of the level of each
transcript in the mRNA pool. If genes are grouped into broad categories
(for example, according to function), the mean numbers of ESTs give a
fairly good estimate of the gene transcription in each category and the
EST frequency of several sets of Arabidopsis and rice
(Oryza sativa) genes have been shown to roughly
correspond to relative protein stoichiometries (Mekhedov et al.,
2000 Here, we present an analysis of two different sets of aspen leaf ESTs, and use the data to draw conclusions regarding gene expression during autumnal leaf senescence in aspen.
RNA Sampling and Preparation From a free-growing aspen on the Umea University campus, leaf
samples were harvested twice a week from August 17 until October 1, 1999, flash frozen in liquid nitrogen, and stored at
We measured the amount of extractable RNA from three independent RNA extractions of leaves from each sampling date to get a crude estimate of the kinetics of RNA disappearance as the leaf senesced. We noted an increase in the amount of extractable RNA in the first half of September, but during the latter half of the month the RNA gradually decreased in abundance and finally disappeared (Fig. 2). The amount of extractable RNA does not necessarily correspond precisely to the amount of RNA present in the sample or to the level of protein synthesis. Nevertheless, these changes indicate that chloroplast degradation may be preceded by an increase in protein synthesis, and that protein synthesis activity probably continued for about 2 weeks after the initiation of massive chlorophyll degradation.
EST Sequencing, Bioinformatics, and Database Construction Two cDNA libraries were constructed and analyzed using a
high-throughput DNA sequencing setup. From the autumn leaf library (harvested on September 14 as described above), 5,258 EST sequences were obtained. To get a reference data set for comparison, we sequenced
4,923 clones from a cDNA library prepared from young, but fully
expanded, leaves of an aspen hybrid grown in a greenhouse (Larsson et al., 1997 The number of genes represented in a cDNA library and the redundancy of
specific genes can be estimated by clustering the EST sequences
according to sequence similarity. There are several pitfalls in this
procedure. ESTs obtained from the 5' end of the clone might not overlap
although they derive from the same transcript. This is because of the
fact that many clones (especially those originating form large
transcripts) are not full length. In addition, for highly expressed
genes represented by many copies among the ESTs, clustering programs
like Phrap and the TIGR assembler have a strong tendency to
split them into several contigs (Liang et al., 2000 We also identified, for each sequence, the protein in the Munich Institute of Protein Sequences (MIPS) Arabidopsis database (MATDB) that gave the highest BLASTX score. Many annotations in MATDB are automatically performed and are, in our experience, less reliable than those in the other databases. However, because every gene in MATDB has been assigned to a functional class in the MIPS classification system, identification of the closest Arabidopsis homologue provided a rapid way to obtain a preliminary functional classification of our sequences, which was later subjected to extensive manual curation. All annotations were entered into a FileMaker Pro-database using appropriate scripts. A total of 4,512 ESTs (44%) could be assigned to a Mendel gene family. In the autumn leaf library, 380 different Mendel GFNs (see "Materials and Methods") were represented, and the young leaf library contained 460 different GFNs. Of these, 155 were shared between the two libraries. The remaining 5,669 ESTs fell into 3,717 homology groups and were given PGFNs. Only 207 PGFNs were shared between the libraries: The autumn leaf library had 2,027 unique PGFNs and the young leaf library had 1,483. Thus, the genes expressed in young leaves corresponded more often to previously characterized proteins or genes than those expressed in autumn leaves, and the pattern of gene expression in the two types of leaf differed markedly. Different Genes Were Most Abundant in the Two Libraries To analyze more carefully the differences in gene expression we compared, from the curated lists of annotated clusters, the most abundant ESTs in the two libraries. Genes encoding "standard" photosynthetic proteins were scarce in the autumn leaf library (Table I). Rubisco ESTs, for example, were found at a frequency corresponding to 4% of that in the young leaf library (see below), and similar frequencies were found for other genes encoding proteins involved in the photosynthetic light and dark reactions (see below). However, ESTs encoding early light-inducible proteins (ELIPs), which accumulate in the thylakoid membrane during stress, were 13 times more abundant in the autumn leaf library. Several other stress-related proteins, for example metallothionein, blight-associated protein P12 (which has homology to expansin), pollen coat protein (which is related to dehydrins), and proteases, were also frequently found among the autumn leaf ESTs.
As expected, in the young leaf library most genes with a high abundance of ESTs encoded proteins of the photosynthetic apparatus. In fact, of the 20 most abundant genes (Table II), 14 were related to photosynthesis: 679 clones (14%) represented RbcS, encoding the small subunit of Rubisco, and 219 (4.5%) represented Lhcb1, encoding the major LHC II protein. No other protein was represented by more than 2% of the clones. Other proteins related to photosynthesis included seven light-harvesting chlorophyll a/b-binding proteins, two other PS I proteins, two other PS II proteins, and three soluble proteins. One additional gene encoded a chloroplast-located protein (a thiazole biosynthetic enzyme). Among the most frequent sequences that were not related to photosynthesis were two that encoded cytosolic proteins (ubiquitin and metallothionein) and two encoding proteins that appeared to be sorted through the secretory pathway: a cell wall protein (Pro-rich protein) and a germin-like protein. It was apparent that the genes represented in the autumn leaf library were generally less well characterized than those in the young leaf library: Almost all of the genes in Table II have a very well-defined function, whereas this is only true for a minority of the genes in Table I.
Five Metallothionein Genes Were Highly Expressed in Autumn Leaves Genes encoding metallothionein, the most abundant type of EST in
the autumn leaf library, have previously been shown to be senescence
induced (Kagi, 1991
Several Cys and Aspartic Protease ESTs Were Abundant in the Autumn Leaf Library Proteases have important roles in the senescence process
(Ye and Varner, 1996
ESTs from 35 Additional Genes Were Significantly Enriched in Autumn Leaves We have used the annotation procedure to analyze about 27,000 ESTs
from five other aspen cDNA libraries (Sterky et al.,
1998
Although none of the Paul genes, to our knowledge, have homologs
previously claimed to be directly involved in leaf senescence, several
have functions or expression patterns that relate to stress or
senescence. Paul20 encodes a protein homologous to At3g54040 (Nt-sube80
protein), which is known to be induced by elicitors and photoassimilate
accumulation (Herbers et al., 1995 Representation of Known SAGs Previous studies of other types of senescing leaves have
identified many genes induced during senescence. To see whether some of
these were expressed at higher levels in autumn leaves than in young
leaves, we compared EST frequencies for all genes listed as senescence
associated by Buchanan-Wollaston (1997
Known SAGs that were significantly (95% confidence level) enriched in the autumn leaf library were ferritin and the pathogenesis-related protein PR1. Phospholipase D, Asn synthetase, ATP sulfulyase, chitinases class III, and NADH-ubiquinone oxidoreductase subunit K also had higher clone frequencies in the autumn leaf library, but these enrichments were not statistically significant. A number of genes reported to be senescence associated were not enriched in the autumn leaf library, including beta-galactosidase, glutathione S-transferase, catalase, chitinase class I and the glyoxisomal forms of NAD+ malate dehydrogenase, Fru-bisphosphate aldolase, cytosolic Gln synthetase, and glyceraldehyde-3-phosphate dehydrogenase. However, we cannot exclude the possibility that in some of these cases there could be problems involved in the identification of the true aspen ortholog to a senescence-specific form of the protein. Several known SAGs were not found in either library (although some were found in other libraries, not derived from leaf tissue). These included ribonuclease RNS2, malate synthase, isocitrate lyase, phosphoenolpyruvate carboxykinase, and pyruvate orthophosphate dikinase. Of the 26 analyzed genes previously found to be senescence induced, we found expression patterns consistent with such induction for 13 (eight of which were statistically significant), eight appeared not to be up-regulated in this stage of senescence as compared with young leaves, and for five genes, we obtained no data on their expression patterns. Photosynthesis Was Down-Regulated But Lipid Metabolism and Respiration Were Up-Regulated The MIPS functional classification scheme is not always appropriate for plant genes. For example, there is no single class for "photosynthesis": Rubisco and the Calvin cycle enzymes are found in class 01.05.01.05.01 (metabolism, C compound and carbohydrate metabolism, C compound and carbohydrate utilization, autotrophic CO2-fixation, and Calvin cycle), the proteins of the PSs are found in class 02.30 and the chloroplastic ATPase is found, together with mitochondrial ATPase, in class 02.11. Therefore, we constructed a slightly modified MIPS classification scheme, differing from the original in some subclasses within class 1 (metabolism) and 2 (energy) and classified all genes according to it. The modified scheme (named UPSC-MIPS) is presented in the supplementary material. Based on the functional classification of the clones, we compared the classes of genes that were expressed in the two libraries. We did not attempt to classify genes with BLASTX scores under 100 ("not classified" in Fig. 3), and those that were most similar to a plant gene without a known function, typically an Arabidopsis open reading frame, were put in the category "unclassified." The functional classification for each of these genes is included in the list of clones in the supplementary material.
In Figure 3, the percentage of clones found in the different main UPSC-MIPS classes is shown, and the distribution of clones in the subclasses of class 01 (metabolism) and 02 (energy) is shown in Table VIII. The full list of clones in the different classes is found in the supplementary material. The fraction of clones in class 01 (metabolism) was the same in the two libraries, but the subclasses C compound and carbohydrate metabolism, lipid, fatty acid and isoprenoid metabolism, nucleotide metabolism, and nitrogen and sulfur metabolism were more strongly represented among the clones in the autumn leaf library than in the young leaf library.
The differences between the libraries were much more pronounced in the class 02 (energy). For instance, the subclass photosynthesis contained 5.2% of the clones in the autumn leaf library, compared with 33% in the young leaf library. Major differences were also found within the photosynthesis subclass. Subclass 02.30.01.01 (photosynthetic light reaction), for example, accounted for 0.9% of the ESTs in the autumn leaf library, compared with 16% in the young leaf library, an almost 20-fold reduction. On the other hand, subclass 02.30.02.05 (photorespiration) was much less depleted (0.6% versus 1.0%). Several authors have suggested that lipid metabolism provides
energy for the senescing leaf (Gut and Matile, 1988 The classes from the list of most abundant genes that seemed to be most enriched in the autumn leaf library, 06 protein destination (including 06.13, proteolysis), and 11 Cell rescue, defense, death, and aging, were much better represented in the autumn than in the young leaf library (7% versus 3% and 11% versus 4%, respectively). The fraction of ESTs without significant homology to any gene in public databases was almost twice as large in the autumn leaf library (28% versus 15%), whereas the fraction of "unclassified" clones (homologous to a gene without assigned function) was about the same in the two libraries. Transcript Profiling We analyzed transcript abundance for five of the genes: ubiquitin, PR1, and three Cys proteases (Pcyprot 1, 4, and 6), which we identified as putative SAGs in aspen during the autumn in leaves of a free-growing aspen tree. As a comparison, one apparently down-regulated gene (Lhcb2) was also analyzed. The expression patterns of the two types of gene were, as expected, strikingly different: Although the Lhcb2 mRNA level decreased steadily during the autumn, all five putative SAGs showed an increase in transcript abundance (Fig. 4). However, the patterns were different. Pcyprot6 and PR1 mRNAs accumulated to high levels only in the very late stages of senescence, ubiquitin and the Pcyprot 1 showed a more gradual increase and the Pcyprot4 transcript showed biphasic behavior, with one peak at August 24 and another at September 14. This supports the hypothesis that many of the genes we identified as potential SAGs in the EST material are SAGs.
The coloration of the leaves of deciduous trees in the fall in
temperate regions is perhaps the most striking example of leaf senescence. Therefore, it is rather surprising that there are no data
on gene expression during autumn leaf senescence. This type of leaf
senescence is probably very similar to senescence in, for example,
detached leaves, but there must also be differences. The autumnal leaf
senescence program is induced in all the leaves by decreasing day
length, regardless of whether they are stressed by other environmental
factors. There is also a lot of natural variation in the regulation of
the process. For instance, in an adaptation to the earlier autumns in
the north, trees from higher latitudes start senescing earlier than
trees from lower latitudes (Pauley and Perry, 1954 Estimating expression levels from EST frequencies is an indirect method
and there are both technical and biological limitations to such an
analysis. For example, uneven efficiency in the reverse transcription
of mRNAs of different sizes, size fractionation, and the possible
recalcitrance of some genes toward cloning in Escherichia
coli are all problems that may affect the results. Despite these
limitations, the "digital northern" approach has a major advantage
over traditional northern or most DNA chip array experiments because it
gives data on mRNA levels for individual genes relative to the total
mRNA pool. mRNA levels do not necessarily correspond well to protein
synthesis, and there are many well-documented examples of translational
regulation of gene expression in plants. However, for most major
enzymatic components, EST abundance seems to be a fair approximation of
relative protein abundance (Jansson, 1999 The pattern of gene expression in the two libraries was strikingly
different. Our data indicate that most of the metabolic characteristics
previously reported for senescing leaves (down-regulation of
photosynthesis and up-regulation of genes involved in protein, lipid,
pigment degradation, and respiration, as well as stress-related genes;
for review, see Smart, 1994 In the young, greenhouse-grown leaves a very large proportion of the mRNA pool (and, thus, protein synthesis) was devoted to synthesis of the photosynthetic apparatus: 33% of the ESTs encoded proteins known to be components of the various protein complexes involved in photosynthesis. In contrast, only 5% of the clones in the library from senescing leaves encoded photosynthetic proteins and one-half of those were stress-related photosynthetic proteins such as ELIPs. The average gene encoding a "standard" photosynthetic protein, directly involved in light reaction or CO2 fixation, was down-regulated about 20-fold in the autumn leaves. We expected gene expression in young leaves grown under non-stressed conditions to be heavily concentrated on photosynthesis, but we were surprised to find how little of the gene expression in autumn leaves, which still showed no visible sign of chlorophyll degradation, was dedicated to photosynthesis. Senescence is a strictly controlled developmental process and by the middle of September, photosynthetic gene expression had apparently been turned off and the leaves had prepared to break down their chloroplasts. In addition to the confirmation that autumn senescence shares many
features with senescence in leaves of annual plants, we also identified
a number of genes whose orthologs in Arabidopsis are either unknown or
have not been connected with senescence. By choosing leaves in the
process of chlorophyll degradation as sources for the autumn leaf cDNA
library, we believed that we could get a snapshot of the protein
synthesis activity related to degradation of the chloroplasts, and
possibly other cell constituents as well. Of our identified 35 Paul
genes, nine encoded proteins Arabidopsis orthologs seem to be
chloroplast located, and four of these have no assigned function. These
are all good candidates for proteins involved in degradation of the
chloroplast components, and we also found two known chloroplast
proteases, DegP1 and FtsH2 (Adam et al., 2001 Another striking difference was the higher fraction of ESTs in the autumn leaf library that showed no significant homology to any known protein in public databases. This could simply be a consequence of the fact that young, green leaves have been very extensively studied and the proteins of such leaves are better characterized. Because gene prediction also relies on EST data, genes expressed in tissues that not have been subjected to EST sequencing are overrepresented among genes for which no orthologs have been found, and/or whose putative function remains unknown. This means that we may overestimate the fraction of "truly novel" genes in our data but, even so, there are many potentially interesting genes to be found in autumn leaves of aspen. A prominent feature of the nuclear genome of plants is the large
fraction of genes that appear to have originated from the cyanobacterial genome. It is believed that in the evolution of the
green plant, a cyanobacterial progenitor of the chloroplast was
engulfed by the eukaryotic host, becoming enclosed by a double membrane, and then permanently integrated into the plant cell as an
organelle, the chloroplast. The ancestral chloroplast genome has been
estimated to have consisted of around 3,200 genes, roughly 1,700 of
which have been lost because of redundancy between the nuclear and
plastid gene products, and about 1,400 genes appear to have been
transferred to the nuclear genome, leaving only 87 plastid-encoded
genes (Sato et al., 1999 We found no evidence for a conversion of peroxisomes to glyoxysomes,
like in senescing rape (Brassica napus) leaves
(Vicentini and Matile, 1993 Our data indicate that Cys and aspartic proteases may play an important
role during chloroplast degradation, whereas at least the ubiquitin
system (as evident from the RNA blot data) is not up-regulated until a
later stage of senescence. It has been shown in other systems that the
proteasome components do not accumulate during senescence
(Bahrami and Gray, 1999 We believe that this work, in addition to discovering genes, provides insights into gene expression in aspen leaves at a rather early stage of autumn senescence. Moreover, it illustrates the usefulness of leaves of deciduous trees as a model system to study leaf senescence, and we believe that our ongoing transcript profiling using DNA microarrays will make it possible to pinpoint a number of candidate genes for regulators of the process. Ability to control senescence will have important biotechnological implications because trees that shed their leaves too early have lower than optimal productivity, whereas if the senescence process is initiated too late, the tree does not have sufficient time to recapture nutrients and complete the hardening procedure before the winter, and, thus, is likely to suffer from growth limitations and/or frost injuries. Therefore, this study (the first, to the best of our knowledge, in which gene expression during autumn leaf senescence has been studied) may be the first step to a deeper understanding of this biologically important process.
Plant Material Leaves were sampled from a free-growing aspen (Populus
tremula) at the University of Umea campus. About 30 leaves,
from the outer part of the crown, were sampled twice a week, at 11 AM on each occasion. Leaves were flash frozen in liquid
nitrogen and stored at RNA Preparation and Blotting Aspen RNA was prepared according to Chang et al.
(1993) cDNA Library Constructions The senescence cDNA library was constructed from RNA prepared
from leaves harvested on September 14, using the SMART cDNA library
construction kit system (CLONTECH Laboratories, Palo Alto, CA).
The young leaf cDNA library, described by Larsson et al. (1997) EST Sequencing The cDNA inserts were sequenced from the 5' end using PCR products as templates by a Biomek robot (Beckman Instruments, Fullerton, CA) in a 96-well microtiter format. PCR amplifications were performed using general vector primers and standard PCR protocols. The size and quality of the PCR products were checked by gel electrophoresis. The samples were analyzed using a DYEnamic ET Dye Terminator Kit (Amersham-Pharmacia Biotech) and a biotinylated sequencing primer (reverse sequencing primer). The sequencing reaction products were purified on a magnetic workstation (Magnatrix 1200, Magnetic Biosolutions, Stockholm) using paramagnetic beads (Dynapure, Dynal, Oslo) before the samples were loaded onto an ABI 377 (Perkin-Elmer Applied Biosystems, Foster City, CA) or MegaBACE 1000 (Amersham-Pharmacia Biotech) DNA sequencer. Bioinformatics Raw sequences chromatograms were processed by the Phred program
(http://www.phrap.com/phred/). Vector sequences and low-quality regions
were deleted using Vectorstrip
(http://www.hgmp.mrc.ac.uk/Software/EMBOSS/). The cleaned inserts were
then stored in FASTA format. The whole of the above process was
performed semiautomatically with the help of Perl scripts. ESTs
containing rRNA, chloroplast DNA, or mitochondrial DNA were identified
by the BLASTN algorithm of NCBI-BLAST (Altschul et al.,
1990 We constructed a FileMaker Pro-database with a Web interface with options like direct BLAST searches against our own database (on a local Unix server) or against MIPS or SWISS-PROT (over the Internet). There are also direct links to the Kyoto Encyclopedia of Genes and Genomes (http://www.genome.ad.jp/kegg/) and Enzyme (http://www.expasy.ch/enzyme/) databases for easy access to more detailed information. For contigs, there is also information about the number of clones present in them, and the libraries from which they originate. Significance levels for differential expression were calculated by the
equation of Audic and Claverie (1997)
We wish to thank Baram Amini, Thomas Hiltonen, Susanne Larsson, Björn Sjöblom, and Carl Zingmark for their participation during various stages of this work.
Received August 14, 2002; returned for revision October 9, 2002; accepted November 7, 2002. 1 This work was supported by the Knut and Alice Wallenberg Foundation, by the Foundation for Strategic Research, by the Swedish Research Council (grant to S.J.), and by the Swedish Research Council for the Environment, Agricultural Sciences, and Spatial Planning (Formas; grants to S.J., J.L., and P.G.).
2 Present address: Lipid Metabolism Unit, Massachusetts General Hospital, 32 Fruit Street, GRJ 1328, Boston, MA 02114.
* Corresponding author; e-mail stefan.jansson{at}plantphys.umu.se; fax 46-786-66-76.
Article, publication date, and citation information can be found at www.plantphysiol.org/cgi/doi/10.1104/pp.012732.
This article has been cited by other articles:
|