|
Plant Physiol, July 2001, Vol. 126, pp. 1331-1340
Sequence and Analysis of the Tomato JOINTLESS
Locus1
Long
Mao,
Dilara
Begum,
Stephen A.
Goff, and
Rod A.
Wing*
Clemson University Genomics Institute, 100 Jordan Hall, Clemson,
South Carolina 29634 (L.M., D.B., R.A.W.); and SYNGENTA, 3050 Science
Park Road, San Diego, California 92121 (S.A.G.)
 |
ABSTRACT |
A 119-kb bacterial artificial chromosome from the
JOINTLESS locus on the tomato (Lycopersicon
esculentum) chromosome 11 contained 15 putative genes.
Repetitive sequences in this region include one
copia-like LTR retrotransposon, 13 simple sequence
repeats, three copies of a novel type III foldback transposon, and four putative short DNA repeats. Database searches showed that the foldback
transposon and the short DNA repeats seemed to be associated preferably
with genes. The predicted tomato genes were compared with the complete
Arabidopsis genome. Eleven out of 15 tomato open reading frames
were found to be colinear with segments on five Arabidopsis bacterial
artificial chromosome/P1-derived artificial chromosome clones.
The synteny patterns, however, did not reveal duplicated segments in
Arabidopsis, where over half of the genome is duplicated. Our analysis
indicated that the microsynteny between the tomato and Arabidopsis
genomes was still conserved at a very small scale but was complicated
by the large number of gene families in the Arabidopsis genome.
 |
INTRODUCTION |
Tomato (Lycopersicon
esculentum) is one of the
most important and intensely studied model dicot plants. Compared with
the pioneering research in disease resistance and fruit maturation, little is know about the microsyntenic relationship of tomato and other
plant species as revealed by sequencing long contiguous stretches of
genomic DNA. To date, the longest tomato sequence reported in GenBank
is a 105-kb bacterial artificial chromosome (BAC) sequence from
the chromosome 2 ovate region (Ku et al., 2000 ). In
contrast, the Arabidopsis genome has been sequenced (The
Arabidopsis Genome Initiative [AGI], 2000 ;
http://www.Arabidopsis.org/; http://www.kazusa.or.jp/) and provides a good
opportunity for comparative genomics. It is time to answer the
questions such as: To what extent can the knowledge gained from the
Arabidopsis genome sequence be applied to understand the tomato genome,
a species that separated from Arabidopsis more than 100 million years
ago (MYA; Gandolfo et al., 1998 ; Yang et al., 1999 )? The sequence of
the 105-kb BAC from tomato chromosome 2, containing the
ovate locus, was compared with the Arabidopsis genome,
shedding interesting light on the synteny of the genomes of these two
model species (Ku et al., 2000 ). Considering the genome size of tomato (approximately 970Mb), more comparative sequence analysis will be
required before we can develop a comprehensive and confident comparison
of the genomes of these two dicot plants.
In our effort to map-base clone the tomato JOINTLESS gene, a
gene that controls the development of pedicel abscission zones (Mao et
al., 2000a ), we sequenced a 118,813-bp BAC clone encompassing the
tomato JOINTLESS region on chromosome 11. In addition to the 15 putative open reading frames (ORFs) detected, repetitive
sequences, especially small DNA elements such as foldback transposons,
were also described in detail here. A comparison of the predicted
tomato genes with the Arabidopsis genome showed that segmented gene
colinearity between the two species was preserved, but located on
different Arabidopsis chromosomes or on the same chromosome but
physically separated. Unlike the chromosome 2 ovate locus,
the JOINTLESS region seemed to be synteneous to the
non-duplicated portion of the Arabidopsis genome.
 |
RESULTS AND DISCUSSION |
Sequence Characteristics of the Tomato JOINTLESS
Locus
Tomato BAC clone 240K04 (GenBank accession no. AF292003.),
containing the JOINTLESS locus, was isolated from an
L. esculentum cv Heinz1706 BAC library (Budiman et al.,
2000 ) by screening with jointless-linked RFLP markers TG523
and TY159L. Previous work showed that the genomic region containing the
jointless locus has a low genetic/physical ratio of less
than 100 kb per centiMorgan (cM) as demonstrated by yeast artificial
chromosome physical mapping (Zhang et al., 1994 ), in contrast to
the whole genome average of 750 kb cM 1
(Tanksley et al., 1992 ). The tomato genomic sequence on BAC 240K04 was
found to be 118,813 bp long and contained TG523 and TY159L, genetically
1 cM apart, but separated by only approximately 50 kb (Fig.
1). In total, 15 putative genes
(including two that belong to one retrotransposon) were predicted on
240K04, giving a gene density in this region as one every 8 kb. This
ratio was close to a recent estimation of one gene per 6.2 kb at the
ovate region of tomato chromosome 2 (Ku et al., 2000 ) and
nearly twice that of Arabidopsis chromosome 2 (one gene per 4.4 kb; Lin
et al., 1999 ). BAC 240K04 had an average GC content of approximately
32% with the 15 putative gene coding regions having a higher GC
content of 42%. These data were comparable to a previous study where
the whole genome GC content was estimated as 37% and coding regions as
46% (Messeguer et al., 1991 ). Concerning the repetitive sequences on
BAC 240K04, the only known transposable element was a
copia-like retrotransposon. Small DNA elements, however,
were found frequently when the sequence of 240K04 was searched against
GenBank, reminding us of small elements like miniature inverted-repeat
elements (MITEs) in rice and maize (Bureau and Wessler, 1994 ; Bureau et
al., 1996 ). These DNA elements (short DNA repeats [SDRs]) were of
great interest because of their frequent association with gene
sequences. In particular, some SDRs were directly involved in the
mutation of tomato genes such as jointless (Mao et al.,
2000a ) and yellow flesh (Fray and Grierson, 1993 ).

View larger version (23K):
[in this window]
[in a new window]
|
Figure 1.
A graphical display of the predicted ORFs and the
DNA elements on BAC 240K04. Arrows for each ORF indicate the coding
orientation. The pointed arrows on 240K04.1 and 240K04.15 indicate that
these two ORFs are 3' incomplete and the blunt arrows flanking the
retrotransposon represent the two LTRs.
|
|
Genes
Fifteen putative genes were predicted by gene prediction
programs such as GenScan and their functions were determined by
searching the nonredundant protein database in GenBank (Table
I, Fig. 1). ORFs 240K04.1 and 240K04.15
were incomplete because they were located at either end of the BAC. Six
predicted genes (240K04.2, 3, 5, 6, 8, and 11) had no significant match
in GenBank. However, five had significant expressed sequence
tag (EST) matches with The Institute for Genomic Research tomato
gene index (Lycopersicon Gene Index,
http://www.tigr.org/tdb/lgi/), and thus were
annotated as unknown proteins. In total, 11 out of 15 putative genes
had significant EST matches in the LGI, including ORFs 240K04.12 and 240K04.13, that apparently belong to the same retrotransposon as
delimited by the two LTRs as described below. In the tomato ovate region, however, only six out of the 17 ORFs were
found to have EST matches (Ku et al., 2000 ). ORFs 240K04.6 and 240K04.7 may be from the same gene family because a portion of their sequences were significantly similar to each other. Based on the GenBank search
results, we were able to assign putative functions to nine ORFs
including JOINTLESS, a member of the MADS-box transcription factor gene family (Table I).
Repetitive Sequences
Previous work showed that the tomato genome contains mostly low
copy number sequences (Zamir and Tanksley, 1988 ). Ganal et al.
(1988) estimated by hybridization with the major tomato repeated sequences that the total amount of highly repeated sequences might lie
between 10% and 15%. An analysis of 1,205 random BAC end sequences also showed that the repetitive sequences in the tomato genome is
around 12% (Budiman et al., 2000 ). Although a DNA database search of
the 240K04 sequence did not detect large blocks of repetitive sequences, regions of small DNA segments were found with high similarity to more than 20 gene sequences from tomato or potato Solanum spp., indicating the repetitive nature of
these DNA sequences.
Simple Sequence Repeats (SSRs)
There were 14 SSRs along the 118.8-kb sequence. Four of them were
simple repeats, three (TA)n, and one (TTA)8. The
remaining were mixtures of different repeats derived from nucleotide
conversions or substitutions such as
(TG)8(TA)8 at 2,243 bp,
(TTA)5 TA(TAA)2(TTA)2 at 69,378 bp, and
(TA)17TC(TG)7 at 75,565 bp.
These SSRs gave a density of one SSR per 8.5 kb. In addition,
(AT)n-type SSRs were the main class of SSRs, consistent with a previous
report that (AT)n SSRs were more frequently found in coding regions in tomato (Broun and Tanksley, 1996 ).
Retrotransposons
One copia-like LTR retrotransposon, named Lere1, was
present on BAC 240K04. Lere1 was 5,532 bp long with two 276-bp LTRs
that were 97% identical. A six-frame translation of the Lere1 sequence revealed numerous stop codons within the polyprotein region indicating that Lere1 was highly degraded, though the major components of a
retrotransposon such as the endonuclease (or integrase) domain, the
reverse transcriptase domain, and the RNase domain were still recognizable. Two predicted ORFs (240K04.12 and 240K04.13) represented most of the contiguous amino acid regions from all three forward frames. 240K04.13 had strong homology to known copia-like
retrotransposons. However, 240K04.12 did not contain any major domains
but was simply homologous to annotated Arabidopsis sequences. A DNA-DNA
comparison revealed that Lere1 was not identical to any
retrotransposons previously cloned in tomato. Nevertheless, there were
two tomato genomic sequences, the vacuolar invertase genes from
Lycopersicon pimpinelliforlium (Z12028.1) and
L. esculentum (Z12027.1; Elliott et al., 1993 ), which
contain sequences corresponding to the polyprotein region of Lere1. It
is interesting that a TBlastN search of the LGI revealed that one EST
match (TC14149) had 94% protein sequence identity along 251 amino
acids, indicating that either a similar retrotransposon was still
active in tomato or a retrotransposon (or part of it) was co-expressed
in TC14149.
Foldback Transposon Tomato Anionic Peroxidase Inverted Repeat
(TAPIR)
Three regions of BAC 240K04 were found to be similar to TAPIR
elements (Hong and Tucker, 1998 ). Four copies of TAPIR have been
reported as inverted repeats in the non-coding regions of tomato
polygalacturonase (TAPG) 1, 2, and 4, as well as the first intron of
tomato anionic peroxidase (X15853). An updated GenBank search using
TAPIR1 as a query showed additional tomato sequences bearing TAPIR
elements (Table II).
View this table:
[in this window]
[in a new window]
|
Table II.
Analysis of TAPIR nad SDR repeat elements in tomato
BAC 240K04
L.e., L. esculentum; L. pim., L. pimpinellifolium; L. pen., Lycopersicon pennellii;
S.t., Solanum tuberosum; L. per., Lycopersicon
peruvianum; ACC, 1-aminocyclopropane-1-carboxylate; TAP: tomato
anionic peroxidase; N.A., not applicable.
|
|
The reason that TAPIRs were of interest was that one TAPIR element was
found to be associated with a 939-bp deletion in the promoter and first
exon of the jointless mutant allele (Mao et al., 2000a ),
indicating that TAPIR could be a transposable element. With its strong
secondary structure and a middle loop, TAPIR was very similar to a type
III foldback transposon (Fig. 2a;
Rebatchouk and Narita, 1997 ). To prove that TAPIR did transpose in the
tomato genome, we designed primers flanking the three TAPIR loci on BAC 240K04 (TAPIR1, TAPIR 2, and TAPIR3) and PCR amplified these regions from both cultivated and wild tomatoes (Fig. 2b). At the TAPIR1 locus,
the wild tomato species Lycopersicon hirsutum had a
smaller band compared with the other tomato species or cultivars. This DNA fragment was cloned and sequenced. The results showed that in
L. hirsutum, the TAPIR element was absent at the site
corresponding to the TAPIR1 locus of the control cultivated tomato,
indicating that the TAPIR may have transposed. Like other transposition
events that often cause the deletion of neighboring sequences, the
deletion in the jointless mutant may well have
been caused by the transposition of TAPIR1 at the 5' region of the
JOINTLESS gene (Mao et al., 2000a ).

View larger version (47K):
[in this window]
[in a new window]
|
Figure 2.
Characterization of foldback transposon TAPIR. a,
Strong secondary structure of the TAPIR element. b, A PCR survey of the
three TAPIR loci in various tomato lines/species. Primers used are:
TAPIR1f, 5'GAT AGT TAA AGA TGC GCC TAA C3'; TAPIR1r, 5'CGT GTG GGT GTA
TAT CTA TTC3'; TAPIR2f, 5'GGT AGA TAG GCA AAA GTT TC3'; TAPIR2r, 5'GAA
TAG ATA TAC ACC CAC ACG3'; TAPIR3f, 5'CAC TTG TTA TAC ACT TGT GAT GG3';
and TAPIR3r, 5'CCT TGT TGG GTA TTT GCA TGT G3'. c, The sequence of the
direct repeats (DRs) flanking the tomato TAPIRs. d, A phylogenetic tree
developed using the middle loop sequences of 12 TAPIRs. 1BF09F, 3DC06F,
and 3AB02F were three BAC ends. The lowercase r indicates the reverse
strand.
|
|
The transposition events of TAPIRs can be indicators of the
evolutionary relationship among the tomato species. In Figure 2b, the
investigation among various tomato species of the three loci
corresponding to TAPIR1, 2, and 3 on 240K04 showed that polymorphism was more likely to be found in the wild species L. hirsutum,
Lycopersicon parviflorum, and L. peruvianum. The other two wild species, Lycopersicon cheesemanii and L. pimpinefollium, displayed no
polymorphism at all. This demonstrated that the genomes of the latter
two wild species were more similar to the modern cultivars than the
former three, concordant with the results of a previous phylogenetic study using RFLPs (Miller and Tanksley, 1990 ). Insertions of SDRs at
certain loci of the cultivated tomato genomes could be very recent
events, after the cultivated tomatoes were further separated from the
wild species.
TAPIR elements have another feature common to known transposons, i.e.
they are flanked by short DRs. Comparison of the flanking sequences of
intact TAPIRs showed that TAPIR DRs were usually 7 bp long and mainly
composed of A/T bases (Fig. 2c), indicating that A/T-rich regions could
be the preferable insertion targets for TAPIRs. In addition to the
three copies present in BAC 240K04 (Fig. 1), another nine tomato
genomic sequences, all of which were sequences of identified genes,
were found to bare TAPIR elements in their 5'/3' or intron regions
(Table II). A phylogenetic tree was generated using the
sequences corresponding to the loop region of each TAPIR because this
part of the sequence appeared to be the most heterogeneous. Eight out
of 12 TAPIRs tested clustered into a major group including TAPIR1 and
TAPIR3 (Fig. 2d). Therefore, TAPIRs may be derived from the same
ancestor. The remaining four seemed to be more differentiated. It is
unclear whether the sequence differences occurred after their
insertion, or TAPIRs had different origins in their origination by
"trapping" a different piece of genomic DNA in between the two
inverted repeats.
Type III foldback transposons such as TAPIR are similar to
MITEs such as maize Stowaway that also have strong secondary
structure and are thought to be associated with genes in both monocot
and dicot plants (Bureau and Wessler, 1994 ). Though multiple cis
elements such as ethylene and auxin response elements have been found
in TAPIR sequences near tomato TAPG genes (Hong and Tucker, 1998 ), no
essential function was found for TAPIRs in regulating adjacent down
stream genes (Hong et al., 2000 ). This may imply that the association of TAPIRs with tomato genes does not mean that they have
particular functions in gene regulation but is simply due to the
preferable sequence composition for TAPIR insertion sites at these
positions. A recent study of Arabidopsis transposons also demonstrated
that A/T-rich sequences, which in many cases were intergenic and intron
sequences, were the preferred locations for transposons (Le et al.,
2000 ).
SDRs
Four SDRs were identified in the sequence of BAC 240K04 that had
no similarity in sequence and secondary structure to any known
transposable elements (Table II). The lengths of SDRs ranged from 150 to 400 bp. A BlastN search of GenBank showed that four kinds of SDRs
were present in 18 tomato genomic DNAs. Like TAPIR elements, SDRs were
found mainly located at a gene's 5', 3', or introns. The prevalence of
SDRs indicated that these elements could be unidentified transposons.
We tested the SDR1 locus in various tomato species using a similar PCR
experiment as described in Figure 2b and found that wild tomato species
L. hirsutum, L. parviflorum, and L. peruvianum had shorter products (data not shown), indicating that
SDR1, like TAPIR, could be a transposable element too. This hypothesis
was further supported by the discovery that half of the mRNA sequence
of the phytoene synthase from the yellow flesh mutant
(X67143.1) was a nearly full-length SDR1. Fray and Grierson (1993)
reported that the yellow flesh mutant allele contains a
highly repeated genomic DNA sequence. Aberrant transcripts containing
part of the TAPIR1 sequence were also observed from the
jointless allele (Mao et al., 2000a ).
SDR3 was A/T rich and only 150 bp long. There were eight tomato genomic
sequences in GenBank containing this element, including the 5' region
of the tomato HMG2 (3-hydroxy-3-methylglutaryl CoA reductase 2) gene
that also contains a retrotransposon (ToLTR) at the 5' end. SDR3 had
four significant matches in the LGI (AI776920, AI487358, AW036511, and
AW220080), indicating that this small element was often positioned
close enough to be included in the transcript of a gene. Among the wild
relatives of cultivated tomato species such as L. esculentum, L. pimpinefollium is considered the most
closely related (Miller and Tanksley, 1990 ). The coding sequences of
the vacuolar invertase in these two species have been shown to be
identical (Elliott et al., 1993 ). The conservation of repetitive
sequences that were associated with the gene in the two species,
including the segments of retrotransposon Lere1 sequence at their 5'
region and the small element SDR2 at their 3' region, may further
indicate that they are very similar at the genomic sequence level.
It appeared that SDRs were specific to only members of the family
Solanaceae. SDR3 and SDR4 each had a copy in a potato gene (Table II).
Other than that, SDRs were absent from the non-tomato genomic
sequences. The constraint of SDRs in the Solanaceae family may indicate
that the occurrence of these elements was a recent event, after the
separation of the Solanaceae from the remaining plant families.
Gene Colinearity of Tomato BAC 240K04 and the Arabidopsis
Genome
Four criteria were used for synteny analysis: homology, physical
distance, colinearity, and gene orientation. This means that all the
genes considered had significant homology with each other, were
physically close, and were in the same order and transcriptional orientation. First, the protein sequences from the fifteen putative tomato genes from BAC 240K4 were searched against the Arabidopsis sequences in GenBank using TBlastN. All but one
(240K4.03) of the putative genes had
significant similarity with the Arabidopsis genome with an
expectation value of E < 10 10 (Table
III). Besides 240K04.12 and 240K04.13
that belong to the retrotransposon polyprotein, six more ORFs had more
than five Arabidopsis matches with an E value less than
10 10, indicating that they were members of
various gene families. An example was 240K04.14, the MADS-box gene
whose MADS-box domain is present in numerous MADS-box transcription
factors in Arabidopsis. ORF 240K04.9, a putative auxin-independent
growth promoter, had more than 25 Arabidopsis matches with
E<10 10. The multiple Arabidopsis matches of
tomato ORFs were apparently due to the high number of gene families in
the Arabidopsis genome more than 37% of Arabidopsis genes have >five
members (AGI, 2000 ). Gene families may cause problems when to determine
the Arabidopsis orthologue for the correspondent tomato gene based on
sequence homology. Therefore, in our microsynteny study, homology is
only one of the four criteria evaluated.
Figure 3 shows the network of
microsynteny of the tomato ORFs on BAC 240K04 with the Arabidopsis
genome. Regions in Arabidopsis were not labeled with the names of the
ORFs due to the fact that in both tomato and Arabidopsis most of the
ORFs were derived from the gene prediction programs and may not be
correspondent ORF by ORF. Because the borders of ORFs on tomato BAC
240K04 were distinguishable we believe that the correspondent regions
in Arabidopsis genome should represent separate ORFs. Five Arabidopsis
DNA segments were found to contain clusters of genes that were
physically close to each other and were in the same order as their
homologs in tomato (G1-G5). The coding orientations of the
corresponding genes were exactly the same in both genomes, meeting the
criteria set above. The number of genes in each segment were from two
to five. G3 and G5 contained two genes in less than 10 kb that were
significantly similar to the tomato genes. The largest segment G4
contained five genes that were colinear with those on the tomato BAC
240K04. The ORF 240K04.6 had nine matches in the Arabidopsis genome.
Three of them were located in the three syntenic segments from the
Arabidopsis chromosomes II, IV, and V, respectively. Another ORF
240K04.14, the MADS-box gene, was associated with two Arabidopsis
syntenic segments, G1 and G3. The presence of the MADS-box genes in the vicinity of other tomato homologs in Arabidopsis may be coincidental due to a large number of such genes and their highly conserved DNA
binding domain. However, the identical coding orientation in both
Arabidopsis and tomato when compared with the other gene(s) in the same
syntenic segment supported that these MADS-box genes did fall within
syntenic clusters in Arabidopsis.

View larger version (31K):
[in this window]
[in a new window]
|
Figure 3.
A schematic display of the Arabidopsis syntenic
segments to tomato BAC 240K04. The number of centiMorgans (cM) indicate
the approximate positions of the corresponding Arabidopsis
BAC/P1-derived artificial chromosome clones on the Arabidopsis
genome. Black arrows indicate ORFs that do not fall in any colinear
segments. Arrows on the tomato BAC indicate approximate positions of
the predicted ORFs, whereas arrows on Arabidopsis BAC/P1-derived
artificial chromosome clones indicate major homologous
regions.
|
|
Overall, the five Arabidopsis syntenic segments were from three
chromosomes. G2 and G3 were from chromosome IV and more than 37 cM
apart. G4 and G5 were from chromosome V and more than 120 cM apart.
However, no duplicated Arabidopsis segments were observed for these
syntenic clusters, compared with the tomato chromosome 2 ovate region where the Arabidopsis syntenic segments were
duplicated (Ku et al., 2000 ). Thus, the Arabidopsis segments syntenic
to the tomato JOINTLESS region may fall in the portion of
the genome that was not duplicated because the duplicated part in the
Arabidopsis genome comprises only 60% (AGI, 2000 ). It is interesting
that none of the genes in the Arabidopsis syntenic segments contained a
gene that had a different coding orientation to its tomato counterpart. In the tomato chromosome 2 ovate region, however, three out
of 12 syntenic tomato genes were located on the opposite strand in Arabidopsis (Ku et al., 2000 ). This may indicate that the duplicated portion of the Arabidopsis genome could be subject to less
selection pressure because of gene copy redundancy and therefore may
accommodate more genetic rearrangements.
Although the duplicated Arabidopsis genome may fit the model of
polyploidization and subsequent gene loss (AGI, 2000 ; Ku et al., 2000 ),
we observed that ORF 240K04.6 was associated with three syntenic
segments from three Arabidopsis chromosomes. In each segment, 240K04.6
was in the middle of the syntenic cluster and therefore the possibility
of a false correlation was low, although ORF 240K04.6 had nine
significant matches with the Arabidopsis sequences. The presence of one
ORF in more than two synteneous segments supported the hypothesis that
other alternatives such as independent segmental duplication may also
be possible in shaping the Arabidopsis genome (AGI, 2000 ). We also
noticed that three Arabidopsis segments (G1, G3, and G4) were spanning
the tomato region that contained the retrotransposon (represented by
240K04.12 and 240K04.13). However, none of these segments contained a
retrotransposon in their syntenic context, indicating that the
transposition event in tomato occurred after the divergence of these
two species about 100 MYA (Gandolfo et al., 1998 ; Yang et al.,
1999 ).
 |
CONCLUSIONS |
The analysis of the 118,813-bp region from the tomato
JOINTLESS region revealed that the tomato genome may contain
an abundance of SDRs or small transposons. The presence of some of
these elements may be very recent events as demonstrated by the PCR
experiment to investigate the related loci in the wild tomatoes.
Although SDRs were frequently associated with tomato genes, they did
not appear to have functions in regulating adjacent genes. Unlike in
rice and maize where small DNA elements, such as MITEs, have been found
in large numbers and associated with genes too (Bureau and Wessler,
1994 ; Wessler et al., 1995 ; Mao et al., 2000b ), the Arabidopsis genome
has fewer numbers of such elements (Casacuberta et al., 1998 ; Ade and
Belzile, 1999 ; Le et al., 2000 ). Therefore, the large number of such
elements may contribute significantly in the expansion of the genomes
of rice, maize, and tomato.
Comparison of the tomato JOINTELSS locus with the
Arabidopsis genome demonstrated that small syntenic segments could be
easily found in the genomes of these two dicot plants that have been diverged more than 100 MYA. In particular, the strict conservation of
the gene-encoding orientations in these colinear segments indicates that the region may represent an ancestral block of genes. It is
interesting that the comparison of this region did not reveal duplicated segments in Arabidopsis in which more than 60% of its genome is duplicated. It is tempting to put forward a hypothesis about
the synteny between the tomato and Arabidopsis genomes, but a clearer
picture will be visible only when we have additional long stretches of
tomato sequence available for investigation.
 |
MATERIALS AND METHODS |
Tomato (Lycopersicon esculentum) BAC Library
Screening
Hybridization of high-density tomato BAC filters using TG523 and
TY159L insert is as described elsewhere (Mao et al., 2000a ). DNAs of
positive clones for insert check and Southern analysis were isolated
using the standard alkali lysis method (Sambrook et al., 1989 ).
Construction of a Shotgun Library of Clone 240K4 and
Sequencing
A maxiprep of tomato BAC clones was performed using alkali lysis
and further purified by CsCl gradient centrifugation using a standard
protocol (Sambrook et al., 1989 ). BAC DNA was nebulized under 6.5 pounds per square inch for 4 min and end repaired. Fragments of
2 to 5 kb were cut out from an agarose gel, purified, and ligated to
pBluescipt. The ligation products were electroporated into DH10B
competent cells (Gibco BRL, Rockville, MD). Clones were picked
randomly and stored in 96-well microtiter plates.
DNA sequencing was performed on ABI 377XL automatic sequencers.
Templates of shotgun clones were prepared by the AutoGen (Integrated Separation Systems, Portsmouth, NH) from 3-mL overnight
cultures. DNA from each preparation was finally dissolved in 80 µL of
water or Tris EDTA and 4 µL was used for sequencing reactions
using the BigDye Cycle Sequencing Kit following manufacturer's
instructions (ABI, Columbia, MD).
Sequence Analysis
Production sequencing of approximately 2,700 shotgun clones was
performed with an additional 200 reactions for finishing. Phred (Ewing
and Green, 1998 ; Ewing et al., 1998 ) was used for base calling from the
trace files, and Consed (Gordon et al., 1998 ) was used to assemble the
sequences into contigs. The final consensus sequence was
searched against GenBank using BlastN and BlastX algorithms
(Altschul et al., 1997 ) through the Web site of the National
Center for Biotechnology Information
(www.ncbi.nlm.gov). Gene structures were predicted
with Genscan, Genscan+ (Chris Burge, Massachusetts Institute of
Technology, Cambridge,
http://CCR-81.mit.edu/GENSCAN.html), and GeneMarkHMM
(Mark Borodovsky, Georgia Tech, Atlanta,
http://genemark.biology.gatech. edu/GeneMark/). Arabidopsis databases
was searched through The Arabidopsis Information Resource Web site
(http://www.Arabidopsis.org/) and The Kazusa Arabidopsis data
opening site (http://www.kazusa.or.jp/kaos/).
The GCG package (University of Wisconsin, Madison, Version 10) command
FOLDRNA was used for analyzing secondary structures. The phylogenetic
tree was made using PUAPSEARCH and PAUPDISPLAY from multiple sequence
alignment generated by PILEUP.
 |
ACKNOWLEDGMENTS |
We are grateful to Mareen Milnamow and Ann Hu of the former
Novartis Plant Protection, Inc. (Raleigh, NC) for technical
help in shotgun library construction. Dr. S.D. Tanksley of Cornell University (Ithaca, NY) kindly provided the genomic DNA of some tomato species. We also thank Drs. Yeisoo Yu and Chunhuan Wan for their
help during the course of sequencing and Rob Kingsberry, III (Clemson
University, SC) for bioinformatics assistance. The sequence of
240K04 has been deposited in GenBank under the accession number
AF292003. Supplemental data are viewable at
http://www.genome.clemson.edu/~lmao/pp_supplement.html.
 |
FOOTNOTES |
Received December 14, 2000; returned for revision February 8, 2001; accepted April 17, 2001.
1
This work was supported by the National Science
Foundation (grant to R.A.W.) and by the Coker Chair in Plant Molecular
Genetics (to R.A.W.). Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the authors and
do not necessarily reflect the views of the National Science Foundation.
*
Corresponding author; email rwing{at}clemson.edu; fax
864-656-4293.
 |
LITERATURE CITED |
-
Ade J, Belzile FJ
(1999)
Hairpin elements, the first family of foldback transposons (FTs) in Arabidopsis thaliana.
Plant J
19: 591-597[CrossRef][Medline]
-
AGI
(2000)
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.
Nature
408: 796-815[CrossRef][Medline]
-
Altschul SF, Madden TL, Schafer AA, Zhang J, Zhang Z, Miller W, Lipman DJ
(1997)
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
Nucleic Acids Res.
25: 3389-3402[Abstract/Free Full Text]
-
Broun P, Tanksley SD
(1996)
Characterization and genetic mapping of simple repeat sequences in the tomato genome.
Mol Gen Genet
250: 39-49[Medline]
-
Budiman MA, Mao L, Wood TC, Wing RA
(2000)
A deep-coverage tomato BAC library and prospects toward development of an STC framework for genome sequencing.
Genome Res
10: 129-136[Abstract/Free Full Text]
-
Bureau TE, Ronald PC, Wessler SR
(1996)
A computer-based systematic survey reveals the predominance of small inverted-repeat elements in wild-type rice genes.
Proc Natl Acad Sci USA
93: 8524-8529[Abstract/Free Full Text]
-
Bureau TE, Wessler SR
(1994)
Stowaway: a new family of inverted repeat elements associated with the genes of both monocotyledonous and dicotyledonous plants.
Plant Cell
6: 907-916[Abstract]
-
Casacuberta E, Casacuberta JM, Puigdomenech P, Monfort A
(1998)
Presence of miniature inverted-repeat transposable elements (MITEs) in the genome of Arabidopsis thaliana: characterization of the Emigrant family of elements.
Plant J
16: 79-85[CrossRef][ISI][Medline]
-
Elliott KJ, Butler WO, Dickinson CD, Konno Y, Vedvick TS, Fitzmaurice L, Mirkov TE
(1993)
Isolation and characterization of fruit vacuolar invertase genes from two tomato species and temporal differences in mRNA levels during fruit ripening.
Plant Mol Biol
21: 515-524[CrossRef][ISI][Medline]
-
Ewing B, Green P
(1998)
Base-calling of automated sequencer traces using phred: II. Error probabilities.
Genome Res
8: 186-194[Abstract/Free Full Text]
-
Ewing B, Hillier L, Wendl MC, Green P
(1998)
Base-calling of automated sequencer traces using phred: I. Accuracy assessment.
Genome Res
8: 175-185[Abstract/Free Full Text]
-
Fray RG, Grierson D
(1993)
Identification and genetic analysis of normal and mutant phytoene synthase genes of tomato by sequencing, complementation and co-suppression.
Plant Mol Biol
22: 589-602[CrossRef][ISI][Medline]
-
Ganal MW, Lapitan NLV, Tanksley SD
(1988)
A molecular and cytogenetic survey of major repeated DNA sequences in tomato (Lycopersicon esculentum).
Mol Gen Genet
213: 262-268[CrossRef]
-
Gandolfo MA, Nixon KC, Crepet WL
(1998)
A new fossil flower from the Turonian of New Jersey: Dressiantha bicarpellata gen. et sp. nov. (Capparales).
Am J Bot
85: 964-974[Abstract]
-
Gordon D, Abajian C, Green P
(1998)
Consed: a graphical tool for sequence finishing.
Genome Res
8: 195-202[Abstract/Free Full Text]
-
Hong SB, Sexton R, Tucker ML
(2000)
Analysis of gene promoters for two tomato poly-galacturonases expressed in abscission zones and the stigma.
Plant Physiol
123: 869-881[Abstract/Free Full Text]
-
Hong SB, Tucker ML
(1998)
Genomic organization of six tomato polygalacturonases and 5' upstream sequence identity with tap1 and win2 genes.
Mol Gen Genet
258: 479-487[Medline]
-
Ku HM, Vision T, Liu J, Tanksley SD
(2000)
Comparing sequenced segments of the tomato and Arabidopsis genomes: large-scale duplication followed by selective gene loss creates a network of synteny.
Proc Natl Acad Sci USA
97: 9121-9126[Abstract/Free Full Text]
-
Le QH, Wright S, Yu Z, Bureau T
(2000)
Transposon diversity in Arabidopsis thaliana.
Proc Natl Acad Sci USA
97: 7376-7381[Abstract/Free Full Text]
-
Lin X, Kaul S, Rounsley S, Shea TP, Benito MI, Town CD, Fujii CY, Mason T, Bowman CL, Barnstead M
(1999)
Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana.
Nature
402: 761-768[CrossRef][Medline]
-
Mao L, Begum D, Chuang HW, Budiman MA, Szymkowiak EJ, Irish EE, Wing RA
(2000a)
JOINTLESS is a MADS-box gene controlling tomato flower abscission zone development.
Nature
406: 910-913[CrossRef][Medline]
-
Mao L, Wood TC, Yu Y, Budiman MA, Tomkins J, Woo SS, Sasinowski M, Presting G, Frisch D, Goff S
(2000b)
Rice transposable elements: a survey of 73,000 sequence-tagged-connectors (STCs).
Genome Res
10: 982-990[Abstract/Free Full Text]
-
Messeguer R, Ganal MW, Steffens JC, Tanksley SD
(1991)
Characterization of the level, target sites and inheritance of cytosine methylation in tomato nuclear DNA.
Plant Mol Biol
16: 753-770[CrossRef][ISI][Medline]
-
Miller JC, Tanksley SD
(1990)
RFLP analysis of phylogenetic relationships and genetic variation in the genus Lycopersicon.
Theor Appl Genet
80: 437-448[ISI]
-
Rebatchouk D, Narita JO
(1997)
Foldback transposable elements in plants.
Plant Mol Biol
34: 831-835[CrossRef][ISI][Medline]
-
Sambrook J, Fritsch EF, Maniatis T
(1989)
Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Plainview, NY
-
Tanksley SD, Ganal MW, Prince JP, de Vicente MC, Bonierbale MW, Broun P, Fulton TM, Giovannoni JJ, Grandillo S, Martin GB
(1992)
High density molecular linkage maps of the tomato and potato genomes.
Genetics
132: 1141-1160[Abstract]
-
Wessler SR, Bureau TE, White SE
(1995)
LTR-retrotransposons and MITEs: important players in the evolution of plant genomes.
Curr Opin Genet Dev
5: 814-821[CrossRef][ISI][Medline]
-
Yang YW, Lai KN, Tai PY, Li WH
(1999)
Rates of nucleotide substitution in angiosperm mitochondrialDNA sequences and dates of divergence between Brassica and other angiosperm lineages.
J Mol Evol
48: 597-604[CrossRef][ISI][Medline]
-
Zamir D, Tanksley SD
(1988)
Tomato genome is comprised largely of fast-evolving, low copy-number sequences.
Mol Gen Genet
213: 254-261[CrossRef]
-
Zhang H, Martin GB, Tanksley SD, Wing RA
(1994)
Map-based cloning in crop plants: tomato as a model system: II. Isolation and characterization of a set of overlapping yeast artificial chromosomes encompassing the jointless locus.
Mol Gen Genet
244: 613-621[ISI][Medline]
© 2001 American Society of Plant Physiologists
This article has been cited by other articles:

|
 |

|
 |
 
H. Kuang, C. Padmanabhan, F. Li, A. Kamei, P. B. Bhaskar, S. Ouyang, J. Jiang, C. R. Buell, and B. Baker
Identification of miniature inverted-repeat transposable elements (MITEs) and biogenesis of their siRNAs in the Solanaceae: New functional implications for MITEs
Genome Res.,
January 1, 2009;
19(1):
42 - 56.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Moxon, R. Jing, G. Szittya, F. Schwach, R. L. Rusholme Pilcher, V. Moulton, and T. Dalmay
Deep sequencing of tomato short RNAs identifies microRNAs targeting genes involved in fruit ripening
Genome Res.,
October 1, 2008;
18(10):
1602 - 1609.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. M. Jones, C. M. Rick, D. Adams, J. Jernstedt, and R. T. Chetelat
Genealogy and fine mapping of obscuravenosa, a gene affecting the distribution of chloroplasts in leaf veins, and evidence of selection during breeding of tomatoes (Lycopersicon esculentum; Solanaceae)
Am. J. Botany,
June 1, 2007;
94(6):
935 - 947.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Wang, X. Tang, Z. Cheng, L. Mueller, J. Giovannoni, and S. D. Tanksley
Euchromatin and Pericentromeric Heterochromatin: Comparative Composition in the Tomato Genome
Genetics,
April 1, 2006;
172(4):
2529 - 2540.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
R. Guyot, X. Cheng, Y. Su, Z. Cheng, E. Schlagenhauf, B. Keller, and H.-Q. Ling
Complex Organization and Evolution of the Tomato Pericentromeric Region at the FER Gene Locus
Plant Physiology,
July 1, 2005;
138(3):
1205 - 1215.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
E. van der Knaap, A. Sanyal, S. A. Jackson, and S. D. Tanksley
High-Resolution Fine Mapping and Fluorescence in Situ Hybridization Analysis of sun, a Locus Controlling Tomato Fruit Shape, Reveals a Region of the Tomato Genome Prone to DNA Rearrangements
Genetics,
December 1, 2004;
168(4):
2127 - 2140.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
P. Bauer, T. Thiel, M. Klatte, Z. Bereczky, T. Brumbarova, R. Hell, and I. Grosse
Analysis of Sequence, Map Position, and Gene Expression Reveals Conserved Essential Genes for Iron Uptake in Arabidopsis and Tomato
Plant Physiology,
December 1, 2004;
136(4):
4169 - 4183.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
Z.-N. Yang, X.-R. Ye, J. Molina, M. L. Roose, and T. E. Mirkov
Sequence Analysis of a 282-Kilobase Region Surrounding the Citrus Tristeza Virus Resistance Gene (Ctv) Locus in Poncirus trifoliata L. Raf.
Plant Physiology,
February 1, 2003;
131(2):
482 - 492.
[Abstract]
[Full Text]
[PDF]
|
 |
|
|
|