|
|
||||||||
|
Plant Physiology 138:1-3 (2005) © 2005 American Society of Plant Biologists Biological Databases for Plant ResearchBiology has undergone several rounds of transformation in terms of the research paradigms it has operated, ranging from theoretical to experimental, in the pursuit of discovering new molecular mechanisms that regulate biological form and function. In the decades to come, it will take on another transformation to understand the modes of action of biological processes at the organismal level, where computational models of systems-wide properties could serve as the basis for prediction of biological behavior, leading to new experimentation and discovery. For this transformation to occur, it is essential to facilitate and enhance the processing, integration, and interpretation of the massive amounts of biological data by the life science research community. Databases have been a standard way of managing and processing large amounts of information in diverse arenas, including academic disciplines, industry, and government sectors, for many years. The use of database technologies has drawn the attention of a subset of the biological community, but its use has been limited to a small sector of the communitymainly those involved in the organization and distribution of data resources. While these resources are perused by a great number of the research community, the majority of these users are relatively unaware of the initiatives undertaken to acquire, curate, and enhance the content of these databases in service to the wider research community. This can both limit the uses of these data to its maximal capacity as well as lead to misuse of the data. In addition, more and more experimental biologists are generating data on a large scale and are in need of developing and managing databases of their own.The motivation of organizing this focus is severalfold: (1) to demystify how the major database resources relevant for plant research today acquire, process, and make available their data, what the current limitations and caveats are of these resources, and what the future directions of these resources are; (2) to bring forward the general issues of databases and data management today to the larger plant research community; (3) to engage the general readership of Plant Physiology in thinking about large datasets and how to apply them to their research problems; and (4) to encourage researchers in the use and development of databases to further their research goals. This focus issue will be published over three issues: the current May issue, July issue, and August issue. The articles that will be featured in the focus issue represent some the major database resources available today and serve to introduce this topic to the general readership of this journal. However, by no means do they complete the picture of the diversity of database resources that are currently available. To accommodate the ongoing research in the generation of large datasets and development of biological databases, Plant Physiology's Bioinformatics section will have a subsection called Plant Databases starting September 2005.
There are three main types of biological databases that have been established and are being developedlarge-scale public repositories, community-specific database resources, and project-specific databasesalthough the lines among these categories are becoming less clear. Large-scale public repositories are usually developed and maintained by government agencies or international consortia. Examples include GenBank (July Issue, 2005), which is an international nucleotide sequence repository developed and maintained as a collaboration between the National Center for Biotechnology Institute in the United States, EMBL in Europe, and DDBJ in Japan. Other examples include UniProt (Schneider et al., 2005
These databases have been and are being developed fairly independently, and there is a general lack of good documentation on the rationale of the design and implementation and community-wide standards for operation in annotation and data exchange. Part of this problem comes from the lack of recognition of this work as a legitimate scientific endeavor. Most of the databases described above are public efforts carried out by biologists and software developers in academic settings, and more effort to share their development experiences via conferences and publications would help alleviate the problem. The majority of papers on databases describe mostly the content and user functionality available from the databases and their attendant query interfaces, and offer little information on the design and implementation of the software. Also, there is no standard in making database software and schema available. This is a particularly acute problem for emerging data types, such as those resulting from metabolite profiling experiments. Recently, standards in data description and exchange for plant metabolomics have been proposed (Bino et al., 2004
Another key problem in this field is the limited ability to access and usefully integrate data from these myriad of databases in a seamless manner. The increasing number and types of databases and software applications make it more and more difficult for researchers to find out where to go for what information. In addition, the different ways in which data are presented and made accessible for many of these databases create an additional burden on researchers who seek to apply the available resources to their research. Emerging technologies to solve these problems have been proposed, such as the BioMOBY initiative (Wilkinson et al., 2005 It is clear that biological research is in an ongoing state of transition, where novel methods, technologies, and implementations will increasingly be deployed in the pursuit of an enhanced mechanistic understanding of biological systems. One of the most difficult hurdles to overcome in deploying new technologies and reaching new goals is the training and retraining of biologists to adapt to changing needs and environments. Social engineering and technology application will always occur more slowly than technological engineering. It is our hope that this focus issue and the ensuing Plant Databases section in this journal will contribute to the promotion of the social and technological engineering needed to help transition the plant research community to an enhanced awareness and application of database resources in support of its scientific endeavors.
rhee{at}acoma.stanford.edu
bcrosby{at}uwindsor.ca FOOTNOTES www.plantphysiol.org/cgi/doi/10.1104/pp.104.900158. LITERATURE CITED Bino RJ, Hall RD, Fiehn O, Kopka J, Saito K, Draper J, Nikolau BJ, Mendes P, Roessner-Tunali U, Beale MH, et al (2004) Potential of metabolomics as a functional genomics tool. Trends Plant Sci 9: 418425[CrossRef][ISI][Medline]
Cannon SB, Crow JA, Heuer ML, Wang X, Cannon EKS, Dwan C, Lamblin A-F, Vasdewani J, Mudge J, Cook AJ, et al (2005) Databases and information integration for the Medicago truncatula genome and transcriptome. Plant Physiol 138: 3846
Horan K, Lauricha J, Bailey-Serres J, Raikhel N, Girke T (2005) Genome Cluster Database: a sequence family analysis platform for Arabidopsis and Oryza sativa. Plant Physiol 138: 4754 Jenkins H, Hardy N, Beckmann M, Draper J, Smith AR, Taylor J, Fiehn O, Goodacre R, Bino RJ, Hall R, et al (2004) A proposed framework for the description of plant metabolomics experiments and their results. Nat Biotechnol 22: 16011606[Medline]
Jenkins H, Johnson H, Kular B, Wang TL, Hardy N (2005) Toward supportive data collection tools for plant metabolomics. Plant Physiol 138: 6777
Lawrence CJ, Seigfried TE, Brendel V (2005) The Maize Genetics and Genomics Database. The community resource for access to diverse maize data. Plant Physiol 138: 5558
Schneider M, Bairoch A, Wu CH, Apweiler R (2005) Plant protein annotation in the UniProt Knowledgebase. Plant Physiol 138: 5966
Wilkinson MD, Schoof H, Ernst R, Haase D (2005) BioMOBY successfully integrates distributed heterogeneous bioinformatics web services: the PlaNet exemplar case. Plant Physiol 138: 517
Yuan Q, Ouyang S, Wang A, Zhu W, Maiti R, Lin H, Hamilton J, Haas B, Sultana R, Cheung F, Wortman J, Buell CR (2005) The Institute for Genomic Research Osa1 rice genome annotation database. Plant Physiol 138: 1826
Zhang P, Foerster H, Tissier CP, Mueller LA, Paley S, Karp PD, Rhee SY (2005) MetaCyc and AraCyc: metabolic pathway databases for plant research. Plant Physiol 138: 2737 This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY | THE PLANT CELL | |
|---|---|---|---|