|
|
||||||||
|
Plant Physiol, December 2001, Vol. 127, pp. 1590-1594 SCIENTIFIC CORRESPONDENCE Using Combinatorial Design to Study Regulation by Multiple Input Signals. A Tool for Parsimony in the Post-Genomics Era1Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 251 Mercer Street, New York, New York 10012 (D.E.S); Department of Biology, New York University, 100 Washington Square East, New York, New York 10003 (A.Y.K., L.V.L., M.F.C., G.M.C.)
For many systems in biology, genes,
pathways, or metabolites are regulated or synthesized in
response to multiple "input" signals. Herein, we describe how
combinatorial design can be used to define a small set of experiments
that will effectively explore the effects of all possible combinations
of multiple inputs on such regulation.
Historically, understanding how a single signal (e.g. hormone) might
potentially regulate a specific gene or the synthesis of a metabolite
has been a daunting task that has required molecular biology, genetics,
and cell biology. The new challenge in the post-genomics era is to
understand how whole genomes or metabolomes respond not only to single
signals, but also to collections of potentially interacting inputs.
Hormone interaction is a classic example of multiple input regulation.
In plant biology, this includes the antagonistic effects of auxin and
cytokinin, or abscisic acid and gibberellin, in regulating plant
growth and development (Milborrow, 1970 The intersections of "input" signaling pathways have been termed
"cross talk" (Knight and Knight, 2001 In our ongoing studies to explore effects of nitrogen on gene
expression, a complex picture has been emerging wherein gene responses
to nitrogen sources appear to be dependent on multiple variables
including starvation, light, and carbon status, to name a few (Lam et
al., 1998 The "matrix effect" referred to above tells us that many
inputs could potentially influence the regulation of a target gene, pathway, or metabolite. But suppose that a researcher thought that only
a subset of these inputs in fact had this influence. This implies that
the other inputs don't matter. How would one test such a hypothesis?
Combinatorial design is one approach. We briefly describe the
inspiration for this approach from the field of software testing and
then explain its application to case studies in plant biology.
In software testing, the hypothesis (sometimes shown to be wrong) is
that the software is correct under all input combinations. That is, if
one could show that varying the inputs doesn't matter as far as
correctness of the output is concerned, then the software is deemed to
be "correct." For example, suppose there are 10 possible inputs,
each having four possible values. This leads to
410 combinations of inputs, or 1,048,576 tests of
the software. Such a large number is infeasible because software
testing is still a manual art. Cohen et al. (1997) Now, computing a subset of the possibilities is not the same as
computing all of them, so even if the software passes these tests, it
may not be correct. Combinatorial design is a tradeoff between effort
and thoroughness. However, the pragmatic fact is that this form of
software testing picks up virtually all errors (Cohen et al.,
1997 The insight gained from software testing is that if there is a set of
inputs that probably don't matter, then one can determine whether they
in fact do matter using a small number of tests. Applying this idea to
biology works as follows: If you are given a set S of inputs and you
believe that a subset C are the only ones that matter, then you can use
combinatorial design to show (or strongly suggest) that inputs S So, to put this principle to practice, we use as an example a case study, where there are six inputs: light, starvation, carbon, inorganic N, and organic N (Glu or Gln), and each of these inputs has three possible values (0, low [L], high [H]). As in the example above, to test how these six inputs interact to regulate gene expression, we would have to test 36 or 729 possible combinations. Allowing for replication required in northerns, microarrays, or metabolic profiles, this would be an unreasonable (and expensive) number of treatments to test. However, by using combinatorial design, we can propose a subset of these experiments that would give a good approximation of the experimental space. To start, we might first hypothesize that one input (e.g. "light") is the only input that really matters. Then we would test all three values of the input "light" (0, L, or H) and determine whether the output (e.g. gene expression) changes depending on varying values of the other inputs. This entails applying combinatorial design to the other five inputs and combining the resulting treatments with each possible value of "light" (see Table II). So, for each pair of the remaining five inputs (each of which can be one of three values, 0, L, or H), nine pair-wise combinations (three × three) are tested (for example, see the bold entries in Table II). If all possible pairs of five inputs are tested, this results in a combinatorial design set of 17 treatments. These 17 treatments are then tested in the context of every value for light (0, L, or H) resulting in 3 × 17 or 51 total experiments (See Table II).
We would next check the output (e.g. gene expression) to determine
whether any of the other five inputs (e.g. starvation, carbon,
NH4---NO3, Glu, or Gln)
caused the output to vary. If the output did not vary in
response to these other inputs, then "light" is the only input that
matters, or is the dominant regulating input. However, if the output
did vary when various combinations of the other inputs were tested,
then we would next test the hypothesis that two inputs in fact matter
(e.g. light and carbon). In our working example of six inputs (each
with three values, 0, L, or H), if two inputs are hypothesized to
matter then, using combinatorial design, 135 tests are needed. This is
still a big savings over 729 Combinatorial design helps refine hypotheses as well. Suppose that in the above example, "light" determined the output (e.g. gene expression) only when its value was L or H. Then the output (e.g. gene expression) would vary depending on other inputs only when the "light" value was 0. By observing the differences in other inputs that caused the output to vary, one may arrive at some other hypothesis (e.g. when "light" is 0, then carbon determines the output). Forty-six experiments would be required to test this hypothesis, some of which are repeats of the original 51, and far fewer than the 135 tests needed to test two inputs.
In addition to our proposed application of combinatorial design to
biology described above, other applications to biology including
genomic ones have also begun to emerge. Kerr and Churchill (2001) Our proposed application of combinatorial design to biology can potentially be used in the analysis of complex systems to allow researchers to determine which inputs do "matter" in a biological response and more importantly, to rapidly define which inputs don't matter, by using this streamlined and parsimonious experimental approach. The "read out" of the response can be changes in gene expression (DNA chips and microarrays), metabolite profiles (metabolomics), or even developmental responses. Using combinatorial design to reduce the experimental data set is important, not only because it is cost effective, but also because it presents a small number of datasets whose analysis will give an appropriate answer. Finally, is it really possible to extrapolate what might be true for
software testing to much more complex biological systems? We think that
with care, the answer is yes. The reason is that the biological
pathways so far elucidated can be modeled using fairly simple boolean
circuits (Genoud et al., 2001
Received August 28, 2001; accepted September 24, 2001. 1 This work was supported by the National Institutes of Health (grant no. GM32877 to G.M.C.) and by the National Science Foundation (grant no. IIS-9988636 to D.E.S.).
* Corresponding author; email gloria.coruzzi{at}nyu.edu; fax 212-995-4204.
www.plantphysiol.org/cgi/doi/10.1104/pp.010683.
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ASPB Publications | PLANT PHYSIOLOGY | THE PLANT CELL | |
|---|---|---|---|