## Abstract

Plant nuclear genome size (GS) varies over three orders of magnitude and is correlated with cell size and growth rate. We explore whether these relationships can be owing to geometrical scaling constraints. These would produce an isometric GS–cell volume relationship, with the GS–cell diameter relationship with the exponent of 1/3. In the GS–cell division relationship, duration of processes limited by membrane transport would scale at the 1/3 exponent, whereas those limited by metabolism would show no relationship. We tested these predictions by estimating scaling exponents from 11 published datasets on differentiated and meristematic cells in diploid herbaceous plants. We found scaling of GS–cell size to almost perfectly match the prediction. The scaling exponent of the relationship between GS and cell cycle duration did not match the prediction. However, this relationship consists of two components: (i) S phase duration, which depends on GS, and has the predicted 1/3 exponent, and (ii) a GS-independent threshold reflecting the duration of the G1 and G2 phases. The matches we found for the relationships between GS and both cell size and S phase duration are signatures of geometrical scaling. We propose that a similar approach can be used to examine GS effects at tissue and whole plant levels.

## 1. Introduction

Sizes of plant nuclear genomes vary over three orders of magnitude from about 0.065 (*Genlisea margaretae*) to 152.23 pg/1C (*Paris japonica*) [1,2]. However, possible ecological implications have begun to be explored only recently [3,4], for a review see [5]. Although the full functional meaning of this variation is yet unknown, it is clear that genome size (GS) must represent certain phenotypic constraints. Cells must be larger to accommodate large genomes, and larger genomes typically take more time to multiply [6,7] (but see [8–10]), thus seemingly constraining the cell growth rate and other plant functional traits. These assumptions led to the formulation of the large genome constraint hypothesis (LGCH) [11]. According to the LGCH, species with large genomes are disadvantaged owing to the costs associated with accumulation and replication of non-coding repetitive DNA. The strongest support for this hypothesis is the correlation between GS and cell size [3,5,12,13] as well as between GS and cell cycle duration [6,14].

In addition to these basic relationships, there are numerous reports of correlations between GS and various plant traits, as well as environmental or evolutionary characteristics [5,11,12,14–20]. Angiosperms with larger genomes often have lower specific leaf areas and, consequently, lower maximum photosynthetic rate [5,11] and lower growth rate [14,15], possibly owing to the competition for phosphorus between DNA and RNA [16]. At the same time, GS in herbs is often associated with larger seed mass [12,17,18] and leaf lifespan [19]. Furthermore, herbaceous species with large genomes are not adapted to extreme environments, as they cannot have a large number of small stomata [12]. The same constraint also holds for woody species, which consequently have smaller genomes [12]. From an evolutionary viewpoint, plant lineages with large genomes also show reduced capacity for fast evolutionary differentiation [11], and are rare in species groups that show adaptive radiation on oceanic islands [20]. Genome size has thus been suggested to be one of the important attributes that constrain plant fitness [21].

In spite of extensive research on correlations of GS with various plant traits, we are still far from understanding specific processes that underlie these correlations. One of the possible yet little explored quantitative approaches is to characterize scaling relationships between GS and proximal variables such as cell size, tissue density and cell division rate. Scaling relationships arise owing to universal (often geometrical) processes or constraints that act over a large range of scales and take the form of scale-invariant power laws [22–26]. The signature of such a scaling relationship (namely, the value of the scaling exponent, typically measured as the slope in the log–log plot) can shed light on the type of the underlying process or constraint [27–31].

Although scaling relationships are common in the biological world and have been extensively studied [32–38], only scant attention has been paid to the scaling of GS with other plant traits. In an early study, Baetcke *et al*. [39] found that the slope of the interspecific relationship between GS and nuclear volume is approximately 1 (i.e. a linear relationship). Similarly, Gregory [40] explored different studies focusing on cell sizes of both plants and vertebrates. He showed that, in plants, nuclear volume scales linearly with cell volume; however, he did not use GS (dataset from Price *et al*. [41]). The only attempt to capture the scaling relationship between GS and cell cycle duration comes from Cavalier-Smith [42] who predicted that cell cycle duration should be proportional to the amount of material that a cell needs to grow and develop, which in turn is constrained by the rate of transport. Assuming that the rate of transport is proportional to the cell surface (which increases with the cell volume with the exponent 2/3), and the GS is an isometric function of cell volume, he inferred that the scaling exponent between GS and cell cycle duration should be 1/3. However, he did not test this prediction using statistical techniques.

All of these pioneering studies used ordinary least-squares (OLS) regression to estimate the scaling exponents. While OLS regression is a standard tool for analysing data where both predictors and dependent variables are clear, its use should be restricted to cases when errors on the *x*-axis are negligible relative to the errors on the *y-*axis, with this constraint particularly important when the relationship is not very strong (see [43]). Thus, this technique is not suited to analysing the relationship between GS and other cell properties, where the independent variable is unclear [44] and relationships are not always very strong [5,11]. In such cases, it is necessary to estimate the scaling exponent by more appropriate techniques, such as standard major axis regression (SMA, sometimes also called reduced major axis, RMA [45,46]).

In the present paper, we attempt to identify some possible relationships between GS and variables at the cell and tissue level based on simple geometrical assumptions and test them using empirical data on diploid herbaceous plant species. We first formulate several predictions on scaling relationships involving GS based on geometrical considerations. Next, we examine these predictions using datasets from a large number of plant species. We take advantage of the fact that the recent advent of fast cytometric techniques has made possible collection of extensive data on many plant species [1,21]. This enabled us to assemble empirical datasets on some of these relationships for quite large numbers of species in order to examine to what degree they conform to theoretical predictions. We have limited ourselves to considering diploid plants only. Although polyploidy does increase total DNA content, it introduces a number of different phenomena that do not occur when the base (monoploid) DNA amount increases [1,21]. This is supported by the evidence that the relationship between intraspecific cell size and GS differs between diploids and polyploids [47–50]. Moreover, it has been shown that increase in ploidy level does not increase cell cycle duration [8–10]. Analysing diploid and recent polyploid species together would thus confound the effects of polyploidy and monoploid DNA amount [21].

## 2. Predictions

We predict that GS, cell size and cell cycle length are constrained by geometrical relationships and thus should scale together. These geometrical relationships act both directly (as in the GS–nuclear size relationship), and indirectly, through limits for the transport of material to the cell and within the cell or through the cell metabolism. As all these constraints are possibly bidirectional, we explicitly do not assume any cause-and-effect chain and limit ourselves to symmetrical scaling relationships.

The simplest geometrical constraint concerns the relationship between nuclear volume and GS. As these are measured in volume units and mass units, they should correspond linearly to each other and their scaling exponent should consequently be 1 (prediction 1).

Further, cell size should scale with nuclear volume and hence with GS. The simplest possible relationship is a geometrical constraint as larger cells are needed to accommodate larger nuclei either physically, owing to cytoskeletal structure or owing to the necessary ratio between RNA synthesis and the rate of protein synthesis [51]. This scaling is between volume–volume units (or volume–mass units), meaning that the relationship should also be linear, i.e. with the scaling exponent of 1 (prediction 2a). Consequently, the scaling exponent for the cell area–GS relationship should correspond to 2/3 owing to the area : volume ratio (prediction 2b). Also, cell length should scale with GS according to the length : volume ratio, i.e. with scaling exponent 1/3 (prediction 2c).

When analysing data on cell lengths, it is necessary to take into account the fact that most empirical cell length data come from measurement of stomata guard cells. These cells are known to have rather constant sizes and typically do not endoreduplicate [48] (which makes them a favourite subject for comparative studies), but have a strongly constrained (curved) shape. The scaling relationship with other cell variables may be affected by the possible effect of this bend or curvature. We assume that the shape of this bend could be expressed by a power function *y = ax*^{b}, where *x* is the measured length, and *a* and *b* are positive numbers. This length of the bent shape (*S*) is always greater than *x* and scales nonlinearly with the measured length, with the scaling exponent depending on the shape of the cell (i.e. parameters *a* and *b* of the curve), but necessarily greater than 1 (derivation for the simplest case of a quadratic parabola is in the electronic supplementary material). If we denote this scaling exponent *c*, the measured length (*x*) will thus scale with GS with the exponent (1/3) into (1/*c*), which is always lower than 1/3 if *c* > 1 (prediction 2d).

Constraints on possible scaling relationships between GS and cell growth and division rates should be due either to matter transport to the cell [42] or to cell metabolism [52] as a function of GS and cell size. During cell growth, DNA synthesis and cell division, the cell would need amounts of material proportional to its volume, with the rate of transport limited by cell surface area. This area increases with a scaling of 2/3 in relationship with cell volume. If transport across the cell surface is the limiting factor, the cell cycle duration should thus be proportional to the amount of material that needs to pass into the cytoplasm, divided by the rate of transport [42]:
2.1where *T* is duration of the cell cycle and *V* is the cell volume. Alternatively, cell cycle duration can also be driven by cell metabolism. Both respiration and photosynthesis take place on organelle membranes (owing to the location of ATP-synthesis complexes). Assuming that an increase in cell size leads to a proportional increase in the number of organelles [53], the rate of whole-cell metabolism should scale with GS with the scaling exponent 1. Therefore, mass-specific metabolic rate should be independent of GS (GS^{1}/GS^{1}). Assuming the duration of the cell cycle is inverse to the cell cycle rate, cell cycle duration should not depend on GS (scaling exponent 0). When considering both role of cell metabolism and the rate of transport as factors responsible for cell cycle duration, the scaling exponent between GS and cell cycle duration should vary between 1/3 (limitation by the surface : volume ratio only) and 0 (limitation by cell metabolism only; prediction 3).

This prediction is however problematic as the cell cycle consists of four phases and the duration of each phase may depend upon GS differently owing to the differences underlying or limiting each of these phases. We therefore make specific predictions for the relationship between GS and duration of each phase of the cycle. Because of the demands of the growth phases (G1 and G2), we expect that during these phases cell metabolic rate should be especially important. Thus, the duration of these two phases should not depend on GS; i.e. scaling exponents of relationships between G phase duration and GS should be 0. However, as metabolic rate might be constrained by the rate of transport through the cell membrane, we expect that the scaling coefficient between the duration of G phases and GS should vary between 1/3 and 0, as shown earlier (prediction 4a).

Limitation on the S phase is in principle owing to the number of replication origins and replication rate per replicon. However, the dataset of Hof & Bjerknes [7] shows that there is no significant relationship between GS and the fork rate (replication rate per replicon). We thus suggest that the main factor limiting duration of the S phase is the amount of DNA polymerase and other factors (e.g. CDC6 and CDT1, see [54]) important for the activation of replication origins. Given that the amount of the DNA polymerase as well as CDC6 and CDT1 needed for the replication scales linearly with GS, it should be limited by the rate of transport through the nuclear surface. S phase duration will then be constrained by the ratio between the total amount of material needed for replication (which scales at 1 with the GS) divided by the rate of transport (which scales at 2/3 with the nuclear volume). The predicted exponent for the relationship between S phase duration and GS should thus be 1/3 (prediction 4b). We hypothesize that during cell division (M phase), GS (and cell size) limits the development of nuclear membranes and cell walls. Therefore, duration of the M phase should scale linearly with the total surface area of nuclear membranes and cell walls, which in turn scales at 2/3 with the cell volume and thus at 2/3 with GS. We thus predict the scaling exponent between the duration of the M phase and GS to be 2/3 (prediction 4c).

## 3. Methods

### (a) Datasets

We re-analysed 11 published datasets reporting nuclear volume, cell size, cell cycle duration and cell cycle phase duration (detailed description is given in table 1). Both differentiated and undifferentiated cells were used to obtain data on cell and nucleus sizes; for cell cycle duration, we used only root meristem data where potential constraints owing to GS would show up most clearly owing to their high cell division rates. To reduce noise in the data, we included data only on diploid herbaceous plants. We excluded woody plants, which usually have much smaller genomes because of strong selection on cambial and stomatal cell size [12].

The data came from a total of 269 plant species from 46 families. Genome sizes (2C-value in pg) were taken from original publications and, if missing there, from the Angiosperm DNA C-value database [1]. Ploidy levels were also taken from original publications or from C-value database [1]; corrections of ploidy levels using other existing cytogenetic data were performed in cases of clear errors (J. Suda 2011, unpublished data). We were able to compile cell cycle phase duration data from different resources (table 1, rows 8 and 9) as they all were measured within the same temperature range (20–23°C) [7,56–59].

### (b) Data analysis

We re-analysed each dataset using SMA regression (R language, package lmodel2; [60]). The scaling exponent for each relationship between GS and different cell parameters was calculated as the slope of the SMA regression from the power law relationship (logarithmic transformation of all variables). We calculated confidence intervals for all scaling exponents. The match of the estimated slopes with the theoretical predictions was carried out by assessing (i) simple numerical closeness of the theoretical and empirical values and (ii) statistical fit of the theoretical and empirical values using 95% confidence intervals of the empirical values.

As numerically high values of intercepts in untransformed data can seriously distort estimated values of scaling exponents in log–log transformed data [61], we checked all relationships for the intercept effect. We tested for nonlinearity of the relationship in the log–log transformed data by testing significance of the quadratic term in quadratic polynomial regression fitted by OLS. If the quadratic term was significant, we estimated the scaling exponent from nonlinear regression (SPSS, v. 10; SPSS Inc., Chicago, IL, USA). We used nonlinear regression to fit the equation *y* = *Cx*^{z} + *a*, using the loss function proposed by Ebert & Russell [61] for model II regression (analogue to SMA) estimation of the *y-*axis intercept.

## 4. Results

### (a) Nuclear volume and cell size

There was a strong linear relationship between GS and nuclear volume. The SMA scaling exponent between nuclear volume and GS was 0.96 for shoot meristem cells, 0.85 for root meristem cells and 1.01 for epidermal cells of young leaves (table 2 and figure 1*a*). Confidence intervals of all these relationships covered the predicted value of 1. Also, in accord with our prediction, GS (and nuclear volume) scaled linearly with the cell volume (table 2 and figure 1*b–d*). The strongest relationship occurred between the shoot meristem cells and nuclear volume (scaling exponent 1.01, *r*^{2} = 0.98; table 2 and figure 1*b*). A similar, but weaker relationship was found between epidermal cell area and GS (scaling exponent 0.64, close to the predicted value of 2/3; table 2 and figure 1*c*). Nevertheless, these results with epidermal cells may be partly affected by endoreduplication [48]. There was a very weak indication of a relationship between GS and pollen width having a scaling exponent of 0.34 (table 2). Although this was in agreement with our prediction (1/3), this relationship was not significant, as the *p*-value was 0.18 (0.08 for one-tailed test). In the case of guard cell length (table 2 and figure 1*d*), as the observed scaling exponent was 0.24, it was less than 1/3, and thus in accord with our prediction. The highest deviation from the predicted value was found in the relationship between pollen volume and GS (table 2), with the scaling exponent being 1.16 instead of the predicted 1.

### (b) Cell cycle duration

There was strong nonlinearity in the power-law function of the relationship of cell cycle duration to GS, as showed by the high significance of the GS as a quadratic term (*p* = 0.004). This was the only relationship with a significant quadratic term. This relationship also had a high value of the intercept in non-transformed data (figure 2*a,b*). The value of the intercept (6.14) corresponded to the minimum value of cell cycle duration. The scaling exponent estimated by nonlinear regression (0.69) did not match the value predicted by the surface : volume or metabolic limitation (table 2).

The relationships between GS and individual cell cycle phases varied widely. Although there was no relationship between G phases and GS (table 2 and figure 3*a,c*), the relationship between duration of the S phase and GS was significantly positive (table 2 and figure 3*b*) and the scaling exponent (0.36) corresponded to the predicted 1/3 (when the same dataset as for other phases was used). When we analysed the larger set of all available S phase data, the scaling exponent remained the same and the relationship was even tighter (table 2 and figure 4). Duration of the M phase increased with GS marginally significantly (table 2 and figure 3*d*). The estimated scaling exponent 0.40 did not match numerically the value we predict (0.67), although the difference was not significant owing to high variation and the low number of observations.

## 5. Discussion

We found strong support for the role of geometry in the relationship between GS, nuclear size and cell size. Scaling exponents of these relationships almost exactly matched expectations based on geometrical considerations. In the case of pollen volume, however, the scaling exponent (1.16) was above the predicted value of 1, although the confidence interval (0.96–1.41) did include this value. This was probably owing to variation in the shape of pollen grains and the fact that an individual pollen grain could contain several nuclei. In addition, their sizes can be affected by other factors, namely potential selection for the small sizes necessary for successful transport to female flowers. All these factors are likely to confound the relationship by adding unexplained variation. However, the fact that the size of pollen grain still scales with GS despite such a large amount of unexplained variability, indicates that GS still acts as an important constraint for the pollen grain size. Indeed, the difference from the predicted value of 1 was not significant when using the 95% confidence interval.

Most importantly, the relationships between GS and both cell size and nuclear size are of simple isometry with no nonlinear component. This indicates that size relationships within cells are not constrained by transport, as this process typically involves a mass or volume over plane relationship, which is not isometric. However, our data show that nucleus : genome in a cell is constrained by aspects of its cellular background, such as cytoskeletal structure or ratio between rates of RNA synthesis and of protein synthesis, that are simply proportional to its size. While it has been pointed out earlier by many authors that small cells cannot contain large genomes [40,55], the fact of simple isometry shows that the reverse is also true, i.e. large cells do not contain small genomes, and this relationship is likely owing to a geometrical constraint.

Although there was a strong relationship between GS and cell cycle duration, it did not match our prediction that was based on a single geometrical constraint. This is clearly owing to differences in correlations between GS and different cell cycle phases. Whereas there was almost no relationship between GS and the duration of growth phases (G1 and G2), M phase and particularly S phase scaled with GS rather strongly. Still none of the relationships involved simple isometry, indicating the strong role of transport or similar processes in determining cell cycle duration.

The fact that cell growth phases (G1 and G2) were independent of GS, and consequently, of cell size, indicates that the duration of phases of cell growth is not limited by any transport process involving the nucleus. These phases are probably driven by the rate of cell metabolism, which is possibly maximized by number of organelles increasing with GS [53]. In contrast with the G phases, duration of the S phase increased with GS relatively strongly, with the scaling exponent corresponding to the 1/3 predicted by the limitation of the DNA-polymerase transport rate into the nucleus. Duration of this part of the cell cycle thus seems to be limited by a geometrical constraint on DNA replication. The role of geometry in the M phase is unclear owing to the weak relationship, with the estimated value not only differing from the predicted value of 2/3 (limitation by membrane or cell wall construction), but also weakly differing from zero.

Taking into account the different results that we found for individual cell cycle phases, the relationship between GS and total cell cycle duration seems to consist of two components. The first is the duration of the G phases (which are virtually independent of the GS) in determining the cell cycle duration threshold. The second is the role of the S (and possibly M) phases in determining the slope of this relationship. GS thus plays a significant role in the whole plant growth rate by increasing the time needed for DNA polymerization. However, owing to the different proportions of the cell cycle constituted by particular cell cycle phases among different plants [59,62], the slope of the scaling relationship between GS and cell cycle duration is not universal. Its numerical value may, however, be indicative of the relative time that cells spend in the individual cell division phases.

It is known that a decrease in cell cycle duration is associated with an increase in relative growth rate [47,63], and thus indirectly with photosynthetic rate [64] despite the fact that the relationship need not be straightforward in a population of proliferating cells. These relationships scale up to other plant traits [5] and are fundamental for other biological rates. Cell size scaling has been much less studied in spite of its key role in controlling organ or tissue complexity and thus whole-organism metabolic rate [52]. The separate scaling relationships for cell cycle duration and cell size indicate that their actions are likely to be independent.

While the geometrical scaling strongly indicates intimate functional ties between GS, nuclear volume, cell size and cell cycle duration, it does not say anything about the causal direction of the relationships (see also [44]). In particular, it is not clear whether larger genomes result only from passive accumulation of junk DNA, as is commonly believed. In this scenario, the passive accumulation of junk DNA then constrains plant traits and life histories [11]. However, this is not the only possible basis for evolution of large genomes. For example, small cells may be economically disadvantaged because of their higher metabolic rates and consequently higher energetic demands [65,66]. Increase in the GS may thus be a possible option to diminish the energetic costs of the cell [67] as large cells cannot have small genomes. Moreover, organisms have been demonstrated to both increase and decrease the GS in their cells by accumulation and deletion mechanisms (see [68] for review) and to change the proportion of heterochromatin in response to GS (data from Nagl [69]), suggesting possible mechanisms to cope with excess DNA amounts.

In conclusion, the numerical fit between the predicted and empirical scaling exponents provides strong support for the concept of GS as a phenotypic trait (nucleotypic theory of Bennett [3], etc.). It has also made it possible to identify specific components of this limitation. In particular, size relationships and rate relationships are likely owing to different limitations. At a finer scale, absence of any relationship between GS and both G phases is a strong indication that the surface area : volume ratio is not limiting during cell growth, but does limit the S phase and mitosis. These findings, although tentative and concerning only undifferentiated meristematic cells, show the power of the scaling relationships at the cellular level in spite of the low number of data points available. We are convinced that even finer analysis of geometrical and mechanical constraints in the GS relationships will be possible as additional comparative data are collected. A similar approach might also be possible for examining GS effects at tissue and whole plant levels.

## Acknowledgements

We thank Matyáš Fendrych, Lukáš Kratochvíl, Jonathan Rosenthal, Zuzana Starostová, David Storch, Jan Suda, Arnošt Šizling, Martin Weiser, Viktor Žárský, Hirokazu Tsukaya and three anonymous reviewers for discussions and/or comments on an earlier draft of this paper. The research reported here was supported by research grants MSM0021620845, MSM 0021620828, LC06073 from the Czech Ministry of Education and AV0Z60050516 from the Academy of Sciences of the Czech Republic.

- Received June 20, 2011.
- Accepted August 8, 2011.

- This journal is © 2011 The Royal Society