There are many different ways to measure phenotypic plasticity [1]. In the endeavour to measure the costs of plasticity, researchers often employ large quantitative-genetic experimental designs that rear related groups of organisms in two environments. Under such two-environment designs, plasticity is most often calculated as the difference in trait values across the two environments. Thus, this metric of plasticity is based simply on the difference between two values and can therefore be correlated, by definition, to one or both of those values (i.e. a part-whole correlation). In Auld *et al*. [2], we expressed concern with the implications of this correlation for the interpretation of other statistical analyses (e.g. assessing the costs of plasticity). While we appreciate that Roff [3] has mathematically demonstrated why this correlation can exist, we remain concerned with the implications of this correlation for measuring the costs of plasticity. In Auld *et al*. [2], we discussed how the existence of this correlation can be problematic for subsequent statistical analyses, not why this correlation exists to begin with. Below we: (i) re-visit why the existence of this correlation is important for statistical analyses and their interpretation, (ii) discuss important discrepancies between empirical observations and the ‘theoretical’ maximum discussed by Roff [3], and (iii) discuss ways to move forward.

Current methodology for measuring the costs of plasticity involves the classic regression approach used to measure selection on any phenotypic trait (e.g. [4]), i.e. a multiple regression of fitness on trait value and plasticity. A significant regression coefficient for plasticity indicates direct selection on plasticity (i.e. a cost or benefit of plasticity). While the complications of multi-collinearity are widely acknowledged in typical selection analysis [5], the correlation between plasticity and trait value that is implicit in the definition of plasticity described above seems to have been largely neglected by studies focused on measuring the costs of plasticity [2].

Interestingly, while costs of plasticity are central to a variety of models for the evolution of plasticity, they are rarely estimated to be statistically significant. Frequently, studies estimate these costs as statistically insignificant or negligible [6]. One of the points that we made in our review [2] is that given this correlation between trait value and plasticity, many of the current estimates of the costs of plasticity may be inaccurate. Roff [3] agrees with this point and shows how this correlation can arise by definition, thus, the point remains that multi-collinearity between trait values and plasticity is not acknowledged in the majority of studies on this topic. We urge researchers to consider the strength of this correlation and interpret their results accordingly. As previously noted, the cost of plasticity can only be accurately estimated using the multiple regression method when trait value and plasticity are not highly correlated. As some amount of correlation is inevitable owing to the two-environment definition of plasticity, researchers may be able to arrive at a relatively unbiased estimate of the cost of plasticity when the correlation is weak. Otherwise, alternative metrics of plasticity are required.

In his algebraic demonstration of the basis of this correlation, Roff has imagined an ideal situation in which the trait variance in environment 1, denoted as var(*X*), and trait variance in environment 2, denoted as var(*Y*), are assumed to be equal. Based on our own research and the examples of genotype-by-environment interaction that pervade the literature, we know that this assumption is frequently not met. This is probably why Roff's [3] theoretical maximum correlation between trait value and plasticity of 0.71 is greatly exceeded by empirical data, which exhibit correlation values up to 0.97 (see fig. 2 in [2]). Moreover, just because a correlation between trait values and trait plasticities can exist does not mean that it will exist. For example, in many cases one environment can induce phenotypes that do not vary among genotypes, producing a non-significant correlation, whereas another environment can induce phenotypes that are highly variable among genotypes. We view this as an interesting problem in and of itself.

Our general point is that depending on the definition of plasticity, a correlation between trait value and plasticity may arise, and if so, it presents a problem for current methodology aimed at evaluating the costs of plasticity. When reaction norms are measured across more than two environments, numerous additional ways of measuring plasticity can be employed, most of which will avoid the problem of correlation by definition. In conclusion, we have three suggestions: (i) previous studies of the costs of plasticity using the two-environment definition of plasticity should be re-evaluated to determine the extent of the statistical problem—i.e. how many non-significant estimates are in fact non-significant because of multi-collinearity in the multiple regression test; (ii) in the future, studies should make an effort to measure reaction norms across more than two environments and perhaps consider multiple metrics of plasticity (e.g. height and curvature of the reaction norm); and (iii) future studies that employ a two-environment design should report the correlation between trait value and plasticity (for each environment) so that estimates of the costs can be interpreted accordingly.

## Footnotes

The accompanying comment can be viewed at http://dx.doi.org/10.1098/rspb.2011.0595.

- Received June 1, 2011.
- Accepted June 2, 2011.

- This journal is © 2011 The Royal Society