In a recent paper published in *Proceedings of the Royal Society B*, Robson & Smith (RS) presented an analysis of a database comprising the life histories of more than 50 000 women born in the nineteenth century UT, USA. They addressed the question of whether a woman's likelihood of producing twins is related to overall phenotypic ‘quality’ [1]. Controlling for a number of confounding factors, they show that, compared with women who do not produce twins in their lifetime, twinners have lower post-reproductive mortality, shorter inter-birth intervals, later ages at last reproduction, longer reproductive lifespan and higher lifetime fertility. They conclude: ‘our results strongly support the hypothesis that twinning is an index of phenotypic quality associated with other dimensions of maternal heterogeneity’.

We focus here on the question of why women in RS's sample have higher lifetime fertility if they have also produced twins. In this respect, the results of RS's analysis are consistent with those carried out on smaller databases from natural fertility populations [2–5]. Thus, at first glance it would indeed appear that there is a strong basis to infer that there is some specific aspect of the propensity to produce twins that is biologically linked to lifetime fertility. Indeed, rates of dizygotic twinning (the kind that arises from polyovulation and accounts for most of the inter-societal variation in twinning rates [6]), have been found to be higher in women who conceive more readily [7,8], suggesting that there may indeed be some heterogeneity in fecundability that covaries with twinning propensity. However, while we agree that this hypothesis is both plausible and interesting, comparison of the life-history traits of women who have and have not *ever* produced twins does not allow one to draw any conclusions about the covariation between individual variation in twinning *propensity* and those traits. This is because the fact that a woman has produced twins during her lifetime is not only a product of her propensity to produce twins, but also of the number of times she has given birth.

Even if one assumes that a twin birth is a random event that can happen to all women with equal probability per delivery, one should nonetheless expect that twinning women would have higher lifetime fertility than non-twinning women. Focusing on the number of deliveries rather than children, a simple way to realize this is to reverse the implication of causation by altering the statement ‘Women who give birth to twins, give birth more’, thus: ‘Women who give birth more, are more likely to eventually give birth to twins’. Put formally, the event of twinning can be seen as a binomial outcome and the binomial distribution of this event (*L*) tells us that the lifetime probability of delivering twins at least once, pr(*L* ≥ 1), is
where *p* is the probability of producing twins during one delivery and *d* is the number of deliveries over a lifetime. As *d* increases, so does pr(*L* ≥ 1), without any other source of covariation between *d* and *L* being necessary. Regardless of how other factors affect the probability of twinning, *p* (e.g. as age does [9]), the positive influence of the number of deliveries on the lifetime probability of delivering twins at least once is inevitable. Those women who give birth more accumulate a greater opportunity to become mothers of twins than women who give birth less. Since women who give birth more in their lifetime do so by either having a long reproductive lifespan, or by giving birth at faster rate, we should also expect that mothers of twins will tend to have a later, longer reproductive lifespan (achieved through one or both of earlier age at first or later age at last reproduction) as well as shorter inter-birth intervals, as RS also describe.

In terms of understanding the role of biological variation in twinning propensity in the scheme of life-history variation, the question must thus be whether lifetime fertility and its associated traits differ once the expected higher fertility of twinning mothers is accounted for. We describe here an analysis based on simulations of the data presented by RS of two cohorts of women from UT, USA [1] in order to demonstrate the potential significance to RS's question of the problem we outline. Please note that: (i) this analysis is not intended as a replacement of RS's detailed statistical models, but as a complementary illustration to the argument made above; and (ii) regardless of the results of this analysis, the argument stands *a priori*.

We produced a distribution of deliveries per mother matching as closely as possible the characteristics of RS's sample, and then simulated twinning assuming a *constant rate* per delivery for all women of a given cohort (pre- 1870 and 1870–1900). This rate was adjusted in each cohort in accordance with the parameters of a truncated negative binomial distribution, so that the overall number of mothers producing twins was also matching closely the sample used by RS. We then compared the difference between the lifetime fertility of twinning and non-twinning mothers in our simulated data with that reported in the original analysis.

One thousand simulations were carried out for each cohort. For the pre-1870 cohort, in our simulated datasets the 21 150 mothers produced on average 8.28 ± 3.04 (s.d.) deliveries. Assuming a constant twinning probability of 1.26 per cent per delivery this translated to 8.39 ± 3.10 children per mother and to a lifetime probability of becoming a twin mother of 0.099, compared with 8.39 ± 3.08 children per mother and a lifetime twinning probability of 0.098 in the original sample. For the 1870–1900 cohort, the characteristics of our simulated datasets were also very close to those from the original sample: the 37 636 mothers produced on average 5.71 ± 2.99 deliveries. With a constant twinning probability of 1.31 per cent per delivery this translated to 5.79 ± 3.04 children per mother and a lifetime twinning probability of 0.072, compared with 5.72 ± 3.11 children per mother and to a lifetime twinning probability of 0.070 in the original sample.

In RS's original paper, lifetime fertility was 1.9 children higher in twinning women than in non-twinning women in the pre-1870 cohort, and 2.3 higher in the 1870–1900 cohort. Our simulation of the population, in which all women have *precisely the same chance* of giving birth to twins with each delivery, produced corresponding values of 2.23 ± 0.072 and 2.66 ± 0.062. These results clearly demonstrate that the parities of twinning mothers can easily exceed the automatic advantage of one child even when the per-delivery probability of twinning is not related to women's lifetime fertility. Importantly, although we calibrated our demonstration on the distribution of fertility of the sample used in RS's study, this result holds for a wide range of conditions.

Our simulation suggests that twinning women in RS's sample might not have higher lifetime fertility than can be predicted once their higher number of opportunities to produce twins (births) has been taken into account. While we did not analyse the corresponding predictions for age at last reproduction, reproductive lifespan and inter-birth interval length, it seems likely that differences in these traits between twinning and non-twinning women could also be driven by the expected higher lifetime fertility of the former group. The results of our simulation shows that a formal reanalysis that takes into account, or otherwise circumvents the problem of, the expected higher fertility of twinning versus non-twinning women, would be necessary before any conclusions could be drawn about covariation between twinning propensity and other life-history traits. As mentioned above, reports of shorter conception times in dizygotic twinning women [7,8] suggests that the existence of such differences may be biologically plausible, and so we agree with RS that this question is worthy of study, despite disagreeing with their methodology.

The specific finding of RS that post-reproductive lifespan is higher in twinning mothers is an interesting one that is not confounded by the artefact we describe. However, the confounding of lifetime fertility and twinning status does have implications for biological interpretation of this relationship. It seems likely that post-reproductive lifespan might be less related to twinning propensity, but more to fertility. In support of this, age at last reproduction has been shown to be associated with post-reproductive survival in another natural fertility population [10].

Studies of trait covariation provide useful contributions to our understanding of the mechanisms of life-history traits, which in turn play a vital role in understanding the reasons for variation in such traits. However, it is important to give careful consideration to potential statistical artefacts that can arise in non-experimental designs, in order to avoid misinterpretations of the true nature of variation.

## Acknowledgments

We thank the European Research Council and Wissenschaftskolleg zu Berlin for their support and three anonymous referees and the authors of the original article for their comments.

## Footnotes

The accompanying reply can be viewed at http://dx.doi.org/10.1098/rspb.2012.0436.

- Received January 26, 2012.
- Accepted February 1, 2012.

- This journal is © 2012 The Royal Society