## Abstract

The island rule states that after island colonization, larger animals tend to evolve reduced body sizes and smaller animals increased sizes. Recently, there has been disagreement about how often, if ever, this rule applies in nature, and much of this disagreement stems from differences in the statistical tests employed. This study shows, how different tests of the island rule assume different null hypotheses, and that these rely on quite different biological assumptions. Analysis and simulation are then used to quantify the biases in the tests. Many widely used tests are shown to yield false support for the island rule when island and mainland evolution are indistinguishable, and so a Monte Carlo permutation test is introduced that avoids this problem. It is further shown that tests based on independent contrasts lack power to detect the island rule under certain conditions. Finally, a complete reanalysis is presented of recent data from primates. When head–body length is used as the measure of body size, reports of the island rule are shown to stem from methodological artefacts. But when skull length or body mass are used, all tests agree that the island rule does hold in primates.

## 1. Introduction

The island rule (also known as Foster's Rule) describes an apparent centralizing tendency in the body size evolution of island endemic animals, with large-bodied taxa tending to evolve smaller body sizes after colonization, and small-bodied taxa tending to increase in size (Foster 1964; Van Valen 1973; Lomolino 1985, 2005; Meiri *et al*. 2008).

While many spectacular cases of island dwarfism or gigantism are known, rigorous study of the rule requires statistical testing of large comparative datasets. Using this approach, recent studies have claimed that the island rule holds in a range of vertebrate groups (Lomolino 1985, 2005; Clegg & Owens 2002; Boback & Guyer 2003; Bromham & Cardillo 2007), and even in distantly related animals in analogous ecological situations (such as the Oligocene colonization of the deep-sea benthos by gastropod molluscs; McClain *et al*. 2006). But other studies, sometimes with partially overlapping data, have found no evidence that the rule applies (Meiri *et al*. 2004, 2006, 2008; Meiri 2007). In one respect, these disagreements are encouraging; comparing situations where the island rule does and does not hold, might allow us to test the various causal hypotheses associated with the rule (MacArthur & Wilson 1963; Foster 1964; Van Valen 1973; Heaney 1978; Smith 1992; Clegg & Owens 2002; Palkovacs 2003; Lomolino 2005). Unfortunately, it is likely that much of the disagreement reflects the statistics used, rather than informative ecological differences between the groups studied (Meiri 2007; Price & Phillimore 2007; Meiri *et al*. 2008). Distinguishing real biological differences from methodological artefacts is therefore crucial, if we are to understand the importance of different ecological factors in morphological evolution.

Here, the various statistical tests of the island rule are examined in detail, clarifying the situations in which they can be misleading. As a case study, we refer throughout to the data from primates compiled by Bromham & Cardillo (2007). This group has attracted particular attention due to the controversy over *Homo floresiensis*, a subfossil hominid, and putative island dwarf (Brown *et al*. 2004; Martin *et al*. 2006; Bromham & Cardillo 2007; Niven 2007; Köhler *et al*. 2008).

## 2. Tests of the island rule and their null hypotheses

All tests discussed here use phylogenetically independent pairs of taxa, comprising one island endemic and one mainland relative, each with body size measurements denoted *S*_{i} and *S*_{m}. The most common test is the regression of the island–mainland size ratio, *R*=*S*_{i}/*S*_{m}, onto the mainland body size, *S*_{m}. A negative slope is consistent with the island rule (Lomolino 1985). The assumptions of this parametric test, such as equal variances for all data points, are unlikely to hold for comparative data (Martin & Barbour 1989; Martin *et al*. 2005), particularly if the island colonizations were widely spaced in time, and pairs represent different amounts of evolutionary change (Felsenstein 1988; Garland *et al*. 1992). This probably is the case for the primate dataset (figure 1), which contains pairs placed commonly in separate genera (the simakobu and proboscis monkeys: *Simias concolor*–*Nasalis larvatus*), and pairs not commonly assigned subspecies status (the Cameroon and Bioko populations of *Galago alleni*). To deal with this heterogeneity, non-parametric versions of the test, using the signs or rank orders of ln *R*, have been proposed (Meiri *et al*. 2004; Bromham & Cardillo 2007).

All tests of this kind treat the ratio *R* as an estimator of the evolutionary change undergone by the island species, and so perform best when this estimator is most reliable, i.e. when mainland populations remain close to the common ancestral state of the island–mainland pair. For this reason, such methods can be understood as tests of ‘null hypothesis A’ as described in table 1. This is equivalent to noting that such tests neglect error in the mainland body size measurements (Atchley *et al*. 1976; Meiri 2007; Price & Phillimore 2007), because the ‘error’, in this context, includes any variation in evolutionary outcome that might result from a given ancestral state (Felsenstein 1985, 1988).

A distinct approach to testing the island rule is to regress *S*_{i} directly on to *S*_{m} using standardized major axis (SMA) regression (also known as Model II or reduced major axis regression; Sokal & Rohlf 1995). A slope less than one is consistent with the island rule (Lomolino 1985; Meiri 2007; Price & Phillimore 2007). The SMA regression can be understood as a test of ‘null hypothesis B’: that island and mainland evolution are indistinguishable (table 1). (Ordinary-least-squares (OLS) regression neglects error in the mainland body sizes, and is therefore a test of null A, essentially identical to those mentioned above.) SMA regression is a parametric test, but mainland and island taxa are interchangeable under null B (table 1), which suggests that a Monte Carlo permutation approach could also be used (Edgington 1995). Specifically, we generate a large number of permutations of the data, in which the member of each pair designated as the ‘island’ taxon is chosen at random (permutations take place only within each pair, and so each species is always paired with its nearest relative). Carrying out the same regression on these randomly permuted sets yields a null distribution of the test statistic. This is a non-parametric test of null B, and so avoids the problematic assumptions of normality and equal variances (Martin & Barbour 1989; Meiri *et al*. 2004; Martin *et al*. 2005; Bromham & Cardillo 2007).

Table 1 makes clear that both null A and B can be correctly rejected if the other holds. So which is a more suitable null model for testing the island rule? Arguably, null A is more realistic if island colonization greatly accelerates morphological evolution (Reznick & Ghalambor 2001; Millien 2006; but see Pérez-Claros & Aledo 2007). But this may not be a safe assumption. For example, the primate dataset contains many inhabitants of the Mentawai islands off Sumatra (online appendix 1 in the electronic supplementary material), and morphological, behavioural and genetic data suggest that most of these Mentawai primates are ancestral to their nearest mainland relatives (Brandon-Jones 1998; Ziegler *et al*. 2007; see also Bellemain & Ricklefs 2008). In this case, any acceleration of evolutionary rates in novel environments would be as likely to affect the mainland taxa.

More important is the direction of the biases to which the tests are subject. If null B holds, then all tests of null A are biased towards false detection of the island rule. This is because mainland populations that have increased in size are more likely to be classified as ‘large’, and also more likely to be larger than their island relatives—regardless of whether the latter have shown any centralizing tendency. The reverse applies if the mainland populations have decreased in size. By contrast, when null A holds, tests of null B will tend to be conservative with respect to the island rule. This is because accelerated evolution in the island endemics, in the absence of any centralizing tendency, would tend to make island body sizes more diverse than mainland body sizes—the exact reverse of the pattern predicted by the island rule. These considerations make tests of null B (such as SMA regression or the permutation test) preferable to tests of null A.

Recently, a new approach to testing the island rule has been introduced by Meiri *et al*. (2008). These authors compared the variables *R* and *S*_{m} using the method of independent contrasts (Felsenstein 1985), arguing that all previous tests involved pseudo-replication due to a failure to correct for phylogeny (Meiri *et al*. 2008). In one respect, all methods discussed above did account for phylogeny: no instance of body size evolution was counted more than once, because all island–mainland pairs were phylogenetically independent. But Meiri *et al*.'s (2008) argument concerns not body size but a different trait: ‘the tendency to change body size in a particular way after island colonization’. Their method is a test for correlated evolution between body size and this tendency (whose value is estimated from the ratio *R*); it can therefore be understood as a test of a quite distinct ‘null hypothesis C’ listed in table 1. If null C holds, then tests of null A or B can yield misleading results. For example, if the primate family Galagonidae share an inherited tendency to small body size, and, by coincidence, an inherited tendency to increase in size after island colonization, then multiple colonizations by multiple Galagonidae would be treated as independent confirmations of the island rule by standard tests, but as non-independent expressions of a shared tendency by independent contrasts. Standard tests are not directionally biased (Price 1997), and so if null C held generally in nature, then tests of null B would be expected to yield false support equally often for the island rule, and for its opposite (i.e. an apparent tendency on islands for large animals to increase in size and small animals to decrease); but only independent contrasts would yield correct *p*-values.

The only substantial criticism of independent contrasts is that, unlike tests of null B, it gives systematically biased results if evolution differs fundamentally from its Brownian motion model (Felsenstein 1985; Price 1997). This is a particular problem if the violations cannot be detected by the standard diagnostic tests (Garland *et al*. 1992; Freckleton & Harvey 2006). Niche-partitioning evolution that does violate the method's assumptions, may be relevant to the primate dataset, because it contains multiple colonizations of the same small insular habitats (Price 1997; Freckleton & Harvey 2006; Bromham & Cardillo 2007; online appendix 1 in the electronic supplementary material). More generally, the island rule predicts that the ratio *R* will show phylogenetic signal, regardless of whether a tendency for directional change truly has evolved over the phylogeny (this is because *R*=*S*_{i}/*S*_{m}, and if the island rule holds, *S*_{i} will show less dependence than *S*_{m} on their common ancestral state). This is an undetectable deviation from the method's assumptions.

## 3. Quantifying biases in tests of the island rule

How serious are the biases discussed above? To answer this question, simulated data were generated under each of the null models described in table 1. Parameter values were chosen so that simulated *S*_{m} values resembled closely the real set of mainland primate head–body length measurements (Bromham & Cardillo 2007), and used the topology and relative branch lengths depicted in figure 1.

Analysis of a Brownian motion model of evolution indicated that an important parameter would be the amount of evolutionary change undergone by each pair, compared with the spread of body sizes across the sampled pairs. This quantity is proportional to the typical divergence dates of the island–mainland pairs, *t*_{pair}, divided by the age of the root of phylogeny, *t*_{root} (see online appendix 2 in the electronic supplementary material). If the common ancestor of primates lived close to the Cretaceous/Tertiary boundary (Wible *et al*. 2007), what value of *t*_{pair}/*t*_{root} might plausibly characterize the primate data? Many of the island–mainland pairs probably split after the most recent isolation of the relevant island, usually the end of the last glacial maximum 12 000 years ago (Brandon-Jones 1998; Bromham & Cardillo 2007; Ziegler *et al*. 2007). But molecular date estimates, where available, are sometimes much older. Recent estimates include 2.4–2.6 Myr ago for *Macaca nemestrina*–*Macaca pagensis* (Ziegler *et al*. 2007), 0.8–1.4 Myr ago for *Presbytis femoralis*–*Presbytis natunae* (Meijaard & Groves 2004), and 1 Myr ago for *Macaca radiata*–*Macaca sinica* (Chakraborty *et al*. 2007; see also Tosi *et al*. 2003; Chu *et al*. 2007; Ting 2008). Even given possible acceleration of the molecular clock on islands (Woolfit & Bromham 2005), these all suggest splits substantially older than 12 000 years—and this has implications for other splits involving the same islands. Accordingly, simulations were run with *t*_{pair}/*t*_{root}=0.01, and varied by an order of magnitude in either direction.

All standard tests of the island rule were applied to each simulated dataset. These tests include the OLS regression of *R* on ln *S*_{m} (Lomolino 1985), a non-parametric sign test (Bromham & Cardillo 2007) and the SMA and OLS regressions of ln *S*_{i} on ln *S*_{m} (Lomolino 1985; Price & Phillimore 2007). This test was repeated with the sizes not logarithmically transformed, to quantify the effects of violating the parametric assumptions. In addition to the standard *t*-test, *p*-values for each regression were calculated using the Monte Carlo permutation approach introduced above. Finally, the method of independent contrasts was implemented (Meiri *et al*. 2008), using the true phylogeny under which the data were simulated (figure 1), and transforming branch lengths to maximize the model fit (see appendix A).

Table 2 shows the proportion of simulation trials for which each method yielded false support for the island rule. When data were simulated under null A (table 1), the standard tests generated the correct 5 per cent error level, and so, importantly did independent contrasts. The permutation tests and SMA regression, which are tests of null B, were generally conservative if null A held, supporting the island rule in less than 5 per cent of cases. The sole exception was the SMA regression when body sizes were not log transformed which generated high false-positive error rates (this will be relevant for interpreting the primate results below). When data were simulated under null B, i.e. when island and mainland evolution were identical to each other, only the permutation tests, and SMA regression with log-transformed data gave acceptable results. All other tests yielded high levels of false support for the island rule. The error rate depended strongly on the value of *t*_{pair}/*t*_{root}, being very close to correct levels when *t*_{pair}/*t*_{root}=0.001, twice or three times higher when *t*_{pair}/*t*_{root}=0.01, and very high indeed when *t*_{pair}/*t*_{root}=0.1. Finally, simulating data under null C (table 1) confirmed that all standard tests yield very high false positive rates (20–30%), with only independent contrasts performing well.

We now examine the performance of independent contrasts when the island rule does hold, but when the tendency to change size on islands does not evolve by Brownian motion. Data sets were simulated as follows; after colonization, island species evolved via a random walk, but with a centralizing tendency controlled by a parameter *λ*. Larger values of *λ* imply a stronger tendency for small colonists to increase in size, and large colonists to decrease. (Formally, the model is the Ornstein–Uhlenbeck Process, widely used in the literature and described fully in online appendix 2 in the electronic supplementary material.) No centralizing tendency was present elsewhere on the tree, and mainland populations remained in their ancestral state after colonization, such that null A would hold when *λ*=0. (In this case, the method yields correct false-positive error rates; table 2.) When *λ*>0, the island rule holds, and both *S*_{m} and *R* might show phylogenetic signal, but no trait estimable by *R* has evolved by Brownian motion. Table 3 quantifies our power to detect the island rule when modelled in this way. Independent contrasts are compared with the standard regression of *R* on *S*_{m}, and the conservative sign test (it is not meaningful to include tests with incorrect false-positive error rates). Table 3 shows that independent contrasts have less power than the standard regression in every case. The loss of power is least when the true effect is weak (i.e. when *λt*_{pair}/*t*_{root} is small) because all methods have correct Type I error rates, but can be substantial when the true effect size is strong (when *λt*_{pair}/*t*_{root} is large).

## 4. Reanalysis of primate data

With the results above in mind, the primate data are now reanalysed using the best performing tests: the permutation test of null B, and the independent contrasts test of null C (table 1). These tests are compared to the standard test of null A, the regression of *R* on *S*_{m}. Results are shown in figure 2.

When head–body length is used as the measure of body size (figure 2*a*), results are inconsistent. The raw regression suggests significant support for the island rule, but neither the permutation approach nor independent contrasts reach significance. We cannot therefore reject the possibility that head–body evolution in primates is unchanged by island colonization (null B), or that a tendency for directional change does exist, but evolves independently of pre-colonization body length (null C). By contrast, Bromham & Cardillo (2007) found significant support for the island rule using head–body length and the SMA regression of *S*_{i} on *S*_{m} (a valid test of null B). Online appendix 3 in the electronic supplementary material shows that this result is attributable to heterogeneity of variance—which was shown in table 2 to generate high false-positive error rates.

In contrast to head–body length, when skull length is the measure of body size, all three tests provide consistent support for the island rule in primates, with all three null hypotheses rejected (figure 2*b*). Equally consistent, and highly significant support is provided when body mass is used (figure 2*c*). The major conclusions of Bromham & Cardillo (2007) are therefore confirmed by the more conservative tests.

## 5. Discussion

Despite its unhelpful reputation as a supposed ‘law of nature’, the island rule remains an active area of research, and a valuable case study in the interactions of ecological and evolutionary processes (e.g. Clegg & Owens 2002; Boback & Guyer 2003; Brown *et al*. 2004; Meiri *et al*. 2004, 2006; Lomolino 2005; Martin *et al*. 2006; McClain *et al*. 2006; Bromham & Cardillo 2007; Meiri 2007; Niven 2007; Price & Phillimore 2007; Köhler *et al*. 2008).

In an important recent paper, Meiri *et al*. (2008) argued that the mammalian island rule is an artefact, caused by a small number of clade-specific tendencies to grow or shrink on island colonization which happen to be associated with small or large mainland body size (specifically a tendency for rodents to increase and artiodactyls to decrease on islands). It was further argued that many previous reports of the island rule were attributable to statistical artefacts, notably the effects of regressing a ratio on its denominator (Atchley *et al*. 1976; Lomolino 1985), and the inappropriate treatment of island colonizations as statistically independent events, thereby ignoring the shared evolutionary history of the colonizing species (Meiri *et al*. 2008).

To clarify these methodological artefacts, it was shown above that different tests of the island rule can also be considered as tests of quite different null hypothesis (table 1) that can yield false support for the rule when the ‘wrong’ null hypothesis holds. Simulating data under the different nulls has shown that the biases recognized by Meiri *et al*. (2008) can be severe (table 2), but also that their method of independent contrasts can give high rates of false negative error under certain conditions (table 3), specifically when a pronounced centralizing trend in insular evolution exists, but has not evolved over the phylogeny by a Brownian motion-type process. More importantly, we have confirmed the findings of Bromham & Cardillo (2007) that the island rule does hold in at least one mammalian order. Patterns in the skull length and body mass of primates are consistent with the island rule, and cannot be attributed to any of the biases noted above—although results with head–body length may be artefactual (figure 2).

These primate results do not contradict Meiri *et al*.'s (2008) central conclusion that the island rule does not apply consistently across all mammals (see also Lomolino 2005; Bromham & Cardillo 2007), and no comparative method performs well if the effect of interest varies across the group (even independent contrasts can mislead if the correlation between the traits under study has changed over the tree). For this reason, taxonomically restricted studies are likely to be the most useful for addressing the causal questions surrounding the island rule, concerning predation pressure, resource usage and within- and between-species competition (MacArthur & Wilson 1963; Foster 1964; Van Valen 1973; Heaney 1978; Smith 1992; Bromham & Cardillo 2007; Meiri *et al*. 2008). But while taxonomically limited studies are most likely to show a consistent effect—or consistent absence of an effect—it follows from the results above that they are the most susceptible to another bias: the artefactual support for the island rule that can arise when insular evolution shows no special features, but mainland taxa have evolved away from the common ancestral state of the island–mainland pair (table 2; online appendix 2 in the electronic supplementary material; see also Price & Phillimore 2007). This can be seen in the effects of varying *t*_{pair}/*t*_{root} (table 2), a quantity likely to be negligible for large taxonomic groups, such as the class Mammalia, but large enough to cause serious biases for smaller taxa. A Monte Carlo permutation test has been introduced that avoids this bias, and other problems afflicting parametric tests (table 2; online appendix 3 in the electronic supplementary material; Martin *et al*. 2005). It should therefore prove a useful test of null hypothesis B (table 1).

A final interesting aspect of the primate results is the performance of the different measures of body size (figure 2). While head–body length yields no evidence of the island rule, skull length yields significant support and body mass highly significant support (despite having the smallest sample size, and presumably the greatest lability within an animal's adult lifetime). This aspect of the results strengthens the suggestion of Lomolino (2005) that proxies for body mass might not reliably manifest the island rule, even when it does operate in the group under study (see also Boback & Guyer 2003; Meiri & Dayan 2003; Meiri *et al*. 2004, 2006). However, it is also notable that the primate body mass measurements were based on more specimens than were the head–body or skull lengths (Bromham & Cardillo 2007).

## Acknowledgments

Andrew Rambaut, Shai Meiri, Marcel Cardillo, Craig McClain and Michel Laurin provided very helpful comments on an early version of this paper. Many thanks also to Lindell Bromham, Jess Thomas, David Waxman, Lucy Weinert and reviewers. This work was supported by BBSRC grant DO17750 awarded to Andrew Rambaut.

## Footnotes

- Received August 21, 2008.
- Accepted October 6, 2008.

- © 2008 The Royal Society