## 1. Introduction

In a previous study, Weir & Schluter [1] (hereafter WS2007) argue that smaller average genetic distances between species in cooler latitudes represent faster rates of speciation relative to tropical species. Such an interpretation assumes clades evolve at constant rates across latitudes so that a hypothetical molecular clock ticks at the same rate, regardless of climate. By contrast, our research shows that their assumed relationship between genetic distance and time is confounded by climate [2]. A number of other studies have also found microevolutionary rate variation to be mediated by differences in population size, elevation, latitude, water availability and ocean depth [3–7]. In a later investigation, Weir and Schluter themselves found a greater than fourfold variation in the avian molecular clock that was not readily explained by stochastic variation in substitution rates [8]. While Weir & Schluter [9] (hereafter WS2010) agree that warm latitude species are evolving faster, they claim that the size of this difference is insufficient to confound their earlier results. However, several techniques employed in their reanalysis are unsuitable to the questions being considered and their analyses lack statistical validity. Moreover, their phylogenetic approach is found to be less suitable than our original approach in 95 per cent of cases, as evaluated by the Akaike Information Criterion (AIC).

## 2. Parametrization

WS2010 claim that we have used an overparametrized model of molecular evolution (GTR + I + *Γ*, with parameters estimated for each four-sequence alignment) and that parameters are better estimated jointly from a single alignment of the 521 cytochrome b sequences, spanning several orders of mammals. They refer to these approaches as separate-parameter (ours) and joint-parameter (theirs) models, respectively. WS2010 argue that parameters are unlikely to be stable under our separate-parameters approach, which they claim may stretch long branches. However, they do not consider what effects their alternative joint-parameters model might have on parameter accuracy or on the resulting branch-length ratios. Here, we examine the evidence for overparametrization within our data, the effect such putative overparametrization could have on branch-length ratios, and the broader suitability of the separate- and joint-parameters models by comparing their AIC scores.

WS2010 make the principal claim that our separate-parameters model inflates branch-length ratios by disproportionately stretching long branches when genetic distances are large. While, theoretically, this seems plausible, there is no evidence that this has occurred. If this claim was correct, we would expect absolute values of our branch-length ratios to increase with genetic distance. By contrast, there is a negative relationship between genetic distances and branch-length ratios produced using our model (rank correlation: *n* = 129, *Z* = −2.5, *r* = −0.22, *p* = 0.01). Indeed, since overparametrized models introduce error stochastically, branch lengths of both low- and high-latitude species should be expected on average to increase to a similar degree [10]. Thus, any general increase in branch lengths would tend to reduce warm/cool species branch-length ratios—the metric we reported—not inflate them.

Under their joint-parameters model, phylogenetic model parameters were derived from a 500+ species megaphylogeny. However, this approach does not accommodate any clade-to-clade variation that may exist. Conversely, if our separate-parameters approach were overparametrized, the additional parametrization should increase likelihood tree scores relative to their joint-parameters model, so that AIC scores—that penalize additional parametrization—would favour the joint parameters model. We, therefore, compared AIC scores calculated from PAUP* v. 4.0 b likelihood scores for the separate- and joint-parameters trees. The number of free parameters, *k*, was set as 15 under the separate-parameters model (model parameters and branch-length parameters), and as five under the joint-parameters model (branch-length parameters only; see online electronic supplementary material). In fact, AIC scores were better (lower) under our separate-parameters model in 123 of the 129 cases. Thus, the sum of AIC for our separate-parameters models (mean = 5191) was lower than the sum of AIC for their joint-parameters models (mean = 5412). We, therefore, conclude that our separate-parameters model is more appropriate than the joint-parameters model in this instance and we find no evidence of its overparametrization.

Having established the importance of parameter heterogeneity in these data, we further sought to investigate whether other potential overparametrization arising from our model selection may have affected our analysis. We, therefore, used ModelTest (implemented in Geneious v. 5.0.4, [11]), to determine optimally parametrized models for each of the 129 alignments, under AIC. The GTR substitution model that we applied in our original dataset was selected in 68.2 per cent of cases, with most models having either a gamma-distributed rates across sites (52.3%) or a proportion of invariable sites estimated (35.9%), or both (7.8%). We note that branch-length ratios produced by our separate-parameters model were more concordant with branch lengths derived from these AIC-selected models than were those produced by the joint-parameters model of WS2010 (figure 1). However, as AIC favoured a simpler model than GTR for 31.8 per cent of alignments, we also sought to determine the magnitude of any bias that might occur in such cases. We, therefore, generated simulated sequence data using Seq-Gen [12] under the relatively simple HKY85 model, with a range of tree lengths and transition/transversion ratios. Trees were then estimated in PAUP* [13] using both HKY85 and GTR + G models of nucleotide substitution. GTR + G overparametrization was found to have had minimal effect on branch-length ratios, overestimating the ratio in 60.5 per cent of cases (i.e. 10.5% more frequently than expected by chance), but only by an average of 0.9 per cent relative to the estimates from the HKY85 model.

We, therefore, conclude that both parameter heterogeneity (to accommodate genuine, clade-to-clade variation) and model complexity are required for adequate phylogenetic reconstruction in these data.

## 3. Testing for evolution rate heterogeneity

WS2010 have undertaken major axis regression analysis to estimate the relationship between warm and cool species branch lengths as an alternative to our measure of central tendency of branch-length ratios. That regression analysis is not appropriate for several reasons. First, the data are non-normal, violating the requirement for bivariate normality under major axis regression. WS2010 applied an arbitrary correction for normality by dividing branch lengths by the square root of half the sum of the cool and warm branches. Although they state that this correction accommodates the increased variance in longer branches, it only does so by inflating the variance in smaller branches, as numbers smaller than one are increased by such a correction (figure 2). Thus, while their correction nominally resolves the issue of data normality for the joint-parameters method, it does so by compromising data quality. Furthermore, this correction does not resolve data normality for the separate-parameters model.

As branch-length non-normality cannot be consistently corrected for in this case, a non-parametric measure of central tendency, such as the median, better represents the relationship between branch lengths of warm and cool species. Analysing median branch-length ratios gives us warm : cool species ratios that range from 1.24 (their joint-parameters model) to 1.29 (AIC-selected models) with our original analysis in between at 1.28 (separate-parameters model), indicating the relative robustness of this measure. Interestingly, major axis regression using our separate-parameters branch lengths also produces a similar slope of 1.27. All of these approaches are at odds with the smaller ratio of 1.17 derived from the major axis regression slope produced under the joint-parameters model of WS2010. We propose that, given the superior AIC fit for our separate-parameters model, it is reasonable to accept the median ratio of 1.28 produced by this approach as representative of the true relationship.

## 4. Distributional overlap of species

Our experimental design included a maximum latitudinal overlap of 25 per cent, which WS2010 claim was not consistently applied. However, we took the conservative approach of including pairs where there was uncertainty or conflicting information on latitudinal overlap. Inevitably, different distribution maps will vary in their data. Nonetheless, this is of little relevance as WS2010 show that with or without the pairs they question, the results remain qualitatively unchanged.

## 5. Latitudinal regression analysis

WS2010 use ordinary least squares (OLS) regression to test for a relationship between genetic branch-lengths and latitudinal mid-point separations for the sister species from our dataset. However, our data do not meet the assumptions of regression analysis as residuals are non-normal. More generally, such an analysis assumes that all latitudinal differences have applied consistently over evolutionary time. By contrast, many of the larger latitudinal differences in the dataset are likely to have arisen during the Holocene and will, therefore, not be reflected in cytochrome b evolution because its rate of change is on a very different timescale. While this point raised by WS2010 is intriguing and certainly warrants further investigation, there is little power in their analysis to extract from our data the magnitude of any climatic influence on microevolution via OLS regression.

## 6. Wider context

WS2007 report faster rates of per-lineage origination at higher latitudes, reaching a maximum near the limits of life at the poles. However, that study did not measure origination rates; it instead measured genetic divergence between species pairs that were centred at a range of different latitudes. Under the assumption of a global uniform molecular clock, WS2007 use genetic distance between species to indicate time since divergence. We [2] found that this central assumption was not supported since our data demonstrate that low-latitude species have faster rates of genetic divergence. WS2007 then inferred speciation and extinction rates from birth–death models based on genetic distances between species pairs across a range of latitudes. However, as recently demonstrated by Rabosky [14,15], estimates of extinction rates from phylogenies of extant taxa are sensitive to model assumptions and prone to error. As Rabosky argues, meaningful estimates from such techniques may be impossible.

The interpretation by WS2007 is also at odds with what is known from the fossil record. Fossil studies that have been able to distinguish origination from extinction and emigration demonstrate greater origination rates and younger taxa, where species richness is highest (e.g. [3,16,17]). In light of the fossil record and the uncertainty surrounding phylogenetic estimates of origination rates, if one assumes rates of origination do not favour cold climates, then the greater genetic divergence between tropical species reported by WS2007 is most parsimoniously interpreted as arising from faster rates of tropical evolution. This reasoning is both consistent with our findings and with the fossil record. Therefore, our findings have cautionary implications when applying a homogenizing molecular clock across clades that are in fact differentiated by variation in both ecological and climatic circumstances.

## Footnotes

The accompanying comment can be viewed at http://dx.doi.org/10.1098/rspb.2010.0388.

- Received December 6, 2010.
- Accepted January 14, 2011.

- This Journal is © 2011 The Royal Society