Genetic markers are often used to examine population history. There is considerable debate about the behaviour of molecular clock rates around the population-species transition. Nevertheless, appropriate calibration is critical to any inference regarding the absolute timing and scale of demographic changes. Here, we use a mitochondrial cytochrome b gene genealogy, based entirely on modern sequences and calibrated from recent geophysical events, to date the post-glacial expansion of the Eurasian field vole (Microtus agrestis), a widespread temperate mammal species. The phylogeographic structure reflects the subsequent expansion of populations that went through bottlenecks at the time of the Younger Dryas (ca 12 000 years BP) rather than the Last Glacial Maximum (LGM, ca 24 000 years BP), which is usually seen as the time when present-day patterns were determined. The nucleotide substitution rate that was estimated here, ca 4 × 10−7 substitutions/site/year, remains extremely high throughout the relevant time frame. Calibration with similarly high population-based substitution rates, rather than long-term rates derived from species divergence times, will show that post-LGM climatic events generated current phylogeographic structure in many other organisms from temperate latitudes.
Phylogeography uses the spatial distribution of DNA sequence variation to investigate the recent evolutionary history of organisms , commonly by means of coalescent genealogy sampling . Using this approach, a decay in nucleotide substitution rates with time has been identified across the population-species transition [3,4]. However, both the magnitude and the timing of any decay are hotly contested [5–7], with particular focus on errors that may arise from the use of ancient DNA sequences to infer molecular rates [8–10]. Nevertheless, resolution of the debate becomes critical when applying a phylogeographic approach to reconstruct the post-glacial reoccupation of temperate and other high latitude areas [11,12], because the species-population transition coincides with the relevant time period. Indeed, it has been argued that inappropriate calibration may have misled substantial numbers of phylogeographic studies .
In this paper, we believe that we have been able to estimate an appropriate substitution rate directly from such a post-glacial population expansion, using accurately dated geophysical events to calibrate a mitochondrial gene genealogy from the Eurasian field vole Microtus agrestis. Molecular variation in the cytochrome b gene has previously been sampled across the whole range of this temperate mammal [14–16], but we have added critical new sequence data that permit us to reconstruct population history by means of Bayesian genealogy sampling. The demographic change and the nucleotide substitution rate that we obtain are entirely conditioned on modern sequence data. While there is inevitably a degree of uncertainty about the adequacy of sampling, the sequence data are not subject to any errors that may be attributed to the use of ancient DNA [8,9] and they do not rely on the potentially inappropriate extrapolation of rates from other, non-contemporaneous, divergence events . Our expectation was that the current phylogeographic structure in the field vole would reflect population expansion from glacial refugia when the climate warmed following the Last Glacial Maximum (LGM), ca 24 000 years BP , as is usually assumed [14,15]. However, our data indicate that the demographic expansion of the field vole was more recent than that, while the nucleotide substitution rate is much higher than rates that are typically used to calibrate such genealogies . If our new calibrations for the field vole are valid, this supports the need to reconsider phylogeographic data on post-glacial population expansion of other species as well.
2. Material and methods
(a) DNA amplification
Genomic DNA was extracted from frozen or ethanol-preserved tissue samples of 198 field voles from the British Isles and adjacent parts of northwest Europe. The entire 1143 base-pair cytochrome b gene was amplified in a single polymerase chain reaction and sequenced using standard protocols and published primers . The cytochrome b sequences were first used to separate the field vole material from a large set of Microtus samples collected across western Europe from southern Scandinavia to Iberia. These new field vole sequences were analysed together with 107 published sequences [14,16] from the Palaearctic range of the species. Sampling was heavily concentrated in the British Isles, which introduced a potential bias in the demographic analyses but nevertheless offered an important opportunity for their calibration (see below). All sequences reported in this paper have been deposited in the GenBank database (GU563195–GU563299) and voucher specimens for new material in the National Museums Scotland or the J.B. Searle tissue collection, Cornell University (see the electronic supplementary material, table S1).
(b) Phylogenetic analyses
Phylogenetic trees were inferred from the 204 distinct haplotypes and the phylogeographic structure was compared with that which was identified previously [14,16]. The trees were obtained using the neighbour-joining method implemented in MEGA v. 4.0.1 , the maximum-likelihood approach in PHYML v. 3.0  and the Bayesian phylogenetic inference of MrBayes v. 3.2 . A range of generalized time-reversible (GTR)-nested nucleotide substitution models were used in these initial phylogenetic analyses, to ensure that the results were robust to the choice of model. In addition, trees were generated from all 305 sequences with Bayesian genealogy sampling in BEAST v. 1.5.2 , as described below.
(c) Population genetic statistics
Nucleotide diversity (π), the average number of nucleotide differences between sequence pairs within a population, was calculated using MEGA v. 4.0.1  and its standard deviation was estimated from 1000 bootstrap pseudo-replicates. The calculation used the K2P nucleotide substitution model  and gamma distribution of rates across sites, with an alpha shape parameter of 0.123 estimated using ML in PHYML v. 3.0 .
Two neutrality test statistics, Tajima's D  and Fu's FS , were used to detect recent population expansion, which results in an excess of singleton mutations in external branches of the phylogeny. The significance of any departure from neutrality was determined by comparing the value of the test statistic with an empirical distribution obtained by randomly placing the observed number of mutations onto 10 000 coalescent simulations of the genealogy. These procedures were carried out using DnaSP v. 4.50 .
(d) Coalescent simulations
Times to most recent common ancestor (tMRCA) were estimated using Bayesian genealogy sampling in BEAST v. 1.5.2 . Posterior distributions were obtained from four independent Monte Carlo Markov chain (MCMC) simulations, each run for 100 million generations, by which time all chains converged and effective sample size for each parameter was more than 450. The first 10 million generations from each chain were discarded as burnin, to give a total of 360 million sampled genealogies. A skyline demographic model was used, in order to investigate the effective female population size over time, given that the field vole must have colonized most of its present range since the last glaciation and in light of the population genetic statistics calculated here. The simulations were repeated with simpler logistic and expansion growth models and Bayes factors, a measure of preference for one model over another, were calculated as the ratios of the marginal likelihoods obtained with each model. The harmonic means of the natural log likelihoods were taken as estimators of the marginal likelihoods and the resulting Bayes factors for the model comparisons, expressed as natural logarithms, were interpreted according to widely accepted criteria . These demographic analyses were carried out with a 2-partition (first and second codon position linked; third separate) HKY + Γ substitution model, as recommended for protein-coding nucleotide data . A strict molecular clock was calibrated using prior distributions on the tMRCA of the clade from northern Britain and the root height. The tMRCA of the north Britain clade was given a normal distribution truncated at lower and upper limits of 8400 and 14 685 years BP, respectively. The root height was given a gamma distribution that peaked around 145 000 years BP, which we considered the most plausible date within the 95% confidence limits of a previously estimated time of divergence for the whole species , given that it coincides with the end of the penultimate (Saalian) glacial period. The minimum of the root height distribution was 14 685 years BP, the end of the most recent (Weichselian) glaciation, and there was no maximum constraint on its long upper tail.
Additional separate simulations were carried out with only the continental Eurasian sequences (n = 78) and those from the British Isles (n = 227), to examine the contribution of each to the demographic changes, given the potential bias from the sampling regime. Bayesian skyline plots were again obtained from four independent MCMC simulations with the model and other conditions described above. In these two analyses, effective sample sizes for each parameter were at least 7288 and 902, respectively.
The MCMC simulations were repeated without sequence data, to test their influence on the posterior distributions. All dates are in years before present (years BP), with radiocarbon dates converted using the IntCal04 terrestrial radiocarbon age calibration .
3. Results and discussion
(a) Phylogeographic structure
All of the phylogenetic analyses recovered six well-defined clades, but without any convincing higher level structure (figure 1). A range of GTR-nested substitution models were used in these initial phylogenetic analyses, with no effect on the recovery of the six clades. The choice of the substitution model did however influence the inferred relationships between the clades, and there was no consistent support for any deeper branch. The six clades were recovered in the earlier field vole cytochrome b phylogeny  but only one of them (eastern) was identified at the time. The lineages have coherent geographical distributions with limited overlap (figure 2) and, given that they were specifically identified from a much larger sample of Microtus collected from Denmark to Iberia (§2), these data serve to define the known range of M. agrestis in western Europe. Closely related but distinctive cytochrome b sequences obtained to the south and west of the area occupied by the M. agrestis clades all belonged to the cryptic sibling species that has previously been proposed, but not formally associated with any of the available names [15,30].
The nucleotide diversities for the six M. agrestis clades were remarkably similar, although the eastern clade appeared to be more variable, as might be expected given its vast range across much of eastern Europe and north-western Asia. However, there was no significant difference between any of the values after taking account of their associated standard errors (table 1). In addition to the limited effect of geographical range on the nucleotide diversities, it is notable that the sample size has not affected the measured variability.
The whole field vole population showed strong evidence of recent expansion, based on the values of both Tajima's D statistic and the more sensitive Fu's FS statistic (table 1), as expected given that the species must have occupied most of its range in northern Eurasia following the last glaciation. There is clear evidence of expansion in the eastern, western and north Britain clades, based on both test statistics, but only the more sensitive Fu's FS indicates expansion of the central clade. The weaker signal of recent expansion may reflect the survival of the original population in the southern part of the range of this clade. Only the clades from western France and Scandinavia do not show any significant sign of expansion; however, this may be attributed to the very small sample sizes.
(b) Population history
One lineage is entirely confined to the northern part of the British mainland and the adjacent smaller islands. The restricted distribution of this clade provides an opportunity to calibrate the genealogy, based on its possible time of origin, which is constrained by two geophysical and climatic events. Firstly, the clade must have originated after 14 685 years BP, the time of rapid warming at the end of the LGM , as a temperate species would not survive the periglacial conditions that prevailed in ice-free parts of north-western Europe before that time . Secondly, the clade must have originated before the separation of Britain from mainland Europe owing to the Holocene rise in sea level, around 8400 years BP , otherwise, it would form a subclade of a larger British clade. To refine the calibration, we constrained the root of the tree, representing the common ancestor of the whole extant field vole population, to pre-date 14 685 years BP, on the grounds that the species would not be subject to such an extreme population bottleneck after the LGM.
The calibration is entirely dependent on the assumptions that we have outlined above, and it is particularly important that we have accurately placed the genealogy in the correct geographical context. Four of the six clades have distributions that range northwards from potential refugia in lower latitudes, while two clades are confined to higher latitudes in northern Britain and Scandinavia. As with any phylogeographic analysis, we cannot be certain that we have identified a clade throughout the entirety of its current range, so it is possible that these two clades have disjunct distributions in mainland Europe. This would allow an earlier origin for the clade from north Britain, antedating the constraint of 14 685 years BP that we have applied here. Nevertheless, we have sampled Microtus across the whole western European range of the field vole and its cryptic sibling species and have found no evidence for the presence of the clade from northern Britain in more southerly locations where the species might conceivably have survived the LGM. In addition, there is always an element of uncertainty in relating current sampled distributions to historical distributions and this cannot be addressed without recourse to heterochronous material. It is therefore important to bear in mind that, as the calibration relies on the phylogeographic reconstruction here, the demographic inference will be subject to any uncertainty in this.
We used Bayesian genealogy sampling to estimate the posterior distribution of population parameters, with the prior constraints outlined above. Expressed as natural logarithms, the estimated marginal likelihoods were −4957.864 (s.e. 0.101) for the Bayesian skyline model, −4958.414 (s.e. 0.088) for the logistic growth model and −4959.764 (s.e. 0.093) for the expansion growth model. According to the accepted criteria , the skyline model was strongly preferred over the expansion growth model (ln Bayes factor 1.900), but there was little, if any, preference for the skyline model when compared with the logistic growth model (ln Bayes factor 0.550). The skyline demographic model equalled or bettered the simpler models and is much more informative, so we have used the resulting posterior distributions here.
We found that the whole of the current population of the species had a single mitochondrial ancestor around 23 856 years BP (table 2). This coincides with the coldest period of the LGM . The six regional lineages originated around 11 443–12 600 years BP (figure 1 and table 2), contemporaneous with the Younger Dryas glacial re-advance . The Bayesian skyline plot, effective female population size through time, shows no sign of growth between the origin of the whole population and that of the six regional clades at the Younger Dryas, despite the increased temperature of the intervening Bølling–Allerød interstadial (figure 3). Following the Younger Dryas, there was a rapid rise in population size, coinciding with the sudden rise in temperature at the onset of the Holocene .
The simple explanation for this pattern is that the field vole population became bottlenecked at the time of the Younger Dryas glacial re-advance, so that the signal of earlier population expansion was lost, while the subsequent demographic expansion originated from six ancestral populations, in separate geographical locations at the time of the bottlenecking. It is difficult to see how the six clade populations could have arisen in different parts of Europe, without this earlier expansion out of the single LGM refugium. In any case, it is unlikely that the field vole population failed to respond to the rising temperature of the Bølling–Allerød interstadial (figure 3).
As indicated earlier, the relationship of population size with time is entirely dependent on the assumptions of the calibration, particularly regarding the distribution of the clade that has only been found in northern Britain. The 95 per cent highest posterior density margins for the parameter are also wide, as is usual for this type of analysis, which must sacrifice precision for accuracy of inference . Nevertheless, it is difficult to rationalize the relationship between temperature and population size using alternative values that are far from the median curve (figure 3). The same applies to more specific indicators of the changing climate, with permafrost conditions present in much of Europe during the LGM and returning to unglaciated northern parts of the continent at the time of the Younger Dryas [32,35]. In particular, there is no sign of the contraction in population size that would surely be expected during the Younger Dryas, if the six clades originated from bottlenecks at the time of the LGM. It is even less likely that the clades are of earlier origin and the population size did not respond to more than one cycle of climatic change.
There is one apparent conflict between the result here and the fossil record, with fossils from the southern Ural Mountains attributed to this species and dated to the LGM . However, only recent dated records of this species are reliable, owing to changes in understanding of the limits to morphological variation in the genus Microtus and the chaotic state of Late Glacial radiocarbon chronology . Furthermore, the presence of fossil material does not indicate that the animals contributed genetically to later populations. Indeed, it has been suggested that the differential extinction of such local populations could amount to a significant evolutionary phenomenon . It is perfectly feasible that field voles were present in the southern Urals at the time of the LGM, but made no contribution to the subsequent population of the species. Similarly, biogeographic modelling has been used to infer that the field vole might survive the LGM in multiple locations in Eurasia . Such modelling is at an early stage, but even if correct, the presence of suitable climate in these locations does not demonstrate that animals actually survived there, nor that they contributed genetically to future generations.
In addition to the demographic expansion at the beginning of the Holocene, the population increased again from around 3000 years BP, this time without any associated change in temperature (figure 3). As well as demonstrating that the expansion following the Younger Dryas was initially more rapid and more pronounced in continental Eurasia than in Britain, the separate simulations with the continental and British samples indicate that this recent expansion was entirely confined to the British Isles (see the electronic supplementary material, figure S1). This is reflected in the neutrality test statistics (table 1), which show a stronger signal of recent expansion in the two clades that include sequences from Britain. Although the recent rise may be entirely attributable to populations from Britain, this is evidently not the case for the earlier expansion following the Younger Dryas. While the timing of the earlier expansion coincides with rising temperatures across the Northern Hemisphere, the recent expansion is more readily associated with the loss of woodland that took place from the Neolithic period onwards. This would appear to favour a grassland species like the field vole and, in the absence of any competition from other Microtus, it is now the most common wild mammal to be found in mainland Britain .
(c) Molecular clock rates and phylogeography
The nucleotide substitution rate that we estimated here is very high, 3.887 × 10−7 substitutions/site/year (95% highest posterior density interval 2.954–4.885 × 10−7 substitutions/site/year), similar to short-term mitochondrial mutation rates derived from human pedigrees . It is almost four times an intrageneric substitution rate derived from two species of Microtus  but this rate was calibrated from the divergence of precursor lineages 1.5 Ma. It is around 20 times faster than a frequently quoted rodent mtDNA rate ; however, this is based on an even deeper divergence between species from different genera. While the overall pattern and timing of this decay from short-term mutation rates to long-term substitution rates may be uncertain, our analysis indicates that the field vole mtDNA rate has not yet declined substantially within 25 000 years.
The intraspecific substitution rate here was inferred from the data themselves, using an opportunity to calibrate with external events and avoiding the need to impose an arbitrary choice of rate from a different genetic divergence. As with other parameters that were estimated here, the substitution rate is dependent on the accuracy of the calibration and subject to the uncertainty associated with the phylogeographic reconstruction that underpins the work. It leads to the association of genetic changes with two recent climatic events, the Younger Dryas and the LGM, which seems more plausible than alternative scenarios that imply the retention of genetic structure through more than one glacial cycle. It is also in accordance with the finding that there is no phylogeographic signal in fossil mammals that pre-date the LGM .
It is surprising that the range-wide genetic structure of the field vole, a widespread and cold-tolerant species, would be shaped by such a recent climatic event as the Younger Dryas. The response of individual species to climatic change will depend largely on their own adaptations and environmental tolerance . Nevertheless, there is a strong tendency to assume that any subdivision into multiple geographically widespread lineages, similar to that seen in the field vole (figure 2), reflects the origin of those lineages in multiple refugia at the LGM or earlier climatic minima. This perception has typically been supported by substitution rates estimated from divergences between species and genera; however, calibration with inappropriate external substitution rates may constitute an important source of error in phylogeographic studies . Based on our studies of the field vole, such ‘support’ may well be erroneous and the Younger Dryas may be more important than the LGM, or earlier events, in shaping the current phylogeographic structure of temperate and other high-latitude species. Our findings suggest that there may be a need for revision of the post-glacial colonization history of organisms in Europe and elsewhere, using phylogeographic analysis based on more appropriate substitution rates.
Many thanks to the individuals and institutions who kindly provided material for this study. J.S.H. thanks National Museums Scotland for financial support and İslam Gündüz for training in molecular laboratory techniques. We would also like to thank Russell Gray and two anonymous referees for helpful contributions that significantly improved the paper.
- Received February 11, 2011.
- Accepted March 31, 2011.
- This Journal is © 2011 The Royal Society