Ice-age survival of Atlantic cod: agreement between palaeoecology models and genetics

Grant R Bigg, Clifford W Cunningham, Geir Ottersen, Grant H Pogson, Martin R Wadley, Phillip Williamson


Scant scientific attention has been given to the abundance and distribution of marine biota in the face of the lower sea level, and steeper latitudinal gradient in climate, during the ice-age conditions that have dominated the past million years. Here we examine the glacial persistence of Atlantic cod (Gadus morhua) populations using two ecological-niche-models (ENM) and the first broad synthesis of multi-locus gene sequence data for this species. One ENM uses a maximum entropy approach (Maxent); the other is a new ENM for Atlantic cod, using ecophysiological parameters based on observed reproductive events rather than adult distribution. Both the ENMs were tested for present-day conditions, then used to hindcast ranges at the last glacial maximum (LGM) ca 21 kyr ago, employing climate model data. Although the LGM range of Atlantic cod was much smaller, and fragmented, both the ENMs agreed that populations should have been able to persist in suitable habitat on both sides of the Atlantic. The genetic results showed a degree of trans-Atlantic divergence consistent with genealogically continuous populations on both sides of the North Atlantic since long before the LGM, confirming the ENM results. In contrast, both the ENMs and the genetic data suggest that the Greenland G. morhua population post-dates the LGM.


1. Introduction

To predict how species' ranges may change under global warming scenarios, we must understand the factors that limit their distributions today and in the past. Terrestrial ecologists have correlated information on species occurrence with annual rainfall and temperature: remarkably accurate models of extant ranges can then be developed (Hugall et al. 2002; Iguchi et al. 2004). In addition, palaeoclimate records can be used to estimate how terrestrial species' ranges might have contracted or expanded during past climatic cycles (Watson & Peterson 1999; Hugall et al. 2002). Such ecological-niche-modelling has grown in importance as predictions are made concerning plant and animal ranges in relation to future climate change and extinction risks (Thomas et al. 2004; Thuiller et al. 2005).

This study of Atlantic cod addresses major challenges that complicate extending ecological-niche-modelling to the three-dimensional environment of marine fishes. Both Atlantic cod Gadus morhua and Pacific cod Gadus macrocephalus are descended from an Atlantic lineage that invaded the Pacific at least 3.5 Myr ago, when the Bering Strait was open and the Arctic was ice free (Raymo et al. 1990). The Atlantic cod lineage invaded the Pacific for a second time ca 2 Myr ago to found the morphologically distinct Alaska pollock Theragra chalcogramma (Carr et al. 1999; Coulson et al. 2006). During the last inter-glacial period ca 100 kyr ago, the Pacific cod lineage re-invaded the Atlantic to found the Greenland cod Gadus ogac (Coulson et al. 2006), whose ecophysiological requirements allow it to extend farther into the Arctic than either the Atlantic or the Pacific Cod.

The last glacial maximum (LGM) ca 21 kyr ago was one of the coldest events of the 45–50 major climatic oscillations in the Late Quaternary (Wolff 2005). Distributional changes for planktonic organisms, particularly foraminifera, have provided the basis for most reconstructions of marine conditions during the LGM (e.g. CLIMAP 1976). These reconstructions have since been refined, using additional proxies and palaeoclimate models (e.g. Mix et al. 2001; Meland et al. 2005). With few exceptions (Bucklin & Wiebe 1998), such information has not been used to assess the past distributions of larger marine fauna and flora; nevertheless, for species occupying polar or sub-polar shallow water habitats for all or part of their life cycle, glacial periods were radically different from the present day, and relatively unfavourable. Thus, the LGM saw the loss of most shelf sea habitats as global sea level fell by 120–135 m (Mitrovica 2003), and ice sheets advanced onto the continental shelf in polar waters (Peltier 1994). This habitat loss was particularly acute along the western and northern shores of the North Atlantic, where the continental shelf is narrow. In addition, the North Atlantic, and particularly the eastern coastal region of North America, was subject to repeated invasion by large increases in iceberg numbers (Bond et al. 1992), causing abrupt climate change and extensive shelf floor scouring (Piper 2005).

The study of climatically driven range changes for marine species is greatly complicated by the dynamic, three-dimensional nature of their environment. Conditions can vary greatly over a few metres in water depth, and different life-cycle stages may occupy different habitats. Here we use two ecological-niche-models (ENMs) for hindcasting LGM ranges of G. morhua: one based on adult range and the other focusing on the optimum conditions for reproduction (primarily spawning that occurs in a subset of the adult range). Both the ranges are likely to have been significantly compromised by the harsh and variable environmental conditions described above in the northern Atlantic during the LGM, particularly along western and northern margins, with possible consequences for the genetic diversity of the species. A limitation of ENMs is the inevitable circularity introduced by using extant populations to model extant ranges. Extending the approach of Hugall et al. (2002), we use mitochondrial and nuclear DNA sequences to test the predicted patterns of genetic subdivision when our models are applied to temperature and depth conditions prevailing during the LGM. Thus, through the complementary analyses of ENMs and genetics we address the question of whether, and where, LGM environmental conditions were sufficiently unfavourable to force pre-glacial races of cod to extinction.

2. Material and methods

(a) Maximum entropy ENM

Most ENMs for fishes are based on observed locations of adults, with that information used to simulate adult occurrence by various algorithms. For example, the interactive aquamaps system in FishBase ( uses a c-squares distribution modelling approach (Rees 2003) while Iguchi et al. (2004) uses the genetic algorithm for rule-set prediction (GARP). Several parameters used in the aquamaps system are not easily available from proxies for palaeo-analyses, such as salinity and accurate measures of productivity. Here we use a relatively new ENM called Maxent (Phillips et al. 2006) to produce a map of the probability of cod's occurrence, using data on known occurrence and a set of environmental variables considered important for its habitat. The Maxent uses a maximum entropy principle (i.e. the distribution is as uniform as possible, given a set of constraints conveying our uncertain knowledge of the species); it has been tested against a wide range of other ENMs and has been found to be among the most successful (Elith et al. 2006; Hernandez et al. 2006). Our training dataset of occurrences combines FishBase records with the collection locations of Pogson et al. (1995) and Árnason et al. (2000). The test dataset used to check the performance of the ENM for the present day is from Jónsdóttir et al. (2003). The environmental variables used to constrain the habitat probability were ocean bathymetry, sea surface temperature and sea surface salinity. For present-day validation of Maxent, the relevant bathymetry was at 5′ resolution (ETOPO 1986), while the temperature and salinity were interpolated to this resolution from the 1° Levitus et al. (1994) and Levitus & Boyer (1994) fields, respectively, or taken from the ocean model (§2c) which does provide us with palaeosalinity as well as palaeotemperature data. We acknowledge that temperature effects on adult distributions may be indirect, e.g. operating through trophic interactions affecting juvenile survival (Hjermann et al. 2007).

(b) Ecophysiological constraints for spawning

Spawning in cod requires a more restrictive set of environmental conditions than adult existence (figure 1). We therefore constructed a simple set of ecophysiological constraints, based on conditions of observed spawning to enable climate models to predict potentially suitable breeding habitats. These constraints included: depth; month of spawning; and ambient temperature for egg development. Our focus on spawning identifies populations that should be self-sustaining, excluding ‘sink’ populations where adults are present but do not spawn. Suitable conditions for cod spawning were determined from observational syntheses published by ICES (International Council for the Exploration of the Sea; Brander 1994, 2005).

Figure 1

Present-day distribution of Atlantic cod stocks in relation to annual mean temperature. Spawning areas shown in black. Adapted with permission from Sundby (2000).

Figure 2 describes the envelope of criteria for depth of spawning (that occurs near the sea floor), and field temperatures in near-surface waters during egg development (egg buoyancy is increased post-fertilization). Frequency distributions in figure 2 were determined from ICES data (table 2.3 of Brander (1994)), allocating each stock equally to the x-axis categories covered by their range values. These ranges yielded an envelope of spawning depth from 0 to 400 m, and ambient temperature from 0 to 9°C (figure 2). Because late spring and early summer are key times for egg development, egg hatch and early juvenile survival, an additional constraint was that waters should be ice free by June, defined for modelling purposes as having June temperatures greater than 0°C in the top 20 m. A subsequent ICES synthesis (Brander 2005) included lower minimum (−1.5°C; East Greenland) and higher maximum spawning temperatures (up to 12.7°C, Georges Bank). To be conservative, and to match laboratory data for egg survival (Pepin et al. 1997), we used the narrower ranges from the 1994 synthesis.

Figure 2

Environmental factors defining suitable habitat for cod, based on present-day distributions (Brander 1994, 2005). (a) Water depth in which spawning occurs (data for 18 stocks). (b) Seasonality of spawning (n=23). (c) Ambient temperature for spawning and eggs (n=18). For each histogram, arrowed range shows boundaries used for modelling. Multi-stock laboratory egg survival data (Pepin et al. 1997) also given in upper part of (c), as a function of temperature. The curve is fitted with a third-order polynomial.

(c) Modelling ocean conditions in the present and at the last glacial maximum

To produce oceanic environmental parameters at the LGM, both present-day and LGM ocean simulations were carried out using an ocean general circulation model (Wadley et al. 2002) based on the Bryan–Semtner–Cox formulation. The present-day simulation was used to validate the use of the model hindcasts in the ENMs. The model has an orthogonal curvilinear grid with its north pole in Greenland. This gives relatively high resolution in the northern North Atlantic and Arctic regions, compared with the Southern Hemisphere and North Pacific. The model can be efficiently integrated by the use of a variable time-stepping scheme (Wadley & Bigg 1999), which allows the time-step length to be spatially varied with model resolution. Monthly mean surface forcing was used for momentum, heat and fresh water fluxes. For the LGM integration, present-day fluxes were adjusted by the addition of past (LGM minus present-day) fluxes from atmospheric model simulations (Dong & Valdes 1998), and ocean bathymetry took into account changes in absolute sea-level and ice-loading effects (Peltier 1994). Both present-day and LGM simulations were run for 4500 years and were stable for the last 2500 years of integration. Given the ocean model's vertical resolution constraints, the criteria for suitability for method (§2b) were that the seafloor depth is less than 500 m, water temperatures at 20–100 m depth are in the range 0–9°C sometime between February and June and the sea surface is ice free during June.

(d) Calibration point for rate of genetic divergence

A temporal component is necessary to test the hypothesis of population persistence through the last glaciation, requiring calibration of the rate of molecular evolution. Because cod have speciated twice since the opening of the Bering Strait ca 3.5 Myr ago, they are thought to have participated in the first large wave of invasion (Grant & Ståhl 1988; Carr et al. 1999; Pogson & Mesa 2004). Although one mollusc invaded the Pacific earlier (Marincovich & Gladenkov 1999), the majority of species with fossil records crossed the Bering Strait after 3.5 Myr ago (Vermeij 1991). For this study we used the 3.5 Myr ago date to calibrate rates of molecular divergence between Atlantic cod (G. morhua) and Pacific cod (G. macrocephalus). Using an earlier date for the calibration would have yielded a slower rate of molecular divergence, making our estimates of trans-Atlantic population divergence even older than the ages we present.

Maximum-likelihood trees for the three loci with DNA sequences were inferred using PAUP* v. 4.0b10 (Swofford 2002) under the HKY model of evolution and among site rate variation (allowing direct comparison to the age of divergence as simulated by models MDIV and IM, which only implement the HKY model; see §2g below). Parameters of the model (estimated nucleotide frequencies, proportion of invariant sites and the shape of the gamma distribution) were estimated using maximum likelihood.

(e) Genetic data from the literature and additional sequencing

The Bering Strait calibration point could only be applied to data that are directly comparable between Atlantic and Pacific cod. Such data are available for 10 allozyme loci (Grant & Ståhl 1988), mtDNA (compilation of many authors' cytochrome b (CytB), samples recently reviewed by Árnason (2004)) and two nuclear genes (pantophysin and S2 ribosomal protein (Pogson & Mesa 2004)). There were no comparable datasets including both Atlantic and Pacific cod for either nuclear RFLPs or microsatellites.

Sufficient genetic data were available in the literature and GenBank from populations across the North Atlantic with the exception of the nuclear S2 ribosomal protein gene, a section of which was PCR amplified using Primers F1 CAAGGAGTCAGAGATCATCGAC and R1 TCAAAAGCAATTAGGAGGTGGC (Pogson & Mesa 2004) resulting in 898 aligned base pairs. These new data represented 118 alleles for G. morhua that included 546 bp of coding sequence and two introns of 267 and 85 bp, respectively. Additional S2 sequences were obtained from GenBank for one G. morhua sequence (from Atlantic Canada) and one G. macrocephalus sequence (from Pacific Canada) from an earlier study of Gadid phylogeny (Pogson & Mesa 2004). These comparisons strongly reinforced RFLP studies showing that the S2 gene is not under positive selection in G. morhua (Pogson et al. 1995): not a single first or second codon position has fixed in the S2 gene since the separation of Atlantic and Pacific cod (Pogson & Mesa 2004). Since selection is not expected to directly affect rates of neutral substitution (Birky & Walsh 1988), we removed first and second positions from both nuclear genes (leaving 547 bp for S2, 1473 for Pan-A and 1467 bp for Pan-B) prior to performing the Bayesian analyses described below.

Because the S2 sequences were produced by direct sequencing, heterozygotes were detected as clear double peaks. Of the 118 newly sequenced alleles, only 16 contained sites that could not be unambiguously assigned to one or the other parental allele with more than 95% confidence using the Bayesian approach implemented in Phase v. 2.1.1 (Stephens & Donnelly 2000). The results shown here are derived from a data matrix including the set of the highest scoring haplotypes from Phase. An alternate matrix including only the second highest scoring haplotypes gave virtually identical results for the analyses presented below.

(f) Estimating percentage of private alleles, effective population size and growth

The percentage of private alleles was calculated by weighting each allele equally, regardless of the number of individuals sharing that allele. Because the total number of private alleles tends to increase with sampling effort, the percentage of private alleles was adjusted by bootstrapping 100 times for the CytB (which varied in sampling effort from 78 to 519 (table 1)). Gene diversity (same as haplotype diversity) was estimated using Arlequin v. 2.0 (Schneider et al. 1999). For each population, theta (a parameter representing the effective population size and substitution rate) and the exponential population growth parameter G were jointly estimated by maximum likelihood with the Fluctuate v. 1.4 program (Kuhner et al. 1998), as described by Wares & Cunningham (2001). Fluctuate was not applied to the pantophysin data because positive selection will mimic the demographic consequences of high population growth.

View this table:
Table 1

The percentage of private alleles and gene diversity in each cod population. (n.a.: growth estimates are not valid for loci under selection, such as Pan-A and Pan-B.)

(g) Estimates of migration and timing of population divergence

It is well established that high levels of migration can confound estimates of population ages by reducing the degree of genetic differentiation between populations (Slatkin 1993). DNA sequence data were analysed by two Bayesian methods that allow simultaneous estimation of migration (M) and age of population divergence, MDIV (Nielsen & Wakeley 2001) and IM (Hey & Nielsen 2004).

The estimates of population divergence between populations in MDIV depend strongly on age priors (IM results are much less dependent on priors). Hence all MDIV runs were standardized so that they had the same maximum prior of 200 kyr ago (covering about two glacial cycles). To test our palaeo-ENM hindcasts of cod distributions during the last glacial period, minimum ages of populations are of interest. As mentioned above, our calibration of the rate of substitution is a maximum estimate, so earlier opening of the Bering Strait would give only older numbers.

Because minimum ages are of interest, we present only the 95% CI for the minimum age of population divergence calculated as in an earlier study (Riginos et al. 2004). In subsequent runs, prior values of M were reduced if they greatly exceeded the curve of values. For MDIV, which outputs only columns of numbers, the modes for each parameter were found by fitting a five-order polynomial to the top one-third of the a posteriori values. MDIV and IM trials ran for at least 10 million chains, with multiple runs to ensure stability of results. The options for the IM runs were set as follows: qu 1 -q1 60 -m1 30 -m2 30 -t 2 -L 6.0 -n 1 -b 100000 -p 234567 -z1000, using the November 2005 version of IM (the July 2006 version could not handle our large sample sizes). CIs for the allozyme divergences were calculated from Contml (Felsenstein 2005).

3. Results

(a) Modelling present-day ocean conditions and cod distribution

Our modelled present-day ocean circulation (Wadley et al. 2002) is in good agreement with observed circulation, with approximately 16 Sv (1 Sv=106 m3 s−1) of deep water exported from the South Atlantic. Model-derived temperatures and salinities were interpolated onto a 1°×1° longitude–latitude grid, masked for the present day by a 5′×5′ depth and elevation database (ETOPO 1986). These fields match well with the ocean climatology of Levitus et al. (1994).

Environmental conditions from this climatology, as well as the present-day oceanographic simulation, were used in the Maxent ENM (figure 3a,c) and were also used to define two versions of the expected ranges of present-day suitable habitat for spawning (figure 3b,d). In all four cases, modelled ranges of suitable habitat were congruent with the distributions of more than 90% of the Atlantic cod stocks recognized by ICES (Brander 1994, 2005; also see figure 1); however, the two Maxent ENM realizations extend probable near-coastal ranges much further south than those observed, on both sides of the Atlantic. The principal environmental predictor for Maxent is bathymetry, followed by sea surface temperature; salinity has only a minor influence on the probability of G. morhua occurrence.

Figure 3

Regions of the North Atlantic considered suitable for cod spawning and egg development under present-day conditions. (a) Maxent probability distributions using climatological environmental conditions and (c) Maxent distributions using ocean model-derived environmental conditions. In (a,c) the white dots show the training data and the violet dots the test data points; red areas show strong predictions, with more than a 50% chance of finding cod. (b) Range (blue) consistent with ecophysiological parameters using climatological environmental conditions and (d) ecophysiological-based range using ocean model-derived environmental conditions. In (c,d), areas of mismatch with present-day distribution are circled. Note that the climatological and model data have been interpolated to the higher resolution of the bathymetric data.

For both the ENMs, the use of climatological predictors (figure 3a,b) suggests a more northerly present-day range than using the ocean model environmental variables (figure 3c,d). The discrepancy is the greatest for the NW Atlantic, but less for the Barents Sea. A probable explanation is that spring climatological data are poorly observed in these regions, owing to widespread sea ice, leading to loose constraints except for the bathymetric variable. For the ecophysiological predictor using ocean model data, there are two (relatively minor) southerly false negatives, in the English Channel and Georges Bank off Massachusetts (figure 3d), primarily caused by lack of horizontal resolution of the ocean model providing basic environmental data for the ENM. There were also some false positives, again driven by ocean model resolution limitations, in regions off Baffin Island and Greenland where the species is currently absent. Note the difference between climatological and model variable predictions in the northern Barents Sea, where both Maxent and our ecophysiological parameters show blanks when using model data. Although adult cod are abundant in the Barents Sea as far north as Spitzbergen (Hjermann et al. 2007), the spawning of this stock is limited to near the Norwegian coast (figure 1; Brander 1994, 2005). This result validates the use of ocean model environmental variables to drive the ENMs (as is necessary for the LGM simulation, below); in particular, the model-driven ecophysiological ENM (figure 3d) provides a relatively robust predictor of conditions suitable for population persistence.

(b) Modelling cod range at the LGM

In the LGM simulation most of the 10 Sv of deepwater formation upwells within the Atlantic basin, with only 4 Sv leaving the South Atlantic (Wadley et al. 2002). During the LGM, we estimate sea-surface temperatures decreased by as much as 12°C, with a sharp temperature front at approximately 50° N separating polar and temperate surface waters. The LGM land and depth mask was derived by adding past (LGM minus present-day) depths and elevations (Peltier 1994) to those for the present day.

Both palaeo-ENMs for the LGM give similar results (figure 4a,b). They suggest that G. morhua had a continuous Atlantic range off coastal Europe down to at least northern Spain. This hindcast simulation also shows a limited range of potentially suitable habitats off eastern North America, mostly near Nova Scotia and around the Grand Banks. This suitable habitat is consistent with the conclusions of the GLAMAP project that parts of the Grand Banks were unglaciated at the LGM (Pflaumann et al. 2003; Meland et al. 2005). Apart from the scattered small areas off Baffin Island and Greenland, there is a much larger area in the northeast Mediterranean and the Black Sea (unlikely to have been occupied owing to geographical disjunction). Near Iceland the two models show more disagreement, with only the ecophysiological ENM suggesting a viable habitat off southern Iceland. From figure 4b, the total area of suitable LGM habitat in the North Atlantic is estimated to be approximately 0.7 Mkm2, compared with 3.5 Mkm2 for present-day conditions (data from figure 3d): a fivefold change, with a relatively narrow and environmentally vulnerable latitudinal range in the LGM Western Atlantic and off Iceland.

Figure 4

Regions of the North Atlantic at the time of the LGM that are suitable for cod population survival, (a) from Maxent using ocean model data, as probabilities and (b) from applying the ecophysiological parameters to ocean model data.

(c) Calibrating rate of genetic divergence

As described above, the 3.5 Myr ago date for the trans-Arctic interchange was used to calibrate each source of genetic data. The per-site substitution rate over this time was estimated using the length of the internal branch on the maximum-likelihood trees between monophyletic groups of alleles in G. macrocephalus and the American population of G. morhua (comparisons with other North Atlantic populations gave the same result). The estimated per-site substitution results are shown in table 2. The estimated rate for CytB is very close to the widely used rate for mitochondrial protein-coding genes of 2% sequence divergence per million years (1.00×10−8 substitutions per base per year: Brower & DeSalle 1994). There are no published rate estimates for phantophysin or S2. For allozymes the procedure was similar, except that the population tree of Pacific and Atlantic cod species was constructed using the likelihood program Contml (Felsenstein 2005), yielding a Contml distance of 0.062 for 3.5 Myr ago. The allozyme data for this comparison came from 12 loci shared in common between a study of Pacific (Grant & Ståhl 1988) and Atlantic cod (Mork et al. 1985).

View this table:
Table 2

Estimated substitution rates for DNA loci, inferred using PAUP* v. 4.0b10 (Swofford 2002). (See §§2d and 3c for more details. Rates were calibrated assuming a divergence time of 3.5 Myr (ago) between Atlantic and Pacific cod.)

(d) Estimating percentage of private alleles, effective population size and growth

For all four classes of sequenced loci, the percentage of private alleles and gene diversity (haplotype diversity for the CytB locus) in each population is reported in table 1. For the two classes of sequenced loci not experiencing positive selection—mitochondrial CytB and nuclear S2—joint estimates of theta (i.e. 4Neu) and the growth parameter G from the program Fluctuate (Kuhner et al. 1998) are reported in table 3. An illustration of the number of private alleles in the S2 locus in Europe and America is shown in figure 5.

View this table:
Table 3

Estimates of theta and the exponential growth parameter G for each cod population. (See Wares & Cunningham (2001) for details of methodology. A large G implies swift growth and recent divergence of populations.)

Figure 5

S2 allele network (47 American and 72 Iceland/European). The number of private haplotypes in each region supports long residency on both sides of the North Atlantic; entire groups of alleles—and their descendants—are unique to either America or Europe. Shared alleles reflect some recent migration. In order to compare relative frequency of shared haplotypes, sizes of circles and slices are drawn to reflect percentage in each area. This is why singleton American haplotypes are slightly larger to compensate for smaller total American sample. The black bars are additional substitutions along these branches.

(e) Estimates of migration and timing of population divergence

While the age of initial population subdivision is most crucial to testing genealogical continuity, there is strong evidence of migration between populations. For each locus, both MDIV and IM estimated levels of migration to be about an order of magnitude lower between Canada and Iceland than between Iceland and Europe (table 4). IM also estimates the direction of migration: mostly it was higher from east to west, e.g. for the Pan-B locus; however, in several cases, this directionality flipped between analyses (table 4).

View this table:
Table 4

Comparing migration estimates (M) from MDIV and IM. (Number presented is average migration across four runs in each program.)

Figure 6 summarizes the ages of population divergence using Contml for the allozyme data and MDIV (Nielsen & Wakeley 2001) for the DNA sequence data. The allozymes and mitochondrial CytB data (which are the only loci sampled from Greenland) agree that Greenland diverged from Iceland only since the LGM. In contrast, the Canadian–Greenland comparison shows divergence since long before the LGM, consistent with continuous occupation of Atlantic Canada.

Figure 6

Columns (with 95% minimum error bars) show estimates of time since population subdivision between major regions of the North Atlantic determined from maximum-likelihood analyses of 10 allozyme loci (Mork et al. 1985; Grant & Ståhl 1988) and coalescent analyses (MDIV) of three DNA sequence loci: mitochondrial CytB (250 bp), nuclear ribosomal protein S2 and nuclear protein pantophysin (Pan-A and Pan-B alleles analysed separately, data shown are from smaller of two estimates). All comparisons of estimated times of divergence between America and populations to the east are significantly older than the LGM, suggesting long independent histories. In contrast, our estimated ages for the Greenland/Iceland and Iceland/Europe separations are generally more recent than the LGM (where ‘Europe’ data are provided by pooled populations from Norway and the North Sea).

For the comparisons with multiple DNA sequenced loci (those not including Greenland), we compare the population divergence estimates for MDIV (Nielsen & Wakeley 2001) and IM (Hey & Nielsen 2004; table 5). Both MDIV and IM analyses of individual loci agree that Atlantic Canada and Icelandic populations diverged long before the LGM, most likely during a warm inter-glacial period such as occurred ca 100 kyr ago.

View this table:
Table 5

Comparing population divergence estimates (in thousands of years) from MDIV and IM.

4. Discussion

Both ENM hindcasts found areas of contiguous suitable habitat for the Atlantic cod G. morhua in Atlantic Canada and New England, and significant stretches off mainland Europe, during the extreme conditions of the LGM ca 21 kyr ago (figure 4). Very little suitable habitat was found off Greenland, with rather more around Iceland in the ecophysiological ENM hindcast but none from Maxent. Hence our palaeo-ENMs indicate that we should find genetic evidence for old populations in Atlantic Canada, mainland Europe and possibly Iceland, and probably a post-LGM population in Greenland.

The genetic data are fully consistent with the palaeo-ENMs. Thus, both the Canadian and the European populations of Atlantic cod appear to have survived at least one full glacial cycle (ca 100 kyr ago). For the two neutral loci (mitochondrial CytB and nuclear S2), cod populations in Atlantic Canada and Europe show high and comparable levels of private alleles (table 1), as well as comparably low levels of the population growth parameter G (table 3). As also indicated by the palaeo-models, the Greenland population appears to be newly founded, with low levels of private alleles (table 1), and a very high population growth parameter (table 3).

These heuristic measures of population age can be given a temporal component by estimating the age of population divergence between Atlantic Canada and the other North Atlantic populations. As predicted by a long continuous occupation of Atlantic Canada, all classes of data (including allozymes) showed estimated divergences between Atlantic Canada and Iceland significantly older than the LGM (figure 6), with divergence between Atlantic Canada and Europe being even older (data not shown). Estimating the age of initial population divergence gives a minimum age for the continuous genealogical continuity for both of the populations being compared. Note that population divergence does not preclude migration between the populations once they are established (e.g. table 4).

The estimated recent divergence between Iceland and Greenland populations is consistent with colonization from Iceland and/or Europe since the LGM (figure 6, table 5). Just as the palaeo-ENMs are equivocal about the survival of cod off Iceland, the various methods differ about the ages of divergence between Iceland and Europe (figure 6, table 5).

Although the extent of glaciation in New England and Canada's Maritime Provinces is still controversial, our conclusions are consistent with the GLAMAP project that suggests parts of the Canadian Grand Banks were unglaciated at the LGM (Pflaumann et al. 2003; Meland et al. 2005). Our conclusions also agree with phylogeographic analyses of other North Atlantic fishes by Bernatchez and colleagues who have argued that glacial refugia existed in Atlantic Canada. These studies include the anadromous lake whitefish (Bernatchez & Dodson 1991), rainbow smelt (Bernatchez 1997) and Arctic charr (Brunner et al. 2001). There is growing evidence that other marine taxa, such as the hermit crab Pagurus longicarpus (Young et al. 2002), may have also survived in a glacial refugium.

The nuclear gene pantophysin deserves special attention. This locus has been shown not only to be under strong positive selection but also to have two ancient and distinct allelic lineages, Pan-A and Pan-B (Pogson & Mesa 2004). However, the Pan-A and Pan-B lineages, pruned of selected mutations, agree with the CytB and S2 data in showing a long residency in Canadian and European populations (figure 6) and each exhibits a high degree of endemism in both regions (table 1). Further evidence for an old Canadian population comes from RFLP data, which showed that the A allelic lineage has three recombinant haplotypes endemic to North America (Pogson 2001). In the same vein, an entire clade of the B allelic lineage (separated from other B alleles by two non-replacement substitutions) is entirely unique to North America (Pogson 2001). This suggests that both the Pan-A and the Pan-B allelic lineages have histories in North America at least as long as their histories in Europe.

The ability of cod populations to maintain genealogical continuity over extreme natural climate variability, as shown by this study, suggests considerable inherent resilience. Yet the effects of future climatic conditions need to be considered in relation to demographic structures (Andrews et al. 2006), as well as the major population changes due to human exploitation (Frank et al. 2005). Further development of integrated physical and ecological-niche-modelling, combined with phylogenetic analyses, of cod and other shelf sea species would generate testable predictions about the relative severity of previous and future bottlenecks and their genetic impacts.


We thank NSF for funding DEB-0130275, which created the CORONA network that brought us together for this collaboration. We also thank E. Árnason for providing us with his compilation of mitochondrial sequences from published studies, K. Brander for helpful comments on this work and the Leverhulme Trust for funding project F204Q from which the modelling work began.


    • Received August 23, 2007.
    • Accepted October 19, 2007.


Notice of correction

    Figure 1 is now presented in the correct form.

     16 November 2007

View Abstract