Rapid adaptation to novel environments may drive changes in genomic regions through natural selection. Such changes may be population-specific or, alternatively, may involve parallel evolution of the same genomic region in multiple populations, if that region contains genes or co-adapted gene complexes affecting the selected trait(s). Both quantitative and population genetic approaches have identified associations between specific genomic regions and the anadromous (steelhead) and resident (rainbow trout) life-history strategies of Oncorhynchus mykiss. Here, we use genotype data from 95 single nucleotide polymorphisms and show that the distribution of variation in a large region of one chromosome, Omy5, is strongly associated with life-history differentiation in multiple above-barrier populations of rainbow trout and their anadromous steelhead ancestors. The associated loci are in strong linkage disequilibrium, suggesting the presence of a chromosomal inversion or other rearrangement limiting recombination. These results provide the first evidence of a common genomic basis for life-history variation in O. mykiss in a geographically diverse set of populations and extend our knowledge of the heritable basis of rapid adaptation of complex traits in novel habitats.
It is now clear that evolutionary change can be very rapid, especially in response to strong selection or during invasion of novel habitats [1–6], and that this in turn can influence ecological and environmental conditions [7,8]. In addition, when the same traits respond to similar selective pressures in multiple species or populations, parallel evolution may lead to similar phenotypic changes [9–11]. In such cases, these parallel phenotypic changes may have the same underlying genetic basis [12–17] or may involve different genetic changes that cause similar phenotypic responses [18,19]. While parallel adaptation may stem from novel mutations in the same gene or gene region, it is often attributed to concordant changes in the frequencies of existing alleles [14–16,20,21] or to parallel regulatory changes in gene expression [22,23]. However, the relative importance and frequency of these different underlying genomic responses to adaptive evolution in natural populations is largely unknown.
The identification of specific genomic regions under selection in natural populations is a fundamental goal of evolutionary genetics research and is essential for understanding the genetic basis of local adaptation [16,24]. However, most traits of interest are complex and controlled by a large number of genetic and environmental influences, and dissection of the genetic architecture underlying their expression is challenging [15,25]. This is especially true of traits associated with multiple phenotypic, behavioural or physiological effects . In addition, demonstrating associations between a phenotype and a genotype, or fitness differences between genotypes, may not be sufficient to conclude that an allele confers an adaptive advantage in a given environment . Nonetheless, combining population genomics approaches, such as genome screens, with gene mapping and next-generation sequencing can provide strong inferential evidence of adaptation in a specific genomic region [27,28], ultimately providing the ability to relate selection on phenotypic variation to the underlying adaptive genomic variation in natural populations.
Animal migration has been the focus of many evolutionary genetic studies, and it is now widely understood that a significant proportion of phenotypic variation in migratory traits has a genetic basis . Migration and its associated phenotypic and physiological characteristics are considered threshold traits [30,31], with expression dependent on the interaction between the environment, individual growth and development, and the underlying genetic architecture . However, despite this complexity, single-gene effects on migration have been identified. For example, the candidate gene Clock was found to be associated with a substantial proportion of the variation in migratory timing in a Pacific salmonid . Similarly, Mueller et al.  identified a candidate locus associated with ‘migratory restlessness’ in multiple populations of blackcaps (Sylvia atricapilla), although polymorphism in this gene explained only approximately 3% of the variation in migratory activity.
The salmonid Oncorhynchus mykiss is one of the world's most widely distributed fish species, and also one of the top aquaculture species, with commercial production growing exponentially since the 1950s . In nature, O. mykiss exhibits extensive variation in migratory life history, from resident rainbow trout that never leave freshwater, to anadromous steelhead, which undergo one or more sea migrations but spawn in freshwater. Within this spectrum are populations in which individuals display a range of life-history trajectories [35–37]. It has long been known that these differences in ‘migratory habit’ are at least partially heritable , and that the resident life history has evolved repeatedly in different regions. As a result, resident and anadromous O. mykiss in the same watershed are generally each other's closest relatives [39–41]. However, their differences are profound. In addition to the complex combination of behavioural traits associated with migration , steelhead undergo a suite of physiological and morphological changes known as ‘smoltification’ that includes osmoregulatory adaptation to the salinity of the ocean as well as physical adaptations to the highly mobile life of a pelagic marine fish. Conversely, resident trout retain traits adapted for freshwater, including paedomorphic morphology and the development of reproductive maturity in as little as 1 year .
In a previous study, we sampled a pair of O. mykiss populations isolated above and below a natural waterfall barrier approximately 100 years ago , and screened 298 microsatellite loci using an FST-outlier approach . The locus with the highest pairwise FST value was in complete linkage disequilibrium (LD) with a second outlier locus, and both loci map to the same position on O. mykiss chromosome Omy5 [43,44]. A third outlier locus on Omy5 was also in LD with the first two . Subsequent analysis of genotype data for 96 single nucleotide polymorphisms (SNPs) identified three SNP loci that displayed LD and allelic frequency patterns consistent with the microsatellite outliers in the same pair of populations, as well as in additional O. mykiss populations (, D.E.P. and J.C.G. 2013, unpublished data). Importantly, these results are consistent with mapping studies that have localized QTL for early maturation, growth rate and smoltification, to this region of Omy5 [17,43,46–48].
Based on these data, we hypothesized that Omy5 contains a genomic region that is strongly associated with the expression of resident and anadromous life histories in O. mykiss, and that this region would undergo divergent natural selection in populations isolated above and below barriers. Here, we show that a subset of loci located on chromosome Omy5 follow a clear and concordant pattern of divergence between habitats favouring the resident or anadromous life histories. This pattern probably results from strong parallel natural selection acting on life-history traits influenced by one or more genes in this region. These results demonstrate repeated evolutionary change in response to similar selective environments, suggesting that a common genetic mechanism is involved in the life-history transition from anadromy to residency in O. mykiss through independent adaptation based on standing genetic variation.
2. Material and methods
(a) Population samples
Samples from 21 populations located above and below natural (waterfall) and artificial (dam) barriers to upstream anadromous fish migration were selected from locations in five watersheds throughout California and southern Oregon, as well as from three hatchery rainbow trout strains commonly used in California (figure 1 and table 1). The population samples represent various life stages and individual life histories, including juveniles and adults, as well as summer and winter steelhead ecotypes (table 1). The above-barrier populations are a mixture of coastal steelhead lineage fish isolated above waterfalls (Cutfinger, Willow and Big creeks) or dams (populations in the Salinas, Santa Ynez and Santa Clara rivers), and inland trout populations (Bauers, Buckboard and Butcherknife creeks). Although all of these above-barrier populations appear to be primarily natural-origin fish [39,40,49], introgression by stocked hatchery trout is impossible to rule out and may be significant in some populations.
(b) Marker development, genotyping and analysis
Miller et al.  identified 344 polymorphic SNPs that mapped to chromosome Omy5 with Restriction-site Aided DNA (RAD) sequencing in two divergent strains of hatchery trout. We developed 55 new SNPtype assays (Fluidigm Corporation, So. San Francisco, CA, USA) from these polymorphisms by aligning the flanking sequence around each SNP with up to 150 bp of flanking sequence derived from BAC clones of the Swanson (Sw) strain . In addition to these RAD-derived SNPs, we used assays for five chromosome Omy5 loci developed by Castaño-Sánchez et al. , as well as for the three putative Omy5 loci identified by Abadía-Cardoso et al. .
To complement the Omy5 loci, we used a set of assays for 32 additional SNP loci located on 17 of the 28 other O. mykiss linkage groups. All loci were genotyped with 96.96 Dynamic SNP Genotyping Arrays on an EP1 system (Fluidigm) following the manufacturer's recommended protocols. Two negative (no template) controls were included in each array, and genotypes called using Genotyping Analysis Software v. 3.1.1 (Fluidigm). Details, including map positions, primer sequences and summary statistics, as well as genotypes for all SNP loci, are provided in the electronic supplementary material, appendix.
Tests for Hardy–Weinberg (H–W) equilibrium and LD were conducted using Genepop . Because of the large number of pairwise tests for LD, only p-values less than the Bonferoni-corrected critical value were considered to be statistically significant. We then used the allelic correlation coefficient, (r2) [52,53], to quantify the pattern of LD among the Omy5 loci in three ways. First, we ranked the 55 Omy5 loci using the mean r2 value for each locus compared with all other loci. Second, we visualized the LD among Omy5 loci in each population using the statistical packages ‘genetics’  and ‘LDheatmap’  in the R programming language . Finally, we counted the number of pairwise linkage associations exceeding a critical value that each locus shared with other loci. For these analyses, we used the Scott Creek anadromous adults as a reference population because it had the largest sample size and the most polymorphic loci (54/55 loci).
To quantify the association between Omy5 alleles and life-history, we compared the patterns of allele frequency variation in all populations, considering all SNP loci independently. For each locus, the allele with the highest frequency among the Scott Creek anadromous adults was used as a reference to calculate the frequency, p, of that same allele in all populations.
3. Results and analysis
(a) Genetic data
Of the 95 SNPs evaluated, eight failed to amplify, were not polymorphic, or not consistent with Mendelian segregation, all among the newly developed RADseq loci (failed: R13309, R17718; heterozygous in all individuals: R14387, R37798, R37851; monomorphic in all populations: R23404, R24167, R37101). After removing these loci from the dataset, the remaining 87 loci consisted of 55 markers on Omy5 and 32 loci on other linkage groups (electronic supplementary material, appendix). Four loci displayed significant deviations from H–W equilibrium when considered over all surveyed populations (R30220, p < 0.05; R37229, R37560, p < 0.001; OMS00121, p = ‘highly significant’).
(b) Population structure
Patterns of genetic divergence based on the 32 non-Omy5 loci, as measured by FST and phylogenetic network analysis (electronic supplementary material, figure S1), were consistent with previous studies showing close relationships between populations above and below barriers within a watershed and increasing divergence with distance along the California coast [39,40,57–60]. By contrast, analysis of loci located on Omy5 showed high divergence between above- and below-barrier populations within the same basin, and nearly complete reciprocal monophyly between above- and below-barrier populations (electronic supplementary material, figure S1).
(c) Linkage disequilibrium
None of the 495 pairwise tests for LD among the 32 non-Omy5 loci were significant after Bonferroni correction for multiple tests (p < 0.0001), even among pairs located on the same chromosome (23 loci on eight chromosomes). By contrast, 602 of the 1430 possible pairwise comparisons among the 55 Omy5 loci were significant (42%; p < 0.000034). This analysis included SNPs mapped to chromosome Omy5 as well as the three loci developed by Abadía-Cardoso et al. , confirming their strong linkage to loci on chromosome Omy5.
The allelic correlation coefficient, r2, ranged from 0.008 to 0.53 over all 55 Omy5 loci, and was used to order all loci to visualize LD among Omy5 loci in each population (figure 2). In addition, 14 loci had r2 values greater than a critical value of 0.9 with 10 or more other loci, and 16 additional loci had r2 values greater than 0.5 with 19 or more loci (electronic supplementary material, appendix). Together these analyses clearly identify a block of loci in strong LD with each other, while other loci on Omy5 segregate more independently (figure 2). Although the exact boundaries of the linkage block are difficult to define in the absence of a physical map, the concordant patterns of LD and allele frequency variation across the populations of coastal O. mykiss surveyed here support the hypothesis that physical linkage is involved.
(d) Allele frequency and haplotype variation among populations
The patterns of allele frequency variation are consistent with the LD analysis, and identify a subset of loci on Omy5 whose allele frequencies are both highly correlated and strongly associated with the above- or below-barrier status of each population (table 1 and figure 3; electronic supplementary material, appendix). Thus, in the absence of relative position information from a physical map, we defined the 30 SNP loci with the highest mean pairwise r2 values (figure 2; electronic supplementary material, appendix) as a single, linked haplotype. These loci also had a large number of pairwise r2 values greater than 0.50, while the remaining 25 loci on Omy5 each had less than seven such relationships (electronic supplementary material, appendix). With few exceptions, these loci had nearly identical, population-specific allele frequencies that were highly correlated with each population's presumed life-history strategy (figure 3).
We designated the two major haplotypes identified in this region of Omy5 and associated with resident and anadromous life-history as RS and AD, respectively, and used their mean allele frequency to calculate the ‘haplotype frequency’ for each population. Above-barrier populations had significantly lower mean frequencies of the AD haplotype than below-barrier populations (table 1; t-test, 26.2 versus 72.7, p < 0.01). Significantly, the highest mean frequency of the AD haplotype (75.5) was observed among the four below-barrier populations with samples of known anadromous-phenotype individuals (i.e. adult steelhead; table 1).
Finally, we compared natural, presumably long-established, resident trout populations (Bauers, Buckboard, Butcherknife, Cutfinger and Willow creeks) with recent, anthropogenically established, above-barrier populations (Sespe, Santa Paula, Upper Piru, Piru, Santa Cruz and Big creeks). Four of the five natural resident populations were fixed for the RS haplotype, while all of the recently established above-barrier populations remain polymorphic, and the natural resident populations had significantly lower mean frequency of the AD haplotype (table 1; t-test, 0.42 versus 42.4, p < 0.01).
It is now clear that evolutionary change and adaptive divergence in natural populations can occur very rapidly, responding to strong selection over relatively short ecological timescales [2,4]. In addition, it is clear that human-induced selection often exerts far greater pressure than is typical of natural selection [3,61]. What is less clear is how selection affects the genetic architecture underlying these traits, i.e. whether parallel trait evolution also indicates parallel genetic evolution. Here, we provide evidence of parallel evolution affecting a large genomic region of Omy5 in populations of coastal O. mykiss and propose that it is due to parallel natural selection acting against the anadromous life history in populations of trout above barriers. Allele frequencies of a subset of the SNPs located on Omy5 display strong non-random associations with population life-history patterns, providing concordant evidence of strong selection operating on one region of this chromosome and repeated parallel change in the frequency of an existing haplotype in response to similar selective environments.
The observed patterns of parallel adaptive evolution at the genomic level are similar to those documented for the loss of armor plates in multiple populations of sticklebacks transitioning from marine to freshwater environments  and for parallel evolution of pigmentation in vertebrates , and imply a significant reduction in fitness of the anadromous phenotype in the above-barrier populations. For example, in Scott Creek, given an estimated above-barrier population size of 200–800 , simulations suggest that a relative fitness of approximately 0.85 would be required to drive the observed reduction in AD haplotype frequency over 20–50 generations (results not shown ). Thus, our data are consistent with the hypothesis that individual expression of life-history pattern is directly influenced by one or more genes in this region of Omy5.
Our results provide data from natural populations in California and Oregon that support prior studies on diverse O. mykiss lineages using candidate genes, genome wide association and QTL mapping to explore the genetic architecture of traits associated with life history in O. mykiss, including migration [17,32,42,46,48,65–76]. Many of these studies have identified QTL or significant associations between markers on Omy5 and life-history traits such as spawn timing [47,65,68], smoltification [42,46] and development rate or early maturation [17,66,67,70]. Thus, this study supports previous suggestions that the LD block on chromosome Omy5 represents a ‘master control region’ influencing residency and anadromy in O. mykiss [46,72] and is a parallel selected region  that has experienced repeated selection in multiple populations.
Population genomic surveys may provide improved mapping resolution for genomic regions underlying adaptive phenotypes by taking advantage of numerous generations of recombination in nature . For example, 310 of the 344 SNP loci identified by RAD on Omy5 mapped to a single location in the crossing families of Miller et al. , the highest number of such tightly linked loci found on any chromosome in that study, consistent with previous studies that have had little success in resolving the relative positions of markers located within this region [17,43,46,66,70]. By contrast, our analysis of 55 of these loci clearly differentiated loci within a cohesive linkage block from those segregating normally on other parts of the chromosome. Thus, despite the lack of resolution from QTL mapping studies, our population genomic results provide relative position information about a subset of the Omy5 loci.
Regions of reduced recombination may arise from a variety of mechanisms and are common throughout the genomes of many organisms . However, Phillips et al.  noted that chromosome Omy5 was a ‘major exception’ to the general patterns of recombination among O. mykiss chromosomes, suggesting that there is strong suppression of recombination along at least part of its length. Chromosomal inversions have long been suspected to contribute to local adaptation as the loss of recombination in such regions effectively joins co-adapted alleles at multiple loci into larger haplotypes and prevents them from being separated [78–81]. Although we do not know the specific genes or mutations under selection in the Omy5 region, a Clock gene is known to map to this region, and Clock genes have been shown to be associated with life-history variation in salmonids [32,65,69]. However, the complexity of the smolting phenotype suggests the presence of multiple genes involved in a co-adapted gene complex affecting a variety of traits.
(a) Conservation implications
Recognition of the extent and pace of contemporary evolution is critical to conservation efforts . The present survey includes representatives from four steelhead Distinct Population Segments in California , as well as three divergent O. mykiss lineages and three of the hatchery rainbow trout strains most commonly stocked in California. Almost all California steelhead are protected by the 1973 US Endangered Species Act (ESA), but this protection extends only to ‘naturally spawned anadromous O. mykiss (steelhead) populations below natural and man-made impassable barriers’. Thus, despite clear evidence that O. mykiss above and below barriers in the same coastal basins always share recent common ancestry [39,40,57,84], that offspring of both resident and anadromous O. mykiss parents can express the alternate phenotype , and that fish in above-barrier populations may still undergo smoltification [37,86,87], all fish above natural or artificial barriers to fish passage are excluded from the management units afforded ESA protection .
Despite the consistency of the association between Omy5 and life-history pattern, several populations showed allele frequency patterns counter to our initial prediction. Interestingly, these patterns appear to indicate the life-history type that actually contributes most to successful reproduction in these populations. For example, we initially designated two populations from the Santa Clara River, Sespe and Santa Paula creeks, as ‘below barrier’, but both displayed moderately high frequencies of the RS haplotype. In fact, both of these populations are located above the Vern Freeman Diversion Dam, which, along with other water diversion activities in the lower Santa Clara River, has ‘effectively impeded or block[ed] fish passage to spawning and rearing habitat in the major tributaries of the Santa Clara River’ since 1900, despite the presence of a fish ladder . Thus, our results are consistent with selection from this dam contributing to the almost complete loss of the anadromous life history in the Santa Clara River. Similarly, habitat modifications probably select against anadromy in the headwaters of the Salinas River; the below-barrier Tassajera population is above sections of the river that are completely dewatered in many years.
The significantly lower frequency of the AD haplotype in older, natural, above-barrier populations relative to more recently established populations suggests that given enough time, strong selection against the migratory life history may remove the AD haplotype from these populations as well. The distinction between populations above waterfall barriers and those above dams may also be significant if dams differ from natural barriers in their ability to allow individual fish to express life-history decisions. For example, if the design or operation of a particular dam limits downstream passage of smolts, individuals that might have left the above-barrier habitat are instead forced to remain. Moreover, large reservoirs may provide ocean-like growth and rearing habitat, supporting an adfluvial life history [86,87] that preserves some aspects of anadromy and favours the AD haplotype even in a closed freshwater system. This phenomenon could explain the high frequency of the AD haplotype in the Nacimiento and North Fork Juncal populations, both of which are above large reservoirs, although the potential role of drift in creating such patterns in small above-barrier populations cannot be dismissed. The present data provide a basis for prediction of the life history favoured in a given population and a method for genomic data to contribute to the delineation of appropriate management units in this species [89,90], as well as providing a measurable indicator of the life history favoured by the ecological conditions in a given stream.
We thank Anthony Clemento (NMFS/UCSC), Sean Hayes, David Boughton and Heidi Fish (NMFS), Steve Jacobs and Stephanie Gunckle (ODFW) and Bernie May (UC Davis) for providing samples used in this study; and Sara Paddock and Travis Apgar for help with the GIS map. We also thank Eric Anderson, Anthony Clemento, Libby Gilbert-Horvath and Victoria Pritchard for discussions and assistance with data analysis and interpretation.
- Received January 2, 2014.
- Accepted February 25, 2014.
- © 2014 The Author(s) Published by the Royal Society. All rights reserved.