## Abstract

A fundamental goal of the biological sciences is to determine processes that facilitate the evolution of diversity. These processes can be separated into ecological, physiological, developmental and genetic. An ecological process that facilitates diversification is frequency-dependent selection caused by competition. Models of frequency-dependent adaptive diversification have generally assumed a genetic basis of phenotype that is non-epistatic. Here, we present a model that indicates diversification is accelerated by an epistatic basis of phenotype in combination with a competition model that invokes frequency-dependent selection. Our model makes use of a genealogical model of epistasis and insights into the effects of balancing selection on the genealogical structure of a population to understand how epistasis can facilitate diversification. The finding that epistasis facilitates diversification may be informative with respect to empirical results that indicate an epistatic basis of phenotype in experimental bacterial populations that experienced adaptive diversification.

## 1. Introduction

Studies from experimental evolution indicate that haploid asexual species can diversify into two or more ecotypes [1–3]. In the study of diversification involving *Escherichia coli* [2,3], there was evidence that the diversification was caused, in part, by frequency-dependent ecological interactions between ecotypes. Furthermore, in Le Gac and Doebeli's experiment, the genetics of phenotypic differences between ecotypes had an epistatic basis [4]. Frequency-dependent selection occurs when the fitness of a genotype is a function of the frequencies of genotypes in the population and can be caused by competition. Epistasis occurs when mutations interact with each other to produce a phenotype, as opposed to an additive genetic architecture in which mutations do not interact to produce a phenotype.

A substantial body of theory supports a frequency-dependent mode of adaptive diversification, cf. [5,6], in which this theory generally focuses on ecological processes that cause frequency-dependent interactions and its consequences. Less attention has been given to genetic processes that may facilitate adaptive diversification. Doebeli & Ispolatov [7] investigated epistasis in a frequency-dependent setting in which phenotypes were multivariate, and found that epistasis can lead to chaotic dynamics genetically and cause a population to explore the range of multivariate phenotypic space. Nevertheless, it is still not clear whether adaptive diversification is facilitated or inhibited by epistasis for low-dimensional, as well as high-dimensional phenotypes, such that lineages in a population undergo divergence and evolutionary branching more rapidly when the phenotype has an epistatic versus additive genetic architecture.

In other selective contexts, epistasis has been shown to both facilitate and inhibit the evolution of diversity. For instance, for a univariate phenotype under stabilizing selection, epistasis is expected to decrease phenotypic diversity [8]. By contrast, for a multivariate phenotype under stabilizing selection, epistasis can increase diversity [9]. The mechanism for an increase in multivariate diversity with epistasis comes about by considering how epistatic genetic effects map genealogically [9,10]. Griswold & Eisner [10] showed that under a variety of conditions, epistatic genetic effects map proportionally more towards the tips of a genealogy, leading to less genetic covariance between individuals. This decrease in covariance between individuals facilitates the maintenance of multivariate diversity within a population under stabilizing conditions [9]. Griswold & Henry [9] did not determine the effect of epistasis on the evolution of univariate and multivariate diversity when fitness is frequency-dependent.

Under conditions of diversifying selection multiple lineages are maintained, and the most recent common ancestor (MRCA) of a population is expected to be older than for a population under stabilizing selection [11]. Accordingly, there may be more opportunity for a population to build-up segregating mutations that act epistatically along separate lineages, which may facilitate adaptive diversification (electronic supplementary material, figure S1). In this paper, we assess the hypothesis that epistasis facilitates adaptive diversification in a haploid asexual population for a univariate trait.

## 2. Material and methods

We simulate a haploid population of size *N*. Individuals reproduce asexually and generations are discrete. In each generation, parents are chosen at random and in proportion to their relative fitness to give birth to offspring. Individuals can have multiple offspring. After *N* offspring are generated, the parental population dies. An offspring's genotype is a copy of its parent's, except for the possible addition of mutations. We assume an infinite sites model of mutation, and the number of mutations is Poisson distributed with mean *U* per birth. We assume a genetically monomorphic population at the start of each replicate simulation. Table 1 provides a list of functions, variables and parameters used in the study and their biological interpretation.

A canonical model of log absolute fitness of an individual *i* with phenotype *x _{i}* that leads to adaptive diversification and written in terms of a finite population with overlapping generations is2.1

In equation (2.1), *r* is the intrinsic growth rate. The function characterizes frequency-dependent competition, where *x _{j}* is the phenotype of an individual in competition with individual

*i*. The function

*α*(

*x*,

_{i}*x*) has a maximum value of 1.0 when

_{j}*x*=

_{i}*x*and asymptotically declines to zero as

_{j}*x*and

_{i}*x*diverge. The function gives the carrying capacity of a population monomorphic for phenotype

_{j}*x*and assumes that the largest carrying capacity occurs when

_{i}*x*=

_{i}*x*

_{opt}[6,12]. The biological justification for the carrying capacity function is the assumption that individuals with phenotype

*x*

_{opt}can optimally extract resources for survivorship and reproduction. Parameter

*a*determines the strength of competition and

*b*determines the strength of stabilizing selection. Parameters

*n*and

_{α}*n*determine the shape of the functional decline in competition and resource acquisition, respectively. An important feature of model 2.1 is that it has both disruptive and stabilizing components, such that if

_{K}*n*≥ 2 and

_{α}*a*>

*b*frequency-dependent selection is stronger than stabilizing selection, and a population is expected to diversify. If

*a*<

*b*, frequency-dependent selection is weaker than stabilizing selection and the population will not diversify. Furthermore, the type of diversification depends on the exponent

*n*; if

_{α}*n*is greater than two, the population is expected to evolve a multimodal distribution of phenotypes and if

_{α}*n*equals two, it is expected to diversify into a stable and normal distribution of phenotypes. The extent of disruptive selection is limited, such that although very extreme phenotypes experience little competition, they cannot acquire resources to reproduce and their fitness therefore goes down (figure 1

_{α}*a*).

Here, we explore a fitness model that shares the same properties of equation (2.1), but separates frequency-dependent processes from stabilizing selection processes more distinctly and is tailored to discrete generations and a finite population of constant size, such that the log absolute fitness of individual *i* with phenotype *x _{i}* is proportional to2.2

The term captures the frequency-dependent component of fitness, with the summation across all individuals in the population. The term *κ*(*x _{i}*) in equation (2.2) captures the stabilizing selection component. Here, we assume

*κ*(

*x*) = exp(−

_{i}*b*|

*x*−

_{i}*x*

_{opt}|

^{2}), such that the exponent that corresponds to

*n*in model 2.1 is equal to two. Likewise, we assume in our simulations that

_{K}*n*

_{α}is equal to two, which in model 2.2 results in a multimodal distribution of phenotypes (see §3). Empirically, the frequency-dependent term could be estimated by regressing individual fitness on genotype or morph frequencies, and the stabilizing selection term would be estimated using the standard quadratic approach. In model 2.2, if the population is monomorphic with phenotype

*x*

_{opt}, individual fitness is equal to 1.0. The log fitness of individual

*i*relative to individual

*k*is2.3Equation (2.3) indicates that the relative fitness of individual

*i*increases if it either experiences weaker competition, on average, and/or has a larger stabilizing selection fitness component. Fitness model 2.2 shares the same properties as model 2.1 in which it has both disruptive and stabilizing components and the extent of disruptive selection is limited, such that although very extreme phenotypes experience little competition, they cannot acquire resources to reproduce and their fitness therefore goes down (figure 1

*b*). Model 2.2 has sharper fitness peaks than model 2.1, which may explain why adaptation to a multimodal distribution of phenotypes is possible for model 2.2 when

*n*

_{α}= 2 (see §3).

The phenotype of individual *i* with haplotype *H _{i}* is2.4where

*g*is the

_{i}*i*th genetic effect and

*W*(

*H*) is the power set of individual

_{i}*i*'s haplotype [9,10]. We do not include a random environmental effect. The haplotype of an individual is represented by a list of numbers indicating sites that have mutated since the MRCA of the population, such that the haplotype of the MRCA is represented by the empty set {} and a haplotype with mutations at sites 1, 10 and 23 is represented as the set of numbers {1, 10, 23}. The power set of haplotype {1, 10, 23} is , such that the haplotype's phenotype is2.5where the genetic effect equals the phenotype of the MRCA, genetic effects with a single mutational site indicate the change in phenotype relative to the MRCA caused by a mutation at that site alone (an additive effect), and genetic effects with two or more mutational sites indicate the change in phenotype caused by interactions between sites (an epistatic effect) [9,10]. This model of epistasis takes a molecular genetic perspective, in which mutations are added separately and in different combinations relative to an ancestor to isolate genetic effects. In principle, a haplotype that differs from the MRCA by

*d*sites can have 2

*genetic effects in total and effects involving epistatic interactions between*

^{d}*r*mutations. In this paper, we assume that effects of order

*r*> 2 are zero, such that only pairwise epistasis can occur.

In forward-time simulations, it is possible to have several MRCAs as the population evolves, such that can change as mutations fix. When a set of mutations fix, their additive effects and epistatic effects get incorporated into . Furthermore, their epistatic effects with segregating mutations get incorporated into the additive effects of these segregating mutations. For instance, assume an ancestral haplotype fixes in the population and causes mutations at sites *i*, *j* and *k* to become fixed, then , where is the phenotypic state of the ancestor after fixation. Furthermore, if a mutation at site *ℓ* is segregating, its additive effect after fixation becomes , such that epistatic effects are converted to additive effects.

We assume that additive genetic effects generated by new mutations are independent and normally distributed, with mean zero and variance , where the subscript 1 indicates a genetic effect involving a single mutation. We assume that epistatic genetic effects follow the model2.6where *f* is normally distributed with mean and variance . Parameters and are the expected variances in additive genetic effects for sites *i* and *j*. If the mutation at site *ℓ* is new, then is equal to . If the mutation at site *ℓ* has been segregating, then is equal to the variance in the additive effects of segregating mutations. The difference in between new and older segregating mutations is owing to the fact that after the fixation of one or more mutations, additive effects of segregating mutations are readjusted to account for fixed epistatic effects, such that the variance in segregating effects is different from . Furthermore, the selection process changes the variance in the effects of segregating additive genetic effects relative to . We investigate cases when such that there is no correlation between additive and epistatic genetic effects, and cases when such that there is a negative correlation between additive and epistatic genetic effects, on average. The negative correlation induces the phenomenon of ‘negative epistasis’, which is found in bacteria [13,14]. This model of epistasis is similar to that of Carter *et al.* [15]. The variance in epistatic effects between new mutations is , which is easily derived based on the expected variance of a product and noting that the expected variance of is one. The variance in epistatic effects generated by new mutations interacting with segregating mutations is different from because segregating additive genetic effects may not have an expected value of zero and not be normally distributed.

It is important to set up meaningful comparisons that illustrate the consequence of an epistatic genetic architecture compared with an additive architecture on adaptive diversification. Our approach, following Griswold & Henry [9], is to compare the situation in which it is expected that genetic variance is the same between the additive and epistatic cases under the assumption of neutrality and when the population is at an approximate equilibrium between mutation and drift. To accomplish this, we ran simulations of populations with the epistatic genetic architecture for 10^{4} generations under neutral conditions, measured genetic variance in the trait and then calculated the expected variance in effect size assuming an additive genetic architecture that would give the same genetic variance as the epistatic case. More specifically, if *V _{I}* is the genetic variance for a trait under an epistatic genetic architecture at mutation–drift equilibrium, then the value of that satisfies the equation is the variance in effect size that would result in an additive genetic architecture having the same genetic variance as the epistatic genetic architecture at mutation–drift equilibrium.

We used a disparity statistic to quantify diversity, which is simply the average Euclidean distance between the phenotypes of individuals in the population, and has effectively captured changes in diversity across time in simulation experiments and in nature [16,17]. Disparities were equal between the additive and epistatic cases under neutral conditions when genetic variances were also equal.

## 3. Results

At 5000–10 000 generations, phenotypic disparity is typically greater in a population with an epistatic genetic architecture compared with a population with an additive genetic architecture (figure 2). The variance in the size of genetic effects of random mutations affects disparities. When the variance in random genetic effects is relatively large (figure 2*a*), disparities are equal between the epistatic cases and the additive case at 2000 generations, but at 5000 and 10 000 generations, disparities are typically greater for the epistatic cases than the additive case. By contrast, with a smaller variance in genetic effects (figure 2*b*), disparities are less for the epistatic cases than the additive case at 2000 generations, but at 5000 and 10 000 generations, disparities are greater for the epistatic cases than the additive case. Furthermore, for the smaller variance cases compared with the larger variance cases, the magnitude of the difference in disparity between the epistatic and additive genetic architectures is greater at generations 2000, 5000 and 10 000. Inspection of lineage diversification plots indicates that an epistatic genetic architecture is more prone to generate a third stable lineage earlier than the additive genetic architecture, which facilitates the evolution of greater disparity, on average, by 5000 and 10 000 generations (figure 3 and the electronic supplementary material, figure S2).

Greater disparity at a longer timescale is still achieved with negative epistasis compared with the additive case. For correlations between epistatic and additive genetic effects of between 0 to at least −0.28 , an epistatic genetic architecture can still have greater disparity at 5000 and 10 000 generations (figure 2*b*). Interestingly, when the correlation between epistatic and additive effects becomes strong and negative, such that transitions from values to (or from a correlation between epistatic and additive genetic effects of −0.10 to −0.20 and −0.14 to −0.28 ), disparities actually increase in generations 100–5000 (points connected by grey lines in figure 2) relative to cases with weaker negative epistasis. Inspection of lineage diversification plots indicates that that strong negative epistasis actually facilitates the formation of three and subsequently four or more lineages (electronic supplementary material, figure 3*a*), but that these lineages are unstable, such that lineages often merge. The tendency for lineages to be less stable with weaker negative epistasis is also apparent when (electronic supplementary material, figure 3*b*). After 10 000 generations and when dynamics stabilize, disparity is, on average, less with strong negative epistasis compared with weaker negative epistasis.

Figure 4 explores the effect of increasing the strength of diversifying selection and increasing the strength of both diversifying selection and stabilizing selection. When the strength of diversifying selection increases and there is no negative epistasis, diversity in the epistatic case is greater than the additive case by 500 generations (figure 4), compared to it taking more than 2000 generations when diversifying selection is weaker (figure 2*a*). As the strength of both diversifying and stabilizing selection become stronger, phenotypic disparities as a function of time converge between the additive and epistatic cases (figure 4). Under strong diversifying and stabilizing conditions, morphological lineages are more distinct in the additive versus epistatic case, because there is less of a tendency for lineages to merge (electronic supplementary material, figure S4).

## 4. Discussion

The results of this paper indicate that an epistatic genetic architecture can facilitate adaptive diversification relative to an additive genetic architecture. The increased rate of diversification with epistasis is not immediate when a population is initially monomorphic. The delay in diversification is because an epistatic genetic architecture requires a longer period of time to build-up the genetic variation that is necessary to initiate the first diversification event. After the first diversification event, mutations accumulate along distinct lineages maintained by selection and epistatic interactions can then accelerate divergence between lineages and the formation of a third distinct lineage. For example, in figure 3, and in figure S2 in the electronic supplementary material, the initial diversification event is sooner with an additive genetic architecture compared with the epistatic architecture, but after this initial diversification event there is more rapid formation of a third distinct lineage when the genetic architecture is epistatic. The more rapid generation of the third distinct lineage leads to greater phenotypic disparity after 2000–5000 generations when the genetic architecture is epistatic.

Empirical investigation of the genetic basis of adaptive diversification in *Escherichia coli* [2,3] found that it had an epistatic basis [4]. Our finding that epistasis facilitates diversification at a longer timescale is consistent with the inference drawn from Lac & Doebeli [4] that an epistatic interaction causing divergence in phenotype between two bacterial lineages occurred later in the diversification process as opposed to during the initiation and start of adaptive diversification. A difference between our simulation results and Lac & Doebeli [4] is that they have so far discovered one epistatic effect of significance, whereas in our simulations, multiple epistatic interactions facilitate divergence. The population size and fitness parameters explored in this paper allowed for the evolution of typically three distinct lineages, with epistasis, over 10 000 generations. Under weaker diversifying conditions, the second lineage typically emerged at between 500–1000 generations and the third lineage typically emerged after 2000 generations, whereas under stronger diversifying conditions, times to diversification were shorter. The experiments of Doebeli & co-workers [2–4] were run over a maximum of 1200 generations and observed two distinct lineages.

Prior theoretical work indicated that directional epistasis can facilitate the directional response of a population to selection [15,18]. Our model of epistasis that assumes independent effects is different from the model in references [15,18], which assumed a directional form of epistasis, such that alleles on average reinforce or suppress the effect of other alleles. In our model with independent epistatic effects, these effects are, on average, directionless and it is the disruptive selection process in combination with the maintenance of multiple lineages that allow epistasis to help accelerate phenotypic divergence. Epistasis appears to accelerate divergence, because the maintenance of lineages within a population allows for multiple epistatic interactions to accumulate through time along separate lineages.

Other studies of bacterial populations besides [4] have detected epistatic gene action. For instance, Woods *et al*. [13] detected epistatic gene action that may have subsequently affected the long-term evolvability of experimental bacterial strains. A detailed molecular genetic analysis of mutational effects indicates that the mode of epistatic gene action that facilitated the long-term evolvability of one lineage versus another lineage may have been owing to the less evolvable lineage having a greater degree of negative epistasis, where the genetic background of the lineage tends to suppress the effects of mutations, whereas the more evolvable lineage experienced less suppression of mutational effects. This type of epistasis is similar to a suppressive or equivalently negative type of directional epistasis modelled in reference [15] and in our model of negative epistasis.

Additional work on the same system as in reference [13] indicated that negative epistasis on fitness became more common as a population became more adapted to an environment [19]. Negative epistasis may become more common because, at the phenotypic level, the underlying physiological process may become saturated and there are diminishing returns in phenotype for a given incremental change in an underlying factor. Khan *et al*. [14] detected a negative relationship between the additive and epistatic effect of mutations in the same bacterial population studied by in references [13,19]. Negative epistasis or a diminishing-returns mode of epistasis has also been reported in another bacterial species [20].

This study investigated negative epistasis by invoking a negative correlation between epistatic and additive genetic effects through the parameter . Our results indicate that epistasis can still facilitate adaptive divergence at longer timescales, even if epistasis is negative. Furthermore, our results indicate a counterintuitive principle that even strong negative epistasis can facilitate diversification on an intermediate timescale, because after a diversification event, one lineage accumulates positive additive effects and the other lineage accumulates negative additive effects, and in combination with a strong negative correlation between additive and epistatic effects, one of these lineages is actually likely to generate a descendant lineage that occupies the niche space between the two lineages. Although strong negative epistasis facilitates diversification, the lineages generated are not very stable and tend to merge. Furthermore, after 10 000 generations, and once dynamics stabilize, a population with a strongly negative epistatic architecture has lower diversity than a weaker negative epistatic architecture.

An important consideration of our model of negative epistasis compared with the work of [13,19] is that, in our model, negative epistasis occurs at the same magnitude throughout the evolution of the population, as opposed to becoming of stronger magnitude as a population evolves. Furthermore, there is not a decline in the variance of additive genetic effects with time, which may occur in situations when a population exhausts its ability to generate new variation in the direction of selection.

Not all studies of gene action in bacteria indicate either independent or negative epistasis. Weinreich *et al*. [21] detected a limited set of mutational paths that result in adaptive phenotypes owing to sign epistasis. Using the modelling framework in this paper, a limited set of mutational paths could be modelled as a reduced mutation rate. Our investigation of cases in which the variance in epistatic effect size is reduced also decreases mutational variance that is potentially adaptive, but there may be the potential for more incremental gains in fitness compared with the sign epistasis case [21]. It is important to note that the model of epistasis in this paper's equation (2.4) is general and can accommodate sign-epistasis, but what is required is an underlying mechanistic or network model that would assign values to the additive and epistatic effects of equation (2.4).

The relative strengths of diversifying and stabilizing selection affect the time lag before an epistatic architecture achieves a greater level of diversity. If the relative strength of diversifying selection increases, the time lag is less, likely because selection is more effective at fixing epistatic combinations and the number of segregating alleles remains high enough for epistatic variance to occur because stabilizing selection remains at the same level. If the relative strength of stabilizing selection increases but is still weaker than the diversifying selection process, our results indicate that diversification of populations with an epistatic versus additive genetic architecture will converge. A possible reason for the convergence in diversity is that strong stabilizing selection reduces within-lineage mutational variance, which inhibits the origin of adaptive epistatic genetic effects. Nevertheless, the level of diversity is about the same for the additive and epistatic cases under strong diversifying and stabilizing conditions, provided *a* > *c*. Furthermore, strong stabilizing selection restricts the range of phenotypic space in which lineages can diversify. In this restricted phenotypic space, epistasis is more likely to generate lineages that are close in phenotype to other existing lineages, whereas the additive case maintains more distinct lineages (see the electronic supplementary material, figure S4).

Epistasis may facilitate other processes that lead to balancing selection besides frequency-dependent selection, such as spatially heterogeneous selection, or equivalently local adaptation. Belotte *et al.* [22] documented local adaptation in bacterial soil isolates in nature. It may be easier to identify bacterial isolates that are potentially locally adapted versus experiencing frequency-dependent competition, which would help move studies of the genetic basis of balancing selection processes from laboratory experiments to nature.

Although this paper focused on an asexual mode of reproduction, its results may be useful in the context of sexually reproducing and recombining populations. Adaptive diversification is inhibited by sex and recombination, because otherwise distinct lineages recombine or become joined together by independent assortment and conjugation. Diversification is possible with assortative mating [23] or some spatial structure and environmental variation [24], both of which facilitate the evolution of genetic structure. Bateson–Dobzhansky–Muller incompatibilities are also possible with epistasis. These incompatibilities can be the basis of the evolution of reinforcement [25] and facilitate speciation.

The consequence of an epistatic genetic architecture for the rate of diversification in a population that is initially polymorphic and in mutation–selection–drift equilibrium is still not known. For a univariate phenotype, a population with an epistatic genetic architecture has a smaller genetic variance than a population with an additive architecture at equilibrium [8]. This is suggestive that a population with an epistatic architecture will have initially a lower diversification rate, but a higher diversification rate once the initial diversification event occurs (as reported here). For a multivariate phenotype, there can be greater multivariate diversity maintained at mutation–selection–drift balance in a population with an epistatic genetic architecture versus a population with an additive architecture [9]. This is suggestive that in a multivariate setting, a population with an epistatic architecture may diversify faster on both short and long timescales. But, this conjecture needs to be confirmed theoretically.

In conclusion, experiments in adaptive diversification indicate that the genetic basis of divergence can be epistatic [4]. Our results provide a mechanism suggesting that epistasis can accelerate diversification on longer timescales and for univariate phenotypes. This acceleration appears to be fed by the accumulation of mutations and their interactions along lineages maintained by the balancing selection process.

## Funding statement

The Natural Sciences and Engineering Research Council, Discovery programme (Canada) and the Canadian Foundation for Innovation (CFI) have financially supported this study.

## Acknowledgements

We thank two anonymous reviewers for their comments that greatly improved the paper.

- Received October 29, 2014.
- Accepted January 6, 2015.

- © 2015 The Author(s) Published by the Royal Society. All rights reserved.