## Abstract

Many viruses, particularly RNA viruses, mutate at a very high rate per genome per replication. One possible explanation is that high mutation rates are selected to meet the challenge of fluctuating environments, including the host immune response. Alternatively, recent studies argue that viruses evolve under a trade-off between replication speed and fidelity such that fast replication is selected, and, along with it, high mutation rates. Here, in addition to these factors, we consider the role of viral life-history properties: namely, the within-host dynamics of viruses resulting from their interaction with the host. We develop mathematical models incorporating factors occurring within and between hosts, including deleterious and advantageous mutations, host death owing to virulence and clearance of viruses by the host. Beneficial mutations confer both a within-host and a transmission advantage. First, we find that advantageous mutations have only a weak effect on the optimal genomic mutation rate. Second, viral life-history properties have a large effect on the mutation rate. Third, when the speed–fidelity trade-off is included, there can be two locally optimal mutation rates. Our analysis provides a way to consider how life-history properties combine with biochemical trade-offs to shape mutation rates.

## 1. Introduction

Mutation enables adaptation to changing environments, but produces deleterious changes at the same time. Early mathematical models characterized optimal mutation rates under the balance of these opposing phenomena [1,2]. Generally, the avoidance of deleterious mutation drives mutation rates to low levels, as seen in the presence of mechanisms to correct replication errors. Indeed, when environments are stable the optimal mutation rate is zero, but the evolution of low mutation rates is limited by the cost of replicating DNA (or RNA) accurately—the cost of fidelity [3,4]. The equilibrium genomic mutation rate therefore may reflect a balance between selection against mutation and a limit set by the cost of fidelity. This view is supported by the remarkably constant per-genome mutation rate across a wide range of taxa [3,4]. Alternatively, the lower limit might be set by drift—the inability of selection to improve fidelity any further—rather than the cost of fidelity [5,6].

Contradicting the principle that mutation rates evolve to be low, hypermutable bacteria have been isolated from natural populations. For example, a large proportion of *Pseudomonas aeruginosa* isolates from the lungs of cystic fibrosis patients have been observed to have highly elevated mutation rates [7]. A similar observation has been made for *Helicobacter pylori* isolates [8]. In these examples, bacteria face fluctuating environments that may arise from starting or stopping treatment with antibiotics, or switching the drug class during antibiotic treatment. The presence of these mutator strains in such environments is consistent with theoretical models that predict the transient rise of strains with high mutation rates owing to hitchhiking, with the beneficial mutations they create [9–12].

To what extent do these principles apply to viruses? Many viruses, particularly RNA viruses, have high mutation rates [4,13]. Could this constitute an adaptation to ever-changing environments requiring viruses to constantly overcome defence systems of their host? For example, viruses of vertebrates may need to perpetually escape the immune system to survive by changing sites that are recognized by host immunity. In other words, does the Red Queen arms race maintain high mutation rates? Some theoretical studies have argued that this principle applies to viruses [14,15]; indeed, high-fidelity strains of viruses have been shown to have lower fitness [16,17]. These explanations are based on the assumption that elevated mutation rates must generate advantageous escape mutations in a way that outweighs the cost of deleterious mutations. This idea has been questioned, however, because transmission can occur before the immune system discriminates between viral lineages. That is, viruses do not need high mutation rates and adaptive escape from immunity if they can survive by simply infecting other hosts [18,19]. Moreover, DNA viruses share the same parasitic lifestyle as RNA viruses but have lower mutation rates [18].

A further difficulty with an explanation based on beneficial effects of mutations is that high mutation rates also lead to deleterious effects. The role of deleterious mutation on viruses has been studied using the theory of quasispecies—a cloud of closely related individuals occupying a fitness landscape whose properties are critical to their evolution [20–23]. A prediction from this work is that the upper bound on the genomic mutation rate is one per replication, which is broadly consistent with empirical findings [20,24,25]. It is evident that the rate of deleterious mutation is high in viruses [26,27], with 20 to 40 per cent of mutations having lethal effects [26]. Models of the evolution of viral mutation rates should therefore also account for deleterious effects.

An alternative explanation for high viral mutation rates argues for a trade-off between replication rate and accuracy so that high replication must come with low accuracy (i.e. high mutation rates) [19,28]. Under this hypothesis, viral evolution is strongly driven by selection for fast replication, and mutation rates are high because of this biochemical constraint. The reduced fitness observed in high-fidelity viral strains can be explained as a result of this trade-off rather than their failure to produce adaptive mutations [29,30]. Consistent with this idea is the positive relationship between replication rate and mutation rate observed in HIV [30]. Furthermore, in vesicular stomatitis virus there is a negative relationship between the mutation rate and the adaptation rate measured *in vitro* [29]. Sniegowski *et al.* [4] argue that deleterious mutation drives rates as low as possible under the constraint imposed by the cost of fidelity. By comparison, selection for fast replication in viruses may drive the replication rate as high as possible under the same constraint.

Thus, current thought in this area contrasts two trade-offs that may influence mutation rates—the first between deleterious and advantageous mutations, and the second between replication speed and accuracy. A further trade-off to consider comes from the evolution of virulence literature [31]. Efforts to understand the evolution of pathogen virulence have identified a trade-off between on the one hand growing fast and killing the host (virulence), and on the other growing too slowly and being eliminated by the immune response [32] or outcompeted by faster-growing strains [33–35].

Life-history properties of the virus—defined here to be viral population dynamics resulting from the interaction between the virus and the host—might have an important impact on mutation rates. For example, as mentioned above, escape from the immune system may play a role for viruses of vertebrates [15]. A recent simulation study combines within- and between-host dynamics of a pathogen to explore the evolution of mutation rates of the pathogen [36]. This study suggests that two locally optimal mutation rates are possible: first, when pathogens have a very low mutation rate, they evolve too slowly to increase their within-host replication rate and thus do not kill their host, and are able to transmit to more hosts; second, when pathogens have a high mutation rate they produce too many deleterious mutations to improve their within-host replication rate, and again they do not kill their host, and are able to transmit to more hosts. Our understanding of viral mutation rates would be further improved by integrating all three kinds of trade-offs and explain how they interact.

Here, we construct and analyse a simple model of viral mutation rate evolution by including within- and between-host dynamics, similar to that of Antia *et al.* [32]. Mutations can have deleterious effects, but the mutation process also produces rare stochastic changes that allow viruses to infect more hosts. We study how the multiple trade-offs described above work together to influence mutation rates. Like the work of O'Fallon [36], we find the possible occurrence of two locally optimal mutation rates, which emerge in our model for different reasons.

## 2. Model

We develop a simple dynamical model of viruses growing within hosts and transmitting between hosts. We later extend this basic model to consider within-host processes in more detail. In the basic model, viruses grow exponentially within the host, starting from a single virus. Viruses can experience both deleterious and advantageous mutations. The rate of mutation is *U* per viral genome per generation and the within-host growth rate is *r* per virus per generation. There may be a relationship between these two parameters because of biochemical constraints; in that case, we will write *r*(*U*) for the growth rate. Transmission is proportional to the total number of viruses produced during the course of the infection, with proportionality constant *c*. There is no superinfection or coinfection.

Deleterious mutations result in non-viable virions—they are lethal. The proportion of mutations that are lethal and effectively reduce the viral replication rate is *δ*. Advantageous mutations confer a growth advantage *σ* within hosts and a transmission advantage *s* between hosts. A mutation is advantageous with probability *α*. We will refer to viruses with advantageous mutations as *mutants*. We assume that at most a single advantageous mutation occurs within a single infection. The advantageous mutation confers a selective advantage of *σ* within the host. The infection ends at time *T*. Table 1 summarizes the parameters of the model.

### (a) Dynamics and mean fitness of the virus

Each viral generation lethal mutation removes a proportion 1−e^{−δU} of virions. The virus grows exponentially within the host according to e^{bt}, where
is the growth rate of viruses not affected by lethal mutations. We can think of this as the effective growth rate. Note that the parameter *r* is constant in the model and represents a host- and environment-determined baseline growth rate. It is not affected by mutations. Since the virus grows exponentially in the host and the mutation rate is constant per virion, the time until a mutant appears is described by the Gompertz distribution with hazard function
The probability density function for this distribution is

Note that time *t* takes values over even though the infection ends at time *T*. So whenever the time from this distribution is greater than *T* (i.e. *t* > *T*), this means the infection has ended before the mutant has had a chance to appear.

The mutant has advantage *s* with respect to transmission to new hosts. This roughly corresponds to a Red Queen situation in which there is always a mutation of potentially large effect size available because the host population keeps evolving (or acquiring immune memory against old strains). The overall reproductive fitness is an average over the contributions of the two strains. This is computed by integrating the viral load over infection time *t* and averaging over times *τ* that the mutant could have arisen (if at all) before time *T*.

The numbers of wild-type and advantageous mutant viruses are, respectively, given by e^{bt} and e^{b(1+σ)t}. The total number of wild-type viruses by *t* = *T* is
if there is no mutation. If there is mutation, then we need the time *τ* that the mutant appeared, and to subtract the entire mutant clade from the wild-type population. That is, we must subtract
which is replaced by adding back the total number of mutants, which is given by

The mean reproductive fitness is therefore which can be written 2.1

### (b) Within-host factors: virulence and immunity

Three kinds of within-host models will be considered. First, the duration of infection *T* is constant regardless of the viral growth rate *b*. Second, the infection ends when the virus kills the host by reaching a population size *D*. The duration of infection is given by the time at which this virulence threshold *D* is reached, namely *T* = log(*D*)/*b*. Finally, the duration depends on both virulence and the immune response, as follows. The infection ends either because the virus kills the host when it reaches a population size of *D* or because the immune response clears the virus because it grows slowly enough to be overpowered. To model these two phenomena, assume that the infection duration *T* is given by
2.2Here, the immune response is more effective when viral growth rates are low. Especially, for low effective growth rates *b* there is a linear relationship between *b* and the duration of infection *T*. The parameter *b** is a turning point separating virulence (killing of the host) from recovery (clearance by the host). Figure 1 illustrates the within-host trajectory of the virus under this model. This model is a simplified version of the one proposed by Antia *et al*. [32].

### (c) Replication speed–accuracy trade-off

Under a trade-off between the speed and accuracy of replication, there would be a positive relationship between *r* and *U*. As replication fidelity increases (as *U* decreases), there should be increasing cost to the speed of replication via a decrease in *r*(*U*). This is the cost of fidelity. Let
2.3where *ρ* is a scaling parameter and where *ε* ≥ 0. If *ε* = 1 there is a proportional decrease in replication rate with any decrease in mutation rate. If it becomes increasingly costly to reduce mutation rate, then *ε* < 1. In this case, the constraint curve has negative concavity (but note that on a logarithmic scale for *U* the curve has positive concavity).

We study the effect of this relationship by considering how evolution on the fitness landscape over (*r,U*) is constrained by equation (2.3). The fitness surface specified by equation (2.1) exhibits population-level trade-offs in the system, while the constraint curve represents biochemical trade-offs. Exploring the placement of this curve over the fitness surface shows how multiple trade-offs operate in the evolution of viral mutation rates.

### (d) An extended model with resource limitation and non-lethal mutation

To relax some of the assumptions of the basic model, we introduce an extended model in which there is non-lethal deleterious mutation, and in which resources for the virus are limited by the depletion of target cells. We define a continuous-time stochastic model as follows. Let *V* be the number of wild-type viruses (or virus infected cells), *W* the number of viruses with non-lethal deleterious mutation, *X* the number of mutant viruses with advantageous mutation but not deleterious mutations, and *Y* the number of mutant viruses with the advantageous mutation and the deleterious mutation. Let *θ* be the number of target cells used by the virus; this is a resource that can be depleted by the virus. Target cells become infected with rate parameter *β*. Infected cells are killed at rate *ω* per cell per time unit.

Viruses with the beneficial mutation have a growth advantage of *σ* within hosts, while having the non-lethal deleterious mutation comes with a cost *ζ*. Let *δ* be the proportion of mutations that are deleterious but not lethal, and let *λ* be the proportion of mutations that are lethal. As before, *α* is the proportion of mutations that are advantageous. Define , which are the respective probabilities of non-lethal deleterious and advantageous mutation per generation per infected cell.

A deterministic version of the model can be described with the following system of differential equations: 2.4 2.5 2.6 2.7 2.8

The initial conditions are: *V* = *V*_{0}, *W* = 0, *X* = 0, *Y* = 0 and *θ* = *θ*_{0} at *t* = 0, where *V*_{0} and *θ*_{0} are, respectively, the initial number of virus-infected and uninfected target cells. We study the corresponding stochastic model computationally using Gillespie's tau leap method. We compute the fitness of the virus for a given realization of the process with
2.9where [*A*], the Iverson bracket, equals 1 if *A* is true and 0 otherwise.

## 3. Results

### (a) Basic model

We study the model by first considering the reproductive fitness of the virus, given by equation (2.1) as a function of growth rate *r* and mutation rate *U*. Recall that there are three variants of the model. The first assumes a constant duration of infection. In the second variant, the duration depends on the effective growth rate of the virus: the host is assumed to die when the exponentially growing virus population exceeds a lethal threshold size *D*. In the third variant, the host dies if the virus grows at a fast rate, or, alternatively, recovers if the virus grows slowly, and is assumed to be cleared by immunity (see figure 1). Figure 2*a* shows how the reproductive value changes as a function of growth rate *r* and mutation rate *U* under the first model, in which duration of infection *T* is constant and there is no virulence. As the growth rate increases and the mutation rate decreases, the contours increase in height. The virus maximizes its fitness by growing as fast as it can while keeping its mutation rate as low as possible. Here, there is no disadvantage to growing fast, and low mutation rates are favoured because the deleterious effects of mutation are more immediately felt than advantageous effects. In other words, the wild-type fitness component *R* has a dominating effect on the reproductive value.

Figure 2*b* shows the effect on the reproductive value for the model assuming only virulence. Here, in contrast to the model with constant infection duration, low growth rates along with high mutation rates lead to the highest reproductive values. This is because low growth rates lead to large within-host population and thus higher transmission. This is achieved by lowering growth rate *r* and/or increasing the impact of deleterious mutation through *U* on the effective growth rate *b*.

Figure 2*c* shows the effect of both virulence and growth rate-dependent immunity. In this case, there is a value of *b*, namely *b**, at which the duration of infection *T* is highest (see equation 2.2). At this effective growth rate the total viral load and the reproductive value are also maximal for the parameters used in the analysis. There is therefore a fitness ridge along *b* = *b** for the parameters used here. Figure 2*d* shows the fitness surface again for the model with both virulence and immunity, but with extremely strong positive selection (*s* set to 15). In this case, the *b* = *b** relationship is now a ‘tilted’ ridge with higher fitness values at higher combinations of *r* and *U*. Also, for low growth rates *r*, mutation rate *U* peaks at intermediate values. We note that to achieve this distortion of the surface *s* must be pushed to extremely high values (the effect appears at around *s* = 10). Mutations in viruses enabling the use of different host cell receptors have the potential to increase the available pool of susceptible hosts (e.g. norovirus [37]). However, *s* > 10 is perhaps still unrealistic as it would constantly require more than 10 times as many new susceptible hosts to become available by switching receptors.

In figure 3, we show both and infection duration *T* as a function of *U* for a fixed growth rate *r*, under the virulence + immunity model. Here, we see that the two functions peak at the same point, namely where the two pieces of the model join. This shows that viral fitness is strongly affected by the duration of infection, which in turn is determined by the details of the life-history properties of the virus. Note that the peak in fitness here is in the vicinity of *U* = 1, the upper bound on genomic mutation rate predicted by quasispecies theory [20]. This results from the effects of deleterious mutation owing to the term *b* = *r*e^{−δU} in the model through which the within-host growth rate drops sharply around *U* = 1.

### (b) Application of biochemical trade-off between speed and accuracy

If the speed and accuracy of replication are constrained under a trade-off, then not all phenotypes are possible; in particular, viruses cannot be both very fast and very accurate. Such a constraint can be modelled with a positive relationship between growth rate *r* and mutation rate *U*, as done using equation (2.3). Other relationships are possible; however, the qualitative results do not depend on the exact form of this relationship. This curve can be viewed as a constraint to adaptive evolution that acts as a barrier to movement on the fitness landscape [38]. In the current context, this can be interpreted in two ways: (i) the only genetic states possible are those that confer pairs of *r* and *U* along the curve; and (ii) the only genetic states possible are those with *r* and *U* on or below the curve.

This constraint curve may intersect the fitness ridge given by the highest contours of in figure 2*c*,*d*. If it intersects that ridge at two values, as illustrated in figure 4 (dashed line), then there are two locally optimal pairs of *r,U*. Thus, two ‘strategies’ of viral growth emerge: first, the virus can grow slowly and accurately; second, it can grow fast and inaccurately. On the other hand, the constraint curve might not intersect the fitness ridge (figure 4, dotted line). This lower curve corresponds to a scenario in which the constraint produces lower fidelity for a given growth rate. Here, evolution would drive the system to the point on the constraint curve with the highest fitness, , corresponding to a strategy of intermediate replication rate and fidelity.

### (c) Extended model with resource limitation and non-lethal mutation

To study the extended model, we again vary the genomic mutation rate *U* widely and observe the effect on fitness (figure 5). We do so for the case in which 10 per cent of mutations lead to beneficial effects (*α* = 0.1), and an alternative (*α* = 0) in which there are no beneficial mutations.

As we saw in the earlier simple model, there is an optimal mutation rate that balances the negative effects of virulence on the one hand and deleterious mutation on the other. Virulence takes effect here for low genomic mutation rates *U*, which are associated with high growth rates. The balance of these two forces operates with or without beneficial mutations. When beneficial mutations are introduced, the peak is higher and shifted to higher values of *U*. There is therefore an advantage to lowering fidelity and producing advantageous mutations, but this advantage is not large compared with the influences of within-host growth characteristics.

## 4. Discussion

Although the evolution of mutation rates has been a subject of general interest for a long time, it is only relatively recently that this has been explicitly studied in the context of viruses [29,30]. The basic principles applying to prokaryotes and eukaryotes do not directly apply to viruses because selection acts differently in viruses. There is little or no benefit in prokaryotes and eukaryotes for DNA polymerases to replicate genomes quickly because reproduction at the organismal level requires many other factors. Even in the case of prokaryotes, which generally have smaller genomes than eukaryotes, cell division requires many steps in addition to genome replication. While it may be faster to replicate small prokaryotic genomes, there is no relationship between genome size and cell division rate [39,40]. In contrast, for viruses, fast genome replication has an immediate fitness advantage both at the virion level and the within-host level. There should therefore be strong selection on replication speed [19,28]. A trade-off between the speed of replication and the accuracy of replication has been advocated as the major determinant of the observed genomic mutation rate [19,28]. Here, we have developed a model to examine this biochemical trade-off as well as two further trade-offs: that between advantageous and deleterious mutations, and the life-history trade-off between virulence and immunity.

Incorporating different assumptions about life-history details of viruses leads to very different outcomes. When the duration of infection is constant, replicating as fast and accurately as possible is of ultimate importance. There is an immediate advantage to growing quickly, which is not helped much by advantageous mutations. When the duration of infection depends on how quickly the virus kills the host, replicating as slowly and inaccurately as possible is favoured. This is because low growth rates lead to long-lived infections, and thus high average viral loads. High mutation rates lead to the same outcome by reducing the number of viable viruses through deleterious mutations. We note, however, that if host or environmental factors impose a low underlying growth rate *r*, then the optimal mutation rate *U* must be low. When the duration of infection depends on both virulence and host immunity, there is a balance between the two phenomena and the optimal effective growth rate is intermediate. This optimal rate maximizes the total viral load without killing the host, in the same manner as in the model of Antia *et al.* [32]. Because this optimal growth rate is affected by both the replication rate and the mutation rate, there is a ridge in fitness values over the space of *r* and *U* values through the relationship *b** = *r* e^{−δU}. The trade-off that emerges here is between within-host growth and deleterious mutation in optimizing the effective growth rate. Evolution of mutation rates depends on this ridge as well as any constraint imposed by biochemical trade-offs.

In contrast to deleterious mutation, our model suggests that advantageous mutations have an indirect and therefore weak effect on reproductive fitness. High mutation rates in viruses are not likely to be an adaptation to produce advantageous mutations; that is, they are probably not maintained in a Red Queen scenario. An exception to this principle may occur in viruses causing persistent infections, such as HIV, in which immune escape mutants occur frequently over long periods [41]. Nevertheless, the model presented here generally supports the hit-and-run view of viruses, according to which they transmit to other hosts without having to escape immunity through mutation [18,19]. This is true even when the effect of advantageous mutation is strongly favoured by elevating their rate and effect size. Although this difficulty of explaining mutation rates using escape dynamics has been framed as an inability of life-history properties to explain viral mutation rates [18,19,28], we argue that, in fact, life-history properties in the sense of within-host dynamics [32] play a crucial role. The study of O'Fallon [36] presents an alternative model supporting this contention.

Our analysis offers a way of considering the replication speed–accuracy trade-off simultaneously with the other trade-offs to yield a more complete perspective on the evolution of mutation rates. The speed–accuracy trade-off is a biochemical trade-off, which can be viewed as a constraint on how evolution can proceed on the fitness landscape defined here by . It is a ‘design barrier’ on the landscape as discussed by Stearns [38]. Under this view of the interaction between the trade-offs, evolution will push mutation rates as high as possible on the fitness landscape along the constraint curve connecting replication rate and mutation rate. In this case, imposing this biochemical trade-off implies the possibility of two distinct local optima, which occur only when the constraint curve crosses the curve corresponding to the optimal effective growth rate on the fitness landscape. In contrast to the model of O'Fallon [36], the dual optima here arise from the interaction of two trade-offs rather than the occurrence of two alternative strategies for optimizing within host growth rates. Similarly to O'Fallon [36], the optima occur when the duration of infection is longest, as that is when viral load is highest. Although rather speculative, this may be a step towards understanding why viruses vary in their genomic mutation rate [4,13]. In particular, DNA viruses have lower mutation rates than RNA viruses, and it is possible that life-history and biochemical trade-offs lead to an alternative lower optimum for DNA viruses.

Our model also provides a way of understanding the effect of antiviral drugs (such as ribavirin) that act by increasing the mutation rate to lethal levels [42–44]. This increase in mutation rate might be viewed as a movement on the fitness landscape (figure 2*c*) to a point with higher genomic mutation rate (towards the right on the contour plot). Such a move decreases reproductive fitness, potentially driving the viral population to extinction. If resistance to the mutagen evolves, this movement could be reversed, shifting the mutation rate towards the original lower rate. The fitness ridge we identify here suggests that there could be a class of resistance mutations that restores fitness, but not necessarily by reducing the mutation rate. Specifically, there may be resistance mutations that confer a small reduction in the genomic mutation rate combined with a large increase in the replication rate (an upward move on the contour plot), as pointed out by Bull *et al.* [44]. Such mutations restore fitness without needing to overcome the high mutation rate.

The model could be extended in several directions. We did not consider issues such as robustness and evolvability [22,28,45]. High mutation rates may lead to occupation of more robust regions of the fitness landscape. To model this phenomenon, the proportions of advantageous and deleterious mutations (*α* and *δ* in our model) could become functions of the growth rate *r*. Our basic model only considered mutations of lethal effect and advantageous mutations with a single effect size *s*. Our extended model generalized deleterious effects with an additional class, but it may be possible to generalize further to include more realistic distributions of fitness effect. Another direction would be to make the immune system more detailed and realistic. Finally, we did not consider genetics explicitly, but rather assumed that all phenotypic states are possible, and only along a constraint curve under one version of the model. Another model could specify genetic states as bitstrings and specify how genotype determines phenotype (in this case growth rate and genomic mutation rate). This would allow exploration of epistasis, landscape ruggedness and robustness, for instance.

## Acknowledgements

We thank Peter White and Rowena Bull for helpful discussions and two reviewers for making suggestions that have improved the paper. This work was supported by the Australian Research Council through the Discovery scheme.

- Received August 31, 2012.
- Accepted October 17, 2012.

- © 2012 The Author(s) Published by the Royal Society. All rights reserved.