## Abstract

A modified susceptible–infected–recovered (SIR) host–pathogen model is used to determine the influence of plant mating system on the outcome of a host–pathogen interaction. Unlike previous models describing how interactions between mating system and pathogen infection affect individual fitness, this model considers the potential consequences of varying mating systems on the prevalence of resistance alleles and disease within the population. If a single allele for disease resistance is sufficient to confer complete resistance in an individual and if both homozygote and heterozygote resistant individuals have the same mean birth and death rates, then, for any parameter set, the selfing rate does not affect the proportions of resistant, susceptible or infected individuals at equilibrium. If homozygote and heterozygote individual birth rates differ, however, the mating system can make a difference in these proportions. In that case, depending on other parameters, increased selfing can either increase or decrease the rate of infection in the population. Results from this model also predict higher frequencies of resistance alleles in predominantly selfing compared to predominantly outcrossing populations for most model conditions. In populations that have higher selfing rates, the resistance alleles are concentrated in homozygotes, whereas in more outcrossing populations, there are more resistant heterozygotes.

## 1. Introduction

Broad patterns in nature indicate a relationship between plant mating system and plant–pathogen interactions. There is a positive correlation between the number of fungal pathogen species known to infect a plant host and the outcrossing rate of the host (Busch *et al*. 2004). In addition, species with higher outcrossing rates tend to occur in less disturbed, biologically complex habitats, where disease and other ‘natural enemies’ are likely to be more prevalent (see Levin (1975), although alternative explanations for this pattern exist). These ecological correlations provide evidence that pathogen pressure may influence mating system evolution and vice versa. Given that the mating system of the host controls the genetic diversity of its progeny (Holsinger 2000) and the degree of similarity between parent and progeny, the mating system of the host could affect the evolution of resistance to infection and the prevalence of disease in populations. For example, biological control of weedy plants is more effective for asexually reproducing species than for sexually reproducing species (reviewed in Burdon & Marshall 1981), indicating that recombination in host species may offer protection from natural enemies and limit the spread of enemies within populations.

Most relevant existing theory investigates the potential for disease to select for increased outcrossing by contrasting sexual and asexual reproduction in the context of the Red Queen hypothesis (e.g. Jaenike 1977; Hamilton 1980; Hamilton *et al*. 1990; Howard & Lively 1998). Two previous studies have contrasted self-fertilization (‘selfing’) and outcrossing, both of which are forms of sexual reproduction. One model found that parasites can select for a mixed mating system in a haploid host with diallelic matching alleles governing infection (Lively & Howard 1994). Another study used simulations to investigate the possibility that pathogens can select against complete selfing in a host (Agrawal & Lively 2001). The simulations showed that parasites select for outcrossing over a wide range of parameters, but that the results were sensitive to genetic assumptions mediating infection. Both of these models consider individual fitness, but do not scale up to address the potential consequences of varying mating system on the spread of resistance alleles and the transmission of pathogens within populations.

The question of the effect of inbreeding on the prevalence of disease in a population is of broad interest. Humans are causing loss and fragmentation of many species' habitats. This is likely to increase the amount of inbreeding in plants. This paper addresses the possible effects of inbreeding on disease resistance through a modified SIR host–pathogen model (Anderson & May 1981) to determine the influence of mating system on the outcome of the host–pathogen interaction. This model makes the link between host mating system an influential determinant of genotype frequencies and prevalence of infection at a population level, rather than focusing on individual fitness effects. The following specific questions are addressed.

### (a) Question 1: will the degree of host selfing affect the incidence of infection in the host population?

We examined this in the model by looking at the effect of the inbreeding coefficient (*F*) on the number of resistant, susceptible and infected individuals at equilibrium. We assumed that host resistance is determined by a single dominant allele (R). Therefore, in this model, resistant individuals may be either homozygous resistant (RR) or heterozygous resistant (Rr). The genetic effect of selfing alters the distribution of alleles in the next generation by increasing the proportion of homozygotes and decreasing the proportion of heterozygotes without altering the frequency of alleles (Hartl & Clark 1997). Thus, with increasing selfing, there should be fewer resistant heterozygotes produced and, as a result, a greater proportion of individuals susceptible to infection in the population. We therefore expected higher levels of endemic disease in selfing than in outcrossing populations.

### (b) Question 2: will the degree of host selfing affect the frequency of a dominant resistance allele (R) in the population?

Although selfing alone will not alter the frequencies of any allele, increased selfing will lead to an ever greater production of homozygous individuals, which may leave a greater proportion of the population susceptible to infection. Will more susceptibility to infection lead to a higher gene frequency of the resistance allele in later generations? We examined this in the model by looking at the effect of the inbreeding coefficient (*F*) on the frequency of a dominant resistance allele (R) at equilibrium.

## 2. The model

Using differential equations, we created a continuous time model of host population dynamics with both disease and host genetics. These epidemiological equations were based on Anderson & May's formulation (1981), but we added a genetic component of host resistance to infection and a host mating system that can be varied. Many host-specific pathogens are able to infect only particular host genotypes (Burdon 1987; Thompson & Burdon 1992). In addition, polymorphism for resistance is common within natural populations (Simms & Triplett 1994) and may be maintained by costs of resistance genes (Bergelson & Purrington 1996) or even by gene flow in structured metapopulations (Thrall & Burdon 1997). We assumed a diploid host with a single locus for resistance to infection. A dominant resistance allele (R) confers complete resistance. Although many factors can contribute to overall pathogen resistance, single dominant alleles of major phenotypic effect have been found to confer resistance in both natural and agricultural systems (Flor 1955; Thompson & Burdon 1992), and thus represent a tractable way to model resistance. For simplicity, the pathogen was assumed to be genetically uniform. Explicit numbers of individual pathogens were not quantified because the pathogen was a microparasite that could reproduce within the host. The pathogen caused similar levels of disease in the host whether the infection was caused by a single individual or many individuals of the pathogen, due to the rapid reproductive rate of the pathogen inside of a susceptible host (Anderson & May 1981). Infection affected host fitness by increasing the death rate of infected hosts. The model is suitable for pathogens that spread through horizontal, density-dependent pathogen transmission, where the host is a perennial. It was assumed that the pathogen had no alternative hosts, and that the offspring were all healthy, regardless of the infection status of the parent(s). We assumed a cost of resistance in the form of lower fecundity for any individual with the resistance allele.

The model considered three genotypes, one of which could be either in the infected state or in the uninfected (susceptible) state. Table 1 is a complete listing of the variables and parameters in the following equations. Following Anderson & May (1981), the set of differential equations for the system is:(2.1a)(2.1b)(2.1c)(2.1d)where *X*_{RR}, *X*_{Rr} and *X*_{rr} are the numbers of healthy individuals carrying two R alleles, one R and one r allele and two r alleles, respectively, while *Y*_{rr} is the number of infected individuals with two r alleles. We assumed that individuals with an R allele are completely resistant, or immune, to infection, so there is no need for equations for *Y*_{RR} and *Y*_{Rr}. The model contains several additional assumptions. Once an individual host was infected it either remained infected or died because, unlike the assumption in many traditional models and vertebrate systems, there was no recovery from the disease, following the observation that plants often do not recover from pathogen infection (for analysis of a more general model in which recovery can occur, see electronic supplementary material, part A). The pathogen was transmitted directly and equally from any infected individual. An *X*_{rr} individual's chance of getting the pathogen depended on the number of infected individuals in the population, with an infection rate coefficient *β* (for analysis of more general assumptions on pathogen transmission, see electronic supplementary material, part B). For healthy individuals (*X*_{rr}), the death rate was *b*, whereas for infected individuals (*Y*_{rr}), the death rate was (*b*+*α*). New individuals were figured as the number of gametes produced for resistant and susceptible genotypes, such that the birth rates for the two gametes were(2.2a)(2.2b)Here, *a*_{RR}*, a*_{Rr} and *a*_{rr} are the birth rates of the three genotypes. The infected individuals generally had no reproductive cost to infection. In the analyses that follow, we considered three cases: (i) *a*_{RR}=*a*_{Rr}<*a*_{rr}; (ii) *a*_{RR}<*a*_{Rr}<*a*_{rr}; and (iii) assign a reproductive cost to infection by setting the fecundity of infected individuals to some value (*a*_{inf}) equal to or less than the fecundity of resistant individuals. The term *ρN* represents density-dependent self-limitation on reproduction, where(2.3)

In animal-pollinated plants, mating system can be considered a continuous variable from complete selfing to complete outcrossing (Vogler & Kalisz 2001). The mating system of a population can be estimated using the inbreeding coefficient (*F*), which ranges from 0 to 1 (Hartl & Clark 1997). In comparison to a population composed of randomly mating (i.e. outcrossing) individuals, complete selfing halves the frequency of heterozygotes each generation (Wright 1921). Selfing decreases the frequency of heterozygotes by *F*, which is the probability that two alleles in the same individual are identical by descent (Hartl & Clark 1997). Therefore, offspring genotype frequencies are determined by the following equations:(2.4a)(2.4b)(2.4c)The frequencies, *f*gam*X*_{R} and *f*gam*X*_{r}, of each gamete in the population are simply and , respectively. The total numbers of the three offspring genotypes (new*X*_{RR}, new*X*_{Rr} and new*X*_{rr}) used in equations (2.1*a*)–(2.1*c*) are determined respectively, by multiplying each of the above functions, *f*seed*X*_{RR}, *f*seed*X*_{Rr} and *f*seed*X*_{rr} by the total number of offspring . This completes the development of the model.

## 3. Results

In order to address the two questions noted above, the model was studied at steady-state equilibrium. Where relevant, we examined the behaviour of the model for a range of model parameters to determine the robustness of the results.

### (a) Steady-state equilibrium, case 1: no pathogen present and *a*_{RR}≤*a*_{Rr}<*a*_{rr}

In the absence of a pathogen, because *a*_{RR}≤*a*_{Rr}<*a*_{rr}, the R allele disappears from the system at equilibrium. Therefore, *X*_{RR}=*X*_{Rr}=*Y*_{rr}=0 and it is easy to show that(3.1)The loss of the R allele in the absence of a pathogen is predicted in other models with similar assumptions about the cost of resistance (e.g. Thrall & Antonovics 1995). It is important to note that in a spatially explicit metapopulation context, unlike the single population model presented here, gene flow between populations may increase the time for gene fixation or prevent fixation altogether.

### (b) Steady-state equilibrium, case 2: pathogen present and *a*_{RR}=*a*_{Rr}<*a*_{rr}

For this special case, where resistance is dominant and the cost of resistance is associated with the resistant phenotype, it is possible to solve analytically for the steady-state equilibrium by setting the right-hand sides of equations (2.1*a*)–(2.1*d*) to zero and using equations (2.2*a*,*b*), (2.3) and (2.4*a*)–(2.4*c*). Resistant individuals are maintained in the population by the presence of the pathogen. The solutions for , and are:(3.2a)(3.2b)(3.2c)(Analytic formulae can also be found individually for the last two individual variables, and , as quadratic functions of *F*; see electronic supplementary material, part C.) Note that the total population size, including infected individuals, is in the case where disease is present, as compared with when the disease is absent. An important conclusion is that the level of inbreeding does not affect the above equilibrium values, including , the fraction of the infected individuals in the population (figure 1*a*), because *F*, the inbreeding coefficient, does not appear in the solutions. However, although *F* does not affect the sum, , it does affect the individual values of and (also see electronic supplementary material, part C). When the population is completely outcrossing (*F*=0) there are more heterozygous individuals, while there are none for complete selfing (*F*=1). Because the numbers of resistant, susceptible and infected individuals are the same for completely selfing and completely outcrossing populations, the frequency of the R allele is higher in selfing populations than in outcrossing populations due to the absence of heterozygous resistant individuals in completely selfing populations. (It is possible to extend these results to more general infection rates than ; see electronic supplementary material, part B.)

The analytic solutions allow us to examine the effects of model parameters on the number of susceptible individuals in the population, , as well as the number of infected individuals, . Increased values of *β* and decreased values of the ratio *a*_{rr}/*a*_{RR} (recall *a*_{RR}<*a*_{rr}) will decrease equilibrium numbers of infected individuals. Increased death rates of infected individuals, *α*, lead to larger values of , because a larger value of *α* decreases the steady-state number of the infected individuals in the population, as close examination of equation (3.2*b*) reveals. A similar effect occurs in a predator–prey population with a prey-dependent functional response; a higher death rate of a predator increases the equilibrium population size of its prey.

### (c) Steady-state equilibrium, case 3: pathogen present and *a*_{RR}<*a*_{Rr}<*a*_{rr}

In case 2 above, in which *a*_{Rr}=*a*_{RR}, heterozygous individuals bear the same cost of resistance as the resistant homozygote. A possible alternative is for the heterozygote reproductive rate to be intermediate between the two homozygote reproductive rates. For this case, the model could not be solved analytically at equilibrium. However, solutions were possible through numerical evaluation. We examined the effects of *F*, in combination with each parameter, on the prevalence of disease in the population and summarize the main relationships below.

Consider again the example described in case 2, where reproductive coefficients *a*_{RR}=*a*_{Rr}=0.7 and *a*_{rr}=0.8. But now let *a*_{Rr}=0.75, meaning that the effect of resistance on reproduction is proportional to the number of resistant alleles in an individual. The first thing to note, in this case, is that the equilibria of all variables except for vary with *F* (figure 1*b*). The variable , the number of susceptibles, is held at a fixed level by top-down control by the disease parameters alone and is thus independent of both *F* and *a*_{Rr}. Any change in the rate of production of *X*_{rr}, that is, new*X*_{rr}, changes the steady-state value of . Second, note that for *F*=1 the equilibrium values , , and *N*^{*} are the same as the values in case 2 (*a*_{RR}=*a*_{Rr}). The reason is that when *F*=1, so all of the variables are completely independent of the value of *a*_{Rr}, and thus they are the same as if *a*_{Rr}=*a*_{RR}. Third, note that as *F* is decreased from 1, *N*^{*} increases (only slightly in this particular case, but, with other sets of parameters, decreases in *F* can result in large increases in *N*^{*}). The increase in *N*^{*} is due to the higher reproductive rate of the *X*_{Rr} relative to the *X*_{RR} in conjunction with the greater prevalence of the *X*_{Rr} as outcrossing increases. Fourth, as *F* is decreased from 1, the sum no longer remains constant, as it did in case 2; the increase in slightly exceeds the decrease in . Fifth, and increase with decreasing *F* and are almost double in size when *F* reaches 0.

The behaviours of and as functions of *F* displayed in figure 1*b* are typical of situations in which the infection rate, *β*, is relatively large. Under the condition of smaller *β*, the opposite dependence of and as functions of *F* occurs, in which case these variables decrease as *F* decreases (figure 2). In this example, we also increased *α*, the increment of mortality due to infection, from 0.1 to 0.5, which contributes to the reversed slopes of and as functions of *F* relative to parameter sets with higher *β* and lower *α*. Now decreases by more than 40% as *F* decreases from 1 to 0.

Thus, depending on parameter values, and may increase or decrease as functions of the degree of selfing. For comparison, from the above two cases is plotted together with curves for three additional parameter sets (figure 3*a*). In one case (curve 5 in figure 3*a*), has a very slight peak at an intermediate value of *F*. In a case in which the reproductive rate associated with the resistance allele is half that of the non-resistant allele and the disease mortality increment (*α*) is small (e.g. curve 4 in figure 3), *Y*_{rr} is an extremely large fraction of the population when selfing is relatively high, as the R-allele is eliminated from the population. The plots of all of the variables related to curves 4–6 in figure 3*a*,*b* are appended in electronic versions (see figures E1–E3 in the electronic supplementary material).

The frequency of the R-allele in the population at equilibrium for *a*_{RR}<*a*_{Rr}<*a*_{rr} also follows a variety of patterns (figure 3*b*). We examined the relationship under the same sets of model parameters as above. In general, increased inbreeding leads to a greater frequency of the R-allele. However, when there is a high cost of carrying the R-allele, that is, when values of *a*_{RR} and *a*_{Rr} are low relative to *a*_{rr} (curves 4 and 6 in figure 3), as *F* increases from zero the frequency of the R-allele in the population will decrease or even disappear from the population. A low cost of resistance with a relatively high *β* will lead to the highest frequencies of the R-allele at equilibrium (curves 1 and 2).

### (d) Steady-state equilibrium, case 4: pathogen present and

We considered the case in which infected individuals have a reproductive rate that is either equivalent to or less than that of the resistant individuals. The equilibrium values of , and were independent of *F* and decreased with decreasing *a*_{inf} (please refer to electronic supplementary material, part A). The frequency of the R allele increases with increasing inbreeding as in case 2.

## 4. Discussion

According to our model, when a single R allele can confer resistance and the cost of resistance (lower birth rate) is the same for the homozygote and heterozygote carrying that allele (i.e. *a*_{RR}=*a*_{Rr}, case 2), the host mating system does not affect the fraction of resistant individuals, or the prevalence of infection in a population (figure 1*a*). This is an apparently new and broad analytic result. Ecological and life-history factors such as the pathogen transmission parameter (*β*), the birth (*a*_{RR}, *a*_{Rr}, *a*_{inf} and *a*_{rr}) and death rates (*b* and *α*) and the density dependence factor (*ρ*) determine the number of susceptible and infected individuals at equilibrium. Only the frequency of the resistance allele varies with *F*. The population dynamics ‘compensate’ for disease dynamics such that resistant individuals are still born into the population at a rate high enough to maintain their existence despite the potential influence of selfing in creating a higher number of susceptible individuals in the population.

The selfing rate does not affect the sizes of or when *a*_{RR}=*a*_{Rr} (case 2), because the RR-homozygote and the heterozygote are both resistant and have the same reproductive rate coefficients. However, when the cost of resistance differs between the RR-homozygote and the heterozygote (*a*_{RR}<*a*_{Rr}<*a*_{rr}, case 3) the situation is different, because the heterozygote has higher fitness than the RR-homozygote. For case 3, for the particular situation of complete selfing (*F*=1), the fraction of resistant individuals and the prevalence of infection, , in a population are the same as in the case *a*_{RR}=*a*_{Rr} (case 2; compare figure 1*a*,*b*), because there are no heterozygotes at equilibrium for *F*=1, so the value of *a*_{Rr} has no influence. But, unlike case 2, in case 3 as *F* decreases from 1, both the absolute number, and the fraction, , of infected individuals in the population can decrease, increase, or display unimodal behaviour, depending on the set of model parameters.

Case 3 had to be explored by computer simulations of the model, so it is hard to draw generalizations, but some patterns emerge. Increases in with increasing *F* tend to be associated with small values of *β* and large values of *α* (hence large values of ), while the reverse trend generally occurs for large values of *β* and small values of *α*. This difference appears to stem from two facts. First, *X*_{Rr} is superior to *X*_{RR} because it has a higher reproductive rate. This alters the balance of *X*_{RR} and *X*_{Rr} population from the case in which *a*_{Rr}=*a*_{RR} (compare figure 1*a*,*b*). In particular, when *a*_{Rr}=*a*_{RR}, any decrease in *F* results in an increase in *X*_{Rr} that is exactly matched by the decrease in *X*_{RR} (figure 1*a*), because they are equivalent ecologically. However, when *a*_{RR}*<a*_{Rr}, the increase in *X*_{Rr} for decreasing *F* is not matched in magnitude by the decrease in *X*_{RR}. In all examples under case 3 examined here, the increase in *X*_{Rr} exceeded the decrease in *X*_{RR}. Thus, *X*_{Rr}+*X*_{rr} increases with decreasing *F*, and there is a higher relative proportion of *X*_{Rr} (compared with *X*_{RR}) in the population for *F*<1 in the case when *a*_{Rr}<*a*_{RR} than when *a*_{Rr}=*a*_{RR}. Second, the increased size of *X*_{Rr} has a negative density-dependent effect on the reproduction of *X*_{rr} by causing *N*^{*} to increase, while the increased proportion of *X*_{Rr} in the population has a potentially positive effect on *X*_{rr} reproduction because mating between the additional Rr-individuals and both other Rr-individuals and rr-individuals contributes to the *X*_{rr} reproductive rate. It is difficult to say *a priori* whether the negative or positive effect will dominate for a particular set of parameters, but the positive effect tends to exceed the negative effect for large values of (small *β* and large *α*) and vice versa. Hence our original prediction for question 1 that higher levels of endemic disease should be found in selfing than in outcrossing populations was found not to be true in general.

In question 2 (§1*b*), we asked if increasing inbreeding would lead to a greater frequency of the resistance allele in subsequent generations. We usually found a greater frequency of the resistance allele with increased inbreeding using numerical evaluations of the model (figure 3*b*). However, under certain circumstances, such as a high cost of resistance for homozygotes (i.e. low *a*_{RR}), the resistance allele may actually be present in higher frequencies in populations with more outcrossing, whereas it has smaller frequencies, or is even lost in some selfing populations (curves 4 and 6 in figure 3*b*).

For case 2, or *a*_{RR}=*a*_{Rr}, despite the higher R-frequency in a selfing population, selfing and outcrossing populations have the same phenotypic frequencies of resistant and susceptible individuals due to their differing levels of heterozygosity. For case 3, or *a*_{RR}<*a*_{Rr}<*a*_{rr}, as mentioned earlier the total number of resistant individuals generally seems to increase with decreasing *F*, though we could not determine in general whether or the frequency always do so. The only study to date to examine whether the selfing rate affected the prevalence of resistance in a population was a study of two *Linum marginale*–*Melampsora lini* host–pathogen metapopulations with different outcrossing rates (Burdon *et al*. 1999). Results of this study showed that the predominantly selfing populations did not have significantly lower levels of resistance to a variety of fungal isolates than predominantly outcrossing populations.

One assumption of this model is that selfers do not experience inbreeding depression. This assumption makes the model a more conservative test of the effect of inbreeding on pathogen transmission and the frequency of resistance in a population. Inbreeding depression could cause disadvantages to selfers in that their offspring would have a generalized poor condition that could increase the probability of pathogen infection. Inbreeding depression would create an inherent advantage to outcrossing that may alter the conclusions of this model. The results of this model are still applicable to populations that do not experience significant inbreeding depression.

An important assumption that affects the results of this model is a cost of resistance to pathogen infection. Is there such a cost? The prevalence of polymorphism for resistance in natural plant populations implies that resistance must be costly; otherwise, all plants would be resistant (Parker 1992; Simms 1992). An extensive review of plants resistant to pathogens found a cost of resistance in 50% of the studies surveyed (Bergelson & Purrington 1996), although costs are difficult to measure and may occur only in certain circumstances. Another study created transgenic *Arabidopsis thaliana* that differed from control plants by possessing a single resistance gene (Tian *et al*. 2003). These plants suffered a 9% cost of resistance in the absence of disease.

In conclusion, this model links the genetic effects of host selfing to the prevalence of disease in a population and the frequency of the resistance allele. In this paper, we have considered only one epidemiological model, albeit an important one. We show in the electronic supplementary material, part B, that analytic solution is possible in a broad array of assumed infection rate functions, including dependence on *F*. When costs of resistance are associated with the number of resistance alleles an individual possesses (*a*_{RR}<*a*_{Rr}), the relationship between increased inbreeding and both the prevalence of disease and the frequency of the resistance allele is complex. Some sets of parameters lead to more disease, while others lead to less disease with increasing inbreeding. Likewise, the frequency of the resistance allele can increase or decrease with increasing inbreeding depending on other model parameters. On the other hand, although disease has direct negative effects on individual fitness through increased death rate, in the case that costs of resistance are associated with the resistance phenotype (*a*_{RR}=*a*_{Rr}), the selfing rate does not affect the proportion of resistant, susceptible, or infected individuals in a population. This surprising result exemplifies the difficulty in translating effects on individuals to a population level, an important step when considering density-dependent processes such as pathogen transmission.

## Acknowledgments

We thank Stewart Schultz and Jim Bever for discussions and ideas. Janis Antonovics was helpful during the original development of the model. We thank Jeremiah Busch, Doug Scofield and members of the Clay lab for feedback. J.M.K. was supported by the University of Miami Biology Department and the Indiana University Biology Department. D.L.D. was supported by the USGS's Florida Integrated Science Centers.

## Footnotes

The electronic supplementary material is available at http://dx.doi.org/10.1098/rspb.2006.3519 or via http://www.journals.royalsoc.ac.uk.

- Received January 25, 2006.
- Accepted February 13, 2006.

- © 2006 The Royal Society