## Abstract

Invasive species are increasingly becoming a policy priority. This has spurred researchers and managers to try to estimate the risk of invasion. Conceptually, invasions are dependent both on the receiving environment (invasibility) and on the ability to reach these new areas (propagule pressure). However, analyses of risk typically examine only one or the other. Here, we develop and apply a joint model of invasion risk that simultaneously incorporates invasibility and propagule pressure. We present arguments that the behaviour of these two elements of risk differs substantially—propagule pressure is a function of time, whereas invasibility is not—and therefore have different management implications. Further, we use the well-studied zebra mussel (*Dreissena polymorpha*) to contrast predictions made using the joint model to those made by separate invasibility and propagule pressure models. We show that predictions of invasion progress as well as of the long-term invasion pattern are strongly affected by using a joint model.

## 1. Introduction

Internationally, governments have prioritized invasive species as a key environmental concern (Millennium Ecosystem Assessment Board 2005). Invasive species affect trophic structure, cause large ecosystem changes and interact strongly with many other drivers of global environmental change. They are a leading cause of biodiversity loss (e.g. Mack *et al*. 2000; Dextrase & Mandrak 2005). Despite large potential damages, society has often been slow to take action, presumably owing to the high degree of uncertainty about if, where and when invasions will occur (Park 2004). Not surprisingly, researchers have invested considerable effort into conceptualizing the invasion process and developing methods to forecast invasions. These efforts reduce uncertainty and help us to determine where resources should be allocated.

At the conceptual level, researchers have identified several common components of species invasions (Kolar & Lodge 2001). In simple terms, species come from somewhere—a native range or a different invaded region—and get transported to new areas via vectors and pathways (e.g. ballast, wind, animals). Propagule pressure, or the number of invaders reaching a new area, has been determined to be an important predictor of invasion success (Lockwood *et al*. 2005). Once they reach a new area, invaders need to persist in their new habitat, which will depend on environmental conditions in relation to individual species characteristics. We use the term invasibility to describe these necessary environmental conditions and consider a site invasible if an invasive species can persist (i.e. survive and reproduce) at that site. If they persist, they may increase in abundance and spread, potentially causing detrimental environmental impacts.

Researchers have been building predictive models based on individual components of the conceptual model; for example, they have forecasted invasions using propagule pressure (Leung *et al*. 2004), environmental conditions (e.g. Ramcharan *et al*. 1992; Peterson 2003) and species characteristics (Rejmanek & Richardson 1996; Kolar & Lodge 2002). While there are a growing number of studies forecasting species invasions, there have been few attempts to integrate multiple components of the invasion process into a single model (but see Rouget & Richardson 2003; Herborg *et al*. 2007).

Logically, we would expect that the probability of establishment should be due jointly to propagule pressure and invasibility. Therefore, some sort of joint model would be beneficial. However, it is not clear whether the analysis of each component in isolation simply results in additional uncertainty or in different long-term predictions, nor whether we can sum or multiply the results from individual analyses (i.e. treating each as a filter) or whether an explicitly joint model is required. Despite these logical arguments, researchers generally have not examined this issue formally, and very little effort has been expended to define how components of invasion differ in their contribution to overall risk.

In this paper, we formalize a joint propagule pressure–invasibility model. We apply this joint model to an existing dataset for zebra mussels. We use a probabilistic rather than a dichotomous (i.e. invade/not invade) approach, as we believe probabilities provide the most appropriate way to model invasions. Probabilities integrate naturally into quantitative risk analyses (e.g. Leung *et al*. 2002) and explicitly acknowledge that there may be unknown interacting variables that can determine invasion success. If necessary, probabilities can easily be converted into a dichotomous response variable.

## 2. Joint invasibility–propagule pressure model

In its simplest form, the joint probability of establishment may be described by the product of the probability that a location is invasible (environmental conditions can support a population of invaders) and the risk due to propagule pressure (the number of individuals reaching a given location),(2.1)where *J*_{l,t} is the joint probability of establishment at location *l* by time *t*; is the probability of being invasible (*I*), given known environmental conditions *x*_{l} at location *l*; and is the probability of establishment (*E*), given propagule pressure *N*_{l,i} at location *l* during time interval *i*. In this way, propagule pressure to location *l* can change over time as the invasion progresses. Following probability theory, each element is multiplied together to give the joint probability of establishment.

If propagule pressure is constant over time , equation (2.1) simplifies to(2.2)The effect of propagule pressure is time dependent and eventually reaches unity if *P*(*E*|*N*_{l,i}) is non-zero. At each time interval, there is a probability *P*(*E*|*N*_{l,j}) of becoming established, determined by the propagule pressure, if a site is invasible. The complement is the probability of remaining uninvaded. If the probability at each time interval is independent, the probability of a given site remaining uninvaded decreases over time according to [1-*P*(*E*|*N*_{l,i})]^{t} for the simpler case of equal propagule pressure over time. Thus, the probability of being invaded by time *t* is the complement of remaining uninvaded until time *t*, , for an invasible site (equation (2.2)). The more general form of equation (2.1) is appropriate where propagule pressures change over time.

We treat invasibility, , as a probability. While the invasibility of an area might be dichotomous, our predictions are probabilistic because we have only measured a subset of important environmental variables. It is reasonable to expect that there may be additional environmental variables that may determine whether an area can be invaded but that have not been measured. Thus, generally, the probability that a site is invasible, given known environmental conditions, , will depend upon whether is suitable for survival and the frequency at which coincides with suitable unknown environmental conditions (figure 1). In other words, only a fraction of sites are invasible such that behaves like an asymptote limiting the expected number of invasions under known conditions .

Any number of complexities may also occur, but it is our objective to keep our points simple and clear. For instance, there may be system-specific factors that determine invasion success; however, we focus on propagule pressure and invasibility as they are arguably centrally important components to all invasions. Additionally, environmental variables may be correlated with one another, or may interact to determine whether establishment is possible (e.g. it may be the combination of pH and calcium that determines whether zebra mussels can establish; Hincks & Mackie 1997). Regardless of the specific system or relation, the key points are: (i) we should use probability distributions to describe invasibility. Because some relevant environmental variables may not have been measured but potentially interact with , we should expect our predictions on invasibility to be uncertain (figure 1); (ii) invasibility acts as an asymptote—the fraction of sites that can be invaded, given . As invasible sites become invaded, the remainder would be those sites that are actually uninvasible due to unmeasured environmental variables. These will remain uninvaded regardless of propagule pressure; and (iii) the rate at which the ‘invasibility asymptote’ is reached is determined by propagule pressure. As propagule pressure increases, the probability of invasion per time interval increases. Given sufficient time, an invasible site with significant propagule pressure will eventually become invaded.

## 3. Application to zebra mussel dataset

While we believe that the logic for the importance of a joint model is clear, we need to demonstrate that importance for real-world systems. We applied the joint model to the zebra mussel dataset used in Leung *et al*. (2004). First, we needed to develop sub-models to estimate invasibility and the risk due to propagule pressure.

### (a) Invasibility sub-model

We used a neural network approach to fit a probability surface, linking invasibility to environmental variables (cf. Olden & Jackson 2002). We used a basic multilayer perceptron, containing three layers: an input layer, a middle (hidden) layer and an output layer. Each node in the input layer corresponds to one variable (e.g. pH). Each node in the middle layer allows an additional shape to be generated, following a functional form, in this case a logistic curve,(3.1)where *V*_{i} is the output from node *i* in the middle layer; *b*_{i,0}–*b*_{i,m} and *a*_{i} are coefficients for node *i*; and *x*_{1}–*x*_{m} are environmental variables.

The output layer integrates across all nodes in the middle layer to generate an overall probability of being invasible, given known environmental conditions (). With multiple nodes in the middle layer, each one producing a curve in a different orientation, with different steepness and asymptote, there is great flexibility in the shapes of the probability surface that can be captured using a neural network. For our system, we had two nodes in the input layer, corresponding to pH and calcium, respectively (Ramcharan *et al*. 1992), four nodes in the middle layer and one output node providing the probability . This allowed the generation of virtually any unimodal probability surface.

### (b) Propagule pressure sub-model

Next, we needed to estimate propagule pressure and then link that estimate to the risk of establishment—*P*(*E*|*N*_{l,t}). Counting the actual number of viable propagules introduced into each of thousands of lakes would be impossible. However, as with invasibility, models are useful for providing indices of propagule pressure, which could then be fed into our model. To estimate the risk due to propagule pressure, we built upon published work using Leung *et al*. (2004) as our starting point. Specifically, we had information on zebra mussel invasions that occurred between 1992 and 2001. Further, we knew that recreational boaters were the primary vector, carrying zebra mussels from invaded to uninvaded lakes (Johnson *et al*. 2001). We used a production-constrained gravity model to estimate boater movement patterns and assumed that propagule pressure was proportional to boater traffic from invaded to uninvaded lakes (developed fully in Leung *et al*. (2004, 2006)). Thus, we obtained relative propagule pressure estimates (*N*_{l,t}) for each year from 1992 to 2001. This allowed us to incorporate changes in propagule pressure as the invasion progressed and more lakes became invaded and acted as potential sources of propagules. While we built upon approaches that we had previously developed, we note that any method that provides quantitative estimates of propagule pressure could be used in our model (through *N*_{l,t} in equation (3.2)), and that there are numerous predictors that might aid in developing those estimates (e.g. distance, boater populations, lake size, Bossenbroek *et al*. 2001; spatial heterogeneity, Kumar *et al*. 2006; ballast water discharge, Herborg *et al*. 2007).

Following Leung *et al*. (2004), we used a Weibull function to link propagule pressure to the probability of establishment for an invasible site (see also Dennis 2002),(3.2)where *N*_{l,t} is propagule pressure during time *t* at location *l* and *α* and *c* are shape parameters. Proportional estimates of propagule pressure (based on boater traffic) would be sufficient because the proportionality constant would be integrated into the fit parameter *α*.

### (c) Joint model

To build the joint model, we needed to simultaneously integrate our invasibility estimate with our estimate of probability of invasion due to propagule pressure. We had explicit information specifying when invasions occurred and this allowed us to build a more refined model in comparison with the basic formulations described in equations (2.1) and (2.2). Here, we define *H*_{l} as the joint probability of an observation—location *l* becoming invaded during time *t* or remaining uninvaded for the entire duration (*T*) of the study. The joint probability that location *l* becomes invaded at time *t* is given by the probability that it is invasible () and the probability that it has remained uninvaded for each time interval *i* up to time *t*−1, given propagule pressure (*N*_{l,i}), but becomes invaded during time interval *t*, given propagule pressure (*N*_{l,t}),(3.3)If we consider only the propagule pressure model, is omitted from equation (3.3), and if we consider only the invasibility model, only is included in equation (3.3).

Locations that do not become invaded for the entire duration of the study (*T*) can either be uninvaded because they are uninvasible or because there has not been sufficient propagule pressure to become invaded,(3.4)or equivalently(3.5)If we consider only the propagule pressure model, equation (3.4) would be . If we consider only the invasibility model, equation (3.4) would be _{.}

The log likelihood (*L*) for the entire dataset (*D*) for a given model (*M*) of invaded and uninvaded locations is(3.6)Maximum-likelihood techniques were used to find the parameter values (needed for equations (3.1) and (3.2)) that best fit the data, for each model: the invasibility model, the propagule pressure model and the joint model.

## 4. Forecasting invasion probabilities

Using the zebra mussel dataset, we examined whether model projections of invasibility and estimates of risk due to propagule pressure differed by using the joint model. Specifically, for invasibility, we compared model projections of across all *x*_{l} observed in the dataset for the joint model and the invasibility model. For propagule pressure, we compared model projections of *P*(*E*|*N*_{l,t}) across all *N*_{l,t} for the joint model and the propagule pressure model.

Next, we compared predictions from the models to observed invasions. We used the zebra mussel data from 1992 to 1996 to parameterize the models and the data from 1997 to 2001 as our validation set to test the predictions. In the absence of any predictive model, we began with a ‘null’ model, which was essentially the fraction of lakes that became invaded multiplied by the number of lakes examined. We compared the null model, the invasibility sub-model, the propagule pressure sub-model and the joint model. For each predictive model, we ranked each lake in terms of their relative risk of becoming invaded. As our comparison metric, we used the top 100 ranked lakes for each model. We compared model predictions to the observations, i.e. how many of the 100 lakes predicted to be at high risk were actually observed to become invaded using our validation dataset from 1997 to 2001.

## 5. Results

We used the zebra mussel dataset and forecasts of the joint model and each sub-model, i.e. invasibility and probability of establishment due to propagule pressure were examined individually. The projected estimates of invasibility were substantially higher using the joint model compared with the invasibility sub-model (figure 2*a*). Thus, over the long term, the fraction of sites that were predicted to become invaded by zebra mussels differed dramatically by using the joint model. Similarly, the estimated relation between probability of establishment and propagule pressure was steeper for the joint model compared with the propagule pressure sub-model (figure 2*b*). Thus, the projected rate at which lakes become invaded also differed, up to the asymptote defined by invasibility. For an invasible lake, smaller numbers of propagules were predicted to be necessary to achieve a given probability of invasion for the joint model. In short, using the joint model changed both the trajectory and the long-term expectation of invasion pattern and progress.

The models also differed in their ability to identify which lakes would become invaded by zebra mussels in the validation dataset (1997–2001). All predictive models provided improvements compared with the null model: with the invasibility sub-model, we identified twice the number of lakes that became invaded compared with the null model; with the propagule pressure model, we identified twice as many as the invasibility model; and with the joint model, we identified two-and-a-half times as many as the invasibility model (figure 2*c*).

## 6. Discussion

Recently, there has been an increasing number of papers attempting to predict species invasions (e.g. Peterson 2003; Muirhead & MacIsaac 2005). We believe that these works are highly valuable and will allow us to better understand where invasions are likely to occur and to better focus our management efforts. Here, we took the next step and formalized the construction of a joint model that integrated propagule pressure and invasibility. Such integration is important, as the results of this study made evident (figure 2). Logically, if we considered only invasibility, the potential extent of the invasion would be underestimated because we would not have incorporated the fact that some areas may be uninvaded simply because they have not had enough time for invasions to occur, rather than having unsuitable environments (figure 2*a*). Conversely, propagule pressure is only relevant for sites that are invasible. If we considered only the propagule pressure model, the effect of propagule pressure on the probability of establishment in invasible sites would be underestimated since our statistical estimate would be biased downwards by non-invasible sites (figure 2*b*). Over the long term, for models that consider only propagule pressure, we would predict that all sites would eventually be invaded, given enough time and a non-zero probability *P*(*E*|*N*_{l,t}) (equation (2.2)), because all sites would be treated as invasible. This would probably be false. However, where the data simply do not exist to build a joint model, the sub-models still offer improved predictability—we should always use the best information available. Nevertheless, where possible, a joint model is arguably most beneficial to get the most reasonable predictions of invasion progress over time and determine what management actions are justifiable.

The corollary of the above is that with a joint model it becomes clearer how the relative importance of invasibility versus propagule pressure changes with time and the stage of invasion (Karst *et al*. 2005). If invasion is in its early stages, the dynamics will be largely driven by propagule pressure, such as in this study (*ca* 10% of sites invaded). If the invasion is far progressed, propagule pressure should no longer be predictive and invasion status should primarily be driven by invasibility—all sites could have had sufficient propagule pressure for invasions to occur. Thus, using techniques that incorporate only invasibility (e.g. GARP, Peterson 2003) to predict invasions may be effective using an invader's native range, under the assumption that adequate propagule pressures have occurred such that most potentially invasible areas have been invaded. However, in the new range, treating observed absences as uninvasible may be unwarranted as there might have been little propagule pressure to those areas. An explicitly joint model does not suffer from this limitation and is consistent regardless of the stage of invasion. In fact, these could be treated as testable hypotheses in other systems: propagule pressure is more important early in an invasion; invasibility is more important later in an invasion; and the joint model is always appropriate (derived from equations (2.1) and (2.2)).

Further, we believe that the appropriate way to analyse invasions is to explicitly use probabilities rather than an invasible/not invasible dichotomy. If we accept that there are typically unmeasured environmental variables that might be needed for persistence of a species, a fraction of sites should be uninvasible even when known environmental conditions appear suitable. The probabilities will be determined by the overlap of the known and unknown environmental variables (figure 1). Probabilities also fit naturally into quantitative risk analyses, which, in our opinion, is the most coherent framework for decision making.

There is interest in probabilistic risk analyses in government as well as academia (Lodge *et al*. 2006). Thus, a joint model, expressed in probabilities, has strong ramifications for decision making. At the conceptual level, we need to explicitly acknowledge that the risks due to propagule pressure and invasibility have different behaviours—risk due to propagule pressure is time dependent whereas invasibility may not be. Given that most management actions are based on trying to reduce propagule pressure, management is implicitly concerned with slowing invasions, assuming that propagule pressure is not reduced to zero (e.g. ballast exchange, Drake *et al*. 2005; Minton *et al*. 2005). That is not to imply that management actions are not important. Indeed, explicit cost–benefit analyses suggest that slowing invasions can be very worthwhile (Leung *et al*. 2002).

In conclusion, we recommend that models integrating invasibility and propagule pressure in a probabilistic manner should be adopted where possible (if not, sub-models should still be used as they still offer benefits). Integrated models aid in the conceptualization of the invasion process, permit coherent quantitative predictions of invasion progress over time and have large management implications. While our case study was developed for aquatic systems, the general principles and logic behind joint models should be applicable to terrestrial and other systems as well. The next challenge will be to create forecasting models that incorporate system changes, for example, due to species evolution (Peterson 2003), environmental change (e.g. global warming, Peterson 2003; Neilson *et al*. 2005), introduction of other invasive species (Mack *et al*. 2000) and changing human behaviours (Leung *et al*. 2002; Herborg *et al*. 2007).

## Acknowledgments

This work has been supported by NSERC and Fisheries and Oceans, Canada. We thank two anonymous reviewers and E. Gertzen, A. Irwin, A. Hyder, P. Edwards and S. Kulhanek for their helpful comments.

## Footnotes

- Received June 21, 2007.
- Accepted July 20, 2007.

- © 2007 The Royal Society