## Abstract

The epidemic dynamics of infectious diseases vary among cities, but it is unclear how this is caused by patterns of infectious contact among individuals. Here, we ask whether systematic differences in human mobility patterns are sufficient to cause inter-city variation in epidemic dynamics for infectious diseases spread by casual contact between hosts. We analyse census data on the mobility patterns of every full-time worker in 48 Canadian cities, finding a power-law relationship between population size and the level of organization in mobility patterns, where in larger cities, a greater fraction of workers travel to work in a few focal locations. Similarly sized cities also vary in the level of organization in their mobility patterns, equivalent on average to the variation expected from a 2.64-fold change in population size. Systematic variation in mobility patterns is sufficient to cause significant differences among cities in infectious disease dynamics—even among cities of the same size—according to an individual-based model of airborne pathogen transmission parametrized with the mobility data. This suggests that differences among cities in host contact patterns are sufficient to drive differences in infectious disease dynamics and provides a framework for testing the effects of host mobility patterns in city-level disease data.

## 1. Introduction

Infectious diseases cause morbidity in most humans each year [1] and account for a significant portion of all yearly human mortality [2]. As the global urbanization rate continues to rise past 50%, cities will more often act as focal points for epidemics; providing venues where strangers are more likely to interact, representing the most probable locations of first detection and incurring a greater share of the casualties. Given cities’ pivotal role in the spread of infectious disease, it is important to understand why they exhibit systematic variation in the timing and severity of epidemics [3–5].

Human mobility patterns generate the proximity between individuals prerequisite for the transmission of many infectious diseases. This suggests that cities with different mobility patterns may also differ in the rate at which their inhabitants have infectious contact, leading to variation among cities in the risk of an epidemic [5–7]. Human movement patterns are heterogeneous at a wide range of scales—from within a building [8] to among countries [9–11], as evidenced by diverse sources of data, including the movements of mobile phone users [12,13], air travel patterns [9–11] and census data on commuting patterns [10,14,15]. At each scale, there appear collective mobility patterns maintained far from those predicted by homogeneous random movement. These dissipative structures [16] have the potential to create localized areas where infectious contact rates are systematically amplified [17]. Empirically reconstructed contact networks suggest systematic variation in infectious contact rates across countries, age groups and other sociodemographic factors [18].

Individual variation in rates of infectious contact can significantly alter patterns of disease spread [6,7,15,19–21] and theoretical models of disease dynamics within and among cities (both individual-based simulations [5,7,10,15,22–24] and metapopulation models [10,11,14,15,25–27]) have shown that heterogeneous contact patterns are potentially important in determining urban epidemic dynamics. However, few studies have examined whether empirical variation in intra-city mobility patterns is sufficient to drive detectable differences in epidemic dynamics among cities.

Here, we use 2006 census data on the mobility patterns of 7 225 810 workers in 48 cities to test whether cities differ enough in their mobility patterns to generate differences in their risk of an epidemic. In the first part of the paper, we quantify differences in mobility patterns among cities, using heterogeneity statistics and transportation models to examine whether cities vary systematically in the level of organization in their mobility patterns. In the second part of the paper, we use the commuting data to parametrize a basic model for the spread of an airborne pathogen in each respective city, to test whether the observed differences among cities in mobility patterns are sufficient to generate significant differences among cities in the risk of an epidemic.

## 2. Material and methods

We analysed data from the 2006 Canadian census [28] on the commuting patterns of every worker in 48 Canadian cities—a total of 7 225 810 individuals (see the electronic supplementary material, table S1). The data for each city are organized by census tract (CT) and record the number of workers who live in CT *i* and work in CT *j*, denoted *T _{ij}*. CTs are contiguous geographical areas where 2500–8000 people reside and are typically the same area as a few city blocks. While the geographical area of a CT is influenced by residential population density, the boundaries of CT are chosen without regard to the number of individuals who travel to work there. Let represent the number of workers who reside in CT

*i*. Let represent the number of workers who travel to work in CT

*j*. Let and

*σ*

_{m}represent the mean and standard deviation of

*m*in a given city. Let represent the total number of workers in a city and

_{j}*N*the total population of the city across all CTs that have any workers. Lloyd's ‘mean crowding’ statistic [29] measures worker density from the perspective of workers in their workplaces: in a given city,

*m** is the average number of other individuals who work in the same CT as a worker chosen at random, whereas is the average worker density in a CT chosen at random.

To further characterize inter-city differences in human mobility patterns, we compared how well the patterns for each city could be explained by alternative transportation models [30]. These models respectively describe processes that lead to different degrees of organization in human mobility. The models took the form , where *p _{ij}* is the probability that an individual who resides in CT

*i*will travel to work in CT

*j*, and denotes the expected number of commuters between

*i*and

*j*under the transportation model specified by

*p*. We first used a configuration model [19] for the inter-CT commuting network in each city, which makes the neutral prediction that the probability that a worker who resides in

_{ij}*i*will work in

*j*is proportional to the total number of individual workplaces in

*j*. The configuration model is closely related to the gravity model from transportation analysis, whose suitability for analysing mobility data is under debate ([30]; see the electronic supplementary material). The second model we examined was the newly described radiation model [30] of transportation patterns, which modulates flow between two locations by accounting for the number of potential destinations in the area between them. Both the configuration and radiation model are parameter free.

We asked if the systematic differences in mobility patterns we discovered were sufficient to cause differences in epidemic dynamics among cities by using the commuting data to parametrize a stochastic, spatially explicit, individual-based model of airborne pathogen transmission for each city (see the electronic supplementary material). Epidemic dynamics that result from home–work movements can also be modelled using recently developed metapopulation frameworks [15,25–27,31]. These models aggregate individual behaviour to consider host mobility and disease spread patterns between subpopulations. Accordingly, in the electronic supplementary material, we explore the consequences of relaxing correlations arising from the preservation of individual identities in our model.

To implement the model, we first used the commuting data to estimate the frequency of contact between each possible pair of workers in a city. We then translated contact frequency into pairwise transmission hazard using a basic model of within-host pathogen dynamics for acute infections. We did this for a range of pathogen transmissibilities (here pathogen transmissibility, *λ*, expresses the strain-specific ratio between within-host pathogen load and transmission hazard; log_{10}(*λ*)∈{0, 0.25, 0.5, 0.75, 1}). Our model makes two assumptions: first, we assume that the spatial trajectories of humans in cities can be predicted by their home and workplace locations, which is supported by recent analyses of high-resolution data on the relocation patterns of mobile phone users [13]; second, we assume that excursions from an individual's bed or work station are governed by a stochastic process that is identically distributed across cities. This leads to transmission patterns that conform to mass action within the radius of motion of an individual but are determined by the commuting data at the scale of a city. Although in reality, cities are connected by inter-city commuting, these connections are relatively weak compared with intra-city commuting, and we are testing the prediction that cities can have differences in epidemic dynamics generated endogenously by intra-city mobility patterns. Thus, we model each city separately. If inter-city variability in commuting patterns is sufficient to generate differences among cities in epidemic risk, our model of transmission should predict different disease dynamics in different cities for the same pathogen transmissibility.

## 3. Results

The structure of the commuting matrix, *T _{ij}*, varies markedly both within and among cities (figure 1). A striking feature in these visualizations is the appearance of star shapes in some cites. Star shapes appear where commuting flows originating from many distinct CTs are directed towards a single centralized work location. The crowding statistic

*m** can be used as a measure of the prevalence of star-shaped commuting patterns in a city, because its value increases with the level of aggregation in individual workplace locations [29].

We find that the average number of workers per CT, , saturates rapidly as *N* increases. By contrast, *m**, which then measures overdispersion in individual workplace locations, exhibits a strong positive correlation with *N*. This indicates that workers in larger cities tend to organize into a few extraordinarily populated work areas, while maintaining the same average number of workstations per CT as smaller cities (figure 2*a*). In other words, the prevalence of star-shaped commuting flows in a city is only weakly correlated with the average number of people who work in a CT, but nonetheless varies strongly with total population size, leading commuting patterns in larger cities to be more highly organized around a few focal work locations.

Cities also show marked size-independent variation in *m**, evident in the ratio of *m** to the value predicted by a regression of *m** as a function *N* (see the electronic supplementary material, table S2). In the 48 cities we analysed, (hereafter ‘excess heterogeneity in mobility patterns’) ranged from 0.43 to 3.07, a 7.14-fold difference. For two cities chosen at random, the average ratio between the larger and smaller values of is 1.55. This size-independent difference in mobility patterns is equivalent to the predicted size-dependent difference (based on the regression line in figure 2*a*) that would result from a 2.64-fold change in population size. Thus, differences unrelated to population size are an important component of the variation in worker mobility patterns between cities.

The configuration model explains much of the variation in commuting flow in small cities, but its performance decreases systematically with *N*, indicating that larger cities have increasingly highly organized mobility patterns (figure 2*b*). The fit of the gravity model is poorer than that of the configuration model, but it exhibits the same trend of fitting smaller cities better than larger cities (see the electronic supplementary material). The radiation model also shows systematic variation in performance: as the fit of the configuration model declines, the performance of the radiation model increases, performing relatively poorly in small cities and better in larger ones (figure 2*b*).

Cities with more organized commuting patterns (meaning a larger value of *m**) are predicted to have a higher probability (*P*) of an epidemic following the introduction of a single randomly chosen infected individual (figure 3*a*; electronic supplementary material, table S3). This relationship persists once the effects of population size on *m** and *P* are removed: among cities of the same size, increased excess heterogeneity in mobility patterns is predicted to cause a significant increase in the risk of an epidemic, relative to the average risk for a city of that size (figure 3*b*). The change in relative risk produced by increasing excess heterogeneity in mobility patterns is greater for pathogens with lower transmissibility. Less contagious pathogens also show more variability in relative risk among cities.

The predicted number of individuals infected by the end of an epidemic, *F*, also scales with the level of organization in commuting patterns (figure 3*c*). As with the probability of an epidemic, the influence of organized host mobility patterns on *F* is still significant once the effect of population size is removed (by considering the effect of excess heterogeneity in mobility patterns on excess infected—; ; figure 3*d*). In summary, the simulations show that extant differences among cities in the level of organization in human mobility patterns are sufficient to significantly alter the risk and severity of an epidemic among cities. The average magnitude and variability of this effect depends on pathogen transmissibility.

## 4. Discussion

Larger cities depend on higher levels of organization that increase economies of scale [32,33]. Here, we show that increasing organization in cities may also have important consequences for the spread of infectious disease. Whereas epidemic models have typically assumed that human populations are identically mixed for the purposes of infectious contact, our results add to an increasing body of empirical evidence that infectious contact rates in humans vary systematically among populations [7,18,34]. Correspondingly, heterogeneities in human mobility patterns can explain more of the variability in regional epidemic data than analyses which posit identically mixed host populations [14,35,36], and recent metapopulation models of disease spread have been developed to describe recurrent host movements and retain information on individual identity [15,37,38].

An important direction for future work lies in understanding what signals mobility patterns leave in city-level epidemic data when other important factors are integrated into the analysis, such as inter-city variation in age distribution, immune history or movements within and between cities not accounted for by the commuting data. For instance, we simulated the epidemic dynamics of each city independently, but movement of individuals among cities also affects epidemic dynamics [39]. Another area for improvement is that the commuting data we used represents only the movements of working adults, but the dynamics of many respiratory infections depend on transmission among children [40]. In addition to having different levels of susceptibility, younger age groups can have different mobility and contact patterns [18]. We hypothesize, however, that the transmission model described here approximates (albeit imprecisely) the presence of children by creating contacts among working adults who reside in the same CT. In addition, it is plausible that the mobility patterns of children are similar across cities, so while transmission among children is important, the mobility patterns of workers lead to differences among cities. Also, differences in influenza dynamics among US states have been partially explained using only the movement patterns of workers [14].

Our results provide empirical support for the potential importance of contact heterogeneity at the intra-city scale and show new evidence that successfully forecasting epidemics in cities may require us to identify differences in intra-city mobility patterns among cities of similar sizes. Conversely, increasing numbers of infected in larger populations are not necessarily caused exclusively by increases in the number of potential hosts. Instead, increases in the level of heterogeneity in human mobility patterns in larger cities are sufficient in themselves to significantly increase the risk and final size of epidemics. In the face of limited infrastructure for rapidly implementing quarantine and vaccination policies to control the spread of emerging pathogens, an empirical link between human mobility patterns and disease incidence at the scale of individual cities may allow more effective containment strategies, which exploit predictable inter-city differences in the rate of disease transmission. Novel analyses that connect detailed information on human contact patterns with city-level disease data are required in order to test the importance to real epidemics of the systematic differences in mobility patterns we have described here [41].

## Funding statement

This research was supported by Canadian Institutes of Health Research grant no. PTL-97126 to B.P. and by a National Sciences and Engineering Research Council of Canada PGS-D grant to B.D.D.

## Acknowledgements

Anurag Agrawal, Monica Geber, Giles Hooker, Matt Holden, Colin Parrish and the Theoretical Ecology Reading Group at Cornell provided comments on previous versions of this manuscript.

- Received March 27, 2013.
- Accepted June 24, 2013.

- © 2013 The Author(s) Published by the Royal Society. All rights reserved.