Prediction and control of the geographical spread of emerging pathogens has become a central public health issue. Because these infectious diseases are by definition novel, there are few data to characterize their dynamics. One possible solution to this problem is to apply lessons learnt from analyses of historical data on familiar and epidemiologically similar pathogens. However, the portability of the spatial ecology of an infectious disease in a different epoch to other infections remains unexamined. Here, we study this issue by taking advantage of the recent re-emergence of pertussis in the United States to compare its spatial transmission dynamics throughout the 1950s with the past decade. We report 4-year waves, sweeping across the continent in the 1950s. These waves are shown to emanate from highly synchronous foci in the northwest and northeast coasts. In contrast, the recent resurgence of the disease is characterized by 5.5-year epidemics with no particular spatial structure. We interpret this to be the result of dramatic changes in patterns of human movement over the second half of the last century, together with changing age distribution of pertussis. We conclude that extrapolation regarding the spatial spread of contemporaneous pathogens based on analyses of historical incidence may be potentially very misleading.
Over the past decade, the high-profile introduction and geographical spread of novel pathogens, such as SARS  and foot-and-mouth disease , and the ever-present threat of a devastating influenza pandemic  have emphasized the urgent need to better understand the spatial transmission of infectious diseases and their attendant control implications [4,5]. However, making predictions about the spatial dynamics of such diseases is hampered by the absence of data, precisely because they are emerging pathogens. Much progress in spatial epidemiology has resulted from studying long-term notification records of familiar infectious diseases, such as measles , seasonal influenza  and dengue . It is important to determine, however, whether conclusions drawn from historical studies of particular infectious agents can inform the epidemiology of a modern day and novel pathogen threat.
To explore this issue, we examine case reports of whooping cough (or pertussis) in continental USA, where concerted immunization programmes in the 1950s led to a drastic reduction in incidence over the subsequent three decades . Since 1980s, however, a significant and poorly understood upsurge in cases has been reported [10–12]. The fluctuating fortunes of pertussis control efforts in the US afford us a unique opportunity to study the spatial ecology of an infectious disease in the same localities but in two eras separated by many decades.
Whooping cough is a respiratory disease caused by the bacterium Bordetella pertussis. Historically, the onset of immunization programmes in developed countries were instrumental in the reduction of pertussis incidence [13,14]. It remains, however, one of the key microparasitic diseases of childhood and adolescence, with a significant annual burden of infant mortality , especially in developing nations . The well-publicized resurgence of pertussis reports in several countries that boast high vaccine uptake has been the subject of much debate [10–12,17] and has highlighted major gaps in our understanding of its epidemiology [18,19].
Here, we analysed monthly whooping cough incidence in the 49 continental US states (including the District of Columbia) from January 1951 to December 2010. In order to take into account large temporal differences in incidence, and in particular the near-absence of cases from 1970s to 1990s, our analyses proceeded by splitting the data into three different time periods, focusing exclusively on the early (1951–1962) and the recent (2002–2010) eras (see §2 and the electronic supplementary materials for the rationale for these specific eras). To quantify patterns in the data, we used wavelet decomposition, a technique that is particularly well suited to non-stationary time series [20,21].
2. Material and methods
(a) Pertussis incidence reports
Pertussis monthly notifications were collected at the state level for the 49 continental states (i.e. including the District of Columbia) from 1951 to 2010. These data were obtained from the National Notifiable Disease Surveillance System and are fully anonymized. In the absence of detailed state-specific vaccine uptake estimates, we are unable to infer the fraction of infected reported, though analysis of hospital records suggests a reporting probability of 11.6 per cent .
(b) Population sizes and spatial locations
Population size estimates by state and year from 1951 to 2010 are freely available from the US Census Bureau (http://www.census.gov/popest/archives). Given that our epidemiological data are resolved by state, we make the pragmatic choice to define a population as the number of inhabitants within a given state. Because of the strong aggregation of populations into high-density urban centres, our definition is defensible. For the goegraphical location of each state population, we considered its centroid defined as the point on which a rigid, weightless map would balance perfectly, if the population were represented as points of equal mass. In this study, we used the coordinates (in degrees) of these centroids for the year 2000. These are freely available from the US Census Bureau (http://www.census.gov/geo/www/cenpop/statecenters.txt).
(c) Wavelet transforms
Prior to wavelet decomposition, time series were square-root transformed in order to stabilize the variance. Time series were then centred and reduced in order to allow comparisons of their qualitative features (i.e. periodicity) from state to state. Wavelet decompositions were performed using a Morlet wavelet with a non-dimensional frequency ω0 = 6 (see [6,20,21] for mathematical expressions). Before wavelet transforms, time series were padded with zeros up to the nearest power of 2. Analyses with no padding gave very similar results (not shown). Filtering over a given frequency range is performed by summing the local wavelet power spectrum over this frequency range. Time series were smoothed by filtering over the frequencies where most of their power (i.e. variance) stands. A major advantage of complex wavelets such as the Morlet one is that they enable the quantification of phase angles of the time series. These phase angles give information on the timing of the epidemics. Phase angles were calculated after filtering and expressed, for each state, by taking their residuals in a linear model regressing phase angle of all states as a function of time (see the electronic supplementary materials and its figure S2 for more detailed explanations). They are thus called ‘residual phase angles’ in the rest of the text.
(d) Longitudinal speed of propagation
The longitudinal speeds of disease propagation were estimated from the slopes of the piecewise linear regression between phase angles in radians and longitude in degrees. The radians were first transformed into years, based on the period around which the time series were filtered. Taking into account the Earth's curvature, one longitude degree represents about 84 km at the latitude of 40° (the average latitude of the continental US).
(e) Critical community size
Critical community sizes (CCS) were estimated on a given time period by plotting, for each state, the proportion of months with zero disease notification during this time period against its mean population size over the same time period. Then, an exponential decay curve y(x) = Ae−Bx was fitted to these 49 data points, and the value of the CCS was defined as the value of x where y is equal to a proportion that represents one month over the considered time period.
(f) Spatial synchrony
The assessment of spatial synchrony was based on pairwise (i.e. inter-state) correlation coefficients. Contrary to phase angle analyses which account only for the qualitative features of periodicity, synchrony is strongly dependent on the quantitative features of the time series, especially the amplitude (and thus is more affected by potential changes in the reporting rate). The two approaches are complementary [6,7]. The relations between synchrony and Euclidian distance were estimated by the NCF library for R . In particular, the spatial correlation functions were estimated using the non-parametric spline covariance function and 1000 bootstraps to generate 95% confidence intervals (CIs) .
(g) Time period definition and robustness
The US pertussis incidence time series are characterized by strong non-stationarity, with a central era (1970–1990) during which very few cases were reported across the country. The rationale was thus to focus our analyses on the earlier segment of the data when the estimated vaccination coverage was still low  and pertussis incidence relatively high. Selecting a portion of the data on which analyses are performed naturally leads to concerns over the arbitrariness of the choice and its impact on the robustness of the results. We addressed these concerns by examining the behaviour of the distribution of the pairwise correlation coefficients, the number of states above and below the CCS, the spatial correlation functions and the global wavelet spectra, all calculated on the 1951 − x time periods, when x varies between 1951 and 2010 (see the electronic supplementary materials). These analyses identified a sharp transition around 1963 with constant behaviour before and after this transition (see the electronic supplementary material, figure S1b,d). The results presented for the first era with x = 1962 are robust with respect to x as long as x < 1970. Similarly, a second transition was identified around 2002, defining the second era but the results were less robust with respect to this choice (see §3). Finally, choosing the 1951 − x time period before or after the wavelet decomposition did not significantly affect the results.
The previously mentioned decrease in pertussis incidence during the 1960s is illustrated in figure 1a,b, followed by the recent rise in case notifications across states. In figure 1c, we display the mean local wavelet power spectra of the 49 states. This figure indicates that most variance in pertussis fluctuations is centred around the period of 4 years in the first era (1951–1962) and between 5 and 6 years in the most recent era (2002–2011). The figure also shows that wavelet decomposition fails to identify significant periodicity during the 1970s because of the paucity of reported cases.
Additionally, we carried out standard Fourier spectral analyses of pertussis incidence in each era for all states. Consistent with the mean wavelet results, we found that most states in the early and recent eras exhibit a pronounced statistically significant 4- and 5.5-year period, respectively (figure 2a,c), whereas there is no detectable periodicity in the time series from 1963 to 2001 (figure 2b). The extinction profile of pertussis—defined using the CCS concept —is also different in the three eras (figure 2d). Importantly, in the early era, in more than two-thirds of the states, the population size exceeded the CCS, with no pertussis extinctions. In the middle era, however, extinctions were frequently observed, while pertussis is endemic in 12 states in the recent era. The three epochs are also characterized by differences in their spatial synchrony [23,25] (figure 2e), with early and recent eras showing much more pronounced decay in synchrony with distance—characteristic of dispersal-driven synchrony —than the intermediate period (figure 2e).
To examine relationships among epidemics in different states, we filtered the time series for each state in the early and recent eras around the dominant periods 4 and 5.5 years, respectively (figure 2a,c). A striking feature that emerges is the systematic gradient in the timing of epidemics in the early era, as illustrated in figure 3a. However, this organization among states appears different in the two eras as exemplified by the reconstructed incidence in New York and Colorado (highlighted in orange and yellow, respectively). In the early era, these two states show epidemics that are out of phase with a lag of almost 2 years, with outbreaks in New York leading the country, while epidemic cycles in Colorado trail. In contrast, the dynamics in these states are almost in synchrony in the recent era. Calculation of the residual phase angles of the filtered time series in the two eras allowed us to quantify the lags observed between epidemics in each state.
Mapping the early era residual phase angles (figure 3c) reveals spatially organized travelling waves, with two discernible foci, and this pattern is robust respective to the number of years used to define the early era (see the electronic supplementary material, figure S2f together with accompanying movies of the spatial dynamics of filtered signals). Epidemics in the northeastern and northwestern states are strongly in phase, despite the substantial geographical separation; a feature that is also apparent in figure 2e (red curve), with increasing spatial correlation functions for distances in excess of 3000 km. Thus, northern states on the east and west coasts appear to act as foci, driving epidemic waves that spread southwest and southeast, respectively. When phase lags are regressed against the geographical location of states, we find longitude is a strong and significant predictor of the epidemic timing (figure 3f). This translates into a longitudinal progression speed westward from the northeast (323 km per month, 95% CI, 311–336 km per month) that is almost three times as fast as the eastward wave from the northwest (113 km per month, 95% CI, 106–120 km per month). For comparison, the radial speed of the dengue haemorrhagic fever waves emanating from Bangkok in the 1980s–1990s was estimated at 148 km per month (95% CI, 114–209 km per month) , while the waves of measles in the pre-vaccine era were estimated to spread from London to nearby environs at 20 km per month . Latitude, on the other hand, has much less effect (figure 3b).
In the recent era, in contrast, we find surprisingly little spatial organization (figure 3d). As we demonstrate in the electronic supplementary materials, while the earlier era is characterized by consistent phase hierarchy among states through time (see the electronic supplementary material, figure S2f), the later era exhibits substantial variability without any systematic structure (see the electronic supplementary material, figure S2m). Additional support for this clear difference between the eras is provided by the lack of robustness with respect to the length of time series used to define the recent era (see the electronic supplementary material, figures S1b,d), again confirming that there is no clear spatial organization in this era. Finally, by presenting the unfiltered time series ordered according to longitude, we confirm the spatial structure of pertussis dynamics in the first era (see the electronic supplementary material, figure S3a), together with its unravelling in the recent era (see the electronic supplementary material, figure S3b).
Finally, we examined whether residual phase angles depicted in figure 3 are associated with state population sizes and found no significant relationship (figure 4, F1,47 = 1.14, p = 0.29 for the first era and F1,47 = 0.85, p = 0.36 for the second era). This is in contrast with the relationship documented for measles in England and Wales . Furthermore, no relationship was observed between phase correlations and population size products (see the electronic supplementary material, figure S4), in contrast to seasonal influenza waves in the US . In the discussion section, we propose different lines of explanation for these observations.
4. Discussion and conclusions
To date, three distinct mechanisms have been identified to explain the spatio-temporal morphology of infectious disease systems.
I. Diffusive spread from a point source. This can result from the localized introduction of a pathogen into a virgin population, as documented in the systematic expansion of raccoon rabies  and West Nile virus  in the US and the Ebola virus in Zaire .
II. Source-sink dynamics and gravity coupling. Measles epidemics in the pre-vaccine era in England and Wales have been shown to be the result of recurrent waves emanating from large populous centres (sources) that percolate through the rural hinterland (sinks) . These waves are consistent with ‘gravity coupling’, whereby the epidemiological exchange between two centres is determined by their distance and their respective population sizes . Similar patterns have been noted in the spatial hierarchies of annual influenza epidemics across the US  and waves of dengue haemorrhagic fever in Thailand, pulsing across the country from Bangkok .
III. Environmental gradients. In contrast, the waves of seasonal influenza epidemics across Brazil are not thought to be driven by population density or patterns of movement. Influenza epidemics originate in northern Amazonian regions, where population density and movement rates are low, spreading to the more densely populated southern subtropical states over a three-month period . Climatological variables (including temperature and absolute humidity ) are thought to be the leading candidates for generating this wave .
In this study, we find clear spatial organization of pertussis epidemics in the 1950s and the conspicuous absence of such structure after its re-emergence half a century later. Among the two above-described mechanisms relevant to recurrent epidemics (mechanisms II and III), we can tentatively rule out mechanism III because if environmental factors were the primary drivers of patterns reported in figure 3c, then we would expect approximately similar geographical structuring in the recent era (figure 3d).
A rigourous evaluation of the role of mechanism II in the spatio-temporal dynamics of pertussis in the early era would require the use of statistical inference methods, applied to coupled state-specific mechanistic transmission models . Unfortunately, the absence of state-level vaccine uptake information in the 1950s precludes such an approach. Therefore, we resort to proximate indicators of gravitational coupling. Given that population sizes in most states in the early era exceeded the extinction threshold (CCS, figure 2d), the observed spatial waves are unlikely to have resulted from strict source–sink dynamics, where frequent local extinctions are followed by reignition from large reservoir states. We tested the effect of population size on the hierarchy of epidemic timings by two previously published methods (see figure 4 and electronic supplementary material, figure S4) and failed to observe any significant relationship, in contrast to measles in England and Wales  and influenza in the US .
A fourth hypothesis that may explain the patterns reported here for US pertussis in the 1950s is the ‘pacemaker’ mechanism, which has been readily predicted in theoretical studies [34,35] but has not been documented in large-scale epidemiological (or ecological) systems, so far. Here, the foci—or pacemakers—would be a collection of geographically clustered populations with highly synchronous epidemiological dynamics acting as local rhythm generators, such as states in the northeast and northwest in the 1950s. The factors that would determine the precise location of these foci are not yet understood. In mathematical models, however, pacemakers have been shown to arise in regions exhibiting higher than average connectivity , or—in spatially homogeneous systems—to emerge with no predictable location [34,36]. In epidemiological systems, regional population demography, patterns of localized movement and vaccine uptake levels are likely to play an important role in the spatial pattern formation. Theoretical work to identify the processes that may generate pacemaker dynamics and the impact of disease-specific epidemiological and immunological traits is clearly timely.
Over the 60 years of our dataset, we have documented major shifts in pertussis epidemiological dynamics. First, the inter-epidemic period shifted from 4 (figure 2a) to 5.5 years (figure 2c), which may be attributable to a decrease in the rate of susceptible recruitment either owing to a lower per capita birth rate or increased vaccine uptake. However, we point out that because of the brevity of the time series in the recent era (only 9 years), observed patterns need to be interpreted with caution, especially given the large CIs in figure 4b and variability in phase angles for some states in electronic supplementary material, figure S2l. Second, since the 1950s, there have undoubtedly been changes in pertussis reporting probability, likely attributable to changes in the surveillance system and improved diagnostic capabilities . However, such changes, if they can affect the quantitative aspects of the dynamics (such as mean incidence), are expected to have very limited effects on qualitative aspects of the dynamics (such as periodicity and phase) on which our analysis is based. We thus expect our results on spatial dynamics to be insensitive to changes in the reporting fidelity.
The precise mechanisms generating different pertussis metapopulation dynamics in these eras are not known. We speculate, however, that the extent of pertussis-specific spatial coupling between states is likely to have altered substantially over the time span of these data. This would have arisen for two reasons. First, it would be uncontroversial to point out that human mobility patterns have changed dramatically over the past 6 decades. In the 1950s, the majority of trips were made by ground transportation, with air travel still rare apart from coast-to-coast exchange . This verbal hypothesis can explain the spatial phase structure observed in figure 3f: rapid between-coast air connections would synchronize the two coasts and slow terrestrial diffusion on the road/rail network would be responsible for the linear phase-longitude association. The westward speed of propagation being almost three times as fast as the eastward one may have resulted from a road network being denser and straighter in the flat east part of the USA than in its mountainous Western counterpart, and also from higher population density in the east. The dramatic increase of air travel all over the continent together with rapid population growth in the central US may be responsible for the disruption of this spatial structure in the contemporary re-emergence of pertussis. The second potential factor in the changing spatial epidemiology of pertussis relates to the age distribution of pertussis incidence. It has been shown that, in recent years, pertussis is affecting older individuals [39,40], with increasing number of reports observed in adolescent and adult age groups [41,42]. Hence, compared with the 1950s, the spatial exchange of pertussis across states would have changed both because of changing underlying movement patterns in the population, but also because of a shift in the age classes affected. From a pragmatic perspective, our findings suggest that analyses of historical data may be much less informative about modern epidemiological systems than could a priori be expected.
We thank Aaron King and two anonymous reviewers for comments on this paper. M.C. was funded by IRD and CNRS. M.C. and P.R. were supported by the Vaccine Modeling Initiative of the Bill and Melinda Gates Foundation and the Research and Policy in Infectious Disease Dynamics program of the Science and Technology Directorate, Department of Homeland Security, and the Fogarty International Center, National Institutes of Health, and by a grant from the National Institutes of Health 1R01AI101155.
- Received July 30, 2012.
- Accepted September 3, 2012.
- This journal is © 2012 The Royal Society