Spatial variations in disease patterns of the 1918–1919 influenza pandemic remain poorly studied. We explored the association between influenza death rates, transmissibility and several geographical and demographic indicators for the autumn and winter waves of the 1918–1919 pandemic in cities, towns and rural areas of England and Wales. Average measures of transmissibility, estimated by the reproduction number, ranged between 1.3 and 1.9, depending on model assumptions and pandemic wave and showed little spatial variation. Death rates varied markedly with urbanization, with 30–40% higher rates in cities and towns compared with rural areas. In addition, death rates varied with population size across rural settings, where low population areas fared worse. By contrast, we found no association between transmissibility, death rates and indicators of population density and residential crowding. Further studies of the geographical mortality patterns associated with the 1918–1919 influenza pandemic may be useful for pandemic planning.
Greater understanding of past influenza pandemic patterns is key for designing more efficient public health interventions for future outbreaks. Despite the growing interest in historical epidemiological studies, spatial variations in the transmissibility and mortality rates of the 1918–1919 influenza pandemic have yet to be carefully quantified. In addition, little is known about the comparative dynamics of influenza in cities and rural areas in general, mostly due to the lack of appropriate data at this scale for inter-pandemic and pandemic seasons.
Transmissibility can be quantified by the reproduction number (R), the average number of secondary cases generated by an index case, which determines the intensity and types of interventions required to avert an epidemic (Anderson & May 1991; Diekmann & Heesterbeek 2000). Past research has estimated that the reproduction number of the 1918–1919 influenza pandemic ranged between 1.5 and 5.4 (Mills et al. 2004; Gani et al. 2005; Chowell et al. 2006, 2007a; Ferguson et al. 2006; Sertsou et al. 2006; Viboud et al. 2006a; Massad et al. 2007; Nishiura 2007; Andreasen et al. in press), depending on the specific location and pandemic wave considered, type of data, estimation method and level of spatial aggregation, which has ranged from small towns of a few hundred people to entire nations with several million inhabitants. The variability in published R estimates suggests that local factors, including geographical and demographic characteristics, could play a role in disease transmissibility.
In parallel, large variations in the 1918–1919 pandemic mortality rates have been reported between different nations and cities in the US, and were linked to differences in socio-demographic conditions and public health interventions (Murray et al. 2006; Bootsma & Ferguson 2007; Hatchett et al. 2007). The differences between urban and rural areas and population units of varying size, however, have not been comprehensively analysed.
Here we quantify geographical patterns during the autumn and winter waves of the 1918–1919 pandemic in English and Welsh cities, towns and rural areas. We analyse weekly influenza mortality from a diversity of geographical units to explore how transmissibility and mortality rates varied locally with socio-demographic factors, including population size, population density, residential crowding and urbanization.
2. Material and methods
(a) Data sources
National vital registration began in 1837 in England and Wales. Weekly numbers of influenza-specific deaths covering 46 weeks from 29 June 1918 to 10 May 1919 were compiled from an official publication of the British Ministry of Health (Ministry of Health 1920), and have been made partially available online (Influenza Pandemic Mortality in England and Wales 2006, http://ahds.ac.uk/catalogue/collection.htm?uri=hist-4350-11918-1919). In England and Wales, as in other European countries, the 1918–1919 pandemic occurred in three major waves (Low 1920; Andreasen et al. in press; figure 1): a summer wave, associated with low mortality (29 June to 3 August 1918); an autumn wave with high mortality (12 October to 28 December 1918); and a winter wave with intermediate mortality (8 February to 5 April 1919). Here we focus on the autumn 1918 and winter 1919 waves due to better resolution in the mortality data.
The mortality data covered all of England and Wales, and were available at the refined spatial scale of administrative units (N=305), as well as the coarser scale of counties (N=62) (Ministry of Health 1920). A total of 247 towns and cities with more than 10 000 inhabitants were categorized as urban units, and London was considered as a single unit. The remaining 58 units comprised rural areas, including 42 ‘remainder of counties’ classified as non-urban locales by the Ministry of Health and 16 peripheral administrative counties that did not contain a single city or town (of which 11 were located in Wales).
Demographic data for the 305 administrative units and 62 counties were retrieved from the 1921 decennial census (A Vision of Britain 2006), and included total population size, surface area, number of dwellings and number of rooms occupied. From these data, we derived measures of population density and residential crowding (table 1) and calculated the weekly and cumulative death rates for each geographical unit and pandemic wave. Although no census was conducted in 1918, the demographic data from the 1911 and 1921 census had very similar geographical patterns (correlation>0.99), so that our 1921 demographic estimates were deemed appropriate for the study of the 1918–1919 pandemic geographical patterns. Given the spatial aggregation of the dataset, some of the rural units had large total population sizes, but population density (defined as the ratio of population size to surface area) was 14 times lower on average in rural areas than in cities and towns.
(b) Transmissibility of the autumn and winter pandemic waves (reproduction number, R)
(i) Estimation methods
We estimated the reproduction number in each geographical unit and during each pandemic wave, relying on the early ascending phase of local epidemic curves, where saturation effects arising from the depletion of susceptible individuals can be neglected (Anderson & May 1991). We assumed an initial exponential growth and estimated the intrinsic growth rate, ‘r’, by fitting a straight line (with slope r and intercept b) to the initial increase in weekly deaths (in logarithmic scale). The longest epidemic period consistent with exponential growth was determined by the goodness-of-fit test statistic (Chowell et al. 2007a). The reproduction number was calculated by substituting the estimate for r into an expression derived from the linearization of the classical susceptible–exposed–infectious–recovered (SEIR) transmission model (Lipsitch et al. 2003; Wallinga & Lipsitch 2007)(2.1)where 1/b1 and 1/b2 are the mean latent and infectious periods, respectively. This expression for R assumes exponentially distributed latent and infectious periods, and the mean generation interval between two successive cases is given by Tc=1/b1+1/b2. We carried out a sensitivity analysis to assess the impact of this assumption on the R estimates. We derived an upper bound for the extreme case of a fixed generation interval (delta distribution) using the following equation (Wallinga & Lipsitch 2007):(2.2)
There is considerable debate about the duration of the generation interval for influenza (Andreasen et al. in press), with the most recent estimates relying on seasonal influenza transmission in households (Cauchemez et al. 2004; Ferguson et al. 2006), while no data are available for pandemic influenza. Given the uncertainty, we considered two extreme values of the generation interval used in past research: a short interval of 3 days (where the latent and infectious periods were both set to 1.5 days; Ferguson et al. 2005, 2006; Wallinga & Lipsitch 2007) and a longer interval of 6 days (latent period=1.9 days and infectious period=4.1 days; Longini et al. 2004; Mills et al. 2004). The case fatality rate was set to 2%, as in a previous transmissibility study (Mills et al. 2004).
(ii) Heterogeneity in influenza transmissibility and association with socio-demographic factors
We explored whether there was a meaningful variability in the reproduction number of the 1918–1919 influenza pandemic across England and Wales by comparing the variability in the R estimates between and within counties. We estimated the within-county variability from the finer spatial scale of administrative units, and used the analysis of variance (ANOVA) to test the county-specific differences in transmissibility (Neter & Wasserman 1974).
The association between reproduction number and socio-demographic variables was explored via Spearman correlations, using a Bonferroni correction for multiple comparisons.
(c) Cumulative influenza death rates for the autumn and winter pandemic waves
(i) Death rates, population size and urbanization
We initially explored the association between death rates, census variables and urbanization, by correlation and multivariate regression. We identified population size and urbanization as statistical predictors of death rates. We characterized these relationships further by applying two methods derived from econometrics and previously applied in infectious disease epidemiology, the Lorenz curve and the summary Gini index (Lee 1997; Woolhouse et al. 1997; Kerani et al. 2005; Green et al. 2006). The Lorenz curve is a graphical representation of the cumulative distribution of a quantity representing in our case the proportion of deaths assumed by the bottom y% of population sizes. If all death rates are perfectly equal, the Lorenz curve is the first diagonal (no heterogeneity), while a perfectly unbalanced distribution appears as a vertical line (maximum heterogeneity, for instance, if all units except one had zero deaths). Most empirical distributions of death rates lie somewhere in between.
The Gini index is a summary statistic of the Lorenz curve, ranging between 0 and 1, calculated as the area between the Lorenz curve and the diagonal representing perfect equality. A large Gini index indicates highly heterogeneous death rates. Conversely, a Gini index of zero represents an absence of heterogeneity, so that death rates are directly proportional to population size.
(ii) Scaling laws
Next, we characterized the functional relationship between population size and cumulative number of influenza deaths. Influenza deaths follow a power-law function of population sizes if D∝Ng, where D denotes number of deaths; N indicates population size; and g is an exponent to be estimated. For g=1.0, deaths are exactly proportional to population size (i.e. no heterogeneity); g<1 indicates that low population units have higher per capita death rates, while the opposite is true for g>1.
(d) Timing of pandemic onset and socio-demographic factors
Finally, for the autumn and winter waves, we explored whether the timing of the local pandemic onset varied with socio-demographic characteristics. The onset week was defined for each pandemic wave and geographical unit as the first week associated with a monotonic increase in weekly deaths, up to the week of peak deaths.
(a) Transmissibility of the autumn and winter pandemic waves
(i) Reproduction number estimates (R)
Figure S1 in the electronic supplementary material shows the distribution of the reproduction number estimates for the autumn and winter waves of the 1918–1919 pandemic in England and Wales, at the refined spatial scale of administrative units (N=305) and the coarser scale of counties (N=62). Most locations experienced an initial epidemic phase lasting at least three epidemic weeks, so that R estimates derived from equation (2.1) could be obtained in 87% of administrative units and 100% of the counties for the autumn wave, and 69% of the units and 87% of the counties for the winter wave. Normal probability plots indicated that R estimates at the county level closely followed a normal distribution, while the distributions of estimates at the administrative unit level showed a greater frequency of high values compared with a normal distribution (not shown). In table 2, the summary estimates of R are presented for a short and long duration of the generation interval (3 and 6 days). For the shorter generation interval in the autumn wave, the mean R was found to be 1.40 (95% CI: 1.38–1.42) in the administrative units, with similar values at the county level. The mean R estimate based on the aggregated national pandemic wave was not different, at 1.39 (95% CI: 1.36–1.43). For the winter wave, we estimated an overall mean R of 1.35 (95% CI: 1.33–1.37) in administrative units, with similar values at the county and national levels. Higher R estimates were found for a longer serial interval (approx. 1.9 for autumn and approx. 1.7 for winter).
Assuming the extreme case of fixed latent and infectious periods, we can give an upper bound on the estimates of R (Wallinga & Lipsitch 2007). In this sensitivity analysis, the R estimates increased only marginally, by 0.05 and 0.2 on average, when using the generation intervals of 3 and 6 days, respectively.
Overall, the autumn wave showed higher transmissibility than the winter wave, with 62% of the administrative units experiencing a reduction of transmissibility from autumn to winter (see figure S2 in the electronic supplementary material for maps of R estimates). There was no correlation between the reproduction numbers in the autumn and winter waves.
(ii) Heterogeneity in transmissibility and relationship with socio-demographic factors
Geographical heterogeneity in influenza transmissibility was statistically significant in the autumn (ANOVA, p=0.02) but not in the winter (ANOVA, p=0.11). There was no difference in transmissibility between urban and rural areas for either wave (Wilcoxon test, p>0.38).
The associations between R estimates and socio-demographic factors were weak to moderate, with the highest correlation estimated at 0.42 (p=0.002, table 3). It should be noted that there was a moderate but significant correlation between the transmissibility in the winter wave and the two measures of residential crowding at the county level (number of people per room and number of people per dwelling). However, no socio-demographic factor was consistently associated with transmissibility at both spatial scales or for both pandemic waves.
(b) Cumulative death rates in the autumn and winter waves
(i) Overall patterns, rural and urban areas
The average influenza mortality rate was 0.27% (s.d.=0.26%) in the autumn and 0.1% in the winter (s.d.=0.09%), with 94% of the administrative units experiencing decreasing rates from autumn to winter (maps of mortality rates are shown in figure S3 of the electronic supplementary material). Death rates in the autumn and winter waves were weakly correlated (ρ=0.19, p=0.007). A striking difference was found with urbanization, with 30–40% higher per capita death rates in urban areas than rural areas, for both pandemic waves (Wilcoxon test: autumn wave, p=0.007; winter wave, p=0.0001; figure S4 in the electronic supplementary material).
(ii) Relationship with socio-demographic factors
Table 4 presents the correlations between influenza death rates and demographics for urban and rural units, counties and pandemic waves. The strongest relationship was found with population size in rural settings, where the highest death rates were observed in the areas with the lowest population size—a pattern that was consistent for both pandemic waves. By contrast, there was no association seen for urban areas, and no relationship between death rates and population density. Death rates were weakly associated with an indicator of residential crowding (the average number of people per room) at both spatial scales in cities and towns, but only for the winter wave. Stepwise multivariate regression identified population size and urbanization as the only predictors of death rates in the autumn and winter waves.
(iii) Modelling death rates as a function of urbanization and population size
The relationship between the cumulative number of influenza deaths in a geographical area and the population size of the area can be visualized through the Lorenz curves (figure 2). The curve for rural areas was far from the diagonal for both pandemic waves (Gini index between 0.23 and 0.27), thus confirming a systematic relationship between death rates and population size in rural locations. By contrast, there was no association in urban areas (Gini index<0.05). Interestingly, variability in death rates entirely disappeared when the data were aggregated at the county level (Gini index∼0).
We also characterized the relationship between population size and cumulative influenza deaths using a power-law function (D∝Ng), stratified by urban and rural areas (figure 3). For rural areas, the exponent g estimates ranged between 0.71 and 0.77 (significantly below 1.0), whereas these estimates were approximately 1.0 for cities and towns. These estimates suggest that, in rural settings, smaller population units suffered a disproportionately large per capita mortality burden, whereas there was little variation in death rates across cities and towns. In line with the Lorenz curve analysis, heterogeneities disappeared at the scale of counties, and death rates became nearly independent of population size.
(c) Timing of pandemic onset and socio-demographic factors
Units with large population size experienced an early pandemic onset, both for the autumn and winter waves (figure 4). There was an approximately three-week difference on average between onset in the high and low population units in the autumn wave, while the winter wave was more synchronized, with an average difference of one week. Population size was well correlated with the timing of the onset of the autumn and winter waves (−0.55≤Spearman ρ≤−0.50, p<0.001). The association was maintained when the data were stratified by rural and urban units, which suggests that population size, rather than urbanization, was the predominant factor driving timing of pandemic onset. Population size was the only socio-demographic factor that independently predicted the timing of the onset for both pandemic waves.
We explored the geographical and demographic patterns of transmissibility and mortality during the autumn and winter waves of the 1918–1919 influenza pandemic in England and Wales, using weekly epidemiological time series at various levels of spatial aggregation. To the best of our knowledge, this is the first population study of the 1918–1919 influenza pandemic at such a detailed spatial resolution. Our estimates of the reproduction number ranged between 1.3 and 1.9, with a low level of spatial variation across geographical units, which was only marginally associated with socio-demographic factors. By contrast, there was a marked spatial heterogeneity in death rates, which was linked to urbanization, with 30–40% higher rates in cities and towns than in rural areas on average. We evidenced further variations in death rates across rural settings, where smaller population units fared worse. Urban death rates did not vary with population size.
There are several caveats and assumptions in our study. As in several previous studies (Mills et al. 2004; Viboud et al. 2004, 2006a,b), we assumed that influenza mortality was a good proxy for disease incidence and was appropriate for the estimation of transmissibility. Our analyses were conducted with the assumption that reporting and coding of death certificates were homogeneous across England and Wales and over the course of the pandemic (vital statistics had been in place since 1837 in England and Wales, and all deaths were medically certified in 1918). One may expect that more remote locations would have poorer reporting or coding of deaths; however, we found higher death rates in rural units with lower population sizes, which argues against such bias. Finally, as in previous mortality studies (Ministry of Health 1920; Smallman-Raynor et al. 2004; Johnson 2006), we relied on the administrative divisions of England and Wales in 1918, which are not necessarily the most meaningful spatial units for disease dynamics.
We identified spatial variations in the R estimates for the autumn wave at the county level; these variations were not simply the result of measurement error, but there was no obvious association with the socio-demographic factors under study. For comparison, a previous analysis of the 1918 flu pandemic in 45 US cities found that the transmissibility of the autumn wave was weakly correlated with population density (Mills et al. 2004), and a study of measles in 60 cities in England and Wales during the pre-vaccination era found that transmissibility was not associated with population size (Bjornstad et al. 2002). It is interesting that influenza and measles transmissibility appear nearly invariant across a wide range of population sizes and densities, and despite the differences in social connectivity patterns in urban and rural settings. It would be interesting to study this relationship further for seasonal flu epidemics or geographical regions with greater variations in socio-demographic factors.
We found similar transmissibility estimates in the autumn and winter pandemic waves for three different levels of spatial aggregation, including the administrative units, the counties and the entire region of England and Wales. On average, our autumn estimates using a generation interval of 6 days were slightly lower than estimates for the autumn 1918 pandemic wave in the US cities (Mills et al. 2004; Chowell et al. 2007a) and higher than estimates for recent seasonal influenza epidemics (Chowell et al. 2007b). While the estimates were not substantially changed by the spatial aggregation of the data or by relaxing assumptions on the distribution of the latent and infectious periods, they did change with different assumptions on the duration of the generation interval. Hence, the generation interval is clearly a key parameter with considerable residual uncertainty (Ferguson et al. 2006; Andreasen et al. in press).
Our results indicate that areas with larger populations experienced early pandemic onset in the autumn and winter waves, thus suggesting a hierarchical spread of influenza driven by large population centres of England and Wales in 1918–1919 (see also Smallman-Raynor et al. 2004). This pattern is reminiscent of seasonal influenza epidemics in the USA (Viboud et al. 2006b). Despite these observed variations in pandemic timing, it is noteworthy that the 1918–1919 pandemic as a whole was more synchronous in the administrative units of England and Wales than in US cities. The relative synchrony of the pandemic in England and Wales is likely to be explained by strong population mixing and the small geographical extent of this region (Ferguson et al. 2006; Bootsma & Ferguson 2007).
Past research has produced conflicting results on the association between the disease burden of the 1918–1919 pandemic and socio-demographic or geographical factors. In a study of four British cities, no association was found between influenza attack rates and residential crowding (Ministry of Health 1920). Furthermore, influenza-related mortality rates were moderately associated with measures of density and baseline pre-pandemic mortality rates in 45 US cities (Pearl 1921; Bootsma & Ferguson 2007). Similarly, in this study, we did not find any obvious association between death rates and measures of population density or residential crowding, or between death rates and pre-pandemic infant mortality rates (correlation approx. 0.30, not shown). Other studies, by contrast, have reported a strong effect of socio-demographic characteristics on the 1918–1919 pandemic mortality rates, including per capita income and indicators of wealth such as apartment size (Mamelund 2006; Murray et al. 2006). These conflicting results may stem from the differences in the spatial scale considered (ranging from household to country) and the choice of socio-demographic indicators.
The most important factor associated with death rate in our study was urbanization, with cities and towns experiencing approximately 30–40% higher death rates than rural areas during both pandemic waves. Although the reasons for this pattern remain unclear, a similar mortality pattern has been evidenced in New Zealand (McSweeny et al. 2007). Differential exposure to tuberculosis could have played a role, as this disease has been put forward as a predisposing factor for morbidity and mortality during the 1918 influenza pandemic (Pearl 1919; Noymer & Garenne 2000), and tuberculosis was traditionally more prevalent in urban than rural settings. Alternative hypotheses include remoteness and greater social distancing in rural areas (McSweeny et al. 2007), although in this case, we would perhaps expect lower transmissibility in rural areas.
Further differences with urbanization were uncovered in our study, in that death rates varied with population size in rural areas, but not in cities and towns. Disparities in access and organization of health care in rural areas with small populations, which fared worse during the pandemic, could explain these geographical patterns (McSweeny et al. 2007).
It is interesting that important residual variability in transmissibility estimates and death rates remained unexplained in our study. We think it unlikely that local differences in population age structure or public health interventions could have played a role. Children are believed to drive disease spread locally within cities or small communities (Monto 1999), although they probably do not disseminate infection over large distances (Viboud et al. 2006b). In our study, the proportion of the population comprising children did not explain geographical variations in transmissibility or death rates of the 1918–1919 pandemic (not shown). Although the 1918–1919 pandemic virus was characterized by an unusual age pattern of deaths concentrated among young adults (Olson et al. 2005), the local differences in the proportion of young adults did not explain variations in death rates in our study (not shown). Public health interventions are irrelevant in the context of England and Wales, as no intervention was implemented in 1918–1919, in contrast to the USA (Bootsma & Ferguson 2007).
The existence of cross-immunity between the viruses circulating in successive waves of the 1918–1919 pandemic has been hypothesized (Anon. 1919; Andreasen et al. in press), and may affect the geographical patterns described in this study. The cross-immunity hypothesis relies on anecdotal evidence supporting the idea that individuals who were infected with influenza during the summer escaped clinical illness or experienced only mild respiratory symptoms in the autumn (Anon. 1919). Accordingly, Scandinavia experienced a large influenza morbidity wave in the summer of 1918, followed by a second wave of low mortality impact in the autumn, relative to other countries (Andreasen et al. in press). We were not able to fully investigate the impact of differential exposure to influenza during the summer of 1918 in England and Wales, due to the lack of morbidity data and extremely low death counts for this period. However, we found a weak but significant correlation between mortality rates in the summer and autumn waves, which is consistent with the cross-immunity hypothesis (Spearman ρ=0.2, p<0.001). The possibility of cross-immunity also suggests that our rather low R estimates for the autumn and winter pandemic waves do not represent the maximum transmissibility of the 1918 influenza virus in an entirely naive population (the so-called R0), which may have been as high as 5.4 (Andreasen et al. in press).
Overall, this analysis of the 1918–1919 pandemic in England and Wales reveals that high-resolution spatial data at the level of cities, towns and the surrounding rural areas are key to detecting heterogeneity in influenza transmissibility and death rates. Influenza transmissibility results from a complex combination of pre-existing immunity patterns, population mixing and viral strain characteristics, while mortality is also affected by health care and socio-demographic conditions. Our results suggest that population size and urbanization played a role in the geographical patterns of the 1918–1919 pandemic in England and Wales. Whether these patterns can be generalized to contemporary seasonal influenza epidemics and future pandemics is an interesting topic for future research.
We thank Bryan Grenfell for helpful discussions at an early stage of this work and Neil Ferguson for constructive comments. We thank Maia Rabaa, FIC, for editorial assistance. This study was funded by the Fogarty International Center, NIH. This work was partially supported through the Los Alamos National Laboratory LDRD grant 20070099DR (to L.M.A.B.).