Inferring the causes of the three waves of the 1918 influenza pandemic in England and Wales

Daihai He, Jonathan Dushoff, Troy Day, Junling Ma, David J. D. Earn


Past influenza pandemics appear to be characterized by multiple waves of incidence, but the mechanisms that account for this phenomenon remain unclear. We propose a simple epidemic model, which incorporates three factors that might contribute to the generation of multiple waves: (i) schools opening and closing, (ii) temperature changes during the outbreak, and (iii) changes in human behaviour in response to the outbreak. We fit this model to the reported influenza mortality during the 1918 pandemic in 334 UK administrative units and estimate the epidemiological parameters. We then use information criteria to evaluate how well these three factors explain the observed patterns of mortality. Our results indicate that all three factors are important but that behavioural responses had the largest effect. The parameter values that produce the best fit are biologically reasonable and yield epidemiological dynamics that match the observed data well.

1. Introduction

The 1918 influenza pandemic was the deadliest pandemic in history. An estimated 50–100 million people were killed worldwide, and one-third of the world's population is estimated to have been infected [1]. The incidence of influenza and the resultant mortality exhibited multiple waves during the 1918 pandemic, with many regions experiencing up to three peaks in mortality [25]. For example, figure 1 shows the pattern of mortality in the UK during the 12 month period beginning in June 1918; three distinct waves are evident throughout the country during this single year. The recent influenza pandemic in 2009 also displayed multiple waves of incidence in many Northern Hemisphere countries [611].

Figure 1.

Three pandemic waves swept the UK during 1918–1919. The 334 administrative units that we studied within England and Wales are marked on the map (a); the largest 12 cities are identified by name and with solid (red) dots (the size of which indicate the population size rank of the city). The intensity plot (b) shows the spatio-temporal pattern of the pandemic throughout the country (which we divided into 150 latitudinal bins). Panel (c) on the top right shows the daily central England temperature (CET in °C) and weekly influenza mortality in London (in hundreds). (Online version in colour.)

Identifying the processes that give rise to multiple pandemic waves is important for public health. This problem has consequently attracted much attention, with several mechanisms being proposed, including viral evolution (which modifies transmissibility, immunological escape or both), environmental change (primarily weather conditions) and behavioural change of people in response to the pandemic [12,13]. Several statistical analyses of data from past pandemics have identified potential causes of mortality patterns. For example, Chowell et al. [14] found that death rates during the 1918 pandemic in the UK were 30–40% higher in cities and towns when compared with rural areas; and Pearce et al. [15] found that the occurrence of epidemic waves in the UK in 1918 was associated with patterns of socioeconomic status and age, potentially as a result of prior immunity within some age groups. An analysis by Andreasen et al. [3] suggests that immunological history might play a role. As many of these authors have noted, however, these results are strictly correlative and a combination of statistical analysis and mechanistic mathematical modelling is required to establish a causal relationship between such factors and the occurrence of multiple waves.

Several previous studies have taken a mechanistic approach to the 1918 pandemic. Mathews et al. [16] made a number of different hypotheses, fit associated mechanistic models to survey data from 12 localities in the UK, and explained multiple waves using waning of immunity derived from prior exposure. Bootsma & Ferguson [17] focused on two waves of influenza mortality in 1918 in 16 US cities; their results suggested that transmission rate variations owing to public health measures and individual behavioural responses to high mortality could have caused the autumn and winter waves in the USA. Chowell et al. [5] studied the two waves of influenza notifications in Geneva, Switzerland, and used a two-strain model to explain multiple waves. Influenza wave studies outside the context of the 1918 pandemic include Camacho et al.'s [18] study of two waves of influenza-like-illness cases in a period of two months on a tiny island (with 284 inhabitants) in 1971 and Matrajt & Longini's [19] investigation of the 2009 pandemic in the UK and USA.

Our group recently [20] fit mechanistic models to mortality data from the 1918 influenza pandemic in London, UK, and compared seven different dynamical mechanisms that might account for multiple waves. We found that temporal changes in the transmission rate provide the most plausible explanation. Here, we attempt to identify the most plausible biological, social or environmental processes that could account for temporal changes in transmission rate in 1918. The processes we consider are as follows.

  • — Schools opening and closing. It has been found that closing schools for summer vacations had a large effect on structuring the 2009 influenza pandemic in the Northern Hemisphere. Closing schools in June contributed strongly to attenuating the first wave and re-opening schools in September appears to have triggered the second wave in the autumn [11,2123]. In 1918, influenza incidence was highest in children and young adults ([24], figure 1), suggesting that the pattern of school terms may have been important in structuring the 1918 pandemic.

  • — Weather changes. Much recent work has indicated that atmospheric conditions such as temperature and humidity play a significant role in influenza transmission [11,25,26].

  • — Human behavioural responses. Restriction of human contact patterns affects infectious disease transmission. In the USA in 1918, regions that differed in the public health measures that were employed also differed in the temporal pattern of mortality [17]. In Australia in 1918, social distancing measures appear to have substantially reduced the influenza clinical attack rate [27].

We use dynamic transmission models to investigate whether transmission of influenza in the UK was affected by the above processes. We use sequential Monte Carlo methods [28,29] to construct statistical fits of our dynamic models to a 1918 mortality dataset that covers 334 administrative units in the UK (333 distinct areas in England and Wales, including the boroughs of London, and London as a whole).

2. Material and methods

We obtained information about UK school terms in 1918 from available historical documents. The beginning of the twentieth century was a time of considerable change in the UK educational system. Many schools were free, and most free schools were compulsory, up to around the age of 13. Unfortunately, we do not have detailed information on school closures for every city. In many agricultural regions, schools were closed each year for the harvest for four to eight weeks in July or August. In some non-agricultural regions, schools were closed from approximately mid-July to mid-August owing to the summer wave of pandemic influenza. The 1918 harvest was a great success and required a great deal of labour (see the electronic supplementary material).

Influenza transmission is also affected by weather; in particular, influenza virus survival is reduced when absolute humidity is high [26]. In 1918, peak mortality in London corresponded to a low-temperature event [30]. Since humidity records are not available, we use a measure of temperature (which is strongly correlated with absolute humidity) to model climatic effects on transmission rate.

Previous work [31] has indicated that, during the 2009 influenza pandemic, people changed their daily activities to reduce the risk of infection. Human behavioural responses to epidemics and pandemics have received much attention in recent years, and a variety of ways to incorporate such responses into epidemic models have been proposed [3236]. Here, we assume a simple relationship between transmission rate and a measure of recent mortality, similar to the approach of Bootsma & Ferguson [17].

We specify the transmission process with a simple continuous-time compartmental stochastic epidemic model; we use particle filtering and likelihood-based inference [29] to test our model and estimate the parameters.

(a) Data

We used reported weekly influenza deaths during the 1918–1919 pandemic in 334 administrative units of England and Wales obtained from Johnson [37]. We also used the central England daily temperature obtained from the UK Met Office (

(b) Model

Our model is a simple susceptible-infectious-recovered framework with added variables for mortality and perception of risk. The mean field limit of our stochastic model isEmbedded Image 2.1aEmbedded Image 2.1bEmbedded Image 2.1cEmbedded Image 2.1dEmbedded Image 2.1eEmbedded Image 2.1fHere, S, I and R represent the numbers of susceptible, infectious and recovered individuals, respectively. We model mortality using two variables: D represents individuals, who are no longer infectious and on track to die from the effects of influenza, while M models those who have died of influenza; this approach allows us to incorporate the delay between a normal recovery time and the typical time of death. P represents the public perception of risk. It increases when people die, and decays naturally, meaning that perception of risk diminishes over time in the absence of influenza deaths. P in turn affects the transmission rate β (as specified below). N = S + I + R + D + M is the total population (assumed to be constant).

The case fatality proportion (CFP) is ϕ, and γ, g and λ denote the rates at which individuals leave the I, D and P compartments, respectively; thus γ1, g1 and λ1 are the mean infectious period (fixed at 4 days [38]), the mean time from loss of infectiousness to death (fixed at 8 days [17]) and the mean duration of impact of deaths on public perception (to be estimated). The transmission rate β is given byEmbedded Image 2.2β0 is the baseline transmission rate. T(t) denotes the temperature and ξ is the amplitude of the response of transmission rate to temperature changes. The quantity 1 + αH(t) represents seasonality of transmission associated with the school calendar [3942], where α is the amplitude of school-term forcing and H(t) takes a fixed positive value on school days, the value −1 during vacations, and its integral over a year is zero. The final factor represents human behavioural responses to deaths, where P(t) is the perception of risk as specified in (2.1f) and κ is a parameter controlling the strength of the response.

(i) Temperature T(t)

We obtained the daily temperature for central England from the UK Met Office ( Unfortunately, humidity data is not available, but we expect absolute humidity to be closely correlated with air temperature [11].

(ii) School calendar H(t)

The UK had several types of schools, and the detailed patterns of school calendars are very complicated (see Here, we assume that school closure dates between June 1918 and June 1919 consisted of three parts: a harvesting season (between 23 June 1918 and 15 October 1918) with floating start and end dates (to be estimated for each locality), two weeks over the Christmas/New Year period (23 December 1918 to 5 January 1919) and one week around Easter (15 April 1919 to 23 April 1919). We assume transmission was lower during vacations and higher when school was in session.

(iii) Fitting

The model was simulated stochastically using the standard Euler-multinomial approach [29,43]. The simulated weekly mortality was sampled using a negative binomial distribution with the mean equal to the weekly increment in the M class. Thus, measurement error was assumed to be negative binomially distributed (with a tunable over-dispersion parameter ψ [43]).

We made the following assumptions about the unknown parameters: all parameters are non-negative; the CFP is between 0 and 1; the amplitude of school-term forcing is between 0 and 1; the start/end dates of school summer vacation are bounded by the beginning of the time series (23 June 1918) and 15 October 1918; the initial susceptible proportion is between 70 and 90%; the initial infected proportion is above 0.01%; the initial death list class (D) and death impact class (P) are empty. We fixed the mean infectious period and the mean time from loss of infectiousness to death to be 4 days and 8 days, respectively. In the electronic supplementary material, we show that using a lower initial susceptible proportion, or a shorter infectious period (2.5 days), does not substantially affect our results.

We fit the model by finding the maximum-likelihood values of the parameters using the iterated filtering method of Ionides and co-workers [28,44,45]. This likelihood-based inference framework has been used in several publications by both our group [11,29] and other groups [18].

3. Results

Table 1 presents maximum-likelihood parameter estimates for our best-fit model (which includes school closures, weather changes and behavioural responses); this table is based on the 50 largest administrative units in England and Wales (see the electronic supplementary material for a similar table based on all 334 administrative units). The iterated filtering algorithm [28,29] that we used converged well and provided biologically reasonable estimates for the vast majority of localities. In table 1, we report the basic reproductive number of the time-averaged system, Embedded Image, where Embedded Image represents the time average of the transmission rate β(t) with P = 0 in equation (2.2) [46]. The estimated value of Embedded Image agrees with previous estimates [38]. The estimated CFP, ϕ, is smaller than published values, e.g. ϕ = 0.02 in [38]; thus, compared with previous work, our fits suggest a larger number of non-fatal infections.

View this table:
Table 1.

Maximum-likelihood estimates for the parameters of our model (2.1) for the 50 largest administrative units in England and Wales. (Graphic is the basic reproduction number. α is the amplitude of school-term forcing. ξ is the intensity of the effect of temperature variation. λ−1 is the time scale of behavioural response. ϕ is the case fatality proportion (CFP). κ is the intensity of behavioural response. See §2 for details. A similar table for all 334 administrative units is given in the electronic supplementary material.)

While we have little information about other parameter values, our estimates seem reasonable. For example, the estimated amplitude (α) of school-term forcing suggests that the transmission rate was reduced by approximately 40% during school vacations. The estimated temperature intensity (ξ) suggests that a 10° increment in temperature reduced the transmission rate by 43%. The estimated vacation start and end dates are in the range discussed in §1. The differences among estimated start/end dates for different cities are probably owing to real differences among cities. (The 95% CIs for estimates of all parameters for the 12 largest cities are given in the electronic supplementary material.)

We investigated the importance of our three explanatory factors (behavioural response, school calendar and climate) by calculating how much the fit of the model is degraded when each is removed. Figure 2 shows a histogram of the change in the Akaike Information Criterion (ΔAICc) between the full model and models missing each factor in turn, for the largest 50 of the 334 administrative units (results for all 334 units are shown in the electronic supplementary material). All three explanatory factors have mostly very large ΔAICc values, implying that all were important factors in determining the course of the epidemic. Behavioural response was the most important factor, with uniformly large ΔAICc in the 50 largest cities. The importance of school terms and temperature changes is more variable across cities, but overall fairly high. The variability is probably owing to problems of limited data, noise and conflation among predictor variables.

Figure 2.

Histograms of (a) ΔAICc (no behavioural response), (b) ΔAICc (no response to school holidays), (c) ΔAICc (no response to temperature) for the 50 largest administrative units in England and Wales. A similar figure for all 334 administrative units is given in the electronic supplementary material. (Online version in colour.)

Model simulations for the three largest cities are shown in figure 3. Four models are shown: (i) the best model, in which transmission is modulated by all three factors (human behavioural response, school terms and temperature), (ii) a model without behavioural responses, (iii) a model without school terms, and (iv) a model without temperature change. The full model provides an excellent match of simulations to the observed mortality. The other three models show relatively poorer fits in all cities, especially the model without behavioural responses.

Figure 3.

Simulations compared with observed weekly mortality in 1918. We show: the best model, forced by three factors (human behavioural response, school terms and temperature); a model with no behavioural effects; a model with no school terms and a model with no temperature effects. With estimated parameter values for each model, 1000 simulations were generated (1000 × 46 weekly data points) and displayed using box-plots (one box-plot for each week), compared with observed weekly influenza mortality (46 weeks, solid (red) curve). The three rows show results for three UK cities. The four columns show the four types of model. (Online version in colour.)

4. Discussion

Disentangling the interactions of causal factors underlying patterns of spread in historical epidemics is a difficult problem, in part due to data limitations. Here, we have used a mechanistic transmission model and likelihood-based inference to investigate factors underlying the three-wave shape of the 1918 influenza epidemic in England and Wales. Building on our earlier finding that changes in transmission rate are the most likely dynamical explanation for the multiple waves, we explored three possible mechanisms that could underlie such change: behavioural change, seasonal weather, and changes in mixing patterns owing to harvest and holidays.

Our main model does not consider the possible effects of viral evolution affecting either infectiousness or immunity (via antigenic change in the virus). Because the periods of time are short, and the proportion of the population infected in the first wave was small [47], these seem unlikely to affect our main conclusions. Nevertheless, in a model in the electronic supplementary material, we explore the potential effects of antigenic change of the pandemic virus. The fit of this model (as measured by AICc) is worse than our main model in most cities.

Since we lack detailed information on both individual-level and public health responses to the 1918 pandemic in England and Wales, we modelled behaviour change using a simple dynamic response function that assumes transmission goes down as the number of ‘recent’ deaths—measured using a timescale that is fitted by the model—goes up.

We modelled seasonal effects using temperature, since we had an available temperature time series, and we expect temperature to be strongly correlated with absolute humidity [11]. We modelled school and harvest effects using holiday dates inferred from historical documents and summer vacation dates fitted by our model.

We fitted a simple mechanistic model and obtained reasonable parameter estimates. We found that the observed three-wave pattern of pandemic influenza in the UK in 1918 is best explained by accounting for all three factors. Without behaviour change, the model is unable to explain the occurrence of all three waves. In particular, the decline of the second wave in the winter followed by a resurgence can only be explained by behavioural response, at least in our framework. The end of the first wave, after a relatively small number of deaths, on the other hand, is more likely to be due to climate or school-term effects. These two effects can substitute for each other to a certain extent, but our model shows that accounting for both does a much better job of explaining the observed patterns, suggesting that both mechanisms were in fact important.

For each of the explanatory factors that we considered (behavioural response, school terms and weather), we chose a simple function designed to capture its approximate effects. These choices are certainly not perfect, but the weekly death data are not rich enough to support more complicated models, or to allow us to compare simple models with confidence. In principle, we could obtain more information about some of our parameters from other sources (e.g. about school vacations, time distributions from infection to mortality, the response of influenza virus to temperature), reducing the number of parameters that need to be fitted.

In the electronic supplementary material, we present sensitivity analyses of our assumptions concerning the mean infectious period, the mean time from infection to death and the initial susceptible proportion. Our main results are robust to changes in the values of these parameters. Our estimate of the CFP does depend on the degree of initial susceptibility (a smaller initially susceptible proportion yields a higher CFP).

We conclude that behavioural changes, temperature trends and school closure all contributed to the observed three-wave mortality patterns in the UK during the 1918 influenza pandemic, and that behavioural changes had the largest effect.

Funding statement

We were supported by the Canadian Institutes of Health Research (CIHR), the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Public Health Agency of Canada (PHAC). D.H. was supported by a start-up grant from the Department of Applied Mathematics at Hong Kong Polytechnic University (HKPU) and an HKPU Central Bidding grant.


We thank the Boutros Laboratory at the Ontario Institutes for Cancer Research, the Rohani and King Laboratory at the University of Michigan, Chris Bauch, and Raluca Eftimie for helpful discussions. We thank two anonymous referees for helpful comments on the manuscript. Computations were performed on SHARCNet ( The views expressed in this paper do not reflect the views of PHAC.

  • Received May 28, 2013.
  • Accepted June 14, 2013.


View Abstract