Nicholas Reich: Biostatistics and Infectious Disease Epidemiology

2025

Forecasting COVID-19 with Temporal Hierarchies and Ensemble Methods

Shandross L, Ray EL, Rogers BW, Reich NG (2025). medRxiv.

forecasting covid-19

Abstract

Infectious disease forecasting efforts underwent rapid growth during the COVID-19 pandemic, providing guidance for pandemic response and about potential future trends. Yet despite their importance, short-term forecasting models often struggled to produce accurate real-time predictions of this complex and rapidly changing system. This gap in accuracy persisted into the pandemic and warrants the exploration and testing of new methods to glean fresh insights. In this work, we examined the application of the temporal hierarchical forecasting (THieF) methodology to probabilistic forecasts of COVID-19 incident hospital admissions in the United States. THieF is an innovative forecasting technique that aggregates time-series data into a hierarchy made up of different temporal scales, produces forecasts at each level of the hierarchy, then reconciles those forecasts using optimized weighted forecast combination. Vhile THieF’s unique approach has shown substantial accuracy improvements in a diverse range of applications, such as operations management and emergency room admission predictions, this technique had not previously been applied to outbreak forecasting. We generated candidate models formulated using the THieF methodology, which differed by their hierarchy schemes and data transformations, and ensembles of the THieF models, computed as a mean of predictive quantiles. The models were evaluated using weighted interval score (WIS) as a measure of forecast skill, and the top-performing subset was compared to a group of benchmark models. These models included simple ARIMA and seasonal ARIMA models, an ensemble of these ARIMA models, a naive baseline model, four operational incident hospitalization models from the U.S. COVID-19 Forecast Hub, and an equally-weighted quantile median of all models that submitted incident hospitalization forecasts to the Forecast Hub. The THieF models and THieF ensembles demonstrated improvements in WIS and MAE, as well as competitive prediction interval coverage, over many benchmark models for both the validation and testing phases. The best THieF model’s rank oscillated between second or third out of fourteen total models during the testing evaluation. These accuracy improvements suggest the THieF methodology may serve as a useful addition to the infectious disease forecasting toolkit.

Evaluation of FluSight influenza forecasting in the 2021–22 and 2022–23 seasons with a new target laboratory-confirmed influenza hospitalizations

Mathis SM, Webber AE, ... Cramer EY, Gerding A, Stark A, Ray EL, Reich NG, Shandross L, Wattanachit N, Wang Y, Zorn MW, , ... Reed C, Biggerstaff M, Borchering RK (2025). Nature Communications, 15(1): 6289.

forecasting flusight influenza

Abstract

Accurate forecasts can enable more effective public health responses during seasonal influenza epidemics. For the 2021–22 and 2022–23 influenza seasons, 26 forecasting teams provided national and jurisdiction-specific probabilistic predictions of weekly confirmed influenza hospital admissions for one-to-four weeks ahead. Forecast skill is evaluated using the Weighted Interval Score (WIS), relative WIS, and coverage. Six out of 23 models outperform the baseline model across forecast weeks and locations in 2021–22 and 12 out of 18 models in 2022–23. Averaging across all forecast targets, the FluSight ensemble is the 2nd most accurate model measured by WIS in 2021–22 and the 5th most accurate in the 2022–23 season. Forecast skill and 95% coverage for the FluSight ensemble and most component models degrade over longer forecast horizons. In this work we demonstrate that while the FluSight ensemble was a robust predictor, even ensembles face challenges during periods of rapid change.

2024

Evaluating infectious disease forecasts with allocation scoring rules

Gerding A, Reich NG, Rogers B, Ray EL (2024). Journal of the Royal Statistical Society Series A: Statistics in Society.

forecasting covid-19

Abstract

Recent years have seen increasing efforts to forecast infectious disease burdens, with a primary goal being to help public health workers make informed policy decisions. However, there has been only limited discussion of how predominant forecast evaluation metrics might indicate the success of policies based in part on those forecasts. We explore one possible tether between forecasts and policy: the allocation of limited medical resources so as to minimize unmet need. We use probabilistic forecasts of disease burden in each of several regions to determine optimal resource allocations, and then we score forecasts according to how much unmet need their associated allocations would have allowed. We illustrate with forecasts of COVID-19 hospitalizations in the U.S., and we find that the forecast skill ranking given by this allocation scoring rule can vary substantially from the ranking given by the weighted interval score. We see this as evidence that the allocation scoring rule detects forecast value that is missed by traditional accuracy measures and that the general strategy of designing scoring rules that are directly linked to policy performance is a promising direction for epidemic forecast evaluation.

Beyond forecast leaderboards: Measuring individual model importance based on contribution to ensemble accuracy

Kim M, Ray EL, Reich NG (2024). arXiv.

forecasting covid-19

Abstract

Ensemble forecasts often outperform forecasts from individual standalone models, and have been used to support decision-making and policy planning in various fields. As collaborative forecasting efforts to create effective ensembles grow, so does interest in understanding individual models' relative importance in the ensemble. To this end, we propose two practical methods that measure the difference between ensemble performance when a given model is or is not included in the ensemble: a leave-one-model-out algorithm and a leave-all-subsets-of- models-out algorithm, which is based on the Shapley value. We explore the relationship between these metrics, forecast accuracy, and the similarity of errors, both analytically and through simulations. We illustrate this measure of the value a component model adds to an ensemble in the presence of other models using US COVID-19 death forecasts. This study offers valuable insight into individual models' unique features within an ensemble, which standard accuracy metrics alone cannot reveal.

Optimizing Disease Outbreak Forecast Ensembles

Fox SJ, Kim M, Meyers LA, Reich NG, Ray EL (2024). Emerging Infectious Diseases, 30(9): 1967-1969.

forecasting covid-19 influenza

Abstract

On the basis of historical influenza and COVID-19 forecasts, we found that more than 3 forecast models are needed to ensure robust ensemble accuracy. Additional models can improve ensemble performance, but with diminishing accuracy returns. This understanding will assist with the design of current and future collaborative infectious disease forecasting efforts.

Flusion: Integrating multiple data sources for accurate influenza predictions

Ray EL, Wang Y, Wolfinger RD, Reich NG (2024). arXiv.

forecasting influenza flusight

Abstract

Over the last ten years, the US Centers for Disease Control and Prevention (CDC) has organized an annual influenza forecasting challenge with the motivation that accurate probabilistic forecasts could improve situational awareness and yield more effective public health actions. Starting with the 2021/22 influenza season, the forecasting targets for this challenge have been based on hospital admissions reported in the CDC's National Healthcare Safety Network (NHSN) surveillance system. Reporting of influenza hospital admissions through NHSN began within the last few years, and as such only a limited amount of historical data are available for this signal. To produce forecasts in the presence of limited data for the target surveillance system, we augmented these data with two signals that have a longer historical record: 1) ILI+, which estimates the proportion of outpatient doctor visits where the patient has influenza; and 2) rates of laboratory-confirmed influenza hospitalizations at a selected set of healthcare facilities. Our model, Flusion, is an ensemble that combines gradient boosting quantile regression models with a Bayesian autoregressive model. The gradient boosting models were trained on all three data signals, while the autoregressive model was trained on only the target signal; all models were trained jointly on data for multiple locations. Flusion was the top-performing model in the CDC's influenza prediction challenge for the 2023/24 season. In this article we investigate the factors contributing to Flusion's success, and we find that its strong performance was primarily driven by the use of a gradient boosting model that was trained jointly on data from multiple surveillance signals and locations. These results indicate the value of sharing information across locations and surveillance signals, especially when doing so adds to the pool of available training data.

Infectious disease surveillance needs for the United States: lessons from Covid-19

Lipsitch M, Bassett MT, Brownstein JS, ... Reich NG, ... Truelove S, Varma JK, Grad YH (2024). Frontiers in Public Health, 12: 1408193.

covid-19

Abstract

The COVID-19 pandemic has highlighted the need to upgrade systems for infectious disease surveillance and forecasting and modeling of the spread of infection, both of which inform evidence-based public health guidance and policies. Here, we discuss requirements for an effective surveillance system to support decision making during a pandemic, drawing on the lessons of COVID-19 in the U.S., while looking to jurisdictions in the U.S. and beyond to learn lessons about the value of specific data types. In this report, we define the range of decisions for which surveillance data are required, the data elements needed to inform these decisions and to calibrate inputs and outputs of transmission-dynamic models, and the types of data needed to inform decisions by state, territorial, local, and tribal health authorities. We define actions needed to ensure that such data will be available and consider the contribution of such efforts to improving health equity.

hubEnsembles: Ensembling Methods in R

Shandross L, Howerton E, Contamin L, Hochheiser H, Krystalli A, Consortium of Infectious Disease Modeling Hubs, Reich NG, Ray EL (2024). medRxiv.

forecasting software

Abstract

Combining predictions from multiple models into an ensemble is a widely used practice across many fields with demonstrated performance benefits. The R package hubEnsembles provides a flexible framework for ensembling various types of predictions, including point estimates and probabilistic predictions. A range of common methods for generating ensembles are supported, including weighted averages, quantile averages, and linear pools. The hubEnsembles package fits within a broader framework of open-source software and data tools called the “hubverse”, which facilitates the development and management of collaborative modelling exercises.

Challenges of COVID-19 Case Forecasting in the US, 2020-2021

Lopez V, Cramer EY, Pagano R, ... Biggerstaff M, Reich NG, Johansson MA (2024). PLOS Comp Bio.

forecasting covid-19

Abstract

During the COVID-19 pandemic, forecasting COVID-19 trends to support planning and response was a priority for scientists and decision makers alike. In the United States, COVID-19 forecasting was coordinated by a large group of universities, companies, and government entities led by the Centers for Disease Control and Prevention and the US COVID-19 Forecast Hub (https://covid19forecasthub.org). We evaluated approximately 9.7 million forecasts of weekly state-level COVID-19 cases for predictions 1-4 weeks into the future submitted by 24 teams from August 2020 to December 2021. We assessed coverage of central prediction intervals and weighted interval scores (WIS), adjusting for missing forecasts relative to a baseline forecast, and used a Gaussian generalized estimating equation (GEE) model to evaluate differences in skill across epidemic phases that were defined by the effective reproduction number. Overall, we found high variation in skill across individual models, with ensemble-based forecasts outperforming other approaches. Forecast skill relative to the baseline was generally higher for larger jurisdictions (e.g., states compared to counties). Over time, forecasts generally performed worst in periods of rapid changes in reported cases (either in increasing or decreasing epidemic phases) with 95% prediction interval coverage dropping below 50% during the growth phases of the winter 2020, Delta, and Omicron waves. Ideally, case forecasts could serve as a leading indicator of changes in transmission dynamics. However, while most COVID-19 case forecasts outperformed a naïve baseline model, even the most accurate case forecasts were unreliable in key phases. Further research could improve forecasts of leading indicators, like COVID-19 cases, by leveraging additional real-time data, addressing performance across phases, improving the characterization of forecast confidence, and ensuring that forecasts were coherent across spatial scales. In the meantime, it is critical for forecast users to appreciate current limitations and use a broad set of indicators to inform pandemic-related decision making.

2023

Evaluation of the US COVID-19 Scenario Modeling Hub for informing pandemic response under uncertainty

Howerton E, Contamin L, Mullany LC, Qin M, Reich NG, ... Viboud C, Lessler J (2023). Nature Communications, 14(1): 7260.

forecasting covid-19

Abstract

Our ability to forecast epidemics far into the future is constrained by the many complexities of disease systems. Realistic longer-term projections may, however, be possible under well-defined scenarios that specify the future state of critical epidemic drivers. Since December 2020, the U.S. COVID-19 Scenario Modeling Hub (SMH) has convened multiple modeling teams to make months ahead projections of SARS-CoV-2 burden, totaling nearly 1.8 million national and state-level projections. Here, we find SMH performance varied widely as a function of both scenario validity and model calibration. We show scenarios remained close to reality for 22 weeks on average before the arrival of unanticipated SARS-CoV-2 variants invalidated key assumptions. An ensemble of participating models that preserved variation between models (using the linear opinion pool method) was consistently more reliable than any single model in periods of valid scenario assumptions, while projection interval coverage was near target levels. SMH projections were used to guide pandemic response, illustrating the value of collaborative hubs for longer-term scenario projections.

Real-time mechanistic Bayesian forecasts of COVID-19 mortality

Gibson GC, Reich NG, Sheldon D (2023). Annals of Applied Statistics, 17(3): 1801-1819.

covid-19

Abstract

The COVID-19 pandemic emerged in late December 2019. In the first six months of the global outbreak, the US reported more cases and deaths than any other country in the world. Effective modeling of the course of the pandemic can help assist with public health resource planning, intervention efforts, and vaccine clinical trials. However, building applied forecasting models presents unique challenges during a pandemic. First, case data available to models in real-time represent a non-stationary fraction of the true case incidence due to changes in available diagnostic tests and test-seeking behavior. Second, interventions varied across time and geography leading to large changes in transmissibility over the course of the pandemic. We propose a mechanistic Bayesian model (MechBayes) that builds upon the classic compartmental susceptible-exposed-infected-recovered (SEIR) model to operationalize COVID-19 forecasting in real time. This framework includes non-parametric modeling of varying transmission rates, non-parametric modeling of case and death discrepancies due to testing and reporting issues, and a joint observation likelihood on new case counts and new deaths; it is implemented in a probabilistic programming language to automate the use of Bayesian reasoning for quantifying uncertainty in probabilistic forecasts. The model has been used to submit forecasts to the US Centers for Disease Control, through the COVID-19 Forecast Hub. We examine the performance relative to a baseline model as well as alternate models submitted to the Forecast Hub. Additionally, we include an ablation test of our extensions to the classic SEIR models. We demonstrate a significant gain in both point and probabilistic forecast scoring measures using MechBayes when compared to a baseline model. We show that MechBayes ranks as one of the top models out of those submitted to the COVID-19 Forecast Hub. Finally, we demonstrate that MechBayes performs significantly better than the classical SEIR model.

Mixture distributions for probabilistic forecasts of disease outbreaks

Wadsworth S, Niemi J, Reich NG (2023). arXiv.

forecasting

Abstract

Collaboration among multiple teams has played a major role in probabilistic forecasting events of influenza outbreaks, the COVID-19 pandemic, other disease outbreaks, and in many other fields. When collecting forecasts from individual teams, ensuring that each team's model represents forecast uncertainty according to the same format allows for direct comparison of forecasts as well as methods of constructing multi-model ensemble forecasts. This paper outlines several common probabilistic forecast representation formats including parametric distributions, sample distributions, bin distributions, and quantiles and compares their use in the context of collaborative projects. We propose the use of a discrete mixture distribution format in collaborative forecasting in place of other formats. The flexibility in distribution shape, the ease for scoring and building ensemble models, and the reasonably low level of computer storage required to store such a forecast make the discrete mixture distribution an attractive alternative to the other representation formats.

Comparison of combination methods to create calibrated ensemble forecasts for seasonal influenza in the U.S.

Wattanachit N, Ray EL, McAndrew TC, Reich NG (2023). Statistics in Medicine, 42(26): 4696-4712.

forecasting influenza

Abstract

The characteristics of influenza seasons vary substantially from year to year, posing challenges for public health preparation and response. Influenza forecasting is used to inform seasonal outbreak response, which can in turn potentially reduce the impact of an epidemic. The United States Centers for Disease Control and Prevention, in collaboration with external researchers, has run an annual prospective influenza forecasting exercise, known as the FluSight challenge. Uniting theoretical results from the forecasting literature with domain-specific forecasts from influenza outbreaks, we applied parametric forecast combination methods that simultaneously optimize model weights and calibrate the ensemble via a beta transformation and made adjustments to the methods to reduce their complexity. We used the beta-transformed linear pool, the finite beta mixture model, and their equal weight adaptations to produce ensemble forecasts retrospectively for the 2016/2017, 2017/2018, and 2018/2019 influenza seasons in the U.S. We compared their performance to methods that were used in the FluSight challenge to produce the FluSight Network ensemble, namely the equally weighted linear pool and the linear pool. Ensemble forecasts produced from methods with a beta transformation were shown to outperform those from the equally weighted linear pool and the linear pool for all week-ahead targets across in the test seasons based on average log scores. We observed improvements in overall accuracy despite the beta-transformed linear pool or beta mixture methods' modest under-prediction across all targets and seasons. Combination techniques that explicitly adjust for known calibration issues in linear pooling should be considered to improve probabilistic scores in outbreak settings.

Predictive performance of multi-model ensemble forecasts of COVID-19 across European nations

Sherratt K, Gruson H, Grah R, ... Gibson GC, Ray EL, Reich NG, Sheldon D, Wang Y, Wattanachit N, ... Bracher J, Funk S (2023). eLife, 12: e81916.

forecasting covid-19

Abstract

Background: Short-term forecasts of infectious disease contribute to situational awareness and capacity planning. Based on best practice in other fields and recent insights in infectious disease epidemiology, one can maximise forecasts’ predictive performance by combining independent models into an ensemble. Here we report the performance of ensemble predictions of COVID-19 cases and deaths across Europe from March 2021 to March 2022. Methods: We created the European COVID-19 Forecast Hub, an online open-access platform where modellers upload weekly forecasts for 32 countries with results publicly visualised and evaluated. We created a weekly ensemble forecast from the equally-weighted average across individual models' predictive quantiles. We measured forecast accuracy using a baseline and relative Weighted Interval Score (rWIS). We retrospectively explored ensemble methods, including weighting by past performance. Results: We collected weekly forecasts from 48 models, of which we evaluated 29 models alongside the ensemble model. The ensemble had a consistently strong performance across countries over time, performing better on rWIS than 91% of forecasts for deaths (N=763 predictions from 20 models), and 83% forecasts for cases (N=886 predictions from 23 models). Performance remained stable over a 4-week horizon for death forecasts but declined with longer horizons for cases. Among ensemble methods, the most influential choice came from using a median average instead of the mean, regardless of weighting component models. Conclusions: Our results support combining independent models into an ensemble forecast to improve epidemiological predictions, and suggest that median averages yield better performance than methods based on means. We highlight that forecast consumers should place more weight on incident death forecasts than case forecasts at horizons greater than two weeks. Code and data availability: All source data were openly available before the study, originally available at: https://github.com/covid19-forecast-hub-europe/covid19-forecast-hub-europe. All data and code for this study are openly available on Github: covid19-forecast-hub-europe/euro-hub-ensemble.

Assessing the utility of COVID-19 case reports as a leading indicator for hospitalization forecasting in the United States

Reich NG, Wang Y, Burns M, Ergas R, Cramer EY, Ray EL (2023). Epidemics, 45: 100728.

forecasting covid-19

Abstract

Identifying data streams that can consistently improve the accuracy of epidemiological forecasting models is challenging. Using models designed to predict daily state-level hospital admissions due to COVID-19 in California and Massachusetts, we investigated whether incorporating COVID-19 case data systematically improved forecast accuracy. Additionally, we considered whether using case data aggregated by date of test or by date of report from a surveillance system made a difference to the forecast accuracy. Evaluating forecast accuracy in a test period, after first having selected the best-performing methods in a validation period, we found that overall the difference in accuracy between approaches was small, especially at forecast horizons of less than two weeks. However, forecasts from models using cases aggregated by test date showed lower accuracy at longer horizons and at key moments in the pandemic, such as the peak of the Omicron wave in January 2022. Overall, these results highlight the challenge of finding a modeling approach that can generate accurate forecasts of outbreak trends both during periods of relative stability and during periods that show rapid growth or decay of transmission rates. While COVID-19 case counts seem to be a natural choice to help predict COVID-19 hospitalizations, in practice any benefits we observed were small and inconsistent.

Impact of SARS-CoV-2 vaccination of children ages 5-11 years on COVID-19 disease burden and resilience to new variants in the United States, November 2021-March 2022: a multi-model study

Borchering RK, Mullany LC, Howerton E, Chinazzi M, Smith CP, Qin M, Reich NG, ... Viboud C, Lessler J (2023). The Lancet Regional Health-Americas, 17: 100398.

covid-19

Abstract

Background: The COVID-19 Scenario Modeling Hub convened nine modeling teams to project the impact of expanding SARS-CoV-2 vaccination to children aged 5–11 years on COVID-19 burden and resilience against variant strains. Methods: Teams contributed state- and national-level weekly projections of cases, hospitalizations, and deaths in the United States from September 12, 2021 to March 12, 2022. Four scenarios covered all combinations of 1) vaccination (or not) of children aged 5–11 years (starting November 1, 2021), and 2) emergence (or not) of a variant more transmissible than the Delta variant (emerging November 15, 2021). Individual team projections were linearly pooled. The effect of childhood vaccination on overall and age-specific outcomes was estimated using meta-analyses. Findings: Assuming that a new variant would not emerge, all-age COVID-19 outcomes were projected to decrease nationally through mid-March 2022. In this setting, vaccination of children 5–11 years old was associated with reductions in projections for all-age cumulative cases (7.2%, mean incidence ratio [IR] 0.928, 95% confidence interval [CI] 0.880–0.977), hospitalizations (8.7%, mean IR 0.913, 95% CI 0.834–0.992), and deaths (9.2%, mean IR 0.908, 95% CI 0.797–1.020) compared with scenarios without childhood vaccination. Vaccine benefits increased for scenarios including a hypothesized more transmissible variant, assuming similar vaccine effectiveness. Projected relative reductions in cumulative outcomes were larger for children than for the entire population. State-level variation was observed. Interpretation: Given the scenario assumptions (defined before the emergence of Omicron), expanding vaccination to children 5–11 years old would provide measurable direct benefits, as well as indirect benefits to the all-age U.S. population, including resilience to more transmissible variants.

2022

An expert judgment model to predict early stages of the COVID-19 outbreak in the United States

McAndrew TC, Reich NG (2022). PLOS Comp Bio, 18(9): e1010485.

covid-19 forecasting

Abstract

From February to May 2020, experts in the modeling of infectious disease provided quantitative predictions and estimates of trends in the emerging COVID-19 pandemic in a series of 13 surveys. Data on existing transmission patterns were sparse when the pandemic began, but experts synthesized information available to them to provide quantitative, judgment-based assessments of the current and future state of the pandemic. We aggregated expert predictions into a single “linear pool” by taking an equally weighted average of their probabilistic statements. At a time when few computational models made public estimates or predictions about the pandemic, expert judgment provided (a) falsifiable predictions of short- and long-term pandemic outcomes related to reported COVID-19 cases, hospitalizations, and deaths, (b) estimates of latent viral transmission, and (c) counterfactual assessments of pandemic trajectories under different scenarios. The linear pool approach of aggregating expert predictions provided more consistently accurate predictions than any individual expert, although the predictive accuracy of a linear pool rarely provided the most accurate prediction. This work highlights the importance that expert linear pool could play in flexibly assessing a wide array of risks early in future emerging outbreaks, especially in settings where available data cannot yet support data-driven computational modeling.

An evaluation of prospective COVID-19 modelling studies in the USA: from data to science translation

Nixon K, Jindal S, Parker F, Reich NG, Ghobadi K, Lee EC, Truelove S, Gardner L (2022). The Lancet Digital Health, 4(10): e738-e747.

forecasting covid-19

Abstract

Infectious disease modelling can serve as a powerful tool for situational awareness and decision support for policy makers. However, COVID-19 modelling efforts faced many challenges, from poor data quality to changing policy and human behaviour. To extract practical insight from the large body of COVID-19 modelling literature available, we provide a narrative review with a systematic approach that quantitatively assessed prospective, data-driven modelling studies of COVID-19 in the USA. We analysed 136 papers, and focused on the aspects of models that are essential for decision makers. We have documented the forecasting window, methodology, prediction target, datasets used, and geographical resolution for each study. We also found that a large fraction of papers did not evaluate performance (25%), express uncertainty (50%), or state limitations (36%). To remedy some of these identified gaps, we recommend the adoption of the EPIFORGE 2020 model reporting guidelines and creating an information-sharing system that is suitable for fast-paced infectious disease outbreak science.

Real-time COVID-19 forecasting: challenges and opportunities of model performance and translation

Nixon K, Jindal S, Parker F, Marshall M, Reich NG, Ghobadi K, Lee EC, Truelove S, Gardner L (2022). The Lancet Digital Health, 4(10): e699–e701.

forecasting covid-19

Abstract

The COVID-19 pandemic brought mathematical modelling into the spotlight, as scientists rushed to use data to understand transmission patterns and disease severity, and to anticipate future epidemic outcomes. However, the use of COVID-19 modelling has been criticised, in part because of a few particularly erroneous projections at the start of the pandemic.1 More than 2 years into the pandemic, models continue to face serious obstacles as tools for informing outbreak response.1 Population-level health outcomes are difficult to predict accurately, especially cases and hospitalisations,2 as discussed in the International Institute of Forecasters blog. This Comment, drawn from our experiences with real-time prospective COVID-19 modelling, details these obstacles. We aim to highlight areas where further research and investment can improve the use of models for informing outbreak responses in the USA, with a summary of recommendations in the Panel.

Comparing trained and untrained probabilistic ensemble forecasts of COVID-19 cases and deaths in the United States

Ray EL, Brooks LC, Bien J, Biggerstaff M, Bosse NI, Bracher J, Cramer EY, Funk S, Gerding A, Johansson MA, Rumack A, Wang Y, Zorn M, Tibshirani RJ, Reich NG (2022). International Journal of Forecasting, 39: 1366-1383.

forecasting covid-19

Abstract

The U.S. COVID-19 Forecast Hub aggregates forecasts of the short-term burden of COVID-19 in the United States from many contributing teams. We study methods for building an ensemble that combines forecasts from these teams. These experiments have informed the ensemble methods used by the Hub. To be most useful to policy makers, ensemble forecasts must have stable performance in the presence of two key characteristics of the component forecasts: (1) occasional misalignment with the reported data, and (2) instability in the relative performance of component forecasters over time. Our results indicate that in the presence of these challenges, an untrained and robust approach to ensembling using an equally weighted median of all component forecasts is a good choice to support public health decision makers. In settings where some contributing forecasters have a stable record of good performance, trained ensembles that give those forecasters higher weight can also be helpful.

The United States COVID-19 Forecast Hub dataset

Cramer EY, Huang Y, Wang Y, Ray EL, Cornell M, Bracher J, Brennen A, Castro Rivadeneira AJ, Gerding A, House K, Jayawardena D, Kanji AH, Khandelwal A, Le K, Niemi J, Stark A, Shah A, Wattanachit N, Zorn MW, Reich NG (2022). Scientific Data, 9(1): 1-15.

covid-19 forecasting software

Abstract

Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident hospitalizations, incident cases, incident deaths, and cumulative deaths due to COVID-19 at national, state, and county levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.

Collaborative Hubs: Making the Most of Predictive Epidemic Modeling

Reich NG, Lessler J, Funk S, Viboud C, Vespignani A, Tibshirani RJ, Shea K, Schienle M, Runge MC, Rosenfeld R, Ray EL, Niehus R, Johnson HC, Johansson MA, Hochheiser H, Gardner L, Bracher J, Borchering RK, Biggerstaff M (2022). AJPH, 112(6): 839-842.

forecasting covid-19

Abstract

The COVID-19 pandemic has made it clear that epidemic models play an important role in how governments and the public respond to infectious disease crises. Early in the pandemic, models were used to estimate the true number of infections. Later, they estimated key parameters, generated short-term forecasts of outbreak trends, and quantified possible effects of interventions on the unfolding epidemic. In contrast to the coordinating role played by major national or international agencies in weather related emergencies, pandemic modeling efforts were initially scattered across many research institutions. Differences in modeling approaches led to contrasting results, contributing to confusion in public perception of the pandemic. Efforts to coordinate modeling efforts in so-called “hubs” have provided governments, healthcare agencies, and the public with assessments and forecasts that reflect the consensus in the modeling community. This has been achieved by openly synthesizing uncertainties across different modeling approaches and facilitating comparisons between them.

Evaluation of individual and ensemble probabilistic forecasts of COVID-19 mortality in the United States

Cramer EY, Ray EL, Lopez VK, Bracher J, ... Slayton RB, Johansson M , Biggerstaff M, Reich NG (2022). PNAS, 119(15): e2113561119.

forecasting covid-19

Abstract

Short-term probabilistic forecasts of the trajectory of the COVID-19 pandemic in the United States have served as a visible and important communication channel between the scientific modeling community and both the general public and decision-makers. Forecasting models provide specific, quantitative, and evaluable predictions that inform short-term decisions such as healthcare staffing needs, school closures, and allocation of medical supplies. Starting in April 2020, the US COVID-19 Forecast Hub (https://covid19forecasthub.org/) collected, disseminated, and synthesized tens of millions of specific predictions from more than 90 different academic, industry, and independent research groups. A multimodel ensemble forecast that combined predictions from dozens of groups every week provided the most consistently accurate probabilistic forecasts of incident deaths due to COVID-19 at the state and national level from April 2020 through October 2021. The performance of 27 individual models that submitted complete forecasts of COVID-19 deaths consistently throughout this year showed high variability in forecast skill across time, geospatial units, and forecast horizons. Two-thirds of the models evaluated showed better accuracy than a naïve baseline model. Forecast accuracy degraded as models made predictions further into the future, with probabilistic error at a 20-wk horizon three to five times larger than when predicting at a 1-wk horizon. This project underscores the role that collaboration and active coordination between governmental public-health agencies, academic modeling teams, and industry partners can play in developing modern modeling capabilities to support local, state, and federal response to outbreaks.

Collaborative modeling key to improving outbreak response

Reich NG, Ray EL (2022). PNAS, 119(14): e2200703119.

forecasting covid-19

Abstract

During the COVID-19 pandemic, modeling and forecasting have informed public health response at the local, state, and national levels by improving situational awareness, providing estimates of key virus characteristics, and optimizing mitigation strategies (1). While forecasting efforts often have been the most visible modeling outputs to the general public, as predictions are often highlighted by the media, other modeling has played an important role in the pandemic as well. In PNAS, Fox et al. detail an important and influential collaborative modeling effort that has supported real-time public health decision-making in Austin, TX, during the COVID-19 pandemic (2). The effort described by Fox et al. is notable both for its careful and accurate modeling as well as the in-depth collaboration, clearly built on a relationship of trust, with Austin city officials. While this effort is exemplary and hopefully will serve as a model for future similar collaborative work, the paper also raises important questions about how this kind of effort can be scaled. Ideally many municipalities, including ones that are not fortunate enough to have a terrific academic modeling group in or near their city, could take advantage of the insights that models have to offer. Can state and national public health agencies support scalable modeling efforts so that every local and state government can take advantage of a wide range of insights from robust modeling efforts? Furthermore, in doing so, can we reduce the dependency of such an undertaking on one single modeling group, by relying on the successful use of collaborative modeling “hubs” that have sprouted up before and during the pandemic (3–10) and/or by supporting the development of modeling capacity within public health agencies?

2021

Recommended reporting items for epidemic forecasting and prediction research: The EPIFORGE 2020 guidelines

Pollett S, Johansson MA, Reich NG, ... Viboud C, Brady O, Rivers C (2021). PLOS Medicine, 18(10): e1003793.

forecasting

Abstract

Background: The importance of infectious disease epidemic forecasting and prediction research is underscored by decades of communicable disease outbreaks, including COVID-19. Unlike other fields of medical research, such as clinical trials and systematic reviews, no reporting guidelines exist for reporting epidemic forecasting and prediction research despite their utility. We therefore developed the EPIFORGE checklist, a guideline for standardized reporting of epidemic forecasting research. Methods and findings: We developed this checklist using a best-practice process for development of reporting guidelines, involving a Delphi process and broad consultation with an international panel of infectious disease modelers and model end users. The objectives of these guidelines are to improve the consistency, reproducibility, comparability, and quality of epidemic forecasting reporting. The guidelines are not designed to advise scientists on how to perform epidemic forecasting and prediction research, but rather to serve as a standard for reporting critical methodological details of such studies. Conclusions: These guidelines have been submitted to the EQUATOR network, in addition to hosting by other dedicated webpages to facilitate feedback and journal endorsement.

Modeling of Future COVID-19 Cases, Hospitalizations, and Deaths, by Vaccination Rates and Nonpharmaceutical Intervention Scenarios — United States, April–September 2021

Borchering RK, Viboud C, Howerton E, Smith CP, Truelove S, Runge MC, Reich NG, ... Shea K, Lessler, J (2021). Morbidity and Mortality Weekly Report (MMWR), 70(19): 719–724.

forecasting covid-19

Abstract

What is already known about this topic? Increases in COVID-19 cases in March and early April occurred despite a large-scale vaccination program. Increases coincided with the spread of SARS-CoV-2 variants and relaxation of nonpharmaceutical interventions (NPIs). What is added by this report? Data from six models indicate that with high vaccination coverage and moderate NPI adherence, hospitalizations and deaths will likely remain low nationally, with a sharp decline in cases projected by July 2021. Lower NPI adherence could lead to substantial increases in severe COVID-19 outcomes, even with improved vaccination coverage. What are the implications for public health practice? High vaccination coverage and compliance with NPIs are essential to control COVID-19 and prevent surges in hospitalizations and deaths in the coming months.

The Zoltar forecast archive: a tool to facilitate standardization and storage of interdisciplinary prediction research

Reich NG, Cornell M, Ray EL, House K, Le K (2021). Scientific Data, 8(59).

forecasting software

Abstract

Forecasting has emerged as an important component of informed, data-driven decision-making in a wide array of fields. We introduce a new data model for probabilistic predictions that encompasses a wide range of forecasting settings. This framework clearly defines the constituent parts of a probabilistic forecast and proposes one approach for representing these data elements. The data model is implemented in Zoltar, a new software application that stores forecasts using the data model and provides standardized API access to the data. In one real-time case study, an instance of the Zoltar web application was used to store, provide access to, and evaluate real-time forecast data on the order of 107 rows, provided by over 20 international research teams from academia and industry making forecasts of the COVID-19 outbreak in the US. Tools and data infrastructure for probabilistic forecasts, such as those introduced here, will play an increasingly important role in ensuring that future forecasting research adheres to a strict set of rigorous and reproducible standards.

Improving Probabilistic Infectious Disease Forecasting Through Coherence

Gibson GC, Moran K, Reich NG, Osthus D (2021). PLOS Comp Bio.

forecasting influenza flusight

Abstract

With an estimated $10 . 4 billion in medical costs and 31.4 million outpatient visits each year, influenza poses a serious burden of disease in the United States. To provide insights and advance warning into the spread of influenza, the U.S. Centers for Disease Control and Prevention (CDC) runs a challenge for forecasting weighted influenza-like illness (wILI) at the national and regional level. Many models produce independent forecasts for each geographical unit, ignoring the constraint that the national wILI is a weighted sum of regional wILI, where the weights correspond to the population size of the region. We propose a novel algorithm that transforms a set of independent forecast distributions to obey this constraint, which we refer to as probabilistically coherent. Enforcing probabilistic coherence led to an increase in forecast skill for 90% of the models we tested over multiple flu seasons, highlighting the importance of respecting the forecasting system's geographical hierarchy.

Evaluating epidemic forecasts in an interval format

Bracher J, Ray EL, Gneiting T, Reich NG (2021). PLOS Comp Bio.

forecasting covid-19

Abstract

For practical reasons, many forecasts of case, hospitalization and death counts in the context of the current COVID-19 pandemic are issued in the form of central predictive intervals at various levels. This is also the case for the forecasts collected in the COVID-19 Forecast Hub run by the UMass-Amherst Influenza Forecasting Center of Excellence. Forecast evaluation metrics like the logarithmic score, which has been applied in several infectious disease forecasting challenges, are then not available as they require full predictive distributions. This note provides an overview of how established methods for the evaluation of quantile and interval forecasts can be applied to epidemic forecasts. Specifically, we discuss the computation and interpretation of the weighted interval score, which is a proper score that approximates the continuous ranked probability score. It can be interpreted as a generalization of the absolute error to probabilistic forecasts and allows for a simple decomposition into a measure of sharpness and penalties for over- and underprediction.

Adaptively stacking ensembles for influenza forecasting

McAndrew T, Reich NG (2021). Statistics in Medicine, 40(30): 6931-6952.

forecasting influenza flusight

Abstract

Seasonal influenza infects between 10 and 50 million people in the United States every year. Accurate forecasts of influenza and influenza-like illness (ILI) have been named by the CDC as an important tool to fight the damaging effects of these epidemics. Multi-model ensembles make accurate forecasts of seasonal influenza, but current operational ensemble forecasts are static: they require an abundance of past ILI data and assign fixed weights to component models at the beginning of a season, but do not update weights as new data on component model performance is collected. We propose an adaptive ensemble that (i) does not initially need data to combine forecasts and (ii) finds optimal weights which are updated week-by-week throughout the influenza season. We take a regularized likelihood approach and investigate this regularizer's ability to impact adaptive ensemble performance. After finding an optimal regularization value, we compare our adaptive ensemble to an equal-weighted and static ensemble. Applied to forecasts of short-term ILI incidence at the regional and national level, our adaptive model outperforms an equal-weighted ensemble and has similar performance to the static ensemble using only a fraction of the data available to the static ensemble. Needing no data at the beginning of an epidemic, an adaptive ensemble can quickly train and forecast an outbreak, providing a practical tool to public health officials looking for a forecast to conform to unique features of a specific season.

Aggregating predictions from experts: A review of statistical methods, experiments, and applications

McAndrew T, Wattanachit N, Gibson GC, Reich NG (2021). Wiley Interdisciplinary Reviews: Computational Statistics, 13(2): e1514.

forecasting

Abstract

Forecasts support decision making in a variety of applications. Statistical models can produce accurate forecasts given abundant training data, but when data is sparse or rapidly changing, statistical models may not be able to make accurate predictions. Expert judgmental forecasts—models that combine expert-generated predictions into a single forecast— can make predictions when training data is limited by relying on human intuition. Researchers have proposed a wide array of algorithms to combine expert predictions into a single forecast, but there is no consensus on an optimal aggregation model. This review surveyed recent literature on aggregating expert-elicited predictions. We gathered common terminology, aggregation methods, and forecasting performance metrics, and offer guidance to strengthen future work that is growing at an accelerated pace.

Serological surveys to estimate cumulative incidence of SARS-CoV-2 infection in adults (Sero-MAss study), Massachusetts, July–August 2020: a mail-based cross-sectional study

Snyder T, Ravenhurst J, Cramer EY, Reich NG, Balzer L, Alfandari D, Lover AA (2021). BMJ Open, 11:e051157: 1-10.

forecasting covid-19

Abstract

Objectives To estimate the seroprevalence of anti-SARSCoV-2 IgG and IgM among Massachusetts residents and to better understand asymptomatic SARS-CoV-2 transmission during the summer of 2020. Design Mail-based cross-sectional survey. Setting Massachusetts, USA. Participants Primary sampling group: sample of undergraduate students at the University of Massachusetts, Amherst (n=548) and a member of their household (n=231). Secondary sampling group: sample of graduate students, faculty, librarians and staff (n=214) and one member of their household (n=78). All participants were residents of Massachusetts without prior COVID-19 diagnosis. Primary and secondary outcome measures Prevalence of SARS-CoV-2 seropositivity. Association of seroprevalence with variables including age, gender, race, geographic region, occupation and symptoms. Results Approximately 27 000 persons were invited via email to assess eligibility. 1001 households were mailed dried blood spot sample kits, 762 returned blood samples for analysis. In the primary sample group, 36 individuals (4.6%) had IgG antibodies detected for an estimated weighted prevalence in this population of 5.3% (95% CI: 3.5 to 8.0). In the secondary sampling group, 10 participants (3.4%) had IgG antibodies detected for an estimated adjusted prevalence of 4.0% (95% CI: 2.2 to 7.4). No samples were IgM positive. No association was found in either group between seropositivity and self-reported work duties or customer-facing hours. In the primary sampling group, self-reported febrile illness since February 2020, male sex and minority race (Black or American Indian/Alaskan Native) were associated with seropositivity. No factors except geographic regions within the state were associated with evidence of prior SARSCoV-2 infection in the secondary sampling group. Conclusions This study fills a critical gap in estimating the levels of subclinical and asymptomatic infection. Estimates can be used to calibrate models estimating levels of population immunity over time, and these data are critical for informing public health interventions and policy.

2020

Identification and evaluation of epidemic prediction and forecasting reporting guidelines: A systematic review and a call for action

Pollett S, Johansson M,Biggerstaff M, Morton LC, Bazaco SL, Major DMB, Ibarra AMS, Pavlin JA, Mate S, Sippy R, Hartman LJ, Reich NG, Maljkovic I, Chretien BJP, Althouse BM, Myer D, Viboud C, Rivers C (2020). Epidemics, 33: 100-400.

forecasting

Abstract

Introduction High quality epidemic forecasting and prediction are critical to support response to local, regional and global infectious disease threats. Other fields of biomedical research use consensus reporting guidelines to ensure standardization and quality of research practice among researchers, and to provide a framework for end-users to interpret the validity of study results. The purpose of this study was to determine whether guidelines exist specifically for epidemic forecast and prediction publications. Methods We undertook a formal systematic review to identify and evaluate any published infectious disease epidemic forecasting and prediction reporting guidelines. This review leveraged a team of 18 investigators from US Government and academic sectors. Results A literature database search through May 26, 2019, identified 1467 publications (MEDLINE n = 584, EMBASE n = 883), and a grey-literature review identified a further 407 publications, yielding a total 1777 unique publications. A paired-reviewer system screened in 25 potentially eligible publications, of which two were ultimately deemed eligible. A qualitative review of these two published reporting guidelines indicated that neither were specific for epidemic forecasting and prediction, although they described reporting items which may be relevant to epidemic forecasting and prediction studies. Conclusions This systematic review confirms that no specific guidelines have been published to standardize the reporting of epidemic forecasting and prediction studies. These findings underscore the need to develop such reporting guidelines in order to improve the transparency, quality and implementation of epidemic forecasting and prediction research in operational public health.

Seasonal patterns of dengue incidence in Thailand across the urban-rural gradient

Bi Q, Cummings DAT, Reich NG, Keegan LT, Kaminsky J, Salje H, Clapham H, Doung-ngern P, Iamsirithaworn S, Lessler J (2020). medRxiv.

dengue

Abstract

In Southeast Asia, endemic dengue follows strong spatio-temporal patterns with major epidemics occurring every 2-5 years. However, important spatio-temporal variation in seasonal dengue epidemics remains poorly understood. Using 13 years (2003-2015) of dengue surveillance data from 926 districts in Thailand and wavelet analysis, we show that rural epidemics lead urban epidemics within a dengue season, both nationally and within health regions. However, local dengue fade-outs are more likely in rural areas than in urban areas during the off season, suggesting rural areas are not the source of viral dispersion. Simple dynamic models show that stronger seasonal forcing in rural areas could explain the inconsistency between earlier rural epidemics and dengue “over wintering” in urban areas. These results add important nuance to earlier work showing the importance of urban areas in driving multi-annual patterns of dengue incidence in Thailand. Feedback between geographically linked locations with markedly different ecology is key to explaining full disease dynamics across urban-rural gradient.

Ensemble Forecasts of Coronavirus Disease 2019 (COVID-19) in the U.S.

Ray EL, Wattanachit N, Niemi J, Kanji AH, House K, Cramer EY, ... Reich NG on behalf of the COVID-19 Forecast Hub Consortium (2020). medRxiv.

covid-19 forecasting

Abstract

Background: The COVID-19 pandemic has driven demand for forecasts to guide policy and planning. Previous research has suggested that combining forecasts from multiple models into a single “ensemble” forecast can increase the robustness of forecasts. Here we evaluate the real-time application of an open, collaborative ensemble to forecast deaths attributable to COVID-19 in the U.S. Methods: Beginning on April 13, 2020, we collected and combined one- to four-week ahead forecasts of cumulative deaths for U.S. jurisdictions in standardized, probabilistic formats to generate real-time, publicly available ensemble forecasts. We evaluated the point prediction accuracy and calibration of these forecasts compared to reported deaths. Results: Analysis of 2,512 ensemble forecasts made April 27 to July 20 with outcomes observed in the weeks ending May 23 through July 25, 2020 revealed precise short-term forecasts, with accuracy deteriorating at longer prediction horizons of up to four weeks. At all prediction horizons, the prediction intervals were well calibrated with 92-96% of observations falling within the rounded 95% prediction intervals. Conclusions: This analysis demonstrates that real-time, publicly available ensemble forecasts issued in April-July 2020 provided robust short-term predictions of reported COVID-19 deaths in the United States. With the ongoing need for forecasts of impacts and resource needs for the COVID-19 response, the results underscore the importance of combining multiple probabilistic models and assessing forecast skill at different prediction horizons. Careful development, assessment, and communication of ensemble forecasts can provide reliable insight to public health decision makers.

Estimation of Excess Deaths Associated With the COVID-19 Pandemic in the United States, March to May 2020

Weinberger D, Cohen T, Crawford F, Mostashari F, Olson D, Pitzer VE, Reich NG, Russi M, Simonsen L, Watkins A, Viboud C (2020). JAMA Internal Medicine.

covid-19

Abstract

Importance: Efforts to track the severity and public health impact of coronavirus disease 2019 (COVID-19) in the United States have been hampered by state-level differences in diagnostic test availability, differing strategies for prioritization of individuals for testing, and delays between testing and reporting. Evaluating unexplained increases in deaths due to all causes or attributed to nonspecific outcomes, such as pneumonia and influenza, can provide a more complete picture of the burden of COVID-19. Objective: To estimate the burden of all deaths related to COVID-19 in the United States from March to May 2020. Design, Setting, and Population: This observational study evaluated the numbers of US deaths from any cause and deaths from pneumonia, influenza, and/or COVID-19 from March 1 through May 30, 2020, using public data of the entire US population from the National Center for Health Statistics (NCHS). These numbers were compared with those from the same period of previous years. All data analyzed were accessed on June 12, 2020. Main Outcomes and Measures: Increases in weekly deaths due to any cause or deaths due to pneumonia/influenza/COVID-19 above a baseline, which was adjusted for time of year, influenza activity, and reporting delays. These estimates were compared with reported deaths attributed to COVID-19 and with testing data. Results: There were approximately 781 000 total deaths in the United States from March 1 to May 30, 2020, representing 122 300 (95% prediction interval, 116 800-127 000) more deaths than would typically be expected at that time of year. There were 95 235 reported deaths officially attributed to COVID-19 from March 1 to May 30, 2020. The number of excess all-cause deaths was 28% higher than the official tally of COVID-19–reported deaths during that period. In several states, these deaths occurred before increases in the availability of COVID-19 diagnostic tests and were not counted in official COVID-19 death records. There was substantial variability between states in the difference between official COVID-19 deaths and the estimated burden of excess deaths. Conclusions and Relevance: Excess deaths provide an estimate of the full COVID-19 burden and indicate that official tallies likely undercount deaths due to the virus. The mortality burden and the completeness of the tallies vary markedly between states.

Infectious Disease Forecasting for Public Health

Lauer SA, Brown AC, Reich NG (2020). arXiv.

forecasting

Abstract

Forecasting transmission of infectious diseases, especially for vector-borne diseases, poses unique challenges for researchers. Behaviors of and interactions between viruses, vectors, hosts, and the environment each play a part in determining the transmission of a disease. Public health surveillance systems and other sources provide valuable data that can be used to accurately forecast disease incidence. However, many aspects of common infectious disease surveillance data are imperfect: cases may be reported with a delay or in some cases not at all, data on vectors may not be available, and case data may not be available at high geographical or temporal resolution. In the face of these challenges, researchers must make assumptions to either account for these underlying processes in a mechanistic model or to justify their exclusion altogether in a statistical model. Whether a model is mechanistic or statistical, researchers should evaluate their model using accepted best practices from the emerging field of infectious disease forecasting while adopting conventions from other fields that have been developing forecasting methods for decades. Accounting for assumptions and properly evaluating models will allow researchers to generate forecasts that have the potential to provide valuable insights for public health officials. This chapter provides a background to the practice of forecasting in general, discusses the biological and statistical models used for infectious disease forecasting, presents technical details about making and evaluating forecasting models, and explores the issues in communicating forecasting results in a public health context.

The incubation period of 2019-nCoV from publicly reported confirmed cases: estimation and application

Lauer SA, Grantz KH, Bi Q, Jones FK, Zheng Q, Meredith H, Azman AS, Reich NG, Lessler J (2020). Annals of Internal Medicine.

incubation-period covid-19

Abstract

Background: A novel human coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), was identified in China in December 2019. There is limited support for many of its key epidemiologic features, including the incubation period for clinical disease (coronavirus disease 2019 [COVID-19]), which has important implications for surveillance and control activities. Objective: To estimate the length of the incubation period of COVID-19 and describe its public health implications. Design: Pooled analysis of confirmed COVID-19 cases reported between 4 January 2020 and 24 February 2020. Setting: News reports and press releases from 50 provinces, regions, and countries outside Wuhan, Hubei province, China. Participants: Persons with confirmed SARS-CoV-2 infection outside Hubei province, China. Measurements: Patient demographic characteristics and dates and times of possible exposure, symptom onset, fever onset, and hospitalization. Results: There were 181 confirmed cases with identifiable exposure and symptom onset windows to estimate the incubation period of COVID-19. The median incubation period was estimated to be 5.1 days (95% CI, 4.5 to 5.8 days), and 97.5% of those who develop symptoms will do so within 11.5 days (CI, 8.2 to 15.6 days) of infection. These estimates imply that, under conservative assumptions, 101 out of every 10 000 cases (99th percentile, 482) will develop symptoms after 14 days of active monitoring or quarantine. Limitation: Publicly reported cases may overrepresent severe cases, the incubation period for which may differ from that of mild cases. Conclusion: This work provides additional evidence for a median incubation period for COVID-19 of approximately 5 days, similar to SARS. Our results support current proposals for the length of quarantine or active monitoring of persons potentially exposed to SARS-CoV-2, although longer monitoring periods might be justified in extreme cases. Primary Funding Source: U.S. Centers for Disease Control and Prevention, National Institute of Allergy and Infectious Diseases, National Institute of General Medical Sciences, and Alexander von Humboldt Foundation.

Coordinating the real‐time use of global influenza activity data for better public health planning

Biggerstaff M et al. (2020). Influenza and Other Respiratory Viruses, 14(2): 105-110.

influenza flusight

Abstract

Health planners from global to local levels must anticipate year‐to‐year and week‐to‐ week variation in seasonal influenza activity when planning for and responding to epidemics to mitigate their impact. To help with this, countries routinely collect incidence of mild and severe respiratory illness and virologic data on circulating subtypes and use these data for situational awareness, burden of disease estimates and severity assessments. Advanced analytics and modelling are increasingly used to aid planning and response activities by describing key features of influenza activity for a given location and generating forecasts that can be translated to useful actions such as enhanced risk communications, and informing clinical supply chains. Here, we describe the formation of the Influenza Incidence Analytics Group (IIAG), a coordinated global effort to apply advanced analytics and modelling to public influenza data, both epidemiological and virologic, in real‐time and thus provide additional insights to countries who provide routine surveillance data to WHO. Our objectives are to systematically increase the value of data to health planners by applying advanced analytics and forecasting and for results to be immediately reproducible and deployable using an open repository of data and code. We expect the resources we develop and the associated community to provide an attractive option for the open analysis of key epidemiological data during seasonal epidemics and the early stages of an influenza pandemic.

FluSense: A Contactless Syndromic Surveillance Platform for Influenza-Like Illness in Hospital Waiting Areas

Hossain FA, Lover AA, Corey GA, Reich NG, Rahman T (2020). Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 4(1): 1.

influenza software

Abstract

We developed a contactless syndromic surveillance platform FluSense that aims to expand the current paradigm of influenza-like illness (ILI) surveillance by capturing crowd-level bio-clinical signals directly related to physical symptoms of ILI from hospital waiting areas in an unobtrusive and privacy-sensitive manner. FluSense consists of a novel edge-computing sensor system, models and data processing pipelines to track crowd behaviors and influenza-related indicators, such as coughs, and to predict daily ILI and laboratory-confirmed influenza caseloads. FluSense uses a microphone array and a thermal camera along with a neural computing engine to passively and continuously characterize speech and cough sounds along with changes in crowd density on the edge in a real-time manner. We conducted an IRB-approved 7 month-long study from December 10, 2018 to July 12, 2019 where we deployed FluSense in four public waiting areas within the hospital of a large university. During this period, the FluSense platform collected and analyzed more than 350,000 waiting room thermal images and 21 million non-speech audio samples from the hospital waiting areas. FluSense can accurately predict daily patient counts with a Pearson correlation coefficient of 0.95. We also compared signals from FluSense with the gold standard laboratory-confirmed influenza case data obtained in the same facility and found that our sensor-based features are strongly correlated with laboratory- confirmed influenza trends.

Evaluating the ALERT algorithm for local outbreak onset detection in seasonal infectious disease surveillance data

Brown AC, Lauer SA, Robinson C, Nyquist C, Rao S, Reich NG (2020). Statistics in Medicine.

forecasting influenza

Abstract

Estimation of epidemic onset timing is an important component of controlling the spread of seasonal infectious diseases within community healthcare sites. The Above Local Elevated Respiratory Illness Threshold (ALERT) algorithm uses a threshold-based approach to suggest incidence levels that historically have indicated the transition from endemic to epidemic activity. In this paper, we present the first detailed overview of the computational approach underlying the algorithm. In the motivating example section, we evaluate the performance of ALERT in determining the onset of increased respiratory virus incidence using laboratory testing data from the Children’s Hospital of Colorado. At a threshold of 10 cases per week, ALERT-selected intervention periods performed better than the observed hospital site periods (2004/2005-2012/2013) and a CUSUM method. Additional simulation studies show how data properties may effect ALERT performance on novel data. We found that the conditions under which ALERT showed ideal performance generally included high seasonality and low off-season incidence.

2019

Accuracy of real-time multi-model ensemble forecasts for seasonal influenza in the U.S.

Reich NG, McGowan CJ, Yamana TK, Tushar A, Ray E, Osthus D, Kandula S, Brooks LC, Crawford-Crudell W, Gibson GC, Moore E, Silva R, Biggerstaff M, Johanssom M, Rosenfeld R, Shaman J (2019). PLOS Comp Bio, 15(11): e1007486.

forecasting influenza flusight

Abstract

Seasonal influenza results in substantial annual morbidity and mortality in the United States and worldwide. Accurate forecasts of key features of influenza epidemics, such as the timing and severity of the peak incidence in a given season, can inform public health response to outbreaks. As part of ongoing efforts to incorporate data and advanced analytical methods into public health decision-making, the United States Centers for Disease Control and Prevention (CDC) has organized seasonal influenza forecasting challenges since the 2013/2014 season. In the 2017/2018 season, 22 teams participated. A subset of four teams created a research consortium called the FluSight Network in early 2017. During the 2017/2018 season they worked together to produce a collaborative multi-model ensemble that combined 21 separate component models into a single model using a machine learning technique called stacking. This approach creates a weighted average of predictive densities where the weight for each component is based on that component's forecast accuracy in past seasons. In the 2017/2018 influenza season, one of the largest seasonal outbreaks in the last 15 years, this multi-model ensemble performed better on average than all individual component models and placed second overall in the CDC challenge. It also outperformed the baseline multi-model ensemble created by the CDC that took a simple average of all models submitted to the forecasting challenge. This project shows that collaborative efforts between research teams to develop ensemble forecasting approaches can bring measurable improvements in forecast accuracy and important reductions in the variability of performance from year to year. Efforts such as this, that emphasize real-time testing and evaluation of forecasting models and facilitate the close collaboration between public health officials and modeling researchers, are essential to improving our understanding of how best to use forecasts to improve public health response to seasonal and emerging epidemic threats.

An open challenge to advance probabilistic forecasting for dengue epidemics

Johansson MA, ... Reich NG, Cummings DAT, Lauer SA, ... Brown AC, ... Chretien JP (2019). PNAS, 116(48): 24268-24274.

forecasting dengue flusight

Abstract

A wide range of research has promised new tools for forecasting infectious disease dynamics, but little of that research is currently being applied in practice, because tools do not address key public health needs, do not produce probabilistic forecasts, have not been evaluated on external data, or do not provide sufficient forecast skill to be useful. We developed an open collaborative forecasting challenge to assess probabilistic forecasts for seasonal epidemics of dengue, a major global public health problem. Sixteen teams used a variety of methods and data to generate forecasts for 3 epidemiological targets (peak incidence, the week of the peak, and total incidence) over 8 dengue seasons in Iquitos, Peru and San Juan, Puerto Rico. Forecast skill was highly variable across teams and targets. While numerous forecasts showed high skill for midseason situational awareness, early season skill was low, and skill was generally lowest for high incidence seasons, those for which forecasts would be most valuable. A comparison of modeling approaches revealed that average forecast skill was lower for models including biologically meaningful data and mechanisms and that both multimodel and multiteam ensemble forecasts consistently outperformed individual model forecasts. Leveraging these insights, data, and the forecasting framework will be critical to improve forecast skill and the application of forecasts in real time for epidemic preparedness and response. Moreover, key components of this project—integration with public health needs, a common forecasting framework, shared and standardized data, and open participation—can help advance infectious disease forecasting beyond dengue.

The covariate-adjusted residual estimator and its use in both randomized trials and observational settings

Lauer SA, Reich NG, Balzer LB (2019). arXiv.

cluster-randomization

Abstract

We often seek to estimate the causal effect of an exposure on a particular outcome in both randomized and observational settings. One such estimation method is the covariate-adjusted residuals estimator, which was designed for individually or cluster randomized trials. In this manuscript, we study the properties of this estimator and develop a new estimator that utilizes both covariate adjustment and inverse probability weighting. We support our theoretical results with a simulation study and an application in an infectious disease setting. The covariate-adjusted residuals estimator is an efficient and unbiased estimator of the average treatment effect in randomized trials; however, it is not guaranteed to be unbiased in observational studies. Our novel estimator, the covariate-adjusted residuals estimator with inverse probability weighting, is unbiased in randomized and observational settings, under a reasonable set of assumptions. Furthermore, when these assumptions hold, it provides efficiency gains over inverse probability weighting in observational studies. The covariate-adjusted residuals estimator is valid for use in randomized trials, but should not be used in observational studies. The covariate-adjusted residuals estimator with inverse probability weighting provides an efficient alternative for use in randomized and observational settings.

Opioids in the USA: Disparities in addiction and incarceration

Kazemi A, Kennedy C, Silverman G, Reich NG (2019). Significance, 16(5): 6-7.

opioid

Abstract

Arianna Kazemi, Connor Kennedy and Gabri Silverman, undergraduate winners of the ASA Public Health Data Challenge, and their advisor Nicholas G. Reich, explore differences in the death, arrest and reoffending rates for opioid users in the USA.

Reply to Bracher: Scoring probabilistic forecasts to maximize public health interpretability

Reich NG, Osthus D, Ray EL, Yamana TK, Biggerstaff M, Johansson MA, Rosenfeld R, Shaman J (2019). PNAS.

forecasting influenza flusight

Abstract

Evaluating probabilistic forecasts in the context of a real-time public health surveillance system is a complicated business. We agree with Bracher’s (1) observations that the scores established by the US Centers for Disease Control and Prevention (CDC) and used to evaluate our forecasts of seasonal influenza in the United States are not “proper” by definition (2). We thank him for raising this important issue. A key advantage of proper scoring is that it incentivizes forecasters to provide their best probabilistic estimates of the fundamental unit of prediction. In the case of the FluSight competition targets, the units are intervals or bins containing dates or values representing influenza-like illness (ILI) activity. A forecast assigns probabilities to each bin. During the evolution of the FluSight challenge, the organizers at CDC made a conscious decision to use a “moving window” or “multibin” score that rewards forecasts for assigning substantial probability to values within a window of the eventually observed value. This decision was driven by the need to find a balance between 1) strictly proper scoring and high-resolution binning (e.g., at 0.1% increments for ILI values) and 2) the need for coarser categorizations for communication and decision-making purposes. Because final observations from a surveillance system are only estimates of an underlying “ground truth” measure of disease activity, a wider window for evaluating accuracy was considered. In the end, CDC elected to allow nearby “windows” of the truth to be considered accurate (e.g., within ±0.5% of the observed ILI value), understanding that there was a downside to not using a proper score. Given the increasing visibility and public availability of infectious disease forecasts, such as those from the FluSight challenge (3), forecasts are being used and interpreted for multiple purposes by more end users than when the challenge was originally conceived. Using a proper logarithmic score would require that forecasts be evaluated at a fixed resolution, e.g., for prespecified bins of 0.1% or 0.5%. Even if forecasts were optimized for and formally evaluated at one specific resolution, this use would not preclude the transformation of forecast outputs to a variety of resolutions appropriate for the specific decision or communication. Therefore, Bracher’s (1) letter raises an interesting and timely question about whether to institute a proper scoring rule for evaluating these public health forecasts. Regarding the impact of the impropriety of the score on the results in our original paper, we confirm that none of the forecasts presented in our original paper were manipulated in the way that Bracher shows is possible (4). Furthermore, evaluating forecasts by the proper logarithmic score metric does not substantially change the quality of the component models relative to each other (Fig. 1). Bracher’s (1) letter contributes to an existing and robust dialogue among quantitative modelers and public health decision makers about how to meaningfully evaluate probabilistic forecasts and support effective real-time decision making. We welcome this ongoing public discussion of both scientific and public policy considerations in the evaluation of forecasts.

N95 Respirators vs Medical Masks for Preventing Influenza Among Health Care Personnel

Radonovich LJ, Simberkoff MS, Bessesen MT, Brown AC, Cummings DAT, Gaydos CA, Los JG, Krosche AE, Gibert CL, Gorse GJ, Nyquist AC, Reich NG, Rodriguez-Barradas MC, Price CS, Perl TM (2019). JAMA, 322(9): 824-833.

influenza cluster-randomization

Abstract

Clinical studies have been inconclusive about the effectiveness of N95 respirators and medical masks in preventing health care personnel (HCP) from acquiring workplace viral respiratory infections.

Technology to advance infectious disease forecasting for outbreak management

George GB, Taylor W, Shaman J, Rivers C, Paul B, O'Toole T, Johansson MA, Hirschman L, Biggerstaff M, Asher J, Reich NG (2019). Nature Communications, 10(3932).

forecasting flusight

Abstract

Forecasting is beginning to be integrated into decision-making processes for infectious disease outbreak response. We discuss how technologies could accelerate the adoption of forecasting among public health practitioners, improve epidemic management, save lives, and reduce the economic impact of outbreaks.

Using "outbreak science" to strengthen the use of models during epidemics

Rivers C, Chretien JP, Riley S, Pavlin J, Woodward A, Brett-Major D, Berry IB, Morton L, Jarman RG, Biggerstaff M, Johansson MA, Reich NG, Meyer D, Snyder MR, Pollett S (2019). Nature Communications, 10(3932).

forecasting

Abstract

Infectious disease modeling has played a prominent role in recent outbreaks, yet integrating these analyses into public health decision-making has been challenging. We recommend establishing ‘outbreak science’ as an inter-disciplinary field to improve applied epidemic modeling.

A collaborative multiyear, multimodel assessment of seasonal influenza forecasting in the United States

Reich NG, Brooks L, Spencer F, Kandula S, McGowan C, Moore E, Osthus D, Ray E, Tushar A,Yamana T, Biggerstaff M, Johansson MA, Rosenfeld R, and Shaman J (2019). PNAS, 116(8): 3146-3154.

forecasting influenza flusight

Abstract

Influenza infects an estimated 9 to 35 million individuals each year in the United States and is a contributing cause for between 12,000 and 56,000 deaths annually. Seasonal outbreaks of influenza are common in temperate regions of the world, with highest incidence typically occurring in colder and drier months of the year. Real-time forecasts of influenza transmission can inform public health response to outbreaks. We present the results of a multi-institution collaborative effort to standardize the collection and evaluation of forecasting models for influenza in the US for the 2010/2011 through 2016/2017 influenza seasons. For these seven seasons, we assembled weekly real-time forecasts of 7 targets of public health interest from 22 different models. We compared forecast accuracy of each model relative to a historical baseline seasonal average. Across all regions of the US, over half of the models showed consistently better performance than the historical baseline when forecasting incidence of influenza-like illness 1, 2 and 3 weeks ahead of available data and when forecasting the timing and magnitude of the seasonal peak. In some regions, delays in data reporting were strongly and negatively associated with forecast accuracy. More timely reporting and an improved overall accessibility to novel and traditional data sources are needed to improve forecasting accuracy and its integration with real-time public health decision-making.

Collaborative efforts to forecast seasonal influenza in the United States, 2015–2016

McGowan C, Biggerstaff M, Johansson M, Apfeldorf K, Ben-Nun M, Brooks L, Convertino M, Erraguntla M, Farrow D, Freeze J, Ghosh S, Hyun S, Kandula S, Lega J, Liu Y, Michaud N, Morita H, Niemi J, Ramakrishnan N, Ray EL, Reich NG, Riley P, Shaman J, Tibshirani R, Vespignani A, Zhang Q, Reed C (2019). Scientific Reports, 9(683).

forecasting influenza flusight

Abstract

Since 2013, the Centers for Disease Control and Prevention (CDC) has hosted an annual influenza season forecasting challenge. The 2015–2016 challenge consisted of weekly probabilistic forecasts of multiple targets, including fourteen models submitted by eleven teams. Forecast skill was evaluated using a modified logarithmic score. We averaged submitted forecasts into a mean ensemble model and compared them against predictions based on historical trends. Forecast skill was highest for seasonal peak intensity and short-term forecasts, while forecast skill for timing of season onset and peak week was generally low. Higher forecast skill was associated with team participation in previous influenza forecasting challenges and utilization of ensemble forecasting techniques. The mean ensemble consistently performed well and outperformed historical trend predictions. CDC and contributing teams will continue to advance influenza forecasting and work to improve the accuracy and reliability of forecasts to facilitate increased incorporation into public health response efforts.

2018

Prospective forecasts of annual dengue hemorrhagic fever incidence in Thailand, 2010–2014

Lauer SA, Sakrejda K, Ray EL, Keegan LT, Bi Q, Suangtho P, Hinjoy S, Iamsirithaworn S, Suthachana S, Laosiritaworn Y, Cummings DAT, Lessler J and Reich NG (2018). PNAS, 115(10): E2175-E2182.

forecasting dengue

Abstract

Dengue hemorrhagic fever (DHF), a severe manifestation of dengue viral infection that can cause severe bleeding, organ impairment, and even death, affects between 15,000 and 105,000 people each year in Thailand. While all Thai provinces experience at least one DHF case most years, the distribution of cases shifts regionally from year to year. Accurately forecasting where DHF outbreaks occur before the dengue season could help public health officials prioritize public health activities. We develop statistical models that use biologically plausible covariates, observed by April each year, to forecast the cumulative DHF incidence for the remainder of the year. We perform cross-validation during the training phase (2000–2009) to select the covariates for these models. A parsimonious model based on preseason incidence outperforms the 10-y median for 65% of province-level annual forecasts, reduces the mean absolute error by 19%, and successfully forecasts outbreaks (area under the receiver operating characteristic curve = 0.84) over the testing period (2010–2014). We find that functions of past incidence contribute most strongly to model performance, whereas the importance of environmental covariates varies regionally. This work illustrates that accurate forecasts of dengue risk are possible in a policy-relevant timeframe.

Prediction of infectious disease epidemics via weighted density ensembles

Ray EL, Reich NG (2018). PLOS Comp Bio, 14(2): e1005910.

forecasting influenza flusight

Abstract

Accurate and reliable predictions of infectious disease dynamics can be valuable to public health organizations that plan interventions to decrease or prevent disease transmission. A great variety of models have been developed for this task, using different model structures, covariates, and targets for prediction. Experience has shown that the performance of these models varies; some tend to do better or worse in different seasons or at different points within a season. Ensemble methods combine multiple models to obtain a single prediction that leverages the strengths of each model. We considered a range of ensemble methods that each form a predictive density for a target of interest as a weighted sum of the predictive densities from component models. In the simplest case, equal weight is assigned to each component model; in the most complex case, the weights vary with the region, prediction target, week of the season when the predictions are made, a measure of component model uncertainty, and recent observations of disease incidence. We applied these methods to predict measures of influenza season timing and severity in the United States, both at the national and regional levels, using three component models. We trained the models on retrospective predictions from 14 seasons (1997/1998 - 2010/2011) and evaluated each model's prospective, out-of-sample performance in the five subsequent influenza seasons. In this test phase, the ensemble methods showed overall performance that was similar to the best of the component models, but offered more consistent performance across seasons than the component models. Ensemble methods offer the potential to deliver more reliable predictions to public health decision makers.

Preprints: An underutilized mechanism to accelerate outbreak science

Johansson MA, Reich NG, Meyers LA, Lipsitch M (2018). PLOS Medicine, 15(4): e1002549.

ebola zika

Abstract

Preprints—manuscripts posted openly online prior to peer review—offer an opportunity to accelerate the dissemination of scientific findings to support responses to infectious disease outbreaks. Preprints posted during the Ebola and Zika outbreaks included novel analyses and new data, and most of those that were matched to peer-reviewed publications were available more than 100 days before publication. Despite the advantages of preprints and the endorsement of journals and funders in the context of outbreaks, less than 5% of Ebola and Zika journal articles were posted as preprints prior to publication in journals. With broader adoption by scientists, journals, and funding agencies, preprints can complement peer-reviewed publication and ensure the early, open, and transparent dissemination of science relevant to the prevention and control of disease outbreaks.

Quantifying the Risk and Cost of Active Monitoring for Infectious Diseases

Reich NG, Lessler J, Varma JK, Vora NM (2018). Scientific Reports, 8: 1093.

incubation-period ebola

Abstract

During outbreaks of deadly emerging pathogens (e.g., Ebola, MERS-CoV) and bioterror threats (e.g., smallpox), actively monitoring potentially infected individuals aims to limit disease transmission and morbidity. Guidance issued by CDC on active monitoring was a cornerstone of its response to the West Africa Ebola outbreak. There are limited data on how to balance the costs and performance of this important public health activity. We present a framework that estimates the risks and costs of specific durations of active monitoring for pathogens of significant public health concern. We analyze data from New York City's Ebola active monitoring program over a 16-month period in 2014-2016. For monitored individuals, we identified unique durations of active monitoring that minimize expected costs for those at “low (but not zero) risk” and “some or high risk”: 21 and 31 days, respectively. Extending our analysis to smallpox and MERS-CoV, we found that the optimal length of active monitoring relative to the median incubation period was reduced compared to Ebola due to less variable incubation periods. Active monitoring can save lives but is expensive. Resources can be most effectively allocated by using exposure-risk categories to modify the duration or intensity of active monitoring.

Protecting Healthcare Personnel in Outpatient Settings: The Influence of Mandatory Versus Nonmandatory Influenza Vaccination Policies on Workplace Absenteeism During Multiple Respiratory Virus Seasons

Frederick J, Brown AC, Cummings DAT, Gaydos CA, Gibert CL, Gorse GJ, Los JG, Nyquist AC, Perl TM, Price CS, Radonovich LJ, Reich NG, Rodriguez-Barradas MC, Bessesen MT, Simberkoff MS, for the ResPECT Teama (2018). Infect Control Hosp Epidemiol, 39.

influenza

Abstract

Objective: To determine the effect of mandatory and nonmandatory influenza vaccination policies on vaccination rates and symptomatic absenteeism among healthcare personnel (HCP). Design: Retrospective observational cohort study. Setting: This study took place at 3 university medical centers with mandatory influenza vaccination policies and 4 Veterans Affairs (VA) healthcare systems with nonmandatory influenza vaccination policies. Participants: The study included 2,304 outpatient HCP at mandatory vaccination sites and 1,759 outpatient HCP at nonmandatory vaccination sites. Methods: To determine the incidence and duration of absenteeism in outpatient settings, HCP participating in the Respiratory Protection Effectiveness Clinical Trial at both mandatory and nonmandatory vaccination sites over 3 viral respiratory illness (VRI) seasons (2012–2015) reported their influenza vaccination status and symptomatic days absent from work weekly throughout a 12-week period during the peak VRI season each year. The adjusted effects of vaccination and other modulating factors on absenteeism rates were estimated using multivariable regression models. Results: The proportion of participants who received influenza vaccination was lower each year at nonmandatory than at mandatory vaccination sites (odds ratio [OR], 0.09; 95% confidence interval [CI], 0.07–0.11). Among HCP who reported at least 1 sick day, vaccinated HCP had lower symptomatic days absent compared to unvaccinated HCP (OR for 2012–2013 and 2013–2014, 0.82; 95% CI, 0.72–0.93; OR for 2014–2015, 0.81; 95% CI, 0.69–0.95). Conclusions: These data suggest that mandatory HCP influenza vaccination policies increase influenza vaccination rates and that HCP symptomatic absenteeism diminishes as rates of influenza vaccination increase. These findings should be considered in formulating HCP influenza vaccination policies.

2017

Infectious Disease Prediction with Kernel Conditional Density Estimation

Ray EL, Sakrejda K, Lauer SA, Johansson MA, Reich NG (2017). Statistics in Medicine, 36: 4908-4929.

forecasting dengue influenza flusight

Abstract

Creating statistical models that generate accurate predictions of infectious disease incidence over multiple time points is a challenging problem whose solution could benefit public health decision makers. We develop a new approach to this problem using kernel conditional density estimation (KCDE) and copulas. We obtain predictive distributions for incidence in individual weeks using KCDE and tie those distributions together into joint distributions using copulas. This strategy enables us to create predictions for the timing of and incidence in the peak week of the season. Our implementation of KCDE incorporates two novel kernel components: a periodic component that captures seasonality in disease incidence, and a component that allows for a full parameterization of the bandwidth matrix with discrete variables. We demonstrate via simulation that a fully parameterized bandwidth matrix can be beneficial for estimating conditional densities. We apply the method to predicting dengue fever and influenza, and compare to a seasonal autoregressive integrated moving average (SARIMA) model and a previously published generalized linear model for infectious disease incidence known as HHH4. KCDE outperforms the baseline methods for predictions of dengue incidence in individual weeks. KCDE also offers more consistent performance than the baseline models for predictions of incidence in the peak week, and is comparable to the baseline models on the other prediction targets. Using the periodic kernel function led to better predictions of incidence. Our approach and extensions of it could yield improved predictions for public health decision makers, particularly in diseases with heterogeneous seasonal dynamics such as dengue fever.

flusight: interactive visualizations for infectious disease forecasts

Tushar A, Reich NG (2017). JOSS, 2.

software flusight

Abstract

The rapid emergence of infectious disease outbreaks from both new and known pathogens remains a critical concern of health officials worldwide. Improving communication between teams of scientific researchers who assemble forecasts of outbreaks before and during epidemics, and policy makers who could integrate these data into decicion-making has been identified as a critical area for innovation.(Chretien et al. 2015) In an attempt to address this issue, we have developed flusight, a tool for visualizing infectious disease forecasts. It provides an interactive interface for real-time comparison, exploration, and evaluation of infectious disease forecast models over time and geographic regions. A version is live here, with forecasts of influenza in the US that are updated weekly during the US influenza season. Flusight uses D3 (Bostock 2016) for generating visualizations from a single static file that summarizes the entities to be visualized (such as, predicted and actual weekly influenza incidence, predicted week with the peak incidence for the season, etc...). It is written to keep hosting overhead minimal and pre-generates the data file by parsing model predictions and live influenza data from delphi-API (undefx 2016). All content is bundled into a static web page. The data collection step can be replaced to visualize data and forecasts from custom sources instead of the ones used in the current repository. This allows future users to plug in similar time-series-based disease prediction models for visualization. This application has potential to be widely used by infectious disease forecasters who generate forecasts in real-time. In this way, we hope that flusight will facilitate dissemination, comparison, and standardized evaluation of outbreak predictions.

Enriching Students' Conceptual Understanding of Confidence Intervals: An Interactive Trivia-based Classroom Activity

Wang X, Reich NG, Horton NJ (2017). The American Statistician.

education

Abstract

Confidence intervals provide a way to determine plausible values for a population parameter. They are omnipresent in research articles involving statistical analyses. Appropriately, a key statistical literacy learning objective is the ability to interpret and understand confidence intervals in a wide range of settings. As instructors, we devote a considerable amount of time and effort to ensure that students master this topic in introductory courses and beyond. Yet, studies continue to find that confidence intervals are commonly misinterpreted and that even experts have trouble calibrating their individual confidence levels. In this article, we present a ten-minute trivia game-based activity that addresses these misconceptions by exposing students to confidence intervals from a personal perspective. We describe how the activity can be integrated into a statistics course as a one-time activity or with repetition at intervals throughout a course, discuss results of using the activity in class, and present possible extensions.

2016

Challenges in Real-Time Prediction of Infectious Disease: A Case Study of Dengue in Thailand

Reich NG, Lauer SA, Sakrejda K, Iamsirithaworn S, Hinjoy S, Suangtho P, Suthachana S, Clapham H, Salje H, Cummings DAT, Lessler J (2016). PLOS Neg Trop Dis, 10: e0004761.

forecasting dengue

Abstract

Epidemics of communicable diseases place a huge burden on public health infrastructures across the world. Producing accurate and actionable forecasts of infectious disease incidence at short and long time scales will improve public health response to outbreaks. However, scientists and public health officials face many obstacles in trying to create such real-time forecasts of infectious disease incidence. Dengue is a mosquito-borne virus that annually infects over 400 million people worldwide. We developed a real-time forecasting model for dengue hemorrhagic fever in the 77 provinces of Thailand. We created a practical computational infrastructure that generated multi-step predictions of dengue incidence in Thai provinces every two weeks throughout 2014. These predictions show mixed performance across provinces, out-performing seasonal baseline models in over half of provinces at a 1.5 month horizon. Additionally, to assess the degree to which delays in case reporting make long-range prediction a challenging task, we compared the performance of our real-time predictions with predictions made with fully reported data. This paper provides valuable lessons for the implementation of real-time predictions in the context of public health decision making.

Evaluating the performance of infectious disease forecasts: A comparison of climate-driven and seasonal dengue forecasts for Mexico

Johansson MA, Reich NG, Hota A, Brownstein JS, Santillana M (2016). Scientific Reports, 6.

forecasting dengue

Abstract

Dengue viruses, which infect millions of people per year worldwide, cause large epidemics that strain healthcare systems. Despite diverse efforts to develop forecasting tools including autoregressive time series, climate-driven statistical, and mechanistic biological models, little work has been done to understand the contribution of different components to improved prediction. We developed a framework to assess and compare dengue forecasts produced from different types of models and evaluated the performance of seasonal autoregressive models with and without climate variables for forecasting dengue incidence in Mexico. Climate data did not significantly improve the predictive power of seasonal autoregressive models. Short-term and seasonal autocorrelation were key to improving short-term and long-term forecasts, respectively. Seasonal autoregressive models captured a substantial amount of dengue variability, but better models are needed to improve dengue forecasting. This framework contributes to the sparse literature of infectious disease prediction model evaluation, using state-of-the-art validation techniques such as out-of-sample testing and comparison to an appropriate reference model.

Case Study in Evaluating Time Series Prediction Models Using the Relative Mean Absolute Error

Reich NG, Lessler J, Sakrejda K, Lauer SA, Iamsirithaworn S, Cummings DAT (2016). The American Statistician, 70: 285-292.

forecasting dengue

Abstract

Statistical prediction models inform decision-making processes in many real-world settings. Prior to using predictions in practice, one must rigorously test and validate candidate models to ensure that the proposed predictions have sufficient accuracy to be used in practice. In this article, we present a framework for evaluating time series predictions, which emphasizes computational simplicity and an intuitive interpretation using the relative mean absolute error metric. For a single time series, this metric enables comparisons of candidate model predictions against naïve reference models, a method that can provide useful and standardized performance benchmarks. Additionally, in applications with multiple time series, this framework facilitates comparisons of one or more models’ predictive performance across different sets of data. We illustrate the use of this metric with a case study comparing predictions of dengue hemorrhagic fever incidence in two provinces of Thailand. This example demonstrates the utility and interpretability of the relative mean absolute error metric in practice, and underscores the practical advantages of using relative performance metrics when evaluating predictions.

Time to Key Events in the Course of Zika Infection and their Implications for Surveil- lance: A Systematic Review and Pooled Analysis

Lessler J, Ott CT, Carcelen AC, Konikoff JM, Williamson J, Bi Q, Reich NG, Cummings DAT, Kucirka LM, Chaisson LH (2016). Bulletin of the World Health Organization, 94.

incubation-period

Abstract

OBJECTIVE: To estimate the timing of key events in the natural history of Zika virus infection. METHODS: In February 2016, we searched PubMed, Scopus and the Web of Science for publications containing the term Zika. By pooling data, we estimated the incubation period, the time to seroconversion and the duration of viral shedding. We estimated the risk of Zika virus contaminated blood donations. FINDINGS: We identified 20 articles on 25 patients with Zika virus infection. The median incubation period for the infection was estimated to be 5.9 days (95% credible interval, CrI: 4.4-7.6), with 95% of people who developed symptoms doing so within 11.2 days (95% CrI: 7.6-18.0) after infection. On average, seroconversion occurred 9.1 days (95% CrI: 7.0-11.6) after infection. The virus was detectable in blood for 9.9 days (95% CrI: 6.9-21.4) on average. Without screening, the estimated risk that a blood donation would come from an infected individual increased by approximately 1 in 10 000 for every 1 per 100 000 person-days increase in the incidence of Zika virus infection. Symptom-based screening may reduce this rate by 7% (relative risk, RR: 0.93; 95% CrI: 0.89-0.99) and antibody screening, by 29% (RR: 0.71; 95% CrI: 0.28-0.88). CONCLUSION: Neither symptom- nor antibody-based screening for Zika virus infection substantially reduced the risk that blood donations would be contaminated by the virus. Polymerase chain reaction testing should be considered for identifying blood safe for use in pregnant women in high-incidence areas.

The Respiratory Protection Effectiveness Clinical Trial (ResPECT): A Cluster-Randomized Comparison of Respirator and Medical Mask Effectiveness against Respiratory Infections in Healthcare Personnel

Radonovich LJ, Bessesen M, Cummings DAT, Eagan A, Gaydos C, Gibert C, Gorse G, Nyquist C, Reich NG, Rodriguez-Barradas M, Savor-Price C, Shaffer R, Simberkoff M, Perl TM (2016). BMC Infectious Diseases, 16: 243.

cluster-randomization

Abstract

Although N95 filtering facepiece respirators and medical masks are commonly used for protection against respiratory infections in healthcare settings, more clinical evidence is needed to understand the optimal settings and exposure circumstances for healthcare personnel to use these devices. A lack of clinically germane research has led to equivocal, and occasionally conflicting, healthcare respiratory protection recommendations from public health organizations, professional societies, and experts. The Respiratory Protection Effectiveness Clinical Trial (ResPECT) is a prospective comparison of respiratory protective equipment to be conducted at multiple U.S. study sites. Healthcare personnel who work in outpatient settings will be cluster-randomized to wear N95 respirators or medical masks for protection against infections during respiratory virus season. Outcome measures will include laboratory-confirmed viral respiratory infections, acute respiratory illness, and influenza-like illness. Participant exposures to patients, coworkers, and others with symptoms and signs of respiratory infection, both within and beyond the workplace, will be recorded in daily diaries. Adherence to study protocols will be monitored by the study team. ResPECT is designed to better understand the extent to which N95s and MMs reduce clinical illness among healthcare personnel. A fully successful study would produce clinically relevant results that help clinician-leaders make reasoned decisions about protection of healthcare personnel against occupationally acquired respiratory infections and prevention of spread within healthcare systems.

2015

Triggering Interventions for Influenza: The ALERT Algorithm

Reich NG, Cummings DAT, Lauer SA, Zorn M, Robinson C, Nyquist AC, Price CS, Simberkoff M, Radonovich LJ, Perl TM (2015). Clinical Infectious Diseases, 60: 499-504.

forecasting influenza

Abstract

Early, accurate predictions of the onset of influenza season enable targeted implementation of control efforts. Our objective was to develop a tool to assist public health practitioners, researchers, and clinicians in defining the community-level onset of seasonal influenza epidemics. Using recent surveillance data on virologically confirmed infections of influenza, we developed the Above Local Elevated Respiratory Illness Threshold (ALERT) algorithm, a method to identify the period of highest seasonal influenza activity. We used data from 2 large hospitals that serve Baltimore, Maryland and Denver, Colorado, and the surrounding geographic areas. The data used by ALERT are routinely collected surveillance data: weekly case counts of laboratory-confirmed influenza A virus. The main outcome is the percentage of prospective seasonal influenza cases identified by the ALERT algorithm. When ALERT thresholds designed to capture 90% of all cases were applied prospectively to the 2011–2012 and 2012–2013 influenza seasons in both hospitals, 71%–91% of all reported cases fell within the ALERT period. The ALERT algorithm provides a simple, robust, and accurate metric for determining the onset of elevated influenza activity at the community level. This new algorithm provides valuable information that can impact infection prevention recommendations, public health practice, and healthcare delivery.

The Effect of Cluster Size Variability on Statistical Power in Cluster-Randomized Trials

Lauer SA, Kleinman K, Reich NG (2015). PLOS ONE, 10: e0119074.

cluster-randomization

Abstract

The frequency of cluster-randomized trials (CRTs) in peer-reviewed literature has increased exponentially over the past two decades. CRTs are a valuable tool for studying interventions that cannot be effectively implemented or randomized at the individual level. However, some aspects of the design and analysis of data from CRTs are more complex than those for individually randomized controlled trials. One of the key components to designing a successful CRT is calculating the proper sample size (i.e. number of clusters) needed to attain an acceptable level of statistical power. In order to do this, a researcher must make assumptions about the value of several variables, including a fixed mean cluster size. In practice, cluster size can often vary dramatically. Few studies account for the effect of cluster size variation when assessing the statistical power for a given trial. We conducted a simulation study to investigate how the statistical power of CRTs changes with variable cluster sizes. In general, we observed that increases in cluster size variability lead to a decrease in power.

2013

Interactions between serotypes of dengue highlight epidemiological impact of cross-immunity

Reich NG, Shrestha S, King AA, Rohani P, Lessler J, Kalayanarooj S, Yoon IK, Gibbons RV, Burke DS, Cummings DAT. (2013). JRSI, 10: 20130414.

pathogen-interactions dengue

Abstract

Dengue, a mosquito-borne virus of humans, infects over 50 million people annually. Infection with any of the four dengue serotypes induces protective immunity to that serotype, but does not confer long-term protection against infection by other serotypes. The immunological interactions between serotypes are of central importance in understanding epidemiological dynamics and anticipating the impact of dengue vaccines. We analysed a 38-year time series with 12 197 serotyped dengue infections from a hospital in Bangkok, Thailand. Using novel mechanistic models to represent different hypothesized immune interactions between serotypes, we found strong evidence that infection with dengue provides substantial short-term cross-protection against other serotypes (approx. 1–3 years). This is the first quantitative evidence that short-term cross-protection exists since human experimental infection studies performed in the 1950s. These findings will impact strategies for designing dengue vaccine studies, future multi-strain modelling efforts, and our understanding of evolutionary pressures in multi-strain disease systems.

2012

Empirical power and sample size calculations for cluster-randomized and cluster-randomized crossover studies

Reich NG, Myers JA, Obeng D, Milstone AM, Perl TM (2012). PLOS ONE, 7: e35564.

cluster-randomization software

Abstract

In recent years, the number of studies using a cluster-randomized design has grown dramatically. In addition, the cluster-randomized crossover design has been touted as a methodological advance that can increase efficiency of cluster-randomized studies in certain situations. While the cluster-randomized crossover trial has become a popular tool, standards of design, analysis, reporting and implementation have not been established for this emergent design. We address one particular aspect of cluster-randomized and cluster-randomized crossover trial design: estimating statistical power. We present a general framework for estimating power via simulation in cluster-randomized studies with or without one or more crossover periods. We have implemented this framework in the clusterPower software package for R, freely available online from the Comprehensive R Archive Network. Our simulation framework is easy to implement and users may customize the methods used for data analysis. We give four examples of using the software in practice. The clusterPower package could play an important role in the design of future cluster-randomized and cluster-randomized crossover studies. This work is the first to establish a universal method for calculating power for both cluster-randomized and cluster-randomized clinical trials. More research is needed to develop standardized and recommended methodology for cluster-randomized crossover studies.

Estimating Absolute and Relative Case Fatality Ratios from Infectious Disease Surveillance Data

Reich NG, Lessler J, Cummings DAT, Brookmeyer R (2012). Biometrics, 68(2): 598-606.

influenza case-fatality

Abstract

Knowing which populations are most at risk for severe outcomes from an emerging infectious disease is crucial in deciding the optimal allocation of resources during an outbreak response. The case fatality ratio (CFR) is the fraction of cases that die after contracting a disease. The relative CFR is the factor by which the case fatality in one group is greater or less than that in a second group. Incomplete reporting of the number of infected individuals, both recovered and dead, can lead to biased estimates of the CFR. We define conditions under which the CFR and the relative CFR are identifiable. Furthermore, we propose an estimator for the relative CFR that controls for time‐varying reporting rates. We generalize our methods to account for elapsed time between infection and death. To demonstrate the new methodology, we use data from the 1918 influenza pandemic to estimate relative CFRs between counties in Maryland. A simulation study evaluates the performance of the methods in outbreak scenarios. An R software package makes the methods and data presented here freely available. Our work highlights the limitations and challenges associated with estimating absolute and relative CFRs in practice. However, in certain situations, the methods presented here can help identify vulnerable subpopulations early in an outbreak of an emerging pathogen such as pandemic influenza.

2009

Estimating incubation period distributions with coarse data.

Reich NG, Lessler J, Cummings DAT, Brookmeyer R (2009). Statistics in Medicine, 28(22): 2769-84.

influenza incubation-period

Abstract

The incubation period, the time between infection and disease onset, is important in the surveillance and control of infectious diseases but is often coarsely observed. Coarse data arises because the time of infection, the time of disease onset or both are not known precisely. Accurate estimates of an incubation period distribution are useful in real-time outbreak investigations and in modeling public health interventions. We compare two methods of estimating such distributions. The first method represents the data as doubly interval-censored. The second introduces a data reduction technique that makes the computation more tractable. In a simulation study, the methods perform similarly when estimating the median, but the first method yields more reliable estimates of the distributional tails. We conduct a sensitivity analysis of the two methods to violations of model assumption and we apply these methods to historical incubation period data on influenza A and respiratory syncytial virus. The analysis of reduced data is less computationally intensive and performs well for estimating the median under a wide range of conditions. However for estimation of the tails of the distribution, the doubly interval-censored analysis is the recommended procedure.

Publications

2025

Shandross L, Ray EL, Rogers BW, Reich NG (2025). medRxiv.

Abstract

Mathis SM, Webber AE, ... Cramer EY, Gerding A, Stark A, Ray EL, Reich NG, Shandross L, Wattanachit N, Wang Y, Zorn MW, , ... Reed C, Biggerstaff M, Borchering RK (2025). Nature Communications, 15(1): 6289.

Abstract

2024

Gerding A, Reich NG, Rogers B, Ray EL (2024). Journal of the Royal Statistical Society Series A: Statistics in Society.

Abstract

Kim M, Ray EL, Reich NG (2024). arXiv.

Abstract

Fox SJ, Kim M, Meyers LA, Reich NG, Ray EL (2024). Emerging Infectious Diseases, 30(9): 1967-1969.

Abstract

Ray EL, Wang Y, Wolfinger RD, Reich NG (2024). arXiv.

Abstract

Lipsitch M, Bassett MT, Brownstein JS, ... Reich NG, ... Truelove S, Varma JK, Grad YH (2024). Frontiers in Public Health, 12: 1408193.

Abstract

Shandross L, Howerton E, Contamin L, Hochheiser H, Krystalli A, Consortium of Infectious Disease Modeling Hubs, Reich NG, Ray EL (2024). medRxiv.

Abstract

Lopez V, Cramer EY, Pagano R, ... Biggerstaff M, Reich NG, Johansson MA (2024). PLOS Comp Bio.

Abstract

2023

Howerton E, Contamin L, Mullany LC, Qin M, Reich NG, ... Viboud C, Lessler J (2023). Nature Communications, 14(1): 7260.

Abstract

Gibson GC, Reich NG, Sheldon D (2023). Annals of Applied Statistics, 17(3): 1801-1819.

Abstract

Wadsworth S, Niemi J, Reich NG (2023). arXiv.

Abstract

Wattanachit N, Ray EL, McAndrew TC, Reich NG (2023). Statistics in Medicine, 42(26): 4696-4712.

Abstract

Sherratt K, Gruson H, Grah R, ... Gibson GC, Ray EL, Reich NG, Sheldon D, Wang Y, Wattanachit N, ... Bracher J, Funk S (2023). eLife, 12: e81916.

Abstract

Reich NG, Wang Y, Burns M, Ergas R, Cramer EY, Ray EL (2023). Epidemics, 45: 100728.

Abstract

Borchering RK, Mullany LC, Howerton E, Chinazzi M, Smith CP, Qin M, Reich NG, ... Viboud C, Lessler J (2023). The Lancet Regional Health-Americas, 17: 100398.

Abstract

2022

McAndrew TC, Reich NG (2022). PLOS Comp Bio, 18(9): e1010485.

Abstract

Nixon K, Jindal S, Parker F, Reich NG, Ghobadi K, Lee EC, Truelove S, Gardner L (2022). The Lancet Digital Health, 4(10): e738-e747.

Abstract

Nixon K, Jindal S, Parker F, Marshall M, Reich NG, Ghobadi K, Lee EC, Truelove S, Gardner L (2022). The Lancet Digital Health, 4(10): e699–e701.

Abstract

Ray EL, Brooks LC, Bien J, Biggerstaff M, Bosse NI, Bracher J, Cramer EY, Funk S, Gerding A, Johansson MA, Rumack A, Wang Y, Zorn M, Tibshirani RJ, Reich NG (2022). International Journal of Forecasting, 39: 1366-1383.

Abstract

Cramer EY, Huang Y, Wang Y, Ray EL, Cornell M, Bracher J, Brennen A, Castro Rivadeneira AJ, Gerding A, House K, Jayawardena D, Kanji AH, Khandelwal A, Le K, Niemi J, Stark A, Shah A, Wattanachit N, Zorn MW, Reich NG (2022). Scientific Data, 9(1): 1-15.

Abstract

Reich NG, Lessler J, Funk S, Viboud C, Vespignani A, Tibshirani RJ, Shea K, Schienle M, Runge MC, Rosenfeld R, Ray EL, Niehus R, Johnson HC, Johansson MA, Hochheiser H, Gardner L, Bracher J, Borchering RK, Biggerstaff M (2022). AJPH, 112(6): 839-842.

Abstract

Cramer EY, Ray EL, Lopez VK, Bracher J, ... Slayton RB, Johansson M , Biggerstaff M, Reich NG (2022). PNAS, 119(15): e2113561119.

Abstract

Reich NG, Ray EL (2022). PNAS, 119(14): e2200703119.

Abstract

2021

Pollett S, Johansson MA, Reich NG, ... Viboud C, Brady O, Rivers C (2021). PLOS Medicine, 18(10): e1003793.

Abstract

Borchering RK, Viboud C, Howerton E, Smith CP, Truelove S, Runge MC, Reich NG, ... Shea K, Lessler, J (2021). Morbidity and Mortality Weekly Report (MMWR), 70(19): 719–724.

Abstract

Reich NG, Cornell M, Ray EL, House K, Le K (2021). Scientific Data, 8(59).

Abstract

Gibson GC, Moran K, Reich NG, Osthus D (2021). PLOS Comp Bio.

Abstract

Bracher J, Ray EL, Gneiting T, Reich NG (2021). PLOS Comp Bio.

Abstract

McAndrew T, Reich NG (2021). Statistics in Medicine, 40(30): 6931-6952.

Abstract

McAndrew T, Wattanachit N, Gibson GC, Reich NG (2021). Wiley Interdisciplinary Reviews: Computational Statistics, 13(2): e1514.

Abstract

Snyder T, Ravenhurst J, Cramer EY, Reich NG, Balzer L, Alfandari D, Lover AA (2021). BMJ Open, 11:e051157: 1-10.

Abstract

2020

Pollett S, Johansson M,Biggerstaff M, Morton LC, Bazaco SL, Major DMB, Ibarra AMS, Pavlin JA, Mate S, Sippy R, Hartman LJ, Reich NG, Maljkovic I, Chretien BJP, Althouse BM, Myer D, Viboud C, Rivers C (2020). Epidemics, 33: 100-400.

Abstract

Bi Q, Cummings DAT, Reich NG, Keegan LT, Kaminsky J, Salje H, Clapham H, Doung-ngern P, Iamsirithaworn S, Lessler J (2020). medRxiv.

Abstract

Ray EL, Wattanachit N, Niemi J, Kanji AH, House K, Cramer EY, ... Reich NG on behalf of the COVID-19 Forecast Hub Consortium (2020). medRxiv.

Abstract

Weinberger D, Cohen T, Crawford F, Mostashari F, Olson D, Pitzer VE, Reich NG, Russi M, Simonsen L, Watkins A, Viboud C (2020). JAMA Internal Medicine.

Abstract

Lauer SA, Brown AC, Reich NG (2020). arXiv.