Our team develops statistical methods and analytical tools to help people make sense of data. Building on close collaborative relationships with public health practitioners from across the world -- from Denver to San Juan to Bangkok -- we use modern statistical and machine learning tools to gain insights into complex disease systems.
We build software packages, maintain code repositories, develop SQL databases, create interactive data visualizations, and run computationally intensive simulation studies. We use the R programming language for most of our work, but also use some C, C++, and Python.
Check out our work on GitHub.
Our team develops models for understanding complex and dynamic systems of infectious disease. We have developed real-time forecasts of dengue fever in Thailand, estimated the duration of cross-protection between serotypes of dengue, and predicted the trajectory of the flu season in the US.
Here are the slides for my presentation today at the annual MIDAS conference in Atlanta, GA. The talk summarizes recent work led by post-doc Evan Ray on creating interpretable “feature-weighted density ensembles” for infectious disease forecasting. The paper is currently under review, but the preprint is available on arXiv. Check out the 2017-2018 real-time influenza forecasts from this model available on our flusight app. And here are some slices of the feature-dependent weighting functions for predicting peak incidence for influenza in the U.S.
Posted on 24 May, 2017
I wrote a response to Siddhartha Mukherjee’s article “A.I. vs. M.D.” that appeared in the New Yorker last month. While I submitted it as a letter to the editor, they didn’t publish it. In retrospect, perhaps it was a bit long-winded for their curt and pithy letters section. Mukherjee’s article was published on the heels of Evan submitting his latest work on improving the consistency of infectious disease prediction using interpretable model averaging methods. What follows is the letter I submitted.
Posted on 14 May, 2017
We updated our U.S. influenza forecasts on Tuesday, November 29th. (We tend to update the forecasts on Mondays, but the CDC data release was delayed this week due to Thanksgiving last week.) Overall, the data and the short-term forecasts for flu are showing regional circulation of flu that is a bit below the CDC-defined baseline levels. The two exceptions are in HHS Region 2 (NY and NJ) which is right at its baseline level, according to the most recent data from the CDC (reported through November 19th), and HHS Region 4 (the southeastern corner of the US) which already has risen above its baseline. Region 4 has historically had somewhat earlier seasons than the rest of the US. Check out our interactive FluSight app for more details on each region.
Posted on 30 November, 2016
For the second year in a row, the Reich Lab is participating in the CDC FluSight challenge, a project where teams from around the country submit real-time predictions of influenza to the CDC. The teams use a variety of different models and methods to generate these predictions, from an empirical Bayes method that uses Google search data to a extended Kalman-filter method that uses humidity data to our kernel conditional density estimation method using recent incidence, and there are many others!
This year, we – well, mostly Evan – have developed a new ensemble method that combines predictions from different models. We – mostly Abhinav – also created a visualizer for our predictions. Check it out here! It’s still early in the season, and we’re not seeing much data to suggest that this will be an unusually high or low year, but that’s largely because there just isn’t much information in the early-season data. In this post, I’m going to give you a quick tour under the hood of our ensemble forecasting methodology. At some point, we’ll have an article up on GitHub or arXiv, but for now, this explanation will have to suffice.
Posted on 23 November, 2016
This week I attended a workshop at the CDC about last year’s FluSight challenge, a competition that scores weekly real-time predictions about the course of the influenza season. They are planning another round this year and are hoping to increase the number of teams particiating. Stay tuned to this site for more info.
At the workshop, I learned about DELPHI’s real-time epidemiological data API. The API is linked to various data sources on influenza and dengue, including US CDC flu data, Google Flu Trends, and Wikipedia data. There is some documentation and minimal examples, and this post documents a more robust and complete example for using the API via R. I’ll note that the CDC’s influenza data, can also be accessed via the
cdcfluview R package, which I’m not going to discuss here and I will focus here on accessing some of the other data sources. Here’s a teaser of this data that you can also interactively explore on the DELPHI EpiVis website:
Posted on 01 September, 2016
Another Five College ASA DataFest has long come and gone, and I’ve been meaning to write a recap for a while. Now in its third year in the Pioneer Valley in Western Massachusetts, the number of registrants doubled from last year, from 70 to 140. All Five Colleges (Amherst, Hampshire, Mt. Holyoke, Smith, and UMass-Amherst) sent multiple teams, and there were a few teams with a mix of students from different schools.
Team “Beta than U” from UMass-Amherst took home one of the Best in Group awards. From left to right: Laura Bowles, Vincent Lee, Harley Jean, Bianca Agustin, and Stephanie Crowley.
Posted on 04 April, 2016
Sheri Fink published this nice piece in the New York Times yesterday on the legal issues surrounding state-imposed quarantines on travelers returning from countries with widespread Ebola transmission. In addition to the toll these policies have had on the individuals who have been put under quarantine, I took away from this article that there is still a need for better data on and communication about the risks of travelers being infected with Ebola. As it happens, this is the topic of my talk today at the Epidemics5 conference.
Posted on 03 December, 2015
Last week, I had the honor of presenting at the 64th Annual Meeting of the American Society of Tropical Medicine & Hygiene (ASTMH) in the well-attended Dengue: Epidemiology session. This presentation covers our work with the Thai Ministry of Public Health and Johns Hopkins University in building an infrastructure for making real-time dengue hemorrhagic fever case predictions and evaluating the performance of our predictions thus far.
Posted on 17 November, 2015
FiveThirtyEight’s new CARMELO prediction alorithm, that projects the future careers of every NBA player, has similarity with prediction methods in other fields.
Posted on 12 October, 2015
In a feat of focused coding jujitsu, Krzysztof successfully put together a pull-request to the base development version of STAN.
Posted on 10 September, 2015
The lab participated in the Dengue Forecasting Project, hosted by various federal government agencies.
Posted on 17 August, 2015
The Reich Lab has had a busy summer!
Posted on 16 August, 2015