COVID-19 Forecasting: Fit to a curve or model the disease in real-time?

Categories: Data

If you are going to model a rapidly moving disease that grows exponentially, saturating hospitals and claiming thousands of lives with every new infected community, would you want to fit a curve based on extrapolations from other countries? Or do you want a flexible epidemiological model that fits to all available local data and allows modeling of future intervention and reopening strategies?

Our team thought of this exact question and the problems that we as a society are up against with COVID-19. What can we do to help prevent the healthcare system from breaking down? As a company at the intersection of medicine and technology, and with several epidemiologists on staff, we can arm policy makers with a better understanding of how COVID-19 will affect their communities at a local level through robust disease modeling and engineering.

To drive our impact directly into the heart of public policy, we recently partnered with Covid Act Now to distribute a localized, county-level SEIR model that can fit and predict local disease progressions under a variety of past and future social distancing implementations, all while incorporating and inferring daily information about disease parameters, hospital capacity, mortalities, deaths and public policy.

 

The Challenges of Modeling an Infectious Disease

Modeling infectious diseases in real-time is highly challenging due to extreme sensitivity of exponential growth to still fuzzy and locally varying epidemiological parameters. If the disease progressed unchecked in the U.S., a heavily validated report from the Imperial College London estimated there would be 2.2 million deaths in the U.S. alone. Other models, such as the IHME COVID-19 model from the Institute for Health Metrics and Evaluation at the University of Washington, predicted deaths in the range of 100,000 to 240,000 only to revise their numbers down to 60,400 a few days later. Now, IHME is increasing those projections as our trajectory has diverged from geographies upon which the IHME models were initially based.

No model is perfect, but we at Grand Rounds believe that models like the IHME model cannot be used in isolation. A growing number of epidemiologists agree that the approach used by the IHME is flawed in its ability to robustly model COVID-19’s infection rate through the U.S. population because it does not model the disease itself. Furthermore, the model methodology prohibits its use in navigating the tough road ahead, where health and economic policy implementations must interplay as regions experiment with different approaches to cycles of reopening and mitigation.

The IHME model is a statistical curve-fitting model rather than a model of disease dynamics. Using current numbers as a starting point, mostly observed deaths, the model tries to approximate the disease’s progression in the U.S. by fitting a best fit growth curve based on how the disease progressed in those other geographies who are farther along with their COVID-19 infections. Importantly, the disease has not completed its course in these places, nor is its trajectory through the population identical. The IHME model is not able to account for these differences or account for variations in age, comorbidities or contact patterns (spread in workplaces vs. schools vs. restaurants).

However, the IHME model does have some advantages over a bottom’s up model. First, a curve fitting model can be implemented quickly. One can simply assume the disease will behave in a somewhat similar fashion and make a projection based on prior progressions. Second, the individual parameters of a bottom’s up model do not have to be captured or known so in the early phases of a new infection, complex historical growth patterns can be inferred.

 

Why model the disease and localize the results? 

As much as we would like to believe the IHME model will work, the U.S. is not the same as China. Our governments act differently, our societies behave differently and we as individuals make different choices. Evidence is clear that COVID-19 affects our populations differently depending on factors such as population density, policy (and compliance) and even climate. While it is beneficial to look at what happened in Asia and Europe, the rate of infection growth and fatality rates vary, having vast implications on projections of total hospitalizations and deaths.

The reality is that regions across the U.S. operate differently and have varying capability to react to the pandemic. The timing of seed infections along with effective reproduction rate variations  can lead to projection peaks of the regional healthcare system being overloaded in April versus September. In fact, in preprint study by statisticians in Australia and the U.S. (awaiting peer review), found that the IHME model is directionally correct at the federal level for the U.S. but does not do well modeling the disease growth at the local level, let alone the state level.

Health policy is influenced and implemented much more at the local level than the federal level. For a governor or mayor, if you need to make the hard decision to shut down your local geography, you need to have tools and access to data to rely on including an understanding of how the disease is likely to affect your geography.

Because of the factors listed above and because there isn’t an abundance of granular local data available from other parts of the world that have already been affected by COVID-19, we chose to build a COVID-19 SEIR model. Our model is an extension of and builds upon modeling work done by Dr. Alison Hill, a research fellow at Harvard’s Program for Evolutionary Dynamics and has been endorsed by leading epidemiologists and public health professionals.

 

How Our Model Works

SEIR models have been used by epidemiologists to model disease outbreaks for nearly a century. A SEIR model’s goal is to predict how a disease will be transmitted by moving the population through a set of connected “compartments” which flow to each other based on rates of incubation, infection and recovery, and are capable of modeling complex variations in disease contact (for example how different ages mix), demographics or hospitalization states. 

In a simplified model, the population flows through four stages: susceptible, exposed, infected and recovered or dead. Susceptible individuals may become exposed to the virus, then infected at varying levels of disease severity (mild, moderate or severe ICU hospitalization). Infected individuals then either recover (R) or die (D). The figure below describes the process employed by a SEIR model:

By categorizing people as existing in one of the various states and then modeling how people progress through them, we can infer the individual parameters and incorporate data from smaller and larger geographies to inform the local level. This allows our SEIR model to adapt to each level of geography that we attempt to model, thereby giving a more nuanced view for policy makers. As shown below, San Francisco County has a significantly different disease progression compared to California at large. This has major implications for those citizens, companies and governments operating across the state.

 

 

Another benefit of this bottom’s up approach is that policy makers can plan against different scenarios. Rather than assuming a model will be perfect, by running many different simulations of the SEIR model with different values for the parameters, policy makers can better understand the range of impact that COVID-19 might have. Moreover, it allows for policy makers to focus on the parameters that will have the largest impact on mortality. This is important because policies are changing all the time and the prediction power of a model is only as good as how well it can measure and model the effectiveness of these policies. 

Finally, modeling the disease in this way is the ability to rapidly incorporate updated data or new data sets. Moreover, because of the differences in data sparseness, we need a way for the model to be projected down in some situations. The Grand Rounds SEIR model is a hierarchical model so that we can project down into lower level geographies in a statistically robust manner.  

The Grand Rounds COVID-19 SEIR model is open source and you can view the code here.

 

How has the model been used?

The data and models available from Covid Act Now have been used by multiple governmental agencies in many of the ways we have hoped. Different governmental agencies at various levels in California, Alaska, Kentucky, Texas, Arizona, etc. have used the model, along with others, to understand how COVID-19 might progress under different scenarios of non-pharmaceutical interventions (NPIs) like social distancing. In addition, many leaders are using the models to understand how effective the NPIs have been and how to think about the impact of potentially relaxing the interventions.

On the impact of the Covid Act Now models, Dr. Thomas Hennessy of the University of Alaska Anchorage said, “We found the Act Now model to be really helpful in communicating to decision makers and community leaders in Alaska. By having the flexibility to enter data for specific populations, we created versions of the model for Anchorage, Fairbanks and the state. These were then used in discussions about the scenarios we face with varying degrees of suppression of the epidemic.”

 

What’s next?

While our goal was to inform policy makers to take swifter action and prevent the collapse of the healthcare system, COVID-19 will be with us for some time and the effects will be staggering. Improving the robustness of the models down to the local level will help our government leaders make informed decisions. Moreover, it is a real possibility that we will face secondary waves like Singapore and Japan are currently encountering. To that end, we believe we can improve our response to future waves by incorporating more data sources and having a software solution that can automatically return up-to-date projections; this will ensure we can quickly see the early warning signs. 

There are also individuals who cannot access routine care during this time. We believe our model can help identify those who are most in need of additional help right now, filling the gaps in their care and perhaps helping them avoid the emergency room or hospital for critical care.

Finally, knowing when and how to relax social distancing is a hard problem. We believe that with extensions to the models we have built, we can inform not only local governments but also companies on how different strategies might increase their risk of having an outbreak.