USA Risk estimates by county

Wear a mask

We have stopped updating the data for this application due to reporting changes and declining test volumes. Soon you'll be able to explore Covid's ebb and flow over time, but for now the data is frozen at December, 27, 2022

(Note: This map uses a Web Mercator projection that inflates the area of states in northern latitudes. County boundaries are generalized for faster drawing.)

Risk context

Can you guess the risk levels in YOUR community? Try the Risk Quiz and share your score!

Fill out this 5-minute survey for a chance to win a $50 Amazon gift card!

Can you guess the risk levels in your community? Take the quiz to find out, and share your high score.

Imagine a coffee shop in your area with 20 people inside. What's the probability that at least one of the people is infected with COVID-19?

Imagine a grocery store in your area with 50 people inside. what's the probability that at least one of the people is infected with covid-19, imagine a movie theater in your area with 100 people inside. what's the probability that at least one of the people is infected with covid-19, imagine a graduation ceremony in your area with 1000 people inside. what's the probability that at least one of the people is infected with covid-19.

As many parts of the United States begin to lift shelter-in-place, it’s crucial for us to be able to estimate the risks involved with resuming non-essential activities, particularly those involving large crowds. The COVID-19 Risk Assessment Planning tool can be used to explore the risk that at least one person at an event of a certain size is currently infected with COVID-19, given a certain number of circulating infections in the specified region.

Our risk calculations tell you only how likely it is that at least one person at any event of a given size is infectious. This is not the same as the risk of any person being exposed or infected with COVID-19 at the event

We define circulating cases (people who are currently infectious) as cases reported in the past ten days. Note that real-time risk assessments prior to August 14th used a fourteen day window. The choice of a reduced duration is consistent with CDC advice on the duration of typical infectous periods (though we recognize that individuals may shed longer). We correct for under-reporting by multiplying by an ascertainment bias. Based on seroprevalence data, we suspect that in many parts of the US this is around 10:1 (i.e., ten total cases for every one reported), but this rate may vary by location, and we also include a 5:1 ratio on the main page. Cases may be under-reported due to testing shortages, asymptomatic “silent spreaders,” and reporting lags.

Our tool generates figures that look like this.

COVID-19 Event Risk Assessment Planner - US - Exploratory

Please note that our axes are given on a logarithmic scale, so moving up by one tick means multiplying that variable by ten. The diagonal lines divide the chart into risk levels. For example, all scenarios between the orange and red lines involve a 10-50% risk that someone with COVID-19 is present. The grey region indicates scenarios with a less than 1% chance that someone with COVID-19 is present. We give you exact values for a few preset scenarios in the blue boxes. In this example, we also see a 48.7% chance (red dot ⬤ ) that someone has COVID-19 at an event with 275 attendees if 800,000 cases are circulating in the US.

You can get exact values for your own scenario using the Explore US and State-level estimates tab. You can generate a risk assessment planner for the entire country or focus on a particular state (this will just change our estimate of the total population size and the proportion of people who are infected).

When you input the number of circulating cases, it’s important that you include only those that are currently infectious. This is different from the total number of cases reported because people infected several weeks ago are likely no longer contagious. It’s also likely that we’re only detecting a fraction of cases due to testing shortages, reporting lags, and asymptomatic “silent spreaders.” A rough calculation you can do is to take the past week or two of reported cases and potentially multiply it by some constant (for example, five or ten) to correct for the virus’ ongoing spread and the proportion of cases you think may be undetected.

Otherwise, we’ve done that calculation for you in the Real Time US and State-level estimates tab. The horizontal dotted lines with risk estimates are based on real-time COVID19 surveillance data. They represent, estimates given the current reported incidence (circle ⬤ ), 5 times the current incidence (triangle ▲ ), and 10 times the current incidence (square ◼ ).

Notes on Usage and Interpretation:

Please feel free to share any plots that you generate (we’ve provided a Download button). We’d love for you to use this as a tool to educate your community and weigh the risks of holding certain events right now. You can see how this tool is already being used in the Press tab.

Here’s a sample tweet to accompany the graphic:

All of our calculations are necessarily estimates, based on imperfect data. We can’t tell you the probability that someone in the event will get infected. It’s important to remember that a certain amount of chance is involved in these outcomes. We’d encourage large event planners to exercise caution in the coming months, especially given the potential for one infected person to transmit the virus to many others in one super-spreading event (Biogen conference, Atalanta-Valencia soccer match, Washington choir practice).

As a final note, there is a moderate to high risk of being exposed to COVID-19 in many parts of the US right now. You can reduce your risk of getting infected or infecting someone else by practicing social distancing, wearing masks when out of your home, hand-washing, and staying home when you feel sick. Learn more on how to minimize your individual risk at https://www.cdc.gov/coronavirus/2019-ncov/prevent-getting-sick/prevention.html .

How we do our calculation:

What is the chance that one person at this event will already be infected with COVID-19? To answer this kind of question, we actually calculate the opposite. For example, if we were talking about a NFL game, we’d ask what is the chance that none of the 75,000 attendees is infected?

Let’s start by thinking about just one of them. If 20,000 of the 330 million people in the United States are sick, then each person has a 99.994% chance of being disease-free. In betting terms, the odds are 16,500:1 in our favor. While that sounds good from an individual perspective, the collective risk is very different.

In this scenario, the probability that all 75,000 attendees would have entered the stadium disease-free is like placing 75,000 bets each at nearly certain odds. Sure, you’ll win most of the bets. But the probability that you will win every single one of those bets is extremely low. To calculate it, we multiply the winning probability (1-1/16500) by itself 75,000 times and find that there is approximately a 1% chance that we win every time. In other words, the chances that one or more attendees would have arrived infected with SARS-CoV-2 is 99%.

  • Data source
  • Previously Released Charts

This site provides interactive context to assess the risk that one or more individuals infected with COVID-19 are present in an event of various sizes. The model is simple, intentionally so, and provided some context for the rationale to halt large gatherings in early-mid March and newly relevant context for considering when and how to re-open. Precisely because of under-testing and the risk of exposure and infection, these risk calculations provide further support for the ongoing need for social distancing and protective measures. Such precautions are still needed even in small events, given the large number of circulating cases.

Contributors:

Conceptual development.

  • Joshua Weitz (Georgia Institute of Technology, Biological Sciences, GT-BIOS)

Website and Dashboard Development

  • Aroon T. Chande (GT-BIOS and Applied Bioinformatics Laboratory)
  • Lavanya Rishishwar (GT-BIOS and Applied Bioinformatics Laboratory)
  • Walker Gussler
  • Mallory Harris (Stanford Biology)
  • Stephen Beckett (GT-BIOS)
  • Quan Nguyen (GT-BIOS)
  • Seohla Lee (Friendly Cities Lab at GT)
  • Clio M. Andris (Friendly Cities Lab at GT)
  • I. King Jordan (GT-BIOS)

Risk Expansion Development

  • Emily Norris (Applied Bioinformatics Laboratory)
  • Freyja A. Brandel-Tanis (Friendly Cities Lab at GT)
  • Audra Davidson (GT-BIOS)
  • Allie Sinclair (Duke University)
  • Morgan Taylor (Duke University)
  • R. Alison Adcock (Duke University)
  • Gregory Samanez-Larkin (Duke University)
  • Ongoing support for the project is via the Centers for Disease Control and Prevention (75D30121P10600)
  • Initial funding for the project made possible by support from the Simons Foundation 329108, Army Research Office W911NF1910384, National Institutes of Health 1R01AI46592-01, National Science Foundation 1806606, 1829636 and 2032084).
  • Additional support of the project from the Charities in Aid Foundation and The Marier Cunningham Foundation.

Institutional Review Board (IRB)

The Institutional Review Board (IRB) at Georga Tech and Duke University have reviewed the protocol for collecting user data from this site via surveys and risk prediction quizzes. Approvals were granted effective August 27, 2021 and classified as ‘Minimal risk research qualified for exemption status’. For more information please contact the Office of Research Integrity Assurance, Georgia Institute of Technology, irb.gatech.edu.

Acknowledgements

The team thanks Richard Lenski, Lauren Meyers, and Jonathan Dushoff for input on concept development.

International Collaborations

Italy : http://datainterfaces.org/projects/covid19eventi/ Spain : https://eventosycovid19.es

How to cite

Chande, A., Lee, S., Harris, M. et al. Real-time, interactive website for US-county-level COVID-19 event risk assessment. Nat Hum Behav (2020). https://doi.org/10.1038/s41562-020-01000-9

Chande, A.T., Gussler, W., Harris, M., Lee, S., Rishishwar, L., Hilley, T., Jordan, I.K., Andris, C.M., and Weitz, J.S. ‘Interactive COVID-19 Event Risk Assessment Planning Tool’, URL http://covid19risk.biosci.gatech.edu/

Weitz, J.S., Harris, M., Chande, A.T., Gussler, J.W., Rishishwar, L. and Jordan, I.K. (2020) Online COVID-19 Dashboard Calculates How Risky Reopenings and Gatherings Can Be. Sci Am. https://blogs.scientificamerican.com/observations/online-covid-19-dashboard-calculates-how-risky-reopenings-and-gatherings-can-be/

Code and model:

https://github.com/jsweitz/covid-19-event-risk-planner https://github.com/appliedbinf/covid19-event-risk-planner https://figshare.com/articles/COVID-19_Event_Risk_Assessment_Planner/11965533

March 10, Tweet Thread:

https://twitter.com/joshuasweitz/status/1237556232304508928?s=20

Op-Ed (w/Richard Lenski, Lauren A. Meyers, and Jonathan Dushoff):

https://www.ajc.com/blog/get-schooled/scientists-the-math-show-how-large-events-like-march-madness-could-spread-coronavirus/g1pVdzQgJS5aoPnadBqyXO/

  • Nov. 11, 2020 https://www.latimes.com/science/story/2020-11-11/covid-19-risk-thanksgiving-2020-what-to-expect

WIRED Science

  • Nov. 11, 2020 https://www.wired.com/story/thinking-about-a-holiday-gathering-look-at-this-map/

Boston Globe

  • Nov. 11, 2020 https://www.bostonglobe.com/2020/11/10/nation/youre-not-doomed-zoom-how-have-safer-thanksgiving-during-pandemic/
  • Nov. 09, 2020 https://www.webmd.com/lung/news/20201109/whats-my-risk-of-covid
  • Aug. 13, 2020 https://www.healthline.com/health-news/covid19-tracker-assesses-risk-public-gatherings

Bloomberg - Map Lab

  • Aug. 12, 2020 https://www.bloomberg.com/news/newsletters/2020-08-12/maplab-an-actionable-map-of-covid-risk

San Francisco Chronicle

  • Aug. 07, 2020 https://www.sfchronicle.com/bayarea/article/If-25-people-gather-in-SF-odds-are-34-that-at-15458554.php
  • Aug. 06, 2020 https://gvwire.com/2020/08/06/if-100-people-go-to-a-party-in-fresno-county-its-99-someone-has-covid-19/

ABC7 San Francisco

  • Aug. 06, 2020 https://abc7news.com/covid-19-map-interactive-coronavirus-risk-calculator-cdc-guidelines-for/6355615/

NBC 7 San Diego

  • Aug. 06, 2020 https://www.nbcsandiego.com/news/local/new-tool-shows-risk-of-encountering-someone-with-covid-19/2380403/
  • Aug. 05, 2020 https://www.sfgate.com/news/article/Interactive-map-tells-you-how-likely-it-is-15461395.php
  • Aug. 03, 2020 https://www.boston.com/news/coronavirus/2020/08/03/interactive-tool-to-determine-your-covid-19-danger-level-is-not-for-the-faint-of-heart

The Salt Lake Tribune

  • Aug. 03, 2020 https://www.sltrib.com/news/2020/08/02/answers-utahns-frequently/
  • July 30, 3030 https://www.masslive.com/coronavirus/2020/07/what-are-the-chances-someone-at-your-gathering-in-massachusetts-will-have-coronavirus-county-by-county-map-offers-estimates.html
  • July 23, 2020 https://www.politico.com/newsletters/politico-nightly-coronavirus-special-edition/2020/07/23/the-pandemics-shapeshifting-economy-489872
  • July 21, 2020 https://www.wired.com/story/to-navigate-risk-in-a-pandemic-you-need-a-color-coded-chart/
  • July 14, 2020 https://www.kare11.com/article/news/health/coronavirus/georgia-tech-covid-19-risk-calculator-map-shows-county-by-county-data/89-ac0c2efb-727b-4e4b-bf04-958e572e92b4
  • July 13, 2020 https://kutv.com/news/local/how-likely-are-you-to-get-covid-19-this-new-map-will-tell-you

The Daily Memphian

  • July 09, 2020 https://dailymemphian.com/article/15338/shelby-county-coronavirus-cases-grow-by-306

Georgia Tech College of Sciences

  • July 7, 2020 https://cos.gatech.edu/news/georgia-tech-researchers-release-county-level-calculator-estimate-risk-covid-19-exposure-us
  • June 16, 2020 https://www.newsweek.com/trump-rally-tulsa-oklahoma-coronavirus-cases-covid-19-1511021

FOX5 Atlanta:

  • May 08, 2020 https://www.fox5atlanta.com/news/here-are-three-new-ways-to-tell-whether-were-beating-covid-19
  • May 4, 2020 https://slate.com/technology/2020/05/coronavirus-reopening-math.html
  • April 19, 2020 https://www.wired.com/story/whats-social-distancing-flattening-curve-covid-19-questions/
  • March 26, 2020 https://www.nature.com/articles/d41591-020-00005-1

National Geographic:

  • March 19, 2020 https://www.nationalgeographic.com/science/2020/03/graphic-see-why-small-groups-are-safer-during-covid-19-coronavirus-pandemic/

Atlanta Magazine:

  • March, 13, 2020 https://www.atlantamagazine.com/news-culture-articles/coronavirus-cancels-atlanta/

NPR - Cross Cut:

  • March 12, 2020 https://crosscut.com/2020/03/slow-coronavirus-events-250-plus-people-are-banned-why-250

United States of America

Released: july 7, 2020.

COVID19 data

Real-time COVID19 data comes from the COVID Tracking Project: https://covidtracking.com/api/

Real-time county level COVID19 data comes from the NYTimes COVID19 data project: https://github.com/nytimes/covid-19-data

Population data

US 2019 population estimate data comes from the US Census: https://www.census.gov/data/tables/time-series/demo/popest/2010s-state-total.html

Vaccination data

County-level vaccination coverage is collated by the Bansal lab at https://www.vaccinetracking.us/ . We linearly interpolate between weekly values and merge county data to account for geographic exceptions in Alaska (Hoonah-Angoon plus Yakutat; Lake Peninsula plus Bristol Bay) and in New York City (New York, Kings, Queens, Bronx plus Richmond) following the New York Times dataset. Joint vaccination levels are computed for the City of Joplin, Jasper County, and Newton County, Missouri and for Kansas City, Jackson County, Clay County, Cass County, and Platte County, Missouri.

Merritt, Alexes; Tiu, Andrew; Bansal, Shweta, 2021, “Integrated US COVID-19 Vaccination Data”, https://doi.org/10.7910/DVN/BFRIKI , Harvard Dataverse, V1.

Andrew Tiu, Zachary Susswein, Alexes Merritt, Shweta Bansal. Characterizing the spatiotemporal heterogeneity of the COVID-19 vaccination landscape. medRxiv. https://doi.org/10.1101/2021.10.04.21263345

United Kingdom

Released: october 5, 2020.

The Coronavirus (COVID-19) in the UK API from Public Health England and NHSX: https://coronavirus.data.gov.uk

Italian Department of Civil Protection COVID-19 Data: https://github.com/pcm-dpc/COVID-19/

Italian maps: http://datainterfaces.org/projects/covid19eventi/

Switzerland and Liechtenstein

(from October 5, 2020 - December 19, 2021) Specialist Unit for Open Government Data Canton of Zurich COVID-19 data: https://github.com/openZH/covid_19 (from December 19, 2021): Federal Office of Public Health FOPH: https://www.covid19.admin.ch/en/overview

Released: October 19, 2020

Federal Ministry for Social Affairs, Health, Care and Consumer Protection (BMSGPK) data on COVID-19 for Austria: https://www.data.gv.at/covid-19/

Santé publique France COVID-19 data for France : https://www.data.gouv.fr/fr/datasets/donnees-relatives-aux-resultats-des-tests-virologiques-covid-19/ Note this resource also contains data for overseas departments of France, and for Saint Barthélemy, Saint Martin, and Saint Pierre and Miquelon.

Czech Republic

Released: october 27, 2020.

COVID-19 data sourced from National Health Information System, Regional Hygiene Stations, Ministry of Health of the Czech Republic and prepared by the Institute of Health Information and Statistics of the Czech Republic and the Institute of Biostatistics and Analyses, Faculty of Medicine, Masaryk University: https://onemocneni-aktualne.mzcr.cz/covid-19

Komenda M., Karolyi M., Bulhart V., Žofka J., Brauner T., Hak J., Jarkovský J., Mužík J., Blaha M., Kubát J., Klimeš D., Langhammer P., Daňková Š ., Májek O., Bartůňková M., Dušek L. COVID 19: Overview of the current situation in the Czech Republic. Disease currently [online]. Prague: Ministry of Health of the Czech Republic, 2020. Available from: https://onemocneni-aktualne.mzcr.cz/covid-19 . Development: joint workplace of IHIS CR and IBA LF MU. ISSN 2694-9423.

Data is provided by the Health Service Executive (HSE), Health Protection Surveillance Centre (HPSC), The Central Statistics Office (CSO) and Gov.ie and accessed via Ireland’s COVID-19 Data Hub: https://covid19ireland-geohive.hub.arcgis.com/

COVID-19 data from España Ministerio de Sanidad and Instituto de Salud Carlos III: https://cnecovid.isciii.es/covid19/

Released: November 22, 2020

COVID-19 data from the Statens Serum Institut (SSI):

  • https://covid19.ssi.dk/overvagningsdata
  • https://experience.arcgis.com/experience/aa41b29149f24e20a4007a0c4e13db1d

Swedish COVID-19 National Statistics from Folkhälsomyndigheten: https://experience.arcgis.com/experience/09f821667ce64bf7be6f9f87457ed9aa/page/page_0/

Albania, Andorra, Belarus, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Cyprus, Estonia, Finland, Germany, Greece, Hungary, Iceland, Israel, Latvia, Lithuania, Luxembourg, Malta, Moldova, Monaco, Montenegro, Netherlands, Norway, Poland, Portugal, Northern Macedonia, Romania, San Marino, Serbia, Slovakia, Slovenia, Turkey, Ukraine; and Gibraltar, Guernsey, Jersey, Isle of Man, Faroe Islands, Greenland

Released: december 19, 2021.

We use data aggregated from local health resources in the WHO European Region COVID19 Subnational Explorer: https://experience.arcgis.com/experience/3a056fc8839d47969ef59949e9984a71

risk assessment covid

The COVID-19 Event Risk Assessment Planning Tool is a collaborative project led by Prof. Joshua Weitz and Prof. Clio Andris at the Georgia Institute of Technology, along with researchers at the Applied Bioinformatics Laboratory , Duke University, and Stanford University, and powered by RStudio . Description of the method and analyses available at Nature Human Behaviour .

Ongoing support for the project is via the Centers for Disease Control and Prevention (75D30121P10600), Charities in Aid Foundation, The Marier Cunningham Foundation, and the Rockefeller Foundation Pandemic Prevention Institute.

  • Fact sheets
  • Facts in pictures
  • Publications
  • Questions and answers
  • Tools and toolkits
  • Endometriosis
  • Excessive heat
  • Mental disorders
  • Polycystic ovary syndrome
  • All countries
  • Eastern Mediterranean
  • South-East Asia
  • Western Pacific
  • Data by country
  • Country presence 
  • Country strengthening 
  • Country cooperation strategies 
  • News releases
  • Feature stories
  • Press conferences
  • Commentaries
  • Photo library
  • Afghanistan
  • Cholera 
  • Coronavirus disease (COVID-19)
  • Greater Horn of Africa
  • Israel and occupied Palestinian territory
  • Disease Outbreak News
  • Situation reports
  • Weekly Epidemiological Record
  • Surveillance
  • Health emergency appeal
  • International Health Regulations
  • Independent Oversight and Advisory Committee
  • Classifications
  • Data collections
  • Global Health Estimates
  • Mortality Database
  • Sustainable Development Goals
  • Health Inequality Monitor
  • Global Progress
  • Data collection tools
  • Global Health Observatory
  • Insights and visualizations
  • COVID excess deaths
  • World Health Statistics
  • Partnerships
  • Committees and advisory groups
  • Collaborating centres
  • Technical teams
  • Organizational structure
  • Initiatives
  • General Programme of Work
  • WHO Academy
  • Investment in WHO
  • WHO Foundation
  • External audit
  • Financial statements
  • Internal audit and investigations 
  • Programme Budget
  • Results reports
  • Governing bodies
  • World Health Assembly
  • Executive Board
  • Member States Portal
  • Publications /

How to use WHO risk assessment and mitigation checklist for mass gatherings in the context of COVID-19

How to use WHO risk assessment and mitigation checklist for mass gatherings in the context of COVID-19

This document provides a COVID-19 risk assessment and mitigation checklist for use by host countries and mass gathering organizers. It also includes an operational tool which offers guidance for organizers holding meetings during the COVID-19 outbreak and which should be accompanied by the WHO COVID-19 Generic Risk Assessment Excel file. It should be read in conjunction with WHO’s Key planning recommendations for Mass Gatherings in the context of the current COVID-19 outbreak (Interim guidance).

It is WHO’s view that all countries with community transmission should seriously consider postponing or reducing mass gatherings that bring people together and have the potential to amplify disease and support the recommended best practice of physical distancing. Any decision will be supported through the use of WHO tools, in particular the Risk Assessment for Mass Gatherings during COVID-19.

If movement restrictions and further national measures have been established in the country, the WHO RA does not apply. However, when the process of re-opening/conducting mass gatherings is being considered post movement restrictions, it will be key to ensure any decisions are based on a risk assessment, such as the WHO Mass gatherings COVID-19 risk assessment.

-  Access the tool

-  Access the decision tree

-  Access the publication for sports federations / sports event organizers  (uploaded 15 April 2020)

-  WHO Mass Gathering COVID-19 Risk Assessment Tool – Generic Events  (10 July 2020)

-  WHO Mass Gathering COVID-19 Risk Assessment Tool – Sports Events  (10 July 2020)

-  WHO Mass Gathering COVID-19 Risk Assessment Tool – Religious Events  (10 July 2020)

April 19, 2022

10 min read

How to Judge COVID Risks and When to Wear a Mask

Scientific American asks experts in medicine, risk assessment and other fields how to balance the risks of COVID with the benefits of visiting public indoor spaces

By Devabhaktuni Srikrishna

Patrons at a bar, some with face masks

Jeff J Mitchell/Getty Images

As COVID cases declined across the U.S. in recent months and mask mandates were lifted, more people returned to restaurants, concert halls and offices maskless. But the novel coronavirus’s Omicron subvariant BA.2 —which caused another wave in Europe and China—and related variants threaten to reverse that progress here. Earlier this month dozens of attendees (including high-ranking government officials) tested positive for COVID after  attending a dinner  in Washington, D.C.   The safest option, of course, is to continue avoiding crowded indoor activities. But there remains a lot of interest in safely enjoying bars, cafes and other higher-risk venues that offer the benefits of social interaction.

Scientific American asked experts in epidemiology, medicine, risk assessment and aerosol transmission for advice on how to decide which risks we are willing to take. These decisions are based on assessments of personal risk, community risk and exposure risk—and the steps one can take to take to mitigate them. Personal risk refers to the danger of contracting COVID faced by an individual and the members of their household. Community risk is the current likelihood of encountering COVID among members of one’s community. And exposure risk accounts for the increased chances of catching COVID at a particular venue based on airflow characteristics of the space itself and other people’s behavior.

Here is what experts say about managing these risks while maintaining some of the benefits of public life.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing . By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

How should a person factor personal risk for severe COVID into their decisions?

The number-one predictor of having a severe case of the disease is age, followed by the presence of comorbidities and immunocompromised status, according to Katelyn Jetelina, an epidemiologist who studies COVID risks at the University of Texas Health Science Center at Houston. Using data from the U.S. Centers for Disease Control and Prevention, she estimates that even vaccine-boosted people ages 50 to 64 are more than 10 times more likely to die from a severe breakthrough case than 18- to 49-year-olds with the same vaccination status. Donald Milton, a physician and clinical researcher who studies respiratory viruses at the University of Maryland, highlights recent research showing that, in households with a person who was infected with the Omicron variant (B.1.1.529) of the COVID-causing virus SARS-CoV-2, 43 to 64 percent of people became infected as well , depending on whether the initially infected person was boosted, fully vaccinated or unvaccinated. Jetelina cautions that we also need to account for the personal risks of the people with whom we live in our own risk assessments.

In general, people should discuss personal COVID risk with their doctor; it depends, in part, on which medications they take. Ethan Craig, a rheumatologist at the University of Pennsylvania, cares for patients who are immunosuppressed because of disease or medication and studies COVID risks in that population. One such immunosuppressive drug, rituximab, “knocks out your ability to make antibodies against new viral exposures and impairs your ability to make a response to a vaccine,” he says. Craig adds that such patients usually take precautions of their own accord, such as wearing high-filtration N95 masks , and “if anything, I end up having to talk people down sometimes and be like ‘Look, it’s okay to go to the grocery store.’” For some people, however, even this amount of exposure could be considered an unacceptable risk.

How does the risk of dying from COVID compare to the risk of dying from other causes linked to common activities?

Jetelina estimates that, for people between the ages of 18 and 49 who are boosted, the risk of dying from COVID is roughly equal to the risk of dying when someone drives about 10,000 miles. COVID risk goes up substantially with age and with being unboosted or unvaccinated. Thanks to vaccines, infection-induced immunity, therapeutics, better care and other factors, the relative risk of dying from COVID if you catch it is now, broadly speaking, comparable to that of seasonal flu, Jetelina says—but importantly, because you are more likely to catch COVID than flu, the absolute risk remains much greater. Jetelina recommends COViD-Taser’s Relative Risk Tool , a resource funded by the National Science Foundation, that she helped to develop. It compares one’s risk of death from the disease to such risk posed by other activities, including driving. Although it is a research tool, Jetelina says she can “really trust the science and mathematics behind it.”

But Baruch Fischhoff, a professor of engineering and public policy at Carnegie Mellon University and an authority on how to communicate health risks, cautions against using risk-risk comparisons to make choices without fully considering benefits or unquantified risks. Employers may also misuse such comparisons to compel employees to accept certain risks on the job, which is not exactly a choice. Currently, risk calculators provide estimates based on retrospective data and may be unable to reliably weigh long-term complications of COVID.

How should one assess community risk?

There is no perfect way to measure community risk because it would take repeated random testing, so experts use other estimates: daily cases per 100,000 residents, test positivity rates and growth rates. Jetelina recommends using the New York Times ’ tracker to look up community transmission for your county. She considers community risk high when there are more than 50 weekly cases per 100,000 residents. When the risk is lower than that, Jetelina—a healthy, young boosted person—feels comfortable taking off her mask indoors. “I will say it’s taken a lot of time for me to be comfortable with that,” she says. “Once transmission rates of those indicators start increasing a bit, I’m putting my mask back on.” Others suggest a slightly higher risk threshold of 10 daily (or 70 weekly) cases per 100,000 residents.*

Daily city or county case counts are often an undercount because not everyone is getting tested and home test results are not always reported. As a work-around, health authorities use the “test positivity rate,” or “ percent positive ”—the percentage of COVID tests reported to public health authorities that were positive. If that number exceeds 5 percent, it is widely considered high risk for community transmission (provided the amount of testing in that area is adequate). But the community sample used to measure test positivity likely includes many people who seek out testing because they are currently experiencing COVID symptoms. So test positivity is typically higher than the infection rates among the people you might encounter in a cafe or grocery store, most of whom do not have any symptoms but could still be infectious.

Still, Robert M. Wachter, a professor and chair of the department of medicine at the University of California, San Francisco, says there is no test positivity threshold that separates “safe” from “not safe” because it also depends on other factors , such as whether the benefit outweighs the risk to you, personally, the number of people you will be exposed to, and the closeness and duration of exposure.

Because of these large uncertainties in test coverage, Gerardo Chowell, a professor of mathematical epidemiology at Georgia State University, prefers to look at the general trend in daily COVID cases, hospitalizations and deaths, or percent positive . “When the trend is going up, you’re seeing the transmission chains expand,” Chowell says. “That means that the reproduction number”—the expected number of secondary infections from each infected person—“must be greater than one. If it is increasing, that’s probably the time when [one has the] highest risk of acquiring COVID in a social setting without a mask,” he says.” Wachter points out that, where available, wastewater surveillance may also give an early indication of COVID trends.

What is known about exposure risk in different settings, such as bars or movie theaters?

Linsey Marr, a professor of civil and environmental engineering at Virginia Tech and one of the world’s leading experts on airborne transmission of viruses, says COVID risk in indoor spaces exists on a continuum. It is believed that reducing the amount of virus inhaled (i.e., the inhalation dose) makes infections less likely or illness less likely to be severe . Marr says one of the riskiest settings is an aerobic exercise studio: if somebody is infected, they are going to be exhaling more virus, and everyone else will be inhaling at a faster rate, too. Breathing heavily produces up to 10 times more aerosol particles that carry viruses than breathing normally, according to Richard Corsi, an expert on indoor air quality and dean of the College of Engineering at the University of California, Davis.

Marr says that talking in bars expels a similar number of respiratory particles as coughing, “so it’s like everyone’s in there coughing together.” Craig uses smoking as an analogy for aerosols exhaled during breathing and talking. In other words, “if a person was smoking in this place, would I be able to smell it?” he says. In movie theaters, there is risk of exposure from those seated immediately around you, but because of limited talking and, typically, a high ceiling, there is a lot more dilution of the air. So such a theater may be less risky than other crowded indoor venues. By that reasoning, museums, big-box retailers and grocery stores with high ceilings tend to be relatively safer as well.

Places with rapid rates of ventilation and filtration—such as some subways—are also much lower risk. The Bay Area Rapid Transport (BART) system in San Francisco Bay, for example, filters the air more than 50 times an hour with “virus-trapping MERV-14 air filters ” inside each car. An Italian study of schools found that classrooms with ventilation systems that exchanged air six times per hour reduced infections by more than 80 percent , but many classrooms in the U.S. fail to meet this standard. Corsi characterized current public health recommendations of four to six air exchanges per hour as “a little bit anemic … we can do better.” He recommends owners or managers of crowded indoor spaces, such as classrooms, offices and bars, aim to filter or ventilate with fresh air at rates approaching 12 air exchanges per hour to reduce risks down to the level of an airborne isolation room in a hospital. Not all venues have the resources to do this, but the benefits increase with greater filtration rates, so the closer to this ideal, the better. In places with inadequate ventilation, consider bringing a portable high-efficiency particulate air (HEPA) purifier —or building your own using box fans and high-quality HVAC (heating, ventilating and air-conditioning) filters—to run nearby.

Although the virus is thought to be transmitted primarily through the air, there have been a few documented cases of surface transmission, so it remains a good idea to wash your hands frequently, Marr says.

How can one further reduce the risk of getting COVID from everyday activities?

Getting vaccinated and boosted protects against death, hospitalization and, to a lesser extent, catching and spreading the virus. To avoid infection, Wachter recommends wearing an N95 mask . He has observed that the risk of U.C.S.F. health care workers—himself included—getting infected from their patients while wearing a well-fitting N95 is extraordinarily low. These respirators get close to filtering all of the virus, but they do not filter 100 percent. And if an N95 does not form an airtight seal with your face, it may allow unfiltered air into your lungs. So it is essential to try out and select N95 models that fit and seal to your face without gaps.

What is the risk of taking your mask off in a restaurant or bar to take a sip or bite?

In the 1990s medical researcher Stanley Wiener, then at the University of Illinois College of Medicine, proposed that a person could use respirators to survive aerosolized biological attacks, taking it off briefly to consume food and drink. During the pandemic, many places have allowed masks (or N95 respirators) to be removed while actively eating and drinking. Removing an N95 momentarily for a bite or sip carries “some risk, but I think it’s pretty tiny if you’re exposed for three seconds,” Corsi says, unless an infected person is “right in your face ... and shedding a lot [of virus].” Provided community risk is low or trending downward, Chowell, too, feels comfortable briefly removing his respirator to eat or drink at a party.

What do we know so far about the risk of “long COVID”?

Ranu Dhillon, a physician at Brigham and Women’s Hospital in Boston, who advises governments on infectious disease outbreaks, says he is seeing some patients with “a constellation of different types of symptoms after acute COVID infection,” including young, boosted and relatively healthy people. Wachter cautions that some fraction of vaccinated individuals who get infected—which one study estimates to be around 5 percent and possibly higher—may continue to feel short of breath or fatigued or think less clearly than before. COVID may increase the risks of heart attack, stroke , brain abnormalities or the onset of diabetes . While there have been preliminary studies of the rates of long COVID, including risks of developing cardiovascular complications , Wachter says many of these involved unvaccinated people or infections with variants prior to Omicron. Provisionally, he likens these risks to 20 years of untreated high blood pressure or smoking and points out that one cannot know the risk of long COVID among vaccinated and boosted individuals until long-term studies have concluded, which will take years.

How can we balance these risks with the benefits of socializing and being with others?

According to Wachter, one of the most important factors in overall COVID risk is whether “the person next to me has it.” He acknowledges that if someone is both vaccinated and boosted, it is not irrational for that person to decide that the mental energy and angst of calculating risks and taking precautions is high enough—and the risks of getting sick or dying from COVID are low enough—that they will go back to “living like it’s 2019”—as people in many parts of the country already have. He still worries about the risk of long COVID, though. Milton says that many people “don’t want to wear masks forever” and that we should work to make our built environments better at stopping aerosol transmission . He says people also have to decide whether to wear a high-quality mask when they are around those at higher risk, such as the elderly or immunocompromised, or around other people in general, such as at a party. When community transmission is low, Chowell says he may feel comfortable removing his N95 at parties in some situations, such as to have a drink. “Then you find a way to still interact with people, and they smile back once in a while,” he adds.

* Editor’s Note (4/19/22): This paragraph was edited after posting to correct the description of the threshold of COVID transmission that Katelyn Jetelina considers a high community risk.

Calculating COVID-19 risk

A new COVID-19 risk calculator developed at Harvard T.H. Chan School of Public Health can help people understand the ways that masking, ventilation, filtration, and other factors can mitigate the spread of COVID-19 in indoor environments.

The tool, developed by Joseph Allen , associate professor of exposure assessment science, and his team at the Healthy Buildings program, is based on a model of the COVID-19 outbreak on the Diamond Princess cruise ship in early 2020 as well as other superspreader events.

Allen and Parham Azimi , a research fellow in the exposure, epidemiology and risk program at Harvard Chan School, described how to use the calculator in an April 6, 2021, opinion piece in the Washington Post.

Users can plug in information such as the size of the room they’ll be in, how long they’ll be in the room, whether they’ll be seated or active, whether people are wearing masks and staying six feet apart from each other, and whether the windows are open. Once all the information is filled in, the tool produces a risk estimate. The tool also enables users to switch parameters in order to see how beefing up safety measures can reduce risk.

Read the Washington Post article: Opinion: So you’re unvaccinated and want to see a friend. Here’s how to calculate your risk.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List

Logo of jpersmed

COVID-19 Pandemic: Public Health Risk Assessment and Risk Mitigation Strategies

Dae-young kim.

1 Department of Biological and Environmental Science, Dongguk University-Seoul, 32 Dongguk-ro, Ilsandong-gu, Goyang-si 10326, Gyeonggi-do, Korea; ude.kuggnod@mikpbs (D.-Y.K.); moc.liamg@9ardnerusednihs (S.K.S.)

Surendra Krushna Shinde

Saifullah lone.

2 Interdisciplinary Division for Renewable Energy and Advanced Materials (iDREAM), National Institute of Technology (NIT), Srinagar 190006, India; [email protected]

Ramasubba Reddy Palem

3 Department of Medical Biotechnology, Dongguk University-Seoul, 32 Dongguk-ro, Ilsandong-gu, Goyang-si 10326, Gyeonggi-do, Korea; moc.liamg@ydderabbusmelap

Gajanan Sampatrao Ghodake

Associated data.

Not applicable.

A newly emerged respiratory viral disease called severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is also known as pandemic coronavirus disease (COVID-19). This pandemic has resulted an unprecedented global health crisis and devastating impact on several sectors of human lives and economies. Fortunately, the average case fatality ratio for SARS-CoV-2 is below 2%, much lower than that estimated for MERS (34%) and SARS (11%). However, COVID-19 has a much higher transmissibility rate, as evident from the constant increase in the count of infections worldwide. This article explores the reasons behind how COVID-19 was able to cause a global pandemic crisis. The current outbreak scenario and causes of rapid global spread are examined using recent developments in the literature, epidemiological features relevant to public health awareness, and critical perspective of risk assessment and mitigation strategies. Effective pandemic risk mitigation measures have been established and amended against COVID-19 diseases, but there is still much scope for upgrading execution and coordination among authorities in terms of organizational leadership’s commitment and diverse range of safety measures, including administrative control measures, engineering control measures, and personal protective equipment (PPE). The significance of containment interventions against the COVID-19 pandemic is now well established; however, there is a need for its effective execution across the globe, and for the improvement of the performance of risk mitigation practices and suppression of future pandemic crises.

1. Introduction

Coronaviruses were generally not considered highly infectious to humans before 2002; however, after the occurrence of the severe acute respiratory syndrome (i.e., SARS; 2002) [ 1 , 2 ], Middle East respiratory syndrome (i.e., MERS; 2015) [ 3 , 4 ] and COVID-19 pandemic attracted serious attention of the scientific community and public health authorities. SARS-CoV-2 is the ninth known coronavirus to cause infections in humans, causes severe respiratory illness, and breathing discomfort. Its symptoms are similar to those of pneumonia and seasonal Influenza virus, as well as some other coronaviruses [ 5 , 6 ]. Notably, the worldwide spread of this virus was observed within a few months of its first appearance in December 2019 at Wuhan, China [ 7 ]. The SARS-CoV-2 outbreak has created a prolonged global public health and economic crisis and created confusion over the need for travel bans and border closures, the closure of educational institutions and businesses, and the implementation of preventive measures [ 8 ].

The SARS-CoV-2 virus quickly disseminated worldwide and caused unprecedented public health emergencies. The World Health Organization (WHO) announced COVID-19 as a global pandemic on 11 March 2020. The symptoms of this virus vary widely among individuals, with some patients being completely asymptomatic and others developing severe symptoms together with fever, cough, and fatigue [ 9 ]. Many experts agree that the total number of confirmed cases worldwide so far has mainly been underestimated because of the occurrence of asymptomatic patients; further, patients with mild symptoms recover naturally and remain undetected or unaccounted for [ 10 ]. The SARS-CoV-2 disease risk is also associated with epidemiological factors, host status (immunity, heredity, age, and overall health), exposure to contaminated surfaces, host tropism, host cell receptors, pathogenesis, etc. [ 11 , 12 ]. Such factors make it even more challenging to quantify the global estimate of existing infections and recoveries. Our previous report described the significance of the biological characteristics of SARS-CoV-2 and its biomarkers to develop diagnostics, point-of-care (POC) testing and surveillance measures [ 13 ].

This article covers the significance of risk assessment, which accounts for multiple epidemiological factors associated with the SARS-CoV-2 pandemic (e.g., host range, viral dose, surface survival rate, and some others). The COVID-19 virus is predominantly an airborne disease [ 14 ]—the risk of transmission is greatly dependent on the distance of an infectious source [ 15 ], ventilation [ 16 ], and quality of the face mask; besides, the concentration of the virus in respiratory droplets defines the viral load and a hidden risk of infection from asymptomatic patients [ 17 ]. Risk assessment perspectives are thus essential to ensure the safety of viral research laboratories, improve public awareness, as well as ensure the safe disposal of biohazard materials and the overall safety of healthcare workers and officials [ 18 , 19 ]. The risk assessment and mitigation perspectives are intended to design safety measure strategies and potential preventive measures and policymaking decisions.

During the COVID-19 pandemic, personal- and community-level mitigation practices have played a pivotal role in slowing down viral transmission rates and ensuring stability for public healthcare systems [ 20 ]. Herein, we analyze prospective data accessible in the literature and use it to describe public health risk factors. We derive key strategies to convey effective mitigation measures to reduce stress on the public healthcare system. Thus, this report has several strengths in terms of public health risk assessment and implementation of mitigation strategies. Lastly, our report emphasizes the efficacy of administrative and engineering controls and enforcing both public interventions, such as social distancing and vaccination and personal interventions, such as face masks. These strategies would reduce the risk of further spread and would be relevant to almost all settings and unsettled contexts, including countries that need additional risk mitigation strategies, such as mass level vaccination, which seems unachievable in the near future.

2. Risk Assessment Perspective

2.1. risk group.

The WHO and the National Institutes of Health (NIH) have established frameworks to organize infectious organisms and their toxins into four risk groups based on the risk in humans, adverse impacts on public health, and status of preventative measures and treatment options. In turn, these risk groups are further classified based on human mortality rates, public health risks, and pharmaceutical interventions, preventive measures, and administrative controls [ 21 , 22 , 23 , 24 ]. The coronaviruses outbreak SARS (2002–2004), and MERS (occurred in 2015) were classified as Risk Group 3 pathogens according to NIH authorizations [ 25 ]. Likewise, the COVID-19 has posed public health and public safety risk, and the European Commission does classify this virus as a Risk Group 3 pathogen [ 25 ]. The SARS-CoV-2 risk has been partly reduced by the rapid advancement of diagnostic kits and vaccination; however, delay in drug development is still the basis for the virus’ status as a high-Risk Group pathogen. So far, the COVID-19 pandemic has posed an ominous threat to public health as a result of rapid worldwide spread and impact on human health, in addition to massive economic and social disruption [ 26 ].

An integrated risk assessment and inclusive contact tracing must be executed for international and domestic travelers, using air passenger itinerary data, surveillance data, global positioning system (GPS) data, and individual case reports [ 27 ]. According to a public report estimate, approximately 60,000 air passengers traveled from Wuhan to more than 382 cities worldwide in the early days of emergence, among which 850 were the carrier of the SARS-CoV-2 virus before lockdown measures were implemented in Wuhan, China. Most of those travel destinations were in Asian countries; however, some other individuals traveled to central Europe, Australia, and the United States, thus supporting strong correlations between the predicted travel risks and the reported cases. Let us suppose that international travel restrictions had been timely implemented, particularly on residents of Wuhan, China, in the early days of COVID-19 emergence. If so, it can be concluded that we could have successfully avoided the virus’ spread elsewhere in the world and perhaps reduced the risk of a pandemic outbreak [ 28 ].

2.2. Host Range

The host range of any infection can be used to determine whether it infects a particular host or is likely to be diverse. Therefore, it is essential to understand how an animal virus mutates or evolves to cause human–human infections, resulting in a global pandemic. The answers to such questions lie in the biological characteristics of betacoronaviruses, viral genome mutations, and the evolution of new viral progenies that enabled COVID-19 virus infections in animal species, including humans and different host cell types [ 29 ].

Coronaviruses and some other RNA viruses are known for their rapid mutation rate [ 30 , 31 ]. This characteristic feature allows betacoronaviruses to enable a rapid diversification rate, thus occasionally rendering either more virulent progenies or generating nonviable variants [ 32 , 33 ]. Evolutionary epidemiology suggests that viral adaptation may occur in a new host range and is driven mainly by stochastic mutations [ 34 ], which may encounter the host or miss the target; however, it limits our capacity to anticipate evolutionary changes and the risk of new variants.

The risk of viral diseases is never-ending; thus, it is challenging us by emerging, re-emerging, or resurging unpredictably. Human beings and animals are confronting an intractable challenge from time to time due to infectious viral diseases and public health emergencies. Human infiltration, ecological concerns, habitat loss, and viral-related factors—including natural selection, mutation, and new variants—are the causal factors in the emergence or re-emergence of novel viral diseases [ 35 ]. The expansion of the viral host range results in a surplus of natural host reservoirs, which often leads to different variants, and the emergence or resurgence of viral diseases continues [ 36 ]. The S protein is a critical target region for the formation of new evolutionary variants. Specifically, this protein recognizes different host species and different host cell target proteins within a specific host and other host cell receptors; this phenomenon is referred to as tropism [ 37 ]. The antigenic variations in S proteins for both the SARS-CoV-2 and SARS viruses have been examined to determine their antigenicity [ 38 ]. Approximately six epitopes (CVADYSVLY, RISNCVADY, RSFIEDLLF, MTSCCSCLK, VLKGVKLHY, and RVDFCGKGY) match with SARS-CoV-2 and SARS S protein [ 39 ]. Variations in the S protein amino acid residues and the distinctive biological features of the SARS-CoV-2 can be a basis for an infection less severe than the MERS infection [ 40 ]. However, unlike MERS, the SARS-CoV-2 virus successfully caused a global pandemic crisis because of its higher transmissibility [ 39 ].

The spread of viruses to different species must lead to new reservoirs in animals and, after the mutations, it must have eventually enabled them to target human hosts. Given the rapid global spread of the COVID-19, it is likely that this virus must have mutated several times, with a high probability of variant recurrence [ 41 , 42 ]; it has been more challenging to develop vaccines and therapeutics [ 43 ]. A recent study reported quantitative data on tropism, cell damage, and replication kinetics in the SARS-CoV-2 virus [ 44 ]. Peridomestic animals or wildlife species must have served as stable transitional reservoirs, thus improving the likelihood of the SARS-CoV-2 being accidentally transmitted to human hosts [ 45 , 46 , 47 ]. Public safety surveillance and epidemiological reports suggested that MERS originated from dromedary camels in Qatar. Domesticated animals act as a stable reservoir for a viral disease, which eventually infects human hosts [ 48 , 49 ].

Moreover, epidemiological investigations suggest that the SARS virus also had an animal origin and is known for rapid transmission, resulting in a sizable pandemic. Furthermore, structural and serological studies have indicated that this virus was initially carried by palm civets ( Paguma larvata ), as confirmed by samples of live animals infected with SARS [ 50 , 51 ]. Therefore, further research is required to identify changes in the biological features in the future SARS-CoV-2 variants and their susceptibility to infect a broad host range and further development in tropism [ 52 , 53 , 54 ].

The current SARS-CoV-2 outbreak is linked to a wet market in Wuhan, China; therefore, it is widely thought that the wild animals traded might have been a source for the zoonotic transmission of COVID-19 [ 55 ]. Such concern also raised the question of whether SARS-CoV-2 spreads from humans to pet animals and, eventually, to wildlife; as a consequence, it will generate new reservoirs besides those that already exist [ 56 , 57 ]. It was found that outbred cats were more susceptible to the SARS-CoV-2 virus, and airborne transmission was reported in cats and ferrets [ 58 ]. This report further illustrated that dogs with low susceptibility did not support viral replication of SARS-CoV-2 and that chickens, ducks, and pigs were also not susceptible. This observation suggests that there is the possibility that a few more potential wild animal reservoirs elsewhere in the world exist [ 58 , 59 ]. The Malayan pangolin population is also more susceptible to various coronaviruses; therefore, it is also considered a potential host reservoir for SARS-CoV-2 [ 60 ]. Strict measures against the trade of non-farmed animals, high standards of hygiene practices, and a regulatory framework for the wet market would help avoid the emergence of viral diseases and perhaps prevent predictable zoonotic transmissions [ 60 , 61 , 62 ].

2.3. Possible Transmission Routes

SARS, MERS, and SARS-CoV-2 are airborne coronaviruses that mainly spread via coughing, sneezing, and talking by a virus carrier individual. SARS-CoV-2-infected humans expel differently sized aerosolized droplets into the air with a great force during coughing. Tiny-sized aerosol droplets (<4 μm) readily travel relatively long distances and, within reach of nearby individuals, result in airborne viral transmission and being prone to high risk, particularly in indoor settings [ 63 ]. On the other hand, large droplets fall in close proximity and contaminate those surfaces; there is the possibility of direct contact or surface transmission; further details can be found in a report on transmissibility and transmission routes [ 64 ]. Immunocompetent individuals are also susceptible to higher viral loads [ 65 ], become infected simply via touching contaminated surfaces with mucous droplets, have viable SARS-CoV-2 virus, and possibly transmit the virus if coming in contact with the nose, eyes, or mouth (i.e., indirect contact) [ 66 ].

Person-to-person aerosol-mediated airborne transmission occurs most frequently in large interacting groups, including family members, friends, neighbors, tourists, shoppers, healthcare/hospital workers, and other settings; proximity favors direct transmission [ 67 ]. However, a recent report suggested that SARS-CoV-2 viral particles in aerosol droplets can remain viable in cold air, causal of rapid airborne transmission, especially in the winter season [ 68 ]. Aerosol droplets from SARS-CoV-2-infected people may pose a severe threat even at considerably long distances and in enclosed spaces, particularly if they lack proper ventilation [ 67 ]. Breathing and loud talking also produce smaller aerosol particles, similar to those reported by Anfinrud et al. [ 69 ].

The transmission and spread risks of the SARS-CoV-2 virus can be avoided by isolation of quarantine measures if infected individuals show symptoms. However, some infected individuals stay infectious while asymptomatic, and those patients continue the risk of silent dissemination [ 70 ]. Some infected individuals are more susceptible to severe coughing and produce more aerosol particles than others, thus acting as super-spreaders. The diameters of aerosol particles fall within the micron range hardly affected by gravity [ 71 ]; such droplets are prone to disperse or travel by airflow [ 72 ]. After the emergence of the SARS-CoV-2 pandemic, routes of transmission have been a central topic of debate. So far, inhalable aerosol droplet-mediated airborne transmission is being considered as the primary basis for the manifestation of a global pandemic [ 73 ].

Some other possible airborne transmission routes for the SARS-CoV-2 virus may be a rare event via air medium containing solid particulate matter (PM), dust particles, and air pollutants; so far, it is supposed to be involved in coronavirus infection [ 74 ]. An inhalation of virus-loaded airborne dust and PM can pass the virus into deeper tracheobronchial and alveolar regions, which may rarely pose a risk of infection [ 75 ]. The risk of long-distance travel with sustained viability is of great concern since airborne dust particles can provide a sufficient surface area to adsorb SARS-CoV-2 viruses. Therefore, detailed investigations over possibilities of SARS-CoV-2 virus’ adsorption onto dust surfaces, its viability, and transmission risks need to be investigated further for their role in dissemination.

2.4. Fomite-Mediated Transmission

However, we should not overlook alternative transmission routes; otherwise, they may have serious ramifications, particularly in indoor settings [ 76 ]. Multiple transmission routes are possible for the SARS-CoV-2 infections, including fomites (objects or materials likely to transmit diseases, such as utensils, clothes, and furniture); however, it depends on temperature, humidity, viral load, and some other factors [ 77 ]. Therefore, the relative risk of the SARS-CoV-2 transmission via fomite transmission is negligible owing to low viral load or inactivation of viral particles by environmental factors, temperature, and humidity [ 78 , 79 , 80 ]. Fortunately, a fomite-mediated transmission is a rare event, but it is challenging to decouple it from other possible transmission routes, particularly in the case of transmission by asymptomatic patients. The SARS-CoV-2 can be transmitted to others if healthy individuals contact the contaminated surfaces and touch the nose, mouth, or eyes. Therefore, proper hand hygiene can be an excellent intervention to avoid the fomite route and reduce the risk of transmission [ 81 ].

Microbial risk assessment is applicable to quantify and understand the relative risk of fomite-mediated transmission and evaluate the efficiency of preventive actions to lower the risk of COVID-19 [ 80 ]. Fomites have given a relatively low contribution to the rapid spread of the COVID-19; however, guidelines need to be followed to avoid the risk and anxiety of infection [ 64 ]. Good hand hygiene, including washing hands with soap and 70% alcohol-based hand sanitizers, could reduce the risk of fomite transmission, mainly in the home, healthcare facilities, and community settings [ 82 ]. This report suggests the need to develop quantitative models for identifying high-risk objects and effective sanitization practices to reduce the risk, particularly at indoor settings with high priority (e.g., public buildings, treatment centers, testing facilities). A better understanding of disinfectant efficacy on diverse surfaces and their possible side effects, such as toxicity and negative impact on the environment and human health, will allow us to choose optimal disinfection strategies [ 83 ].

2.5. Surface Survival

Numerous researchers have investigated the surface survival of COVID-19 on various surfaces, including non-porous and porous objects [ 84 , 85 , 86 ]. These studies reported suggesting the fewer viability of viruses on porous surfaces. Virus persistence is the ability of a virus to maintain its viability onto solid surfaces or in airborne aerosol particles. Since the membrane of enveloped viruses is made up of lipids and proteins, they are known to be more prone to inactivation and desiccation than those of non-enveloped viruses [ 87 ]. Enveloped viruses lose their viability once the envelope disrupts. The aerosol transmission route was also a central driver for the spread of SARS during 2002–2003 [ 88 ]. Both SARS-CoV-2 and SARS viruses have similar concerns about viability; long-term survival in the air or on surfaces is the basis for spreading the current SARS-CoV-2 pandemic [ 87 ]. Doremalen et al. recently reported that both SARS viruses could remain viable for a few hours on concrete surfaces [ 68 ]. Their viability on plastic, steel, cardboard, and copper surfaces is about 15, 13, 8, and 3 h, respectively [ 25 ]. However, impermeable non-porous surfaces support the extended viability of the virus, depending on the temperature and relative humidity. The relatively rapid inactivation of SARS-CoV-2 viruses is possible onto the porous surfaces compared to the non-porous; it attributes to the faster evaporation of aerosol droplets and prompt capillary action by porous structures [ 89 ].

2.6. Wastewater-Based Epidemiology

Recent studies highlighted the significance of wastewater-based epidemiological investigations in performing SARS-CoV-2 prevalence and community surveillance, particularly after the establishment of the pandemic crisis [ 90 ]. Thus, wastewater-based epidemiology has raised several questions for handling the SARS-CoV-2 pandemic that needs to be addressed. Virus RNA can be detected in saliva, urine, and stool samples of COVID-19-infected patients; however, it is unclear whether the fecal transmission route is possible. Particularly, trace RNA residues of SARS-CoV-2 were detected in fecal samples for more than 30 days in recovered patients, wherein respiratory test results were negative, indicating the shedding of RNA residues via urine or urine fecal matter after the complete recovery of the patient. However, it would be essential to explore the possibility for the SARS-CoV-2 virus survival in wastewater settings and perform community surveillance practices [ 91 ]. There is a need for further efforts to examine the survival of COVID-19 in some other environment settings in addition to the effect of wastewater treatments on the virus’ fate [ 91 ].

In addition to this, there is a need to develop robust protocols to make it easier to concentrate and quantify enveloped viruses in water samples, which is pivotal [ 92 ]. At present, several research efforts concerning the detection and quantification of enveloped viruses in water samples employ some methods applicable for non-enveloped viruses. The SARS-CoV-2 virus has been detected successfully for trace RNA in various wastewater samples using several concentration protocols [ 93 ]. However, wastewater sampling suffers from several limitations, including delays in sampling and testing, viral inactivation during transportation or depending on temperature, dilution caused by precipitation, variability in the sample, and a lack of sophisticated detection systems [ 94 ]. Despite these factors limiting wastewater surveillance, it holds tremendous potential as an inexpensive type of widespread monitoring that can detect hotspots before they turn into outbreaks, inform recovery guidance and avoid the emergence or resurgence of the SARS-CoV-2 [ 95 ]. Further efforts are required in the field, including policy reforms, ethical practices, sophisticated protocols useful for measuring virus concentrations in wastewater, and accurate estimation of disease prevalence and community surveillance [ 94 , 96 ].

2.7. Reproduction Number

The transmissibility of a viral disease is an essential factor in estimating the virus’ ability to disseminate from an infected person to another host or healthy individual. To assess this, the analysis of “R naught” or “R0”, the reproduction number, is mostly used to determine the likelihood of an epidemic crisis or its severity. Therefore, the R0 can be used to explain how any novel or emerging infection could spread in a susceptible population. Therefore, this parameter is a fundamental concept in the studies of the epidemiology of infectious viruses and other contagions, thus highlighting the instrumental role in understanding any contagious disease that has the potential of global spread and causing the pandemic crisis [ 97 ]. The given infections are likely to fade quickly at R0 values below 1.0. If the R0 value is about 1.0, the disease will remain in an exposed or low-immunity population. If the R0 value exceeds 1.0, it may cause an epidemic or outbreak such as SARS-CoV-2. According to the WHO and the data collected from an exposed population in Wuhan during the initial episodes of SARS-CoV-2 spread, an average R0 value was approximately 2.3 and reported even higher R0 values [ 98 ]. Nevertheless, other independent assessments predicted the R0 for SARS-CoV-2 to range from 1.8 to 3.6, consistent with the WHO estimate [ 99 ].

However, examining the early stages of the COVID-19 outbreak in China, which involved modeling travel and epidemiological data, showed a higher R0 value of approximately 5.7 [ 100 ]. Further, this value may be even higher in crowded areas such as dense urban settings; therefore, SARS-CoV-2 appears to be far more transmissible than the previously reported SARS and MERS viruses. The R0 values for MERS and SARS were 0.45 [ 101 ] and 3.0 [ 102 ], respectively. On the other hand, the R0 value of the seasonal flu is about 1.3 in a population with herd immunity. Therefore, additional studies are required to evaluate the public safety implications of accurate R0 value estimations. As discussed above, higher R0 values may result in the immediate spread of infection through an exposed population, after which the exponential stage severely caused a global pandemic crisis, as shown in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is jpm-11-01243-g001.jpg

Exponential spread of recently emerged viral diseases on the basis of their R0 values. Higher R0 values indicate higher transmission rates of viral infections among human populations without acquired immunity or vaccination. The influenza virus, for which the human population has developed herd immunity, still causes seasonal flu in different parts of the world. Reprinted from the reference [ 25 ].

The further mutation or generation of new variants also influences transmission competence and R0 values; it may be low or high [ 64 ]. Some infected but asymptomatic individuals transmit the SARS-CoV-2 virus before becoming symptomatic, contributing to a higher R0 [ 103 ] and acting as super-spreaders [ 104 ]. Asymptomatic spread certainly goes undetected; however, its role in influencing the R0 value and causing rapid spread and global pandemic is well evident. However, the timely introduction of social distancing, hygiene measures, and mask-wearing have proven effective strategies for reducing R0 and associated mortality rates [ 99 ]. A high rate of transmission through asymptomatic individuals must be liable to large clusters of the SARS-CoV-2 infections, and it may perhaps result in collective immunization of the population [ 104 ]. As expected, worldwide vaccination is another potential adjunct to reduce the average R0 value. The significance of mass vaccination campaigns in different scenarios has been recently reported in Italy [ 105 , 106 ]. It suggests that the speed of vaccination is more important if the R0 value is higher in specific settings.

2.8. Viral Dose

The viral dose is a significant factor in causing a successful infection, and a low viral dose may not cause an infection. The ID 50 value is defined as the value of viral count needed to infect 50% of the given population. Some viruses, such as influenza, have low viral doses [ 107 ], whereas those with high viral doses typically cause more severe diseases [ 108 ]. Both SARS-CoV-2 and SARS exhibited near-identical half-lives for aerosol droplet transmission and different surfaces, including plastic, copper, cardboard, and stainless steel [ 68 ]. Moreover, viruses’ viability depends on the surface material and some environmental factors [ 68 ]. However, epidemiological investigations indicate that SARS-CoV-2 possesses unique characteristics during post-infection viability, mainly with high viral loads in the upper respiratory tract [ 109 ]. Importantly, these differences enable the hidden transmission of COVID-19 during the asymptomatic phase [ 110 ]. COVID-19 infection is also possible via fomite transmission if the viral load is sufficient to cause disease since viruses remain viable in aerosol droplets [ 111 ]. The accurate estimation of the expelled SARS-CoV-2 viruses during coughing is a great challenge to the researcher community.

On the other hand, MERS and SARS virus shedding begins from symptoms, and infectivity remains till the second week from infection [ 112 , 113 ]. Thus, both MERS and SARS virus infections were easier for contact tracing and containment than SARS-CoV-2 viruses. However, the respiratory tract begins viral shedding from 2 to 3 days before the indication of symptoms; thus, it is evident that a large percentage of transmission occurs before the declaration of confirmatory test results [ 114 ]. COVID-19 virus shedding during the presymptomatic period and by asymptomatic patients for about 14 days is likely conceivable [ 115 ]. Thus, such cases certainly act as a significant contributor to the silent spread and global pandemic crisis, as it undergoes undetected due to limited testing capacity or delay of issuing reports [ 104 ].

2.9. Case Fatality Ratio (CFR)

CFR of viral infections is defined as the rate of fatalities to the total count of the confirmed cases, which can be used to assess the severity [ 116 ]. The clinical complications range from asymptomatic to mild pneumonia-like symptoms or influenza viral infection-like symptoms and severe disease associated with lung tissue damage [ 117 ], multiorgan failure, and death [ 118 ]. COVID-19 survivors are also prone to a higher risk of dementia; it was more evident in females (≥60 years old) [ 119 ]. Worldometer data analysis on the first week of February 2021 revealed that the CFR for the SARS-CoV-2 virus varies among countries, as shown in Figure 2 . High death rates in some countries were likely due to inadequate healthcare systems, lack of funds to handle the outbreak effectively and enough infrastructure facilities to treat patients with severe symptoms.

An external file that holds a picture, illustration, etc.
Object name is jpm-11-01243-g002.jpg

CFRs for SARS-CoV-2 in different countries as of 1 February 2021. The data were obtained from Worldometer. The graph illustrates the variations in CFR values depending on the country.

Furthermore, delayed or incomplete testing can result in high CFR values, or inaccurate estimates [ 120 ], since the count of the COVID-19-infected patients seems substantially higher than the number of confirmed cases after testing. The WHO’s estimates on the ratio of deaths per total confirmed cases vary from 1 to 9%, while the world average CFR value for COVID-19 is about 3.4%, which is comparatively better than SARS (11%) and MERS (34%) ( Table 1 ). Several RNA viruses have been begun from either a bat reservoir (e.g., coronaviruses, zika virus and Ebola virus) or a bird reservoir (e.g., influenza virus), except for HIV, which evolved and originated from a primate reservoir [ 121 ] ( Table 1 ).

Previous and recent pandemic viral diseases worldwide and their fatality ratios.

YearContagionDiseaseWorldwide CasesWorldwide DeathsFatality RatioReferences
1918Influenza A (H1N1)Influenza500 million>17.4 million>2.54%[ ]
1957–1959Influenza A (H2N2)Influenzaunidentified1.1 million<0.11%[ ]
1968Influenza A (H3N2)Influenzaunidentified1.0 million<0.52%[ ]
1981HIVHIV/AIDS75 million32 million99.98%[ ]
2002SARSSARS842291611.4%[ ]
2009Influenza A (H1N1)Influenza12,70047000.1–5%[ ]
2012MERSMERS249411,32534%[ ]
2014–2016Ebola virusEbola28,65213,56240%[ ]
2016Zika virusZika41,300---8.3%[ ]
2019SARS-CoV-2COVID-19101,561,2192,196,9442.1[ ]

Significant differences were observed in the CFR values published by different countries ( Figure 2 ). In the initial period of the SARS-CoV-2 pandemic, most European countries exhibited considerably higher CFRs values, approximately 12.5% in the United Kingdom, 10.19% in Spain, 12.79% in Italy, 14.75% in France, 11.9% in Belgium, 10.8% in the Netherlands and 8.78% in Sweden, according to the WHO’s report ( Figure 2 ). Among these European nations, only Germany maintained a much lower CFR value of about 2.29%. The high numbers were due to aged populations, compromised immunity, and chronic comorbidities such as diabetes, high blood pressure, and other metabolic diseases [ 132 ]. The CFR is also strongly correlated with cardiovascular diseases and the age of the infected individuals. Old age groups were more prone to develop severe complications and sometimes death after COVID-19 infection. They have weaker immune systems and often present with other conditions, such as metabolic disorders, hypertension, diabetes, and cancer [ 133 ].

Furthermore, the CFR values of a given country vary over time. Therefore, it would be challenging to make firm conclusions regarding the mortality rate [ 134 , 135 ] and morbidity of COVID-19. Further research might provide accurate insights into other factors contributing to high CFR values [ 136 ]. However, accessibility to well-equipped healthcare systems plays a vital role in achieving low CFR values. Notably, countries lacking experienced teams and an established healthcare infrastructure will predictably have a high CFR. Thus, as the SARS-CoV-2 global pandemic continues to spread, other countries/communities must quickly establish critical life-saving healthcare systems.

3. Risk Mitigation Strategies

This report briefly discusses mitigation strategies and containment measures applicable for the containment of the COVID-19 pandemic; it is a prevalent challenge to healthcare systems worldwide. The rapid development of the COVID-19 pandemic has proven that biosafety policies are a critical part of human society and economic security. At present, a constant increase in the count of the SARS-CoV-2 cases and resurgence risk is clearly evident in some counties. It is critical to protect susceptible populations by eliminating the transmission risks and avoiding superspread events. Given the current pandemic crisis, we all need to comply with the rules and regulations at the public, community, or personal levels since collective effort is a key to mitigating global COVID-19 risk. Some hazardous waste management principles are also applicable to the design framework of risk mitigation for the effective containment of SARS-CoV-2 [ 137 ]. The restoration of collaborative spirit is also essential with various allied efforts, including the safety of healthcare staff, public safety, food security, conducting surveillance, rapid detection, self-isolation, contact tracing, and medical treatments. The perspective presented here is to align with the five-stage top-down hierarchy of controls designed by the Center for Disease Control and Prevention (CDC) that implies, in descending order of practicality: elimination (isolation/quarantine), temporary options (remote work, distance learning), use of engineering controls (protect people from the exposure), implementation of administrative controls (change the behavior of people) and safeguarding with PPE [ 138 , 139 ]. Essentially, adopting the basis of the hierarchy of controls from an occupational safety standpoint can provide a better prospect of understanding the benefits of hazard control practices to contain the further spread of COVID-19 [ 139 ].

3.1. Administrative Control Measures

This account further describes the scope of administrative controls appropriate in managing disease outbreaks and public safety measures. First, administrative controls should be established as the best practices to administer any public health emergency. The lessons from the previous SARS and MERS events provided sufficient data for designing policies for administrative controls with an appropriate model of epidemiologic observations [ 140 ]. Such provisions with specific requirements for the given context have to be adopted to shun the dynamic risk of the COVID-19 pandemic. Second, standard hazard waste disposal procedures and safeguarding face masks, personal protective equipment (PPE) kits, and dressing materials protect healthcare workers should be implemented [ 141 ]. Therefore, the safe disposal of PPE kits (including gloves) used to protect the first responders, healthcare workers, and healthy patients potentially reduces the likelihood of disease dissemination in existing healthcare facilities (nursing rooms, receptionist counters, hospital departments, and other settings). The third and most crucial administrative control measure is the selection of effective disinfectant agents for COVID-19. Given that the enveloped virus SARS-CoV-2 has a phospholipid bilayer, it is susceptible to ordinary soaps, including detergents, bleaching agents, quaternary ammonium compounds, and 70% alcohol-based hand sanitizers [ 142 , 143 ]. These disinfectants are highly effective in dissolving the lipid layer or denaturing the SARS-CoV-2 proteins. As shown in the graphical abstract, disinfection methods are also vital to eliminate the SARS-CoV-2 viruses. Therefore, soaps, disinfectants, and hand sanitizers significantly reduce the likelihood of infection and the occurrence of infectious doses by killing the SARS-CoV-2 viruses present on surfaces.

Administrative controls also involve changing specific behaviors through policy reforms or implementing a framework to reduce the public health risk [ 144 ]. Such control measures are frequently revised to direct social distancing levels, minimize human-to-human interaction, or control human density in given spaces; such directions reduce exposure to COVID-19-infected individuals. Administrative means to consider policy reforms to restrict indoor activities, such as religious gatherings, theaters, sports stadiums, schools, and organizational-level orders promoting remote work, in-person work scheduling, and distance learning initiatives could minimize high density-indorsed risk [ 145 ]. This strategy exemplifies the major weakness of dependency on administrative authorities (the challenge here is to remain dependent on organizational management and wait for mitigation of transmission risk). The lessons learned from several contexts, patient safety concerns, risk management scenarios, and policy reforms that rely on perfect adherence, have limitations and is prone to fail. Therefore, compliance is of utmost significance for administrative controls to be effective and successful; even high-principled people sometimes make mistakes in terms of adherence to the rules. Prompt recruitment of trained healthcare professionals has also been suggested to enhance the performance of the existing healthcare system and scale-up public healthcare facilities [ 146 , 147 ].

3.2. Engineering Control Measures

Risk mitigation also depends on numerous safety measures that promote best practices for handling the SARS-CoV-2 crisis. Engineering controls refer to improving physical barriers among risk sources and health workers [ 146 ]; it is the most preferred measure to reduce transmission risk. Engineering control measures may provide solutions to ensure the physical separation of operations to treat infected persons through mechanical and physical means [ 63 ]. The best example of practical engineering controls is ventilation with a physical barrier that enables negative pressure isolation and quarantine rooms [ 148 ]. Typically, most hospital rooms are not equipped with filters such as high-efficiency particulate air (HEPA). Such air filters are applicable to retain small air particles (from 0.2 to 0.3 μm) with excellent efficiency, about 99.97% [ 149 ]. Portable HEPA filters in indoor settings can be a potential adjunct in controlling the transmission of the SARS-CoV-2 viruses. However, ventilation in conventional rooms (i.e., opening windows) has also been reported to be an effective means to reduce airborne viral loads [ 150 ]. A study conducted during the SARS epidemic revealed a significant relationship between higher ventilation in discrete isolation rooms with multiple beds and lower infection rates among healthcare workers. Increased ventilation can be an essential strategy to reduce droplet transmission, airborne aerosol transmission, and the spread of influenza in academic institutions and indoor settings [ 151 , 152 ]. Therefore, improving ventilation systems could be a valuable option for indoor environments. Otherwise, enhancing natural ventilation by simply opening windows also helps to increase airflow, thus decreasing the risk of infections [ 153 ]. Ensuring indoor air safety by installing ventilation and filtration systems with germicidal ultraviolet (UV) light is yet to be established under engineering controls to safeguard the indoor environments from transmission risk of COVID-19 [ 154 , 155 ]. The CDC further recommends additional engineering control measures, including physical barriers, partitions, UV radiation, and the use of virus-proof fabrics to avoid contact between healthy healthcare workers and infected persons. The implementation of well-designed engineering control measures was revealed to be highly effective in reducing the risk of infectious diseases to healthy individuals, despite the differences in their behaviors, and to considerably heighten the safety of individuals who complied with administrative control measures or use PPE kits [ 156 ].

3.3. Personal Protective Equipment (PPE)

Personal protective equipment (PPE) is essential to safeguard public healthcare workers from infectious diseases such as COVID-19. PPE is the best defensive strategy applicable in all operations; it can be used as a primary physical barrier between healthcare workers and infected patients [ 138 ]. The use of PPE kits should be prioritized, particularly when visiting isolation wards and interacting with infectious patients. The CDC has provided strict guidelines and suggested taking advantage of PPE kits; it is mandatory while contacting suspected individuals and treating patients with severe symptoms. PPE kits include disposable dresses, fit-tested N95 masks or electronic respirators, eye protectors or transparent face shields, and disposable gloves [ 157 ].

Furthermore, fit-tested N95 respirator masks do not filter out 100% airborne aerosol virus particles, as they are designed to act more as spray or splash barriers ( Figure 3 ). Therefore, fit-tested N95 respirator masks effectively protect healthy individuals from infected ones, reducing transmission risk through the physical barrier to cough droplets [ 158 ]. The CDC website provides detailed guidelines on the use of fit-tested N95 respirators and masks. Respiratory protective equipment (RPE) needs to be designed to improve healthcare professionals’ protection against the inhalation of aerosol particles [ 159 ].

An external file that holds a picture, illustration, etc.
Object name is jpm-11-01243-g003.jpg

Virus transmission risk depends on the use of fit-tested N-95 masks by both infected and healthy individuals.

We further suggest considering advances in designing and manufacturing antimicrobial and antiviral functionalities on the fabric used to prepare PPE kits. It would protect the healthcare workers against viruses and bacteria and provide extra safety. Though developed countries ensured a steady supply of PPE kits, including personal protective clothing (PPC) [ 139 ], those developing countries are yet to fulfill their supply chain [ 160 ]. There is a need for further developments to ensure the supply chain [ 161 ] and safer disposal of single-use fabrics used in protective clothing to reduce negative impacts on the environment [ 162 ]. Future research developments should consider ways to increase safety, efficiency, and accessibility of PPC globally with reduced impact on the environment.

3.4. Herd Immunity via Vaccination

“Herd immunity”, defined as “population immunity”, is the indirect protection of people in a population or the prevention of infectious diseases that ensue when most community members are immune via infection or vaccination. Most authorities impelled to accomplish “herd immunity” via vaccination rather than letting the disease spread to a given section of the population. Since it can result in excessive infections and deaths, there is the risk of a public health crisis. Besides this, vaccines develop short-term or long-term immune responses by instant-forming antibodies against future disease events. An active infection can occur in the future or cause disease, but importantly, the vaccination helps to recover without causing severe illness [ 163 , 164 ].

Vaccines have effectively controlled historical contagious diseases such as polio, smallpox, rubella, diphtheria, and many others [ 165 ]. COVID-19 vaccine distribution initiatives have begun, and vaccination rates have increased around the world [ 166 ]. However, people have reasonably started to wonder when this pandemic will end. Most of the time, the answers from experts and even authorities seem to be full of uncertainties. There is an intense expectation that, at one point, enough people will eventually gain immunity against SARS-CoV-2 that it will break the transmission chain (i.e., we will reach the “herd immunity threshold”), but, so far, it appears doubtful [ 167 ]. Nevertheless, herd immunity of COVID-19 through vaccination might be difficult or impossible for several reasons, including vaccination hesitancy, protection-related questions, and uneven vaccine rollout [ 168 ].

First, some individuals object to receiving the COVID-19 vaccine because of religious reasons, fears about the health risks (such as allergic reactions), hesitancy over vaccine acceptance, and skepticism over benefits [ 169 ]. Thus, if the proportion of vaccinated people in a community is lower than the “herd immunity threshold,” the transmissible disease will continue to spread [ 167 ]. Appropriately, a large proportion of the global population needs to be vaccinated against COVID-19 to instigate herd immunity. Second, protection-related questions have raised the critical issue of confusion about how long the COVID-19 vaccines will protect against COVID-19 [ 170 ]. Recent research also suggests that current COVID-19 vaccines may have minor efficacy against some of the newly emerging variants of the COVID-19 virus, which can be resistant or non-responsive [ 171 ]. Third, the distribution of COVID-19 vaccines and vaccination rates varies significantly among countries, states, and local vaccination centers [ 172 ]. Even if a particular community achieves a high COVID-19 vaccination rate, outbreaks will repeatedly occur in the population mix if nearby areas do not. Overall, this is an important research area, and vaccine distribution will likely differ among communities; therefore, integrated vaccination and physical distancing interventions need to be redirected [ 172 ]. The key factors that make achieving “herd immunity” challenging include vaccine type/effectiveness/distribution sustainability, prioritized populations for vaccination, and several other factors [ 173 , 174 ]. Further investigation is necessary to examine the safety and efficacy of the SARS-CoV-2 vaccines’ potential to reduce the transmission and spread of the virus [ 175 , 176 ].

The rapid progress of vaccination programs against COVID-19 signifies colossal contemporary accomplishment and offers new hope of culminating the global pandemic crisis [ 177 ]. Presently, several countries are making progress toward “herd immunity” by adopting the ethical route of massive-scale vaccinations. As the vaccination drives take off at incredible speeds, the number of fully vaccinated adults continues to rise, but the answer to the critical question remains unclear—how long immunity will last after infection or vaccination [ 178 ]. Therefore, among several challenging aspects, it is not clear if or when a particular country will achieve the goals of herd immunity. Most of the approved COVID-19 vaccines can be highly effective at protecting the population against serious health complications, thus reducing the number of patients needing to be hospitalized and achieving lower mortality rates [ 179 , 180 ]. Although it would be impossible to contain the transmission of the COVID-19 virus completely, successful vaccination programs will allow humans to live more comfortable life with the COVID-19 pandemic.

4. Experimental Section

We here examine the observational, prospective, epidemiological studies in scientific literature. The primary objective of this report is to raise awareness on public health risks and suggest appropriate mitigation strategies. Thus, the SARS-CoV-2 pandemic is examined using an investigator’s perspective of risk assessment and mitigation based on recent developments in the literature and is related to the public health approach. This report explores the reason behind why COVID-19 accelerated progressively in the direction of a global pandemic, which affected our human lives and several sectors of the economy to a degree not known in the recent past.

A formal method is used to evaluate multiple aspects of risk assessments for the SARS-CoV-2 pandemic, e.g., implications for humans and the environment. Risk assessments are commonly performed in research environments, particularly for drug development, and the safety of hazard materials is considered in this report. The basis of risk assessment involves several well-defined means, including the biological characteristics of the pathogen, susceptibility of the human host, and the context of public health or impacts on the environment. However, it is critical to relate public health aspects while performing an assessment of severe public health concerns of a newly emerged infectious disease. This report is to aid further progress of public awareness, policy reforms, and prompt healthcare decisions in the realm of public health and risk mitigation strategies, including administrative control measures, engineering control measures, and personal protective equipment.

5. Conclusions

Influenza and other coronavirus outbreaks, including SARS and MERS, have few features in common: (a) human interaction with wildlife animals, interspecies transmission, mutations, and human-to-human transmission, and (b) global pandemic reach. Unlike SARS-CoV-2, SARS and MERS were timely controlled and contained before taking the shape of public health and global pandemic crisis. COVID-19 is a highly transmissible disease; viral shedding manifests prior to the onset of noticeable symptoms. Several patients remain asymptomatic and act as super-spreaders, continuing to infect other healthy individuals. Thus, SARS-CoV-2 undoubtedly spreads silently, threatening public safety globally, particularly in high mobility and population density settings.

The perspective on risk assessment for SARS-CoV-2-like infections is indispensable in designing policies for administrative and engineering controls and PPE kits. Risk awareness could be the best approach to contain the transmission rate and relax the SARS-CoV-2 pandemic crisis. This article discusses risk assessment perspectives to increase the public health awareness of SARS-CoV-2, improve decision making, reduce the negative impact of excess communication and provide directions to deal with current challenges. There is a need to convey potential risk assessment perspectives relevant to SARS-CoV-2, with profound reach and clarity, as well as introducing new behaviors, strengthening social compliance, lowering exposure risk, establishing possible interventions, and simultaneously reducing misconceptions. Further, perspectives on risk assessment are essential to implement the potential interventions and effective management of pandemics, together with enabling an inclusive response to contain the resurgence.

Presently, several countries are working proactively to save lives, establish medical infrastructure, mobilize vaccines to citizens and make serious efforts to limit losses to several sectors and economies, ultimately preventing expected regional or global economic recession. The context- and country-specific lessons learned so far from the SARS-CoV-2 pandemic need to be applied to devise preventive actions against the resurgence of SARS-CoV-2 and safeguard future pandemic circumstances. Therefore, there is a need to consider the conceptual design of the framework, development of medical infrastructures, and promotion of non-pharmaceutical interventions to fast-track the performance of risk mitigation strategies and ensure the containment of future pandemics.

Acknowledgments

This work was supported by the Dongguk University Research Fund of 2021 (S-2021-G0001-00074). The authors would also like to acknowledge the Ramanujan Fellowship (File SB/S2/RJN-013/2018), funded by the Science and Engineering Research Board (SERB), India.

Author Contributions

Conceptualization, D.-Y.K.; validation and visualization, S.L.; writing—original draft, G.S.G.; writing—review and editing, R.R.P. and S.K.S. All authors have read and agreed to the published version of the manuscript.

This research received no external funding.

Institutional Review Board Statement

Informed consent statement, data availability statement, conflicts of interest.

The authors have no conflict of interest to declare.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Advertisement

A Guide for Covid-19 Risk in Your County

By Eleanor Lutz ,  Aliza Aufrichtig ,  Charlie Smart ,  Albert Sun ,  Rich Harris and Gabriel Gianordoli Jan. 26, 2021

  • Share full article

This page is no longer being updated on a regular basis. You can track the latest Covid-19 developments in the U.S. here .

Between Jan. 20, 2021 and March 1, 2022, the Times published a Covid-19 transmission risk level for each U.S. county, based on the reported number of Covid-19 cases and test positivity in the area.

risk assessment covid

January 1, 2021

February 1, 2021

March 1, 2021

May 1, 2021

April 1, 2021

June 1, 2021

July 1, 2021

August 1, 2021

September 1, 2021

October 1, 2021

November 1, 2021

December 1, 2021

risk assessment covid

Jan. 1, 2021

Feb. 1, 2021

Aug. 1, 2021

Oct. 1, 2021

Sept. 1, 2021

Nov. 1, 2021

Dec. 1, 2021

The Times continues to report daily updates for outbreaks in each county. To see the latest information about your community, search for a county below.

Alongside the risk levels, the Times also published county-specific guidance for common activities to help readers lower their personal risk of getting Covid-19 and to help them protect their community. This advice was developed with public health experts at Johns Hopkins Bloomberg School of Public Health and Resolve to Save Lives , an initiative of Vital Strategies.

“Providing transparent, real time information about what people’s risks are is empowering,” said Dr. Tom Frieden, who is a former director of the Centers for Disease Control and Prevention and the president and C.E.O. of Resolve to Save Lives. “You want to know how hard it’s raining Covid.”

This guidance, as of Jan. 2022, is shown below as a reference. It is no longer being updated. More recent guidance is available from the Centers for Disease Control and Prevention .

How to protect yourself and others

Here’s how to lower your personal risk of getting Covid-19 and protect your community. People who have immunocompromising conditions should consult the C.D.C. or their health care provider for possible additional precautions.

If you’re fully vaccinated

Individuals are considered fully vaccinated two weeks after receiving either the second dose of a two-dose vaccine series like those by Pfizer or Moderna, or one dose of a single-dose vaccine. If you are fully vaccinated, your risk of infection is lower and your risk of severe disease is much lower than if you are unvaccinated .

The C.D.C. also recommends additional booster vaccine doses for all adults.

Avoid large indoor events with more than fifty people, especially if some participants may be unvaccinated. Consider choosing outdoor activities instead, such as outdoor dining, outdoor team sports or meeting friends outside.

​It’s better to socialize inside the home with only one or two other vaccinated households. Consider wearing a mask if there is a chance some attendees may be unvaccinated or if people in your own household are in a high-risk category.

You can lower your risk during grocery shopping or other public indoor activities by choosing places where people wear masks. If others are not wearing masks, consider choosing a close-fitting mask with good filtration, or wearing two close-fitting cloth masks.

Traveling domestically is safer after you are fully vaccinated . However, you should not travel if you feel sick, test positive, were exposed to someone with Covid-19, or if you are waiting for the results of a Covid test.

You may want to check the level of transmission at departure and arrival destinations before traveling. Keep in mind that receiving medical care, even for unrelated conditions, may become difficult if hospitals at your travel destination are overwhelmed with Covid-19 patients.

You don’t need to get tested or quarantine after traveling domestically unless you develop Covid-19 symptoms.

International travel may be riskier than domestic travel. Check the C.D.C.’s country-specific Covid-19 risk and international travel requirements before traveling to a different country.

If you feel sick, you should stay home and get tested . The C.D.C. says that people who test positive should stay home for five days . If your symptoms are resolving after those five days and you no longer have a fever, you can leave your home, but you should still wear a mask around others for another five days.

If you were in close contact with someone infected with the virus, you should get tested five days after exposure if possible and wear a mask around other people for 10 days. You should also stay home for five days directly after exposure if you have not yet received a booster shot and you were fully vaccinated more than six months ago with the Pfizer or Moderna vaccine series or vaccinated more than two months ago with the Johnson & Johnson vaccine.

If you or someone in your household is older or has other risk factors for severe Covid-19, you may choose to wear a mask in public indoor spaces even when Covid transmission in your area is low. You may also consider upgrading to a close-fitting mask with good filtration, or wearing two close-fitting cloth masks. Regular handwashing also helps prevent the spread of Covid-19 and other diseases.

If you’re not fully vaccinated

Here’s how you can reduce the risk of getting Covid-19 if you haven’t yet completed your vaccination series.

Avoid indoor dining , bars , gyms , movie theaters and nonessential shopping , as well as having friends over to your home , and indoor personal care services like haircuts and manicures . Given the severity of the outbreak, spending time inside with people from other households puts you at risk for getting the coronavirus or spreading it to others.

Whenever possible, you should choose delivery or curbside pickup instead of shopping in person. If shopping in person is the only option, limit yourself to buying only essential supplies, shop during less crowded hours and keep your visits as short as possible.

Avoid all nonessential travel . If you must take a taxi , open the windows and sit far away from others in the vehicle. If you need to take public transit , try to avoid rush hours and crowds so you can keep your distance from others. If you fly, choose less crowded flights or airlines that keep middle seats empty.

Weddings , funerals , concerts , sporting events and other gatherings that bring multiple households together are places where Covid can spread easily. At this level of risk, even outdoor events are not safe, so consider postponing. Religious services are safest when conducted outdoors and without singing.

Walking , cycling , running and other outdoor individual workouts are the safest kinds of exercise. Low-contact outdoor sports like singles tennis, skateboarding and golf may be enjoyed safely. Contact sports like basketball and soccer should be avoided.

Because of the extremely high risk of exposure to Covid, even outdoor dining and outdoor bars are unsafe.

Work remotely when possible and avoid in-person meetings . In the workplace, less crowded hours are the safest to be on the job.

Children tend to have less-severe symptoms but can still spread the coronavirus, so consider the health risks of everyone in your household when making decisions about your child’s activities.

Learning environments where students stay in small groups at all times make it safer for unvaccinated younger students to go to school . Avoid play dates and extracurricular activities .

Do not skip or delay medical care , including mental health care . Talk to your doctors about postponing any nonessential appointments. If you have an appointment, call before your visit to find out if you need to take special precautions, and ask if telehealth is a good option for you.

You should stay at least six feet away from people who live in other households. Wear a mask that covers your nose and mouth when you are outside your home and whenever you are around people who do not live with you, including any visitors to your home.

If you were in close contact with someone infected with the virus, you should stay home for five days, and get tested five days after exposure if possible. You should also wear a mask around others for 10 days.

If you feel sick you should stay home and get tested . The C.D.C. says that people who test positive should stay home for five days . If your symptoms are resolving after those five days and you no longer have a fever, you can leave your home, but you should still wear a mask around others for another five days.

Avoid crowds , and limit the number of people you meet and the amount of time you spend with them. Avoid indoor spaces with poor airflow. Wash your hands often, especially after visiting a public place or blowing your nose, coughing or sneezing.

About the Covid-19 risk levels

Each county’s Covid-19 risk is primarily based on the number of cases reported per capita during the past two weeks. Additional precautions are suggested if more than 10 percent of tests had a positive result over the past two weeks of available data. This can mean that the county is not testing enough, and that the number of cases may be undercounted.

Although risk levels are assigned based on expert guidance and careful analysis, it is possible that the risk level in a specific county may be over- or underestimated because of a lack of reliable data.

To learn more about county risk and guidance, visit a specific county’s page by using the search feature above.

A county is at an extremely high risk for unvaccinated people if it reported an average daily rate of more than 45 cases per 100,000 people over the past two weeks. Small counties with a population of less than 5,000 people are in this category if they reported more than 32 cases over the past two weeks. A county with fewer cases may also be in this category if more than 10 percent of tests had a positive result over the past two weeks. This can mean that the county is not testing enough, and that the number of cases may be significantly undercounted.

A county is at a very high risk for unvaccinated people if it reported an average daily rate of more than 11 cases per 100,000 people over the past two weeks. Small counties with a population of less than 5,000 people are in this category if they reported more than 8 cases over the past two weeks. A county with fewer cases may also be in this category if more than 10 percent of tests had a positive result over the past two weeks. This can mean that the county is not testing enough, and that the number of cases may be significantly undercounted.

A county is at a high risk for unvaccinated people if it reported an average daily rate of about 3 or more cases per 100,000 people over the past two weeks. Small counties with a population of less than 5,000 people are in this category if they reported more than 2 cases over the past two weeks. A county with fewer cases may also be in this category if more than 10 percent of tests had a positive result over the past two weeks. This can mean that the county is not testing enough, and that the number of cases may be significantly undercounted.

A county is at a moderate risk for unvaccinated people if it reported an average daily rate of about 2 cases per 100,000 people over the past two weeks. Small counties with a population of less than 5,000 people are in this category if they reported 1 or more cases over the past two weeks.

A county is at a low risk for unvaccinated people if it reported an average daily rate of less than 1 case per 100,000 people over the past two weeks. Small counties with a population of less than 5,000 people are in this category if they reported no cases over the past two weeks.

In some cases, a county might not have a risk level if not enough recent data was available, or if inconsistencies were found in the data. If a county’s recent testing data was not available, the rate of positive tests in the state was used, along with recent cases, to calculate the risk level.

Since the risk levels were first published in January 2021, The Times has made the following methodology changes:

December 29, 2021: The guidance was edited to reflect the most recent C.D.C. recommendations based on recent research about the Omicron variant.

October 20, 2021: The guidance for vaccinated people was edited to reflect the most recent C.D.C. recommendations.

August 26, 2021: The guidance for unvaccinated people was edited to reflect recent scientific advances in understanding the Delta variant.

August 19, 2021: The overall risk for vaccinated individuals was changed from “minimal risk” to “lower risk” to reflect recent findings of increased transmission of the Delta variant in vaccinated people.

July 27, 2021: The guidance was edited to reflect the updated advice for vaccinated people published by the C.D.C.

June 15, 2021: The threshold between low and moderate risk levels was increased from 10 cases per 100,000 people over the past two weeks to 20 cases per 100,000 people over the past two weeks. This change was made to better reflect the Level of Community Transmission metric published by the C.D.C.

May 18, 2021: New guidance was added for vaccinated individuals to reflect recent advice published by the C.D.C.

May 6, 2021: The description of the risk levels and guidance was changed to specify more clearly that they apply to unvaccinated individuals.

March 31, 2021: The description of the risk levels was changed to the risk of exposure to Covid-19, rather than the risk of getting Covid-19. This change was made to more accurately describe the risk situation of the growing number of vaccinated people.

March 23, 2021: The risk calculation method was adjusted to use the total number of reported cases, rather than the per capita number, in small counties with fewer than 5,000 people. This change was made in order to estimate risk more precisely in areas where a single case may account for a large percentage of the population.

Tracking the Coronavirus

  • Open access
  • Published: 09 July 2024

Assessment of COVID-19 risk factors of early and long-term mortality with prediction models of clinical and laboratory variables

  • Dawid Lipski 1 ,
  • Artur Radziemski 1 ,
  • Stanisław Wasiliew 1 ,
  • Michał Wyrwa 1 ,
  • Ludwina Szczepaniak-Chicheł 1 ,
  • Łukasz Stryczyński 1 ,
  • Anna Olasińska-Wiśniewska 2 ,
  • Tomasz Urbanowicz 2 ,
  • Bartłomiej Perek 2 ,
  • Andrzej Tykarski 1 &
  • Anna Komosa 1  

BMC Infectious Diseases volume  24 , Article number:  685 ( 2024 ) Cite this article

235 Accesses

1 Altmetric

Metrics details

Coronavirus disease (COVID-19) may lead to serious complications and increased mortality. The outcomes of patients who survive the early disease period are burdened with persistent long-term symptoms and increased long-term morbidity and mortality. The aim of our study was to determine which baseline parameters may provide the best prediction of early and long-term outcomes.

The study group comprised 141 patients hospitalized for COVID-19. Demographic data, clinical data and laboratory parameters were collected. The main study endpoints were defined as in-hospital mortality and 1-year mortality. The associations between the baseline data and the study endpoints were evaluated. Prediction models were created.

The in-hospital mortality rate was 20.5% ( n  = 29). Compared with survivors, nonsurvivors were significantly older ( p  = 0.001) and presented comorbidities, including diabetes (0.027) and atrial fibrillation ( p  = 0.006). Assessment of baseline laboratory markers and time to early death revealed negative correlations between time to early death and higher IL-6 levels ( p  = 0.032; Spearman rho − 0.398) and lower lymphocyte counts ( p  = 0.018; Pearson r -0.438). The one-year mortality rate was 35.5% ( n  = 50). The 1-year nonsurvivor subgroup was older ( p  < 0.001) and had more patients with arterial hypertension ( p  = 0.009), diabetes ( p  = 0.023), atrial fibrillation ( p  = 0.046) and active malignancy ( p  = 0.024) than did the survivor subgroup. The model composed of diabetes and atrial fibrillation and IL-6 with lymphocyte count revealed the highest value for 1-year mortality risk prediction.

Conclusions

Diabetes and atrial fibrillation, as clinical factors, and LDH, IL-6 and lymphocyte count, as laboratory determinants, are the best predictors of COVID-19 mortality risk.

Peer Review reports

The coronavirus disease (COVID-19) pandemic, caused by severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), is the most unpredictable and overwhelming experience of the 21st century for health-care systems worldwide.

The clinical manifestations of COVID-19 infection vary from asymptomatic to pneumonia, which may lead to acute respiratory distress syndrome (ARDS), multiorgan failure and death [ 1 ]. Algorithms for the management of patients with COVID-19 and various therapeutic treatment regimens have been developed [ 2 , 3 ]. Moreover, predictors of short- and long-term outcomes have been meticulously investigated to establish clinical factors and laboratory parameters that can differentiate patients with worse survival. Older age, diabetes, hypertension, and chronic kidney disease [ 4 , 5 , 6 , 7 ] have been reported to be strong predictors of mortality and morbidity due to respiratory deterioration. The role of laboratory parameters in predicting the severity of COVID-19 infection and its outcome is still a subject of discussion and remains unclear [ 8 , 9 , 10 , 11 , 12 , 13 ]. The available scientific data and our experience suggest that the assessment of baseline levels of laboratory markers together with further monitoring of their changes might be of great assistance for clinicians in terms of predicting disease severity, disease evolution and patient prognosis [ 14 , 15 , 16 , 17 , 18 ]. Not uncommonly, COVID-19 is characterized by acute and unpredictable deterioration and high mortality in certain groups of patients [ 19 ], particularly depending not only on the virus type. The course of the disease may be rapidly changing, which can often overload the medical system. Therefore, the identification of early predictors of deterioration and mortality is beneficial [ 20 ]. The rapid and enormous extent of the pandemic has forced the engagement of large amounts of medical resources; thus, knowledge of the disease and experience in its management has continuously increased with the implementation of sufficient diagnostics and therapies. However, even greater awareness of the outcomes of patients who survive the early disease period and are discharged from intensive care units is currently needed since the burden of persistent post-COVID-19 symptoms (long COVID-19 syndrome) and increased long-term morbidity and mortality have been widely highlighted [ 21 , 22 , 23 , 24 ]. The investigation of predictors of disease progression is of utmost importance.

The aim of our study was to determine which baseline parameters may provide the best prediction of early and long-term outcomes.

Study population

The study population included 141 COVID-19 patients (median (Q1; Q3) age 66 (53; 76) years, 72 males (51%) who were hospitalized at Poznan Temporary Hospital from the 5th of November to the 31st of December 2021).

The criteria for hospitalization included a positive COVID-19 Ag Rapid Test– Abbott Panbio result accompanied by symptoms of respiratory tract infection with at least one episode of desaturation (defined as saturation lower than 94%) or dyspnea on admission. Demographic and clinical data were collected. The baseline characteristics of the patients are shown in Table  1 .

Exclusion criteria included rapid deterioration at admission leading to early death without collection of blood samples.

Laboratory parameters

All analysed blood samples were collected from each patient at baseline (i.e. within the first 3 h from admission) at the same study point to assess simple blood morphology, biochemical parameters (including creatinine, urea, lactate dehydrogenase (LDH), aspartate aminotransferase (AST), alanine aminotransferase (ALT), C-reactive protein (CRP), procalcitonin, troponin, glucose, and electrolytes), lipid profiles, and coagulation parameters (including D-dimer (DD) and prothrombin time). Interleukin-6 (IL-6) levels were measured with the use of an enzyme-linked immunosorbent assay. Moreover, blood sampling was systematically repeated to monitor the disease course in a planned manner or additionally if deterioration occurred.

Chest computed tomography (CT) covering the apex to the lung base was performed for each patient to assess the presence and distribution of parenchymal lung tissue abnormalities. Electrocardiograms to assess heart rhythm and echocardiography in suspicion of heart failure or acute coronary syndrome were performed.

Study endpoints

The main study outcomes were in-hospital mortality, death in in-hospital survivors at the 1-year follow-up and 1-year mortality. Additional study endpoints included inflammatory involvement of at least 50% of the lung parenchyma estimated on chest CT and significant clinical deterioration (in-hospital need for noninvasive or invasive ventilation or in-hospital death). We conducted a 12-month follow-up analysis following the discharge of each patient from the hospital. Telephone calls were made (3 attempts to access each phone number provided by the patient) to gather information regarding their current health status. Data concerning mortality were collected from the national database.

Statistical analysis

Continuous variables were tested for a normal distribution using the Shapiro‒Wilk test and are reported as medians and interquartile ranges (Q1; Q3) since the data did not follow a normal distribution. Comparisons were performed by means of the Mann‒Whitney test. For the prediction analyses, the laboratory findings were divided by quartiles. For the parts of the study population with values of each aforementioned laboratory parameter below the 1st (low) and above the 3rd (high) quartiles, the predictors of each outcome were calculated. The chi-squared test was used for categorical data. The Spearman or Pearson (where relevant for nonparametric and parametric data) correlation coefficient was used for assessment of associations between variables. Receiver operating characteristic (ROC) analysis was carried out to determine the best model for mortality prediction. Analysis was performed with Statistica, Tibco and JASP statistical software (JASP Team; 2023. Version 0.18.1). A P value lower than 0.05 was considered to indicate statistical significance.

In-hospital mortality

The in-hospital mortality rate was 20.5% ( n  = 29). The median (Q1; Q3) time from admission to early death was 15 (8; 19) days. Compared with survivors, nonsurvivors were significantly older ( p  = 0.001), were burdened with comorbidities, including diabetes (0.027) and atrial fibrillation ( p  = 0.006), and presented lower saturation at admission ( p  = 0.015) (Table  1 ).

Assessment of baseline laboratory markers and time to early death revealed negative correlations between time to early death and higher IL-6 levels ( p  = 0.032; Spearman rho − 0.398) and lower lymphocyte counts ( p  = 0.018; Pearson r -0.438). There were no similar correlations for CRP, LDH or PCT.

Additional in-hospital study outcomes

In-hospital significant clinical deterioration was noted in 29.8% of patients ( n  = 42). Twelve patients required intubation and mechanical ventilation, 7 who died during hospitalisation and 5 who survived. Eight patients were transferred to the intensive care unit after deterioration, and seven were transferred to cardiac or pulmonary departments for further treatment.

Involvement of at least 50% of the lung parenchyma in the baseline chest CT scan was observed in 29.8% of patients ( n  = 42), 45.2% ( n  = 19) of whom experienced deterioration, 41.4% ( n  = 12) of whom died during hospitalization and 33.3% ( n  = 7) who died after discharge during the 1-year follow-up.

Patients’ treatment was based on the most current recommendations and anticoagulation was adjusted to patients’ weight and thrombosis risk (Table  1 ). Data were presented with the use of IMPROVE-VTE risk score. All patients were recommended to use low molecular weight heparin (LMWH) at treatment doses adjusted to the individual weight up to 10 to 20 days after the hospitalization (depending on the assessed severity of disease and risk related to co-morbidities).

Overall 1-year mortality and mortality in survivors at discharge

Twenty-one patients died after discharge from the hospital during the follow-up period. Overall, the 1-year mortality rate was 35.5% ( n  = 50). The 1-year nonsurvivor subgroup was older ( p  < 0.001) and had more patients with arterial hypertension ( p  = 0.009), diabetes ( p  = 0.023), atrial fibrillation ( p  = 0.046) and active malignancy ( p  = 0.024) than did the survivor subgroup.

Comparison between 1-year survivors and nonsurvivors

Baseline laboratory findings differed significantly between patients with and without any study endpoints and between survivors and nonsurvivors in terms of early and long-term follow-up. (Table  2 ).

Prediction of COVID-19-related mortality and morbidity

The baseline laboratory parameters differed significantly between the survivors and nonsurvivors at both the in-hospital and 1-year follow-ups (Table  2 ). Among all analysed markers, lymphocytes, Il-6 and LDH, procalcitonin and CRP were significantly different in both early and long-term observation. The details of the prediction analysis are outlined in Table  3 .

Univariable and multivariable analyses were performed to reveal predictors of in-hospital and 1-year mortality.

Multivariable analysis for in-hospital mortality revealed significant differences in age (OR 1.075, 95% CI 1.010–1.143, p  = 0.024), diabetes status (OR 18.359, 95% CI 2.772-121.604, p  = 0.003), atrial fibrillation status (OR 7.382, 95% CI 1.157–47.097, p  = 0.034), lymphocyte count (OR 0.003, 95% CI 0.000-0.067, p  < 0.001), LDH levels (OR 1.012, 95% CI 1.005–1.018, p  < 0.001), IL-6 levels (OR 1.010, 95% CI 1.001–1.019, p  = 0.037), platelet count (OR 0.981, 95% CI 0.967–0.995, p  = 0.0100), red blood cell count (OR 0.100, 95% CI 0.18–0.545, p  = 0.008), CRP levels (OR 0.985, 95% CI 0.972–0.999, p  = 0.030), and red blood cell count (OR 0.100, 95.

Multivariate analysis of 1-year mortality revealed that diabetes (OR 7.031, 95% CI 2.193–22.542, p  = 0.001), atrial fibrillation (OR 8.217, 95% CI 1.932–34.943, p  = 0.004), laboratory data lymphocyte count (OR 0.041, 95% CI 0.007–0.252, p  < 0.001), red blood cell count (OR 0.167, 95% CI 0.056–0.492, p  = 0.001), IL-6 level (OR 1.012, 95% CI 1.004–1.020, p  = 0.003) and LDH level (OR 1.005, 95% CI 1.002–1.009, p  < 0.001) were predictive of mortality.

The receiver operator curve (ROC)

Multivariate and ROC analyses revealed the predictive value for 1-year all-cause mortality as a multifactorial model including clinical factors (diabetes mellitus and atrial fibrillation) followed by laboratory parameters (lymphocyte count, Rbc, LDH and Il-6), yielding a sensitivity of 65.2% and specificity of 90.8% and an area under the curve (AUC) of 0.897, as shown in MODEL 1 in Fig.  1 .

Due to the complexity of the presented model, we performed receiver operating characteristic (ROC) analysis for mortality prediction based on clinical factors (DM and AF), which revealed an area under the curve of 0.630, a sensitivity of 24.0% and a specificity of 90.1%, as shown in Fig.  1 .

Due to the insufficient accuracy of ROC analysis based solely on clinical factors, the laboratory parameters were incorporated in a stepwise manner. ROC analysis revealed the highest accuracy when clinical factors were combined with the IL-6 concentration, yielding a sensitivity of 40.0%, a specificity of 90.1% and an area under the curve (AUC) of 0.800, as shown in MODEL 3 in Fig.  1 . The combination of clinical factors and Rbc revealed an area under the curve of 0.679, yielding a sensitivity of 28.0% and a specificity of 87.9%, as shown in MODEL 4 in Fig.  1 . The ROC curve results for mortality prediction combining clinical factors with separate LDH and lymphocyte concentrations revealed areas under the curve of 0.728 (yielding a sensitivity of 40.8% and specificity of 88.6%) and 0.760 (yielding a sensitivity of 38.0% and specificity of 87.9%), respectively, as presented in MODEL 5 and 6 in Fig.  1 .

Finally, ROC analysis based on clinical factors and a combination of two laboratory parameters was performed, and a statistically significant model was reached when clinical factors (DM and AF) were combined with laboratory parameters such as lymphocyte count and IL-6, yielding a sensitivity of 64%, a specificity of 87.9% and an area under the curve (AUC) of 0.818, as shown in Model 7 in Fig.  1 .

figure 1

ROC analyses of the predictive value for 1-year all-cause mortality for different models composed of clinical and laboratory factors. Abbreviations: AF – atrial fibrillation, DM – diabetes mellitus, IL-6 – interleukin – 6, LDH – lactate dehydrogenase, Rbc – red blood cell count

Our analysis showed that the combination of clinical and baseline laboratory data enables the most accurate prediction of mortality risk in patients with COVID-19 infection. The natural course of COVID-19 infection varies depending on the patient’s individual characteristics, such as age, comorbidities, and immune system status. Clinical variables, such as diabetes and hypertension, have been reported to be important factors influencing the course of COVID-19 [ 25 ]. In our analysis, we confirmed the significance of diabetes and atrial fibrillation on patient outcomes. Early on, diabetes was recognized as an important factor contributing to disease severity and mortality and a greater risk of respiratory complications. Furthermore, the newest studies [ 26 ] suggest a relationship between COVID-19 and new-onset diabetes. Chronic or de novo AF has been associated with a worse in-hospital prognosis, a greater complication rate and increased utilization of healthcare resources, both in COVID-19 and non-COVID-19 circumstances [ 27 , 28 , 29 , 30 ]. A severe course of COVID-19 observed in some patients may lead to death or long-lasting complications. The mortality rate due to COVID-19 infection therefore differs depending on several determinants [ 31 ].

Several studies have investigated the role of various biomarkers in the evaluation of mortality risk in terms of the escalation of respiratory support [ 32 ] and complication rate [ 33 ]. Careful evaluation of laboratory parameters may be helpful for estimating the severity of infection and patient prognosis [ 34 , 35 ]. Here, we presented the distribution of laboratory markers of inflammation and analysed the relationships between blood sample results and short- and long-term outcomes. The most frequent laboratory deviations in patients with COVID-19 infection are lymphopenia, neutrophilia, thrombocytopenia, and elevated levels of serum C-reactive protein (CRP). Common abnormalities in hematological tests include increased ferritin levels, prolonged prothrombin times and elevated D-dimer levels [ 36 ]. Other inflammatory biomarkers, including CRP and IL-6, have been extensively investigated and used in daily in-hospital practice for patients with COVID-19 [ 31 ]. CRP is a widely used parameter for all types of infections and has high reliability. A high CRP level is associated with severe pneumonia in COVID-19 patients and is a predictor of deterioration to acute respiratory distress syndrome (ARDS) and death [ 37 ]. Our study demonstrated that simple whole-blood morphology analysis and LDH and IL-6 levels may also have additional predictive value for the determination of COVID-19 complications. Il-6 is a prototype cytokine and shows pleiotropic activity necessary for host defence [ 38 ]. It is rapidly and extensively produced in the course of tissue damage related to infection. It induces a large amount of inflammatory acute phase proteins and mediates a variety of signalling pathways, cell proliferation and apoptosis. However, abnormal Il-6 production may lead to deleterious effects. The level of IL-6 was found to be much greater in COVID-19 patients and correlated with disease severity [ 39 ].

LDH (lactate dehydrogenase) is an enzyme that plays an essential role in the process of intracellular energy production. It is most active in the liver, heart, kidneys, muscles, lungs, brain and red blood cells (erythrocytes); thus, it is not specific to any certain tissue. LDH elevation has been reported in cardiac ischemia, malignancies and other pathologies [ 40 ], as it is indicative of cellular damage and hypoxia. It is believed to be a convenient biomarker of the systemic state of hyperinflammation [ 41 ]. Fialek et al. [ 40 ] performed a meta-analysis of studies that showed the value of LDH as a biomarker for the determination of COVID-19 severity.

The D-dimer level is often significantly elevated in patients with COVID-19. However, in our group, we did not confirm that the D-dimer level was a predictor of study endpoints.

The clinical condition of COVID-19 patients often deteriorates rapidly as a consequence of hyperinflammation due to cytokine storms that can lead to multiorgan damage [ 42 , 43 ]. It is critically important to identify factors that determine a worse prognosis and a greater mortality risk. Our analysis revealed that clinical determinants, including diabetes and atrial fibrillation, are crucial for mortality risk assessment. However, simple models based on clinical variables do not provide as much information as more extensive models composed of clinical and laboratory determinants. Therefore, we believe that examination of LDH, IL-6 and lymphocyte count is a valuable method and should be performed for each patient diagnosed with COVID-19 together with determination of the presence of comorbidities.

Study limitations: This study was performed on a population of patients during the COVID-19 pandemic. Currently, the World Health Organization (WHO) has announced that the COVID-19 epidemic has already ended. However, we observe that the disease is still present and may lead to an unpredictable course with various complications. Moreover, long-lasting complications are commonly reported, with higher mortality in patients who survived COVID-19 being the most serious complication. Therefore, we believe that the determination and use of predictive models are still crucial. The second limitation is that the number of analysed patients was relatively low, but all of them were treated at one centre by one team; therefore, any bias related to the different management and therapeutic methods was avoided. Due to dynamic changes in virus biology and the significant impact of vaccination on the disease course, no standard of care for COVID-19 patients has been established. Moreover, we are aware of the relatively low significance of laboratory data in terms of odds ratios; however, our extensive prediction models showed that adding laboratory parameters to clinical models provides a much more valuable prediction of 1-year mortality.

Diabetes and atrial fibrillation are clinical factors, and IL-6 and lymphocyte count are laboratory determinants that provide the best predictive model for the assessment of COVID-19 mortality risk.

Data availability

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.

Poggiali E, Zaino D, Immovilli P, et al. Lactate dehydrogenase and C-reactive protein as predictors of respiratory failure in CoVID-19 patients. Clin Chim Acta. 2020;509:135–8.

Article   CAS   PubMed   PubMed Central   Google Scholar  

World Health Organization. (2020). Clinical management of COVID-19: interim guidance. https://www.who.int/publications/i/item/clinical-management-of-covid-19 .

Hayıroğlu Mİ, Çınar T, Tekkeşin Aİ. Fibrinogen and D-dimer variances and anticoagulation recommendations in Covid-19: current literature review. Rev Assoc Med Bras (1992). 2020; 66: 842–848.

Liao LD, Hubbard AE, Gutierrez JP, et al. Who is most at risk of dying if infected with SARS-CoV-2? A mortality risk factor analysis using machine learning of patients with COVID-19 over time: a large population-based cohort study in Mexico. BMJ Open. 2023;13:e072436.

Article   PubMed   PubMed Central   Google Scholar  

Abid A, Umar A, Qamar S. Disease outcomes of COVID-19 in Diabetic and Hypertensive patients during the Hospital Stay. Cureus. 2023;15:e46943.

PubMed   PubMed Central   Google Scholar  

Terlecki M, Wojciechowska W, Klocek M, et al. Association between cardiovascular disease, cardiovascular drug therapy, and in-hospital outcomes in patients with COVID-19: data from a large single-center registry in Poland. Kardiol Pol. 2021;79:773–80.

Article   PubMed   Google Scholar  

Jankowska-Sanetra J, Sanetra K, Konopko M, et al. Incidence and course of acute coronary syndrome cases after the first wave of the COVID-19 pandemic. Kardiol Pol. 2023;81:22–30.

Mathew BJ, Vyas AK, Khare P, et al. Laboratory diagnosis of COVID-19: current status and challenges. Iran J Microbiol. 2021;13(1):1–7.

Tang YW, Schmitz JE, Persing DH, Stratton CW. Laboratory diagnosis of COVID-19: Current issues and challenges. J Clin Microbiol. 2020;58:e00512–20.

Georgakopoulou VE, Gkoufa A, Makrodimitri S, et al. Risk factors for the in–hospital and 1–year mortality of elderly patients hospitalized due to COVID–19–related pneumonia. Exp Ther Med. 2023;27:22.

de Weber AP, Viero A, Pillat FT, de Lima Gonçalves MM. Changes in markers of inflammation and their correlation with death in patients with COVID-19 in the intensive care unit. Cytokine. 2024;175:156509.

Article   Google Scholar  

Hayıroğlu Mİ, Çiçek V, Kılıç Ş, Çınar T. Mean serum D-dimer level to predict in-hospital mortality in COVID-19. Rev Assoc Med Bras (1992). 2021; 67: 437–442.

Cakir Guney B, Hayiroglu M, Senocak D, et al. Evaluation of N/LP ratio as a predictor of Disease Progression and Mortality in COVID-19 patients admitted to the Intensive Care Unit. Medeni Med J. 2021;36:241–8.

Fan BE, Chong VCL, Chan SSW, et al. Hematologic parameters in patients with COVID-19 infection. Am J Hematol. 2020;95:E131–4.

CAS   PubMed   Google Scholar  

Marcinkiewicz J, Mazurek H, Majka G, Chain B. Are patients with lung cystic fibrosis at increased risk of severe and fatal COVID-19? Interleukin 6 as a predictor of COVID-19 outcomes. Pol Arch Intern Med. 2020;130:919–20.

Nowak A, Małyszek-Tumidajewicz J, Araszkiewicz A, et al. The course and treatment of COVID-19 in heart transplant recipients: a case series from the late phase of the pandemic. Kardiol Pol. 2023;81:281–3.

Güney BÇ, Taştan YÖ, Doğantekin B, et al. Predictive value of CAR for In-Hospital mortality in patients with COVID-19 pneumonia: a retrospective cohort study. Arch Med Res. 2021;52:554–60.

Çınar T, Hayıroğlu Mİ, Çiçek V, et al. Is prognostic nutritional index a predictive marker for estimating all-cause in-hospital mortality in COVID-19 patients with cardiovascular risk factors? Heart Lung. 2021;50:307–12.

Shi S, Qin M, Shen B, et al. Association of Cardiac Injury with mortality in hospitalized patients with COVID-19 in Wuhan, China. JAMA Cardiol. 2020;5:802–10.

Motloch LJ, Jirak P, Gareeva D, et al. Cardiovascular biomarkers for prediction of in-hospital and 1-Year post-discharge mortality in patients with COVID-19 pneumonia. Front Med (Lausanne). 2022;9:906665.

Nair SC, Gasmelseed HI, Khan AA, et al. Assessment of mortality from COVID-19 in a multicultural multi-ethnic patient population. BMC Infect Dis. 2021;21:1115.

Azzam A, Khaled H, Refaey N et al. The burden of persistent symptoms after COVID-19 (long COVID): a meta-analysis of controlled studies in children and adults. Virol J. 2024; 21.

Rola P, Włodarczak A, Włodarczak S, et al. Invasive assessment of coronary microvascular dysfunction in patients with long COVID: outcomes of a pilot study. Kardiol Pol. 2022;80:1252–5.

Czerwińska-Jelonkiewicz K, Beneria A, et al. Psychological burden of the COVID-19 pandemic 6 months after the outbreak - the voice of the young doctors’ generation: an international survey. Kardiol Pol. 2022;80:485–8.

Li C, Islam N, Gutierrez JP, et al. Associations of diabetes, hypertension and obesity with COVID-19 mortality: a systematic review and meta-analysis. BMJ Glob Health. 2023;8(12):e012581.

Dallavalasa S, Tulimilli SV, Prakash J, Ramachandra R, Madhunapantula SV, Veeranna RP. COVID-19: diabetes perspective-pathophysiology and management. Pathogens. 2023;12:184.

Niu H, Li J, Teng C, Lu X, et al. The incidence and impact of atrial fibrillation on hospitalized coronavirus disease-2019 patients. Clin Cardiol. 2024;47:e24240.

Terlecki M, Wojciechowska W, Klocek M, et al. Prevalence and clinical implications of atrial fibrillation in patients hospitalized due to COVID-19: data from a registry in Poland. Front Cardiovasc Med. 2023;10:1133373.

Çiçek V, Cinar T, Hayiroglu MI, et al. Preoperative cardiac risk factors associated with in-hospital mortality in elderly patients without heart failure undergoing hip fracture surgery: a single-centre study. Postgrad Med J. 2021;97:701–5.

Orhan AL, Çınar T, Hayıroğlu Mİ et al. Atrial fibrillation as a preoperative risk factor predicts long-term mortality in elderly patients without heart failure and undergoing hip fracture surgery. Rev Assoc Med Bras (1992). 2021; 67: 1633–1638.

Gibson PG, Qin L, Puah SH. COVID-19 acute respiratory distress syndrome (ARDS): clinical features and differences from typical pre-COVID-19 ARDS. Med J Aust. 2020;213:54–6.

Ntalouka MP, Brotis A, Mermiri M, et al. Predicting the Outcome of patients with severe COVID-19 with simple inflammatory biomarkers: the Utility of Novel Combined scores-results from a European Tertiary/Referral Centre. J Clin Med. 2024;13:967.

De Vita A, Franceschi F, Covino M. Increased thrombotic risk in COVID-19: evidence and controversy. J Clin Med. 2023;12:4441.

Zeng F, Huang Y, Guo Y, et al. Association of inflammatory markers with the severity of COVID-19: a meta-analysis. Int J Infect Dis. 2020;96:467–74.

Czupryna P, Moniuszko-Malinowska A, Rogalska M, et al. Inflammatory and thrombotic parameters associated with the COVID-19 course in Poland (SARSTer study). Adv Med Sci. 2022;67:291–7.

Mendes-Filho SPM, de Souza Pinheiro R, Martins FS, et al. Kinetics of IL-6, C-reactive protein and fibrinogen levels in COVID-19 outpatients who evolved to Hypoxemia. Clin Pathol. 2024;17:2632010X231222795.

Bouayed MZ, Laaribi I, Chatar CEM, et al. C-Reactive protein (CRP): a poor prognostic biomarker in COVID-19. Front Immunol. 2022;13:1040024.

Wang X, Tang G, Liu Y, et al. The role of IL-6 in coronavirus, especially in COVID-19. Front Pharmacol. 2022;13:1033674.

Lagunas-Rangel FA, Chávez-Valencia V. High IL-6/IFN-γ ratio could be associated with severe disease in COVID-19 patients. J Med Virol. 2020;92:1789–90.

Fialek B, Pruc M, Smereka J, et al. Diagnostic value of lactate dehydrogenase in COVID-19: a systematic review and meta-analysis. Cardiol J. 2022;29:751–8.

Medina-Hernández EO, Pérez-Navarro LM, Hernández-Ruiz J, et al. Changes in lactate dehydrogenase on admission throughout the COVID-19 pandemic and possible impacts on prognostic capability. Biomark Med. 2022;16:1019–28.

Ye Q, Wang B, Mao J. The pathogenesis and treatment of the `Cytokine storm’ in COVID-19. J Infect. 2020;80:607–13.

Kanecki K, Nitsch-Osuch A, Goryński P, et al. Hospitalizations for COVID-19 in Poland: a study based on data from a national hospital register. Pol Arch Intern Med. 2021;131:535–40.

PubMed   Google Scholar  

Download references

Acknowledgements

Not applicable.

The authors have no funding to disclose.

Author information

Authors and affiliations.

Department of Hypertensiology, Angiology and Internal Medicine, Poznan University of Medical Sciences, Poznan, Poland

Dawid Lipski, Artur Radziemski, Stanisław Wasiliew, Michał Wyrwa, Ludwina Szczepaniak-Chicheł, Łukasz Stryczyński, Andrzej Tykarski & Anna Komosa

Department of Cardiac Surgery and Transplantology, Chair of Cardio-Thoracic Surgery, Poznan University of Medical Sciences, ul. Długa 1/2, Poznan, 61-848, Poland

Anna Olasińska-Wiśniewska, Tomasz Urbanowicz & Bartłomiej Perek

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, DL, BP, AOW, TU, AT, AK. ; Data collection, DL, AR, SW, MW, LSC, ŁS, AK Analysis and interpretation of the data, DL, AK, AOW, TU, AR, BP; Writing – Original Draft Preparation, DL, AR, AK; Writing – Review & Editing, DL, AR, SW, MW, LSC, ŁS, BP, AOW, TU, AT, AK; Supervision, AT, AK. All authors edited and approved the final version of the manuscript.

Corresponding author

Correspondence to Dawid Lipski .

Ethics declarations

Ethics approval and consent to participate.

This retrospective study was approved by the Bioethics Committee of Poznan University of Medical Sciences (July 12, 2023) and adhered to the principles outlined in the Declaration of Helsinki.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Lipski, D., Radziemski, A., Wasiliew, S. et al. Assessment of COVID-19 risk factors of early and long-term mortality with prediction models of clinical and laboratory variables. BMC Infect Dis 24 , 685 (2024). https://doi.org/10.1186/s12879-024-09592-7

Download citation

Received : 07 April 2024

Accepted : 04 July 2024

Published : 09 July 2024

DOI : https://doi.org/10.1186/s12879-024-09592-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Inflammation

BMC Infectious Diseases

ISSN: 1471-2334

risk assessment covid

Internet explorer is no longer supported

We have detected that you are using Internet Explorer to visit this website. Internet Explorer is now being phased out by Microsoft. As a result, NHS Digital no longer supports any version of Internet Explorer for our web-based products, as it involves considerable extra effort and expense, which cannot be justified from public funds. Some features on this site will not work. You should use a modern browser such as Edge, Chrome, Firefox, or Safari. If you have difficulty installing or accessing a different browser, contact your IT support team.

Coronavirus (COVID-19) risk assessment

QCovid® is a coronavirus risk prediction model, created by the University of Oxford, which we're using to support the NHS coronavirus response.

QCovid®  is an evidence-based risk prediction model that estimates a person's combined risk of:

  • catching coronavirus and being admitted to hospital
  • catching coronavirus and dying
  • dying of coronavirus following a positive PCR test 

How QCovid® was developed

The Chief Medical Officer for England asked leading academics, clinicians and scientists to create a way of predicting who may be at high risk of serious illness if they catch coronavirus.

A team of researchers, led by the University of Oxford, studied the anonymised health records of more than 8 million people using GP records, hospital records and mortality data from late January 2020 to April 2020. This initial analysis, funded by the National Institute for Health Research, was done using data collected during the first wave of the coronavirus pandemic in the United Kingdom.

The results showed that things such as age, sex assigned at birth, height and weight (used to calculate body mass index (BMI)), ethnicity and some medical conditions increased risk relating to coronavirus – these are known as risk factors.

Further research was done using more up to date data (up to June 2021), and the model has been refined and updated based on the latest findings. Factors such as vaccinated status and current infection rates are now reflected in the model and some conditions have been removed as risk factors where sufficient data was not available.  

QCovid® was designed to:

  • risk assess the general population
  • inform people about their risk level
  • support people with decisions about behaviours in consultation with a clinician

The QCovid® model is not a vaccine effectiveness study and is therefore not designed to be used for comparing someone’s risk when not vaccinated with their risk once vaccinated. 

How QCovid® works

QCovid® works by taking information about risk factors and converting each of these into values. These values are then combined in an equation that estimates risk and generates figures for absolute risk and relative risk.

Absolute risk is the overall risk, based on what happened to other people with the same characteristics and risk factors who caught coronavirus and went to hospital or died as a result.

Relative risk is the level of risk compared to a person who is the same age and sex registered at birth, and has the same vaccination status, but without any other risk factors.

To reflect the fact that some risk factors have a bigger impact on risk, some values contribute more to the result than others (weighting).

Some risk factors also interact with others, and this means that the extent to which each risk factor contributes to a person’s overall risk level (and risk assessment result) will depend on the individual.

  • as well as considering health condition risk factors, the risk may also be affected in combination with other factors such as a person’s age, sex or ethnicity
  • the impact of some risk factors is affected by the presence or absence of other factors. For example, the risk associated with Type 2 diabetes increases with age
  • the impact of some risk factors increases with their severity, for example, a higher level of obesity means a higher risk.
  • some risk factors affect men more than women, and vice versa

QCovid® is a complex model which cannot be simply summarised. However, detailed information about the QCovid® risk factors and their relative weightings is available in research published in the British Medical Journal:

  • Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study   
  • Risk prediction of covid-19 related death and hospital admission in adults after covid-19 vaccination: national prospective cohort stud

View the list of health conditions and treatments considered by the COVID-19 Population Risk Assessment  and the COVID-19 Clinical Risk Assessment Tool.  

Factors not incorporated into QCovid®

As with any model like this, QCovid® can only estimate risk and cannot take all factors into account. There are several things that are important to consider that are not included in QCovid®, such as:

  • an individual’s behaviour (for example hand washing, wearing face coverings and visiting friends or family)
  • local infection rates
  • local and national lockdown measures

How QCovid® has been validated

QCovid® has been peer reviewed, which means independent academic experts have checked that the research is robust.

The research and evidence both for the original version of QCovid® and the latest version were published in the British Medical Journal (BMJ), a respected academic medical journal which only accepts quality peer-reviewed research (around 6% of research papers submitted are published).

The data used to develop the original QCovid® model was collected in early 2020, it was later tested with new data and continued to perform well and accurately predict outcomes. The model has since been reviewed and updated in line with the latest available data and research. 

The  Office for National Statistics (ONS)  has independently validated the performance of QCovid®. The ONS has shown that the model performs well and accurately identifies patients at high risk from coronavirus. The NHS can therefore be confident that the model is robust and meets the highest standards of evidence.  

The QCovid® model has been embedded by the University of Oxford into the QCovid® Calculation Engine . This has been registered with the Medical and Healthcare products Regulatory Agency (MHRA) and categorised as a Class 1 medical device .

QCovid® is available online  with a license that means the website can be used in Great Britain by clinically trained professionals, for academic research and for the purpose of peer review.

Updates to QCovid®

QCovid® is a ‘living’ risk prediction model. This means that, although it is not updated automatically in real-time, it can be updated periodically by the University of Oxford using the latest data and as we learn more about coronavirus.

New versions of the QCovid® model were published in November 2021 and we incorporated these changes into our COVID-19 Clinical Risk Assessment Tool on 25 November 2021. 

How we've used QCovid®

COVID-19 Population Risk Assessment

In February 2021 we used QCovid® at a national level to identify people who may be high risk (clinically extremely vulnerable) so they can be added to the Shielded Patients List (SPL) and prioritised for vaccination.

Known considerations

We are aware of the following considerations relating to the code list used for COVID-19 Population Risk Assessment and the COVID-19 Clinical Risk Assessment Tool.

Last edited: 27 March 2024 2:43 pm

risk assessment covid

European Centre for Disease Prevention and Control

An agency of the European Union

  • Infectious disease topics
  • Risk to the EU

Latest risk assessments on COVID-19

Risk assessment

SARS-CoV-2 in animals: susceptibility of animal species, risk for animal and public health, monitoring, prevention and control

Threat assessment brief: implications for the eu/eea of the spread of the sars-cov-2 omicron xbb.1.5 sub-lineage, intensified circulation of respiratory syncytial virus (rsv) and associated hospital burden in the eu/eea, assessment of the further spread and potential impact of the sars-cov-2 omicron variant of concern in the eu/eea, 19th update, assessment of the further emergence of the sars-cov-2 omicron voc in the context of the ongoing delta voc transmission in the eu/eea, 18th update, threat assessment brief: implications of the further emergence and spread of the sars-cov-2 b.1.1.529 variant of concern (omicron) for the eu/eea - first update, threat assessment brief: implications of the emergence and spread of the sars-cov-2 b.1.1. 529 variant of concern (omicron) for the eu/eea, assessment of the current sars-cov-2 epidemiological situation in the eu/eea, projections for the end-of-year festive season and strategies for response, 17th update, rapid risk assessment: assessing sars-cov-2 circulation, variants of concern, non-pharmaceutical interventions and vaccine rollout in the eu/eea, 16th update, rapid risk assessment: covid-19 outbreaks in long-term care facilities in the eu/eea in the context of current vaccination coverage.

Profile Management

Global Locations

RECENT SEARCHES

RECOMMENDED

  • SURVEY REPORT Energy Survey
  • INSIGHTS Industry Intersections

Loading Spinner

Pandemic Risk Assessment

Finding the right response to the right risks. right now., when uncertainty is the new normal, standing still isn’t an option..

Our free assessment helps quantify your risk through data-driven analysis across multiple impact zones and risk factors. We’ll also show you tailored benchmarking by competitor or industry. Followed by recommendations for next steps. At no obligation.   

Our assessment offers actionable, data-driven analysis across four key impact zones and eight risk factors

Chart: COVID-19 risk factor

Our tool provides

Be ready for what’s next.

Our fast, data-driven assessment quantitatively determines your pandemic risks and offers focused resiliency plans to protect your business, customers and shareholders.

Terms of Use - Risk Assessment tool

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Open access
  • Published: 11 July 2024

Identification of risk factors of Long COVID and predictive modeling in the RECOVER EHR cohorts

  • Chengxi Zang   ORCID: orcid.org/0000-0002-8244-9551 1 ,
  • Edward J. Schenck 2 ,
  • Zhenxing Xu 1 ,
  • Yongkang Zhang 1 ,
  • Jiang Bian   ORCID: orcid.org/0000-0002-2238-5429 3 ,
  • Dmitry Morozyuk 1 ,
  • Dhruv Khullar 1 ,
  • Anna S. Nordvig 4 ,
  • Elizabeth A. Shenkman 3 ,
  • Russell L. Rothman 5 ,
  • Jason P. Block 6 ,
  • Kristin Lyman 7 ,
  • Yiye Zhang 1 ,
  • Jay Varma 1 ,
  • Mark G. Weiner   ORCID: orcid.org/0000-0001-5586-9940 1 ,
  • Thomas W. Carton 7 ,
  • Fei Wang   ORCID: orcid.org/0000-0001-9459-9461 1 &
  • Rainu Kaushal 1  

Communications Medicine volume  4 , Article number:  130 ( 2024 ) Cite this article

2994 Accesses

221 Altmetric

Metrics details

  • Epidemiology

SARS-CoV-2-infected patients may develop new conditions in the period after the acute infection. These conditions, the post-acute sequelae of SARS-CoV-2 infection (PASC, or Long COVID), involve a diverse set of organ systems. Limited studies have investigated the predictability of Long COVID development and its associated risk factors.

In this retrospective cohort study, we used electronic healthcare records from two large-scale PCORnet clinical research networks, INSIGHT (~1.4 million patients from New York) and OneFlorida+ (~0.7 million patients from Florida), to identify factors associated with having Long COVID, and to develop machine learning-based models for predicting Long COVID development. Both SARS-CoV-2-infected and non-infected adults were analysed during the period of March 2020 to November 2021. Factors associated with Long COVID risk were identified by removing background associations and correcting for multiple tests.

We observed complex association patterns between baseline factors and a variety of Long COVID conditions, and we highlight that severe acute SARS-CoV-2 infection, being underweight, and having baseline comorbidities (e.g., cancer and cirrhosis) are likely associated with increased risk of developing Long COVID. Several Long COVID conditions, e.g., dementia, malnutrition, chronic obstructive pulmonary disease, heart failure, PASC diagnosis U099, and acute kidney failure are well predicted (C-index > 0.8). Moderately predictable conditions include atelectasis, pulmonary embolism, diabetes, pulmonary fibrosis, and thromboembolic disease (C-index 0.7–0.8). Less predictable conditions include fatigue, anxiety, sleep disorders, and depression (C-index around 0.6).

Conclusions

This observational study suggests that association patterns between investigated factors and Long COVID are complex, and the predictability of different Long COVID conditions varies. However, machine learning-based predictive models can help in identifying patients who are at risk of developing a variety of Long COVID conditions.

Plain language summary

Most people who develop COVID-19 make a full recovery, but some go on to develop post-acute sequelae of SARS-CoV-2 infection, commonly known as Long COVID. Up to now, we did not know why some people are affected by Long COVID whilst others are not. We conducted a study to identify risk factors for Long COVID and developed a mathematical modeling approach to predict those at risk. We find that Long COVID is associated with some factors such as experiencing severe acute COVID-19, being underweight, and having conditions including cancer or cirrhosis. Due to the wide variety of symptoms defined as Long COVID, it may be challenging to come up with a set of risk factors that can predict the whole spectrum of Long COVID. However, our approach could be used to predict a variety of Long COVID conditions.

Similar content being viewed by others

risk assessment covid

Data-driven identification of post-acute SARS-CoV-2 infection subphenotypes

risk assessment covid

Data-driven analysis to understand long COVID using electronic health records from the RECOVER initiative

risk assessment covid

Insights from an N3C RECOVER EHR-based cohort study characterizing SARS-CoV-2 reinfections and Long COVID

Introduction.

The global COVID-19 pandemic starting in late 2019 has led to more than 557 million infections and 6.4 million deaths as of July 14, 2022 1 . Growing scientific and clinical evidence has demonstrated the existence of potential post-acute and long-term effects of COVID-19, which affect multiple organ systems 2 and are referred to as post-acute sequelae of SARS-CoV-2 infection (PASC, or Long COVID). Recently there have been several retrospective cohort analyses identifying potential PASC using real-world patient data 3 , 4 , 5 , 6 , 7 . However, research on the predictability of PASC and their associated risk factors is still limited, and mixed results have been reported. Such predictive modeling research can help patients and healthcare professionals recognize the risk of PASC early and inform effective actions. Several studies found older age, higher severities in the acute phase of SARS-CoV-2 infection 8 , and pre-existing conditions (e.g., hypertension, obesity) may be associated with a higher risk of developing PASC 9 , 10 , 11 , 12 , 13 , 14 . By contrast, some studies also reported that baseline clinical characteristics or demographics were not associated with PASC 12 . Two main challenges may explain these seemingly conflicting findings: 1) Prior studies have typically been conducted using patient cohorts with small sample sizes including only a few hundred or thousand patients 10 , 15 , limiting the generalizability of conclusions derived, and 2) PASC conditions are highly heterogeneous concerning multi-organ manifestations 6 , 7 , thus their predictabilities and associated risk factors could be heterogeneous as well.

To fill in the knowledge gap and address these challenges, we conducted a data-driven study on the predictability of a broad spectrum of incident PASC conditions and to identify their associated factors. We used two large electronic health records (EHR) cohorts from the PCORnet clinical research networks (CRN) 16 , namely INSIGHT 17 , covering patients in the New York City (NYC) area, and OneFlorida+ 18 , including patients from Florida. The INSIGHT and OneFlorida+ were used for primary analyses and validation respectively. A list of PASC conditions was selected based on our previous findings using a high-throughput data-driven analysis pipeline and existing evidence or clinical knowledge (See the method section for a detailed list of PASC diagnoses), which covered multiple organ systems 6 , 7 , 19 . Baseline covariates included basic demographics (e.g., age, gender, race, ethnicity), socioeconomic status, healthcare utilization history, body mass index, the period of infection, comorbidities, and the care settings in the acute phase including hospitalization and ICU stay. We used a regularized multivariate Cox proportional hazard model to uncover association maps between the abovementioned baseline covariates and different incident PASC conditions. Of note, the factors associated with PASC conditions were identified by removing background associations among non-infected patients and being selected with corrected significance levels due to multiple testings. We observed that severe acute SARS-CoV-2 infection, older age ( \(\ge\) 75), female, extremes of weight, and having baseline comorbidities (e.g., cancer, chronic kidney disease, cirrhosis, coagulopathy, pregnancy, pulmonary circulation disorders) were associated with increased risk of a list of incident PASC patterns. Furthermore, we highlight that severe acute infections, being underweight, and having baseline conditions including cancer or cirrhosis are associated with having at least one PASC condition. We further developed machine learning-based prediction models to identify patients who were more likely to develop particular incident PASC conditions with their baseline characteristics and acute severity. We compared the performance of machine learning models with different levels of complexity, including regularized Cox proportional hazard model, regularized logistic regression, gradient boosting machine, and deep neural network in both the survival analysis setting and binary classification setting. We observed that it might be difficult to predict patients who will have at least one PASC condition (denoted as Any PASC) because a variety of PASC conditions were less predictable and were less associated with upfront disease severity. However, a range of PASC conditions were reliably predictable (e.g., dementia, myopathies, cerebral ischemia, COPD, heart failure, hypotension, malnutrition, acute kidney failure, and non-specific PASC diagnoses U099).

In all, complex association patterns and a lack of predictability of several PASC conditions may represent a challenge for managing heterogeneous PASC conditions. However, leveraging machine learning-based predictive models and EHR databases can help in identifying patients who are at risk of developing different incident PASC conditions. Among complex association patterns, we highlight severe acute infections, being underweight, and having baseline conditions including cancer or cirrhosis are likely associated with increased risk of having incident PASC in the post-acute phase, suggesting further investigation of the association between COVID-19 treatment in adults who are at high risk for severe COVID-19 and the risk of PASC beyond the acute phase. This study is part of the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, which seeks to understand, treat, and prevent the post-acute sequelae of SARS-CoV-2 infection (PASC).

This study leveraged two large-scale de-identified electronic healthcare record warehouses from the INSIGHT Clinical Research Network (CRN) 17 and the OneFlorida+ CRN 18 . The INSIGHT CRN contained longitudinally linked data of approximately 12 million patient encounters at hospitals in the New York City metropolitan area, and the OneFlorida+ CRN contained the EHR data of nearly 15 million patients from Florida and selected cities in Georgia and Alabama. The use of the INSIGHT data was approved by the Institutional Review Board (IRB) of Weill Cornell Medicine following NIH protocol 21-10-95-380 with protocol title: Adult PCORnet-PASC Response to the Proposed Revised Milestones for the PASC EHR/ORWD Teams (RECOVER). The use of the OneFlorida+ data for this study was approved under the University of Florida IRB number IRB202001831. All EHRs used in this study were appropriately deidentified and thus no informed consent from patients was obtained.

Definition of Long COVID

The current definition of Post-acute Sequelae of SARS-CoV-2 (PASC, or Long COVID) in the RECOVER protocols is “ongoing, relapsing, new symptoms, or other health effects occurring four or more weeks after the acute phase of SARS-CoV-2 infection” 7 , 20 . We examined a broad list of likely PASC conditions as outcomes, including depressive disorders, anxiety disorder, unspecified post-COVID-19 conditions encoded by the ICD-10 code U099 (in effect since October 2021 and we used ICD-10 code B948 before the implementation of U099) 21 , fever, malaise and fatigue, dizziness, malnutrition, fluid disorders, diabetes mellitus, edema, hair loss, paresthesia, dermatitis, chronic obstructive pulmonary disease (COPD), atelectasis, pulmonary fibrosis, dyspnea, acute pharyngitis, acute bronchitis, dementia, myopathies, cerebral ischemia, encephalopathy, cognitive problems, sleep disorders, headache, muscle weakness, fibromyalgia, joint pain, acute kidney failure, cystitis, genitourinary problems, constipation, gastroparesis, abdominal pain, gastroesophageal reflux disease, heart failure, hypotension, pulmonary embolism, thromboembolism, abnormal heartbeat, chest pain, and anemia. We compiled this list based on both our previous study 6 , 7 , 19 , 22 , 23 and evidence from other literature 3 , 4 , 7 . An incident condition is defined in the SARS-CoV-2 infected patients who had the condition from 31 days to 180 days after the SARS-CoV-2 infection but did not have the condition three years to seven days before. See Supplementary Data  1 for the diagnostic code list.

Eligibility criteria and study population

Our study included adult patients aged 20 years or older with at least one SARS-CoV-2 polymerase chain reaction (PCR) test or antigen laboratory test between March 1st, 2020, and November 30th, 2021. We further required at least one diagnosis code within three years to seven days before the index date (referred to as the baseline period), and at least one diagnosis code from day 31 to day 180 after the index date (referred to as the post-acute phase or follow-up period), to ensure that patients were connected to the healthcare system and were alive beyond the acute phase. We followed each patient from day 31 after his/her index date until the day of the first target outcome, documented death, the latest date of any documented records in the database, 180 days after the baseline, or the end of our observational window (December 31, 2021), whichever came first. We leveraged two exposure groups: a) the SARS-CoV-2 infected group, for the association study and predictive modeling, and b) the non-infected group, to rule out background associations that were not specific to COVID-19 infection. The infected group included patients with any positive SARS-CoV-2 PCR test or positive antigen laboratory test. The index date was defined as the date of the first documented positive PCR or antigen test for patients in the infected group. The non-infected group included patients whose SARS-CoV-2 PCR or Antigen tests were all negative throughout the entire study period and with no documented COVID-19-related diagnoses at any time. The index date for patients in the non-infected group was defined as the date of the first negative PCR or antigen test. The patient inclusion and exclusion cascades are illustrated in Fig.  1 .

figure 1

a the INSIGHT cohort, and ( b ) the OneForida+ cohort.

We built a list of 89 covariates that are potentially associated with PASC, including basic demographics (e.g., age, gender, race, ethnicity), socioeconomic status in terms of Area Deprivation Index (ADI) 21 , healthcare utilization history, body mass index, the timing of infection, comorbidities, and the care settings in acute phase including hospitalization and ICU admission. For each of the categorical covariates, we defined its reference group the same as prior studies for acute SARS-CoV-2 infection 8 . We accounted for these baseline covariates by multivariate Cox proportional hazard model in our association analyses.

We collected clinical features in the baseline period (3 years to 1 week before lab-confirmed SARS-CoV-2 infection) and the severity of acute infection (1 week before to 2 weeks after lab-confirmed SARS-CoV-2 infection). Age was categorized into 20–39 years, 40–54 years, 55–64 years, 65–74 years, and 75+ years groups. We set 55–64 as the reference group. Gender was grouped into female and male (reference). Only three patients in INSIGHT were identified as other/missing gender who were excluded. The race was categorized into Asian, Black or African American, White (reference), other or missing. Ethnicity was grouped into Hispanic, not Hispanic (reference), and other/missing. We used the national-level ADI to capture the socioeconomic disadvantage of patients' residential neighborhoods 24 . Larger ADI values indicate mode socioeconomically deprived status. Missing ADI value was imputed with median ADI per site. The ADI is a ranking from 1 to 100 with 1 and 100 representing the lowest and the highest level of disadvantage, respectively. We grouped ADI into five categories and set the ADI category 1–20 as the reference group. Baseline healthcare utilization up to three years before the index date was measured according to their care setting. For each inpatient, outpatient, and emergency department encounter, we categorized each setting into 0 visits (reference group), 1 or 2 visits, and 3 or more visits, respectively. We also considered the infection time, which was grouped into March 2020–June 2020, July 2020–October 2020, November 2020–February 2021, March 2021–June 2021, and July 2021–November 2021. We set the first wave of the pandemic from March 2020 to June 2020 as the reference group. Of note, the third wave from July 2021 to November 2021 period was dominated by the Delta variant. Body mass index (BMI) was grouped according to the WHO classification, BMI < 18.5 as underweight, BMI 18.5–24.9 as normal weight (reference), BMI 25–29.9 as overweight, BMI ≥ 30 obese, and set missing value as a separate group.

A wide range of baseline clinical comorbidities was collected, based on the Elixhauser comorbidities, conditions recommended by our clinician group, and related medications. Patients were ascertained as having a condition if they had at least two corresponding diagnoses documented during the baseline period 7 . We also counted the number of pre-existing conditions and grouped them into no comorbidity (reference) 1 , 2 , 3 , 4 , 5 , or more.

Association analysis

To uncover potentially complex association maps between baseline conditions and various incident PASC conditions, for each target PASC condition, we performed association analysis using the following two steps. In step I (multivariate association analysis for each PASC condition), we built a separate multivariate Cox proportional hazard model for each PASC condition in SARS-CoV-2 infected patients to assess associations of covariates and time to the first incident PASC event of interest event or censoring in the follow-up period (31-180 days after COVID-19 confirmation). The censoring event is defined as the earliest event of documented death, loss of follow-up in the database (the date of the last documented record in the EHR systems), 180 days after the baseline, or the end of our observational window (December 31, 2021). Fully adjusted hazard ratios (aHR) of each covariate and target PASC event were estimated. In step II (marginal associations due to SARS-CoV-2 infection), we built another multivariate Cox proportional hazard model for all the patients containing both SARS-CoV-2 infected and non-infected patients. The model inputs include two parts. One is the set of covariates. The other is the set of interaction terms defined as the product of each covariate and SARS-CoV-2 infection status (1 for SAR-CoV-2 infected patients and 0 for non-infected control patients) on the outcome condition. In this way, the coefficient of a particular covariate captured its association with the outcome condition for patients who were not infected by SARS-CoV-2, and the coefficient of its corresponding interaction term captured the “quantitative modifications” of such association for patients who were infected by SARS-CoV-2. Fully aHR of each covariate and interaction term was estimated on infected and non-infected combined patients.

A covariate was identified as a likely risk factor of a particular PASC condition if it satisfied the following four criteria: C1, The adjusted hazard ratio (aHR) estimated from the infected patients in Step I was greater than 1 when compared with the reference group; C2, The p -value of the above aHR by the Wald Chi-Square test was smaller than 0.000562, which was corrected by the Bonferroni method 25 for multiple testing; C3, the aHR of the interaction term of the corresponding covariate, namely the marginal increased risk due to SARS-CoV-2 infection, should be greater than 1 in Step II; and C4, the p -value of the aHR of the interaction term by the Wald Chi-Square test in the second Cox was smaller than 0.05. Of note, Criterion 3 and Criterion 4 try to identify the risk associations that have the portion that can be attributed to the SARS-CoV-2 infection, in addition to the background associations in non-infected patients. In summary, to uncover potentially complex association maps and to rule out background associations not specific to SARS-CoV-2 infection distinguish our method from existing association analysis literature.

Machine learning-based predictive modeling

To uncover the predictability of different PASC conditions, we build predictive models for each PASC condition by examining machine learning models with varying complexity (including regularized Cox hazard model, regularized logistic regression, gradient boosting machine, and deep neural networks) in both survival analysis and binary classification settings.

For the survival analysis setting, we used a multivariate Cox proportional hazard model with L2 norm regularization to predict the time to the outcome event. For the binary classification setting, the occurrence of the target event in the follow-up period was labeled as 1 and 0 otherwise. We used logistic regression with L2 norm regularization, gradient boosting machine with random forest base learner, and deep feed-forward neural network. For each of the abovementioned models, the best model was selected by grid search (see details in the following sensitivity analysis paragraph) in a pre-defined hyper-parameter space through repeated cross-validation (ten times, five folds), detailed as follows: a) the regularized logistic regression, we adopted the L2-norm penalty and searched for the inverse of regularization strength from \({10}^{-3}\) to \({10}^{3}\) with 0.5 as the sampling step size; b) the gradient boosting machine with a random forest as the base learner 26 , we searched hyperparameters from maximum depth (3,4,5), max number of leaves in one tree (10, 20, 30), and a minimal number of data in one leaf (30); c) deep forward neural network, we used the ReLU (Rectified Linear Unit) activation function for the hidden layer and searched the hidden layers ((32,), (64,), (128,), (32, 32), (64, 64), (128, 128)), and learning rate (0.001, 0.01, 0.1). For each of the above-mentioned models, the best model was selected by grid search of the corresponding hyperparameter space through repeated cross-validation (ten times, five folds). In the repeated cross-validation process, we set one of the folds as the test set and the rest of the data as the training set. The C-index and the area under the receiver operating characteristic curve (AUROC) were used to measure the predictive performance in the survival setting and binary classification setting, respectively.

The concordance index (C-index) and the area under the receiver operating characteristic curve (AUROC) were used to evaluate survival prediction performance and binary prediction performance respectively. Both two measures range from 0 to 1 with 0.5 indicating random guess and 1 indicating perfect prediction. The 95% confidence interval of the final performance was estimated by 1000-times bootstrapping performance on each of the testing datasets in repeated cross-validation.

Stratified analysis

The stratified analysis was conducted by stratifying patients by their severity in the acute infection phase (hospitalized or non-hospitalized) and then performing statistical analysis within each stratum. The non-infected control patients were also stratified according to the hospitalized or non-hospitalized during the acute period (1 day before to 30 days after the index date), to capture background associations within each subgroup population.

Sensitivity analysis

To get robust conclusions, we conducted the following sensitivity analyses. In addition to fully-adjusted association analysis, we also conduct univariate association analysis by using a univariate Cox model for each covariate. We also tested the impact of lifting Step II, namely without ruling out background associations not specific to SARS-CoV-2 infection, on the identified risk associations. On the other hand, we also investigate a shortened list of associations if we require the marginally increased risk to be significant. Specifically, we require the p -value of the Wald Chi-Square test of the interaction terms in the second Cox proportional hazard model <0.05. For the predictive modeling, we tested how different feature engineering methods will impact the predictive modeling. Rather than clinician-selected baseline predictors, we used a more high-dimensional feature engineering approach by using the first 3-characters of ICD-10 codes and medication at the ingredient level. These ICD-10 diagnosis codes and medications were selected to construct the input feature vectors of the prediction model based on the significant difference (P-value less than 0.0001 by Fisher’s exact test) between patients with positive and negative PASC conditions results. After the feature selection process, the selected ICD-10 diagnosis codes, medication, and collected baseline covariates were constructed to represent every PASC condition.

Validation analysis and generalizability

To get generalizable conclusions, we further replicated the abovementioned association analyses and predictive analyses in the OneFlorida+ cohort. The cohort selection and modeling strategies were the same as our primary analyses on the INSIGHT cohort.

Reporting summary

Further information on research design is available in the  Nature Portfolio Reporting Summary linked to this article.

Factors associated with different PASC conditions

Here we analyzed association maps between baseline covariates and the risk of developing a range of incident PASC conditions. We quantified the association by three metrics including a) the unadjusted hazard ratio (HR), b) the fully adjusted hazard ratio (aHR), and c) the fully aHR filtered by significance corrected by multiple tests and positive marginal risks over the control group. We developed our primary results on the INSIGHT cohort (See the validation results on the OneFlorida+ cohort in the validation section) which included 35,275 adult patients with lab-confirmed SARS-CoV-2 infection and 326,126 non-infected control patients from March 2020 to November 2021 (see the inclusion-exclusion cascade in Fig.  1 ). Overall, among 35,275 enrolled SARS-CoV-2 infected patients in the INSIGHT cohort, 17,571 (49.8%) of them had at least one incident potential PASC condition (Table  1 ). The univariate HR and multivariate aHR between the covariates and the risk of getting at least one PASC condition were summarized in Table  2 . Moreover, Fig.  2 summarizes fully adjusted aHRs that were significant under multiple tests ( p -value < 0.000562 by the Wald Chi-Square test) and showed positive marginal risks over the control group. We summarize association results as follows.

figure 2

The adjusted hazard ratios of different factors were reported by applying screen criteria C1–C3. The associations whose marginal increased risks than non-infected control patients were also significant regarding criteria C4 were highlighted in purple squares. See details of C1-4 in the method-association analysis section. The color bar represents different risk levels. Any PASC represents having at least one of the conditions below. The color panels represent different organ systems, including (from top to bottom): the nervous system or mental disorders, skin, respiratory system, circulatory system, endocrine and metabolic, digestive system, genitourinary system, and other signs. ICD-10 codes B948 (sequelae of other specified infectious and parasitic diseases) and U099 (post-COVID-19 condition, unspecified) were used to capture general PASC diagnoses. Source data are provided in Supplementary Data  2 .

The severity of acute infection

Increased severity of the acute SARS-CoV-2 infection (according to the care settings) was associated with a higher risk of being diagnosed with incident conditions in the post-acute period. Overall, a higher risk of getting any incident diagnosis was observed in patients who were hospitalized during the acute phase (aHR 1.29 (1.24–1.33)) or in ICU (aHR 1.40 (1.32–1.49)) compared to patients who were not hospitalized during the acute phase (as a reference group, see Table  2 and Fig.  2 ). Specifically, as summarized in Fig.  2 , severe acute infection was associated with a wide range of incident PASC conditions compared to non-hospitalized patients: the hospitalized patients showed higher risk of being diagnosed with sleep disorders (1.2-fold), chronic obstructive pulmonary disease (COPD, 1.7-fold), pulmonary fibrosis (2.1-fold), dyspneas (or shortness of breath, 1.8-fold), pulmonary embolism (1.4-fold), chest pain (1.3-fold), malnutrition (2.0-fold); while the ICU patients showed a higher risk of being diagnosed with myopathy (4.7-fold), cognitive problems (2.0-fold), anxiety disorder (1.8-fold), pulmonary fibrosis (2.1-fold), malnutrition (2.6-fold), malaise and fatigue (2.1-fold). In addition, concerning being diagnosed with general PASC codes U099/B948 (the U099 code, namely unspecified post-COVID-19 condition, was effective since 10/1/2021), hospitalized or ICU patients had a 2.2-fold and 4.3-fold higher risk respectively than non-hospitalized patients.

Patients aged 75 or older showed an increased risk of being diagnosed with a wide range of potential PASC conditions in the post-acute infection phase, including dementia (5.8-fold higher), cerebral ischemia (2.1-fold), malnutrition (1.8-fold) compared to patients were 55–64 years old (as reference).

Gender and race

Female patients exhibited a 4.3-fold increased risk of being diagnosed with incident hair loss in the post-acute infection period compared to male patients. Patients who were self-identified as black exhibited a 1.9-fold increased risk of being diagnosed with incident diabetes mellitus than patients who were self-identified as white, but not significant regarding Criterion 4 (method-association analysis section).

Body mass index

Patients who were underweight (BMI < 18.5 kg/m 2 ) were at a 1.2-fold higher risk of being diagnosed with any incident PASC conditions than those with normal BMI (BMI from 18.5 to 24.9 kg/m 2 ). Specifically, underweight patients were at increased risk of being diagnosed with malnutrition (1.4-fold) and diabetes mellitus (1.6-fold).

Period of infection

We observed that patients who got infected from July 2021 to November 2021, which was dominated by the Delta variant of SARS-CoV-2 27 , showed an increased risk of being diagnosed with incident acute pharyngitis (3.2-fold) in the post-acute infection period compared to patients who got infected during March 2020–June 2020 (the 1st wave) as the reference period.

Pre-existing conditions

As shown in Fig.  2 , having five or more baseline conditions was associated with an increased risk of potential PASC diagnoses in the post-acute phase, including anemia (2.0-fold), fluid disorders (2.0-fold), acute kidney failure (1.3-fold) than patients without documented baseline conditions. Specifically, cancer patients showed increased risk in several post-acute conditions including atelectasis, fever, anemia, pulmonary fibrosis (1.5-fold), hypotension (1.7-fold), malnutrition (2.2-fold), and fibromyalgia (1.3-fold) compared to those without cancer diagnoses at baseline. Those with baseline pulmonary circulation disorder showed a 3.3-fold increased risk of pulmonary embolism than patients without this condition at baseline. Patients with weight loss at baseline were at a higher risk of being diagnosed with unspecified PASC diagnoses U099/B948 (2.2-fold) and malaise and fatigue (1.4-fold) than patients without the weight loss diagnosis at baseline.

Associations stratified by acute care settings

We further conducted analyses to examine the associations between baseline factors and incident PASC conditions among subpopulations stratified by their care settings in the acute phase (hospitalized versus non-hospitalized). The same analytics and screen criteria were used in the subgroup analyses as we did in the primary analyses. Different association patterns were observed across the two different settings as shown in Fig.  3 . For patients who were not hospitalized during acute infection, being older age ( \(\ge \! 75\) years old) and having baseline cancer were associated with an increased risk of being diagnosed with a range of conditions in the post-acute period. However, patients who were hospitalized during their acute infection were in ICU or had baseline conditions including dementia or pulmonary circulation disorder were associated with an increased risk of being diagnosed with different PASC conditions. The female patients were associated with an increased risk of hair loss regardless of their acute settings.

figure 3

a non-hospitalized and ( b ) hospitalized during acute infection. The adjusted hazard ratios of different factors were reported by applying screen criteria C1-C3. The associations whose marginal increased risks than non-infected control patients were also significant regarding criteria C4 were highlighted in purple squares. See details of C1-4 in the Method-Association analysis section. The color bar represents different risk levels. Any PASC represents having at least one of the conditions below. The color panels represent different organ systems, including (from top to bottom): the nervous system or mental disorders, skin, respiratory system, circulatory system, endocrine and metabolic, digestive system, genitourinary system, and other signs. ICD-10 codes B948 (sequelae of other specified infectious and parasitic diseases) and U099 (post-COVID-19 condition, unspecified) were used to capture general PASC diagnoses. Source data are provided in Supplementary Data  2 .

Prediction performance

Figure  4 summarizes the predictability of different PASC conditions, quantified by the Concordance index (C-index) 28 with a 95% confidence interval using a regularized Cox model (Method section). Similar heterogeneous predictive patterns from other machine learning models were also observed and summarized in Supplementary Fig.  4 . We observed heterogeneous predictive performance in predicting different PASC conditions: a) conditions with top predictive performance, defined as C-index \(\ge\) 0.8, are dementia, myopathies, cerebral ischemia, COPD, heart failure, hypotension, malnutrition, acute kidney failure, and non-specific PASC diagnoses U099/B948; b) conditions with moderate predictive performance, defined as C-index in [0.7, 0.8), are encephalopathy, hair loss, atelectasis, pulmonary fibrosis, pulmonary embolism, thromboembolism, anemia, fluid disorders, diabetes mellitus; and c) other conditions such as sleep disorders, headache, depressive disorders, anxiety disorder, parethesia, dermatitis, joint pain, malaise and fatigue, and dizziness, etc, were less predictable with a C-index <0.7 or less.

figure 4

The C-index with 95% confidence intervals as error bars was reported. Any PASC represents having at least one of the conditions below. The bars in different colors were organized by their organ systems including (from left to right): the nervous system or mental disorders, skin, respiratory system, circulatory system, endocrine and metabolic, digestive system, genitourinary system, and other signs. The conditions with a C-index in [0.8, 1) were highlighted with “o” texture, and those with a C-index in [0.7, 0.8) were highlighted with “\” texture. The numbers at the top of the bars denote the rank of the predictability quantified by the C-index among all the bars. The 95% confidence interval was estimated by 1000-times bootstrapping performance on the testing dataset in repeated cross-validation. Source data are provided in the Supplementary Data  2 .

We have examined the impact of criteria 3 and 4 (method-association analysis section) which require the identified association to be with a higher risk in SARS-CoV-2 infected patients compared to non-infected patients. As shown in Supplementary Fig.  1 , we observed that more associations have been identified if the aHR of the interaction term is smaller than 1 (without Criterion 3, highlighted in red squares) and many of these associations may not be relevant to SARS-CoV-2 infection. Taking patients with pre-existing cancer as an example, they were associated with a higher risk of being diagnosed with encephalopathy, thromboembolism, fluid disorders, edema, acute kidney failure, malaise, and fatigue in the post-acute period after SARS-CoV-2 infection. However, these associations might be identified for non-infected cancer patients as well. Therefore, criterion 3 is necessary for filtering out the associations that are not specific to SARS-CoV-2 infection. On the other hand, the identified associations including atelectasis, anemia, constipation, and fever, can be further filtered out if we require criterion 4 (highlighted in green squares in the Supplementary Fig.  1 ), namely the marginally increased risk of those associations to be significant ( p Value < 0.05, the Wald Chi-Square test of the interaction terms in the second Cox proportional hazard model when using control patients).

We also tested to what extent the predictability of incident potential PASC conditions is affected by different machine learning models. We investigated a range of machine learning models with different complexities, including regularized logistic regression models, gradient boosting machines, and feed-forward deep neural networks. As shown in Supplementary Fig.  2 , we observed similar performance of these different models, and the heterogeneous predictability patterns were still observed as in Fig.  4 .

Lastly, we studied if different feature engineering can impact the prediction results of different PASC conditions. Instead of using pre-defined baseline comorbidities, we used a more high-dimensional approach by using the first three digits of ICD-10 codes of all the recorded diagnoses and prescriptions in RxNorm codes at their active ingredient level in the baseline period to predict PASC. We finally used 1593 ICD-10 diagnosis codes and 2309 drugs from the INSIGHT and 1698 ICD-10 diagnosis codes, and 4366 drugs from the OneFlorida+ data. We reported the predictive performance of different machine learning models using this large set of features in Supplementary Fig.  3 , which does not show big differences compared to the performance in Supplementary Fig.  2 or main Fig.  4 , and the heterogeneous predictability patterns remain the same.

Validation by the OneFlorida+ cohort

To assess the generalizability of our findings, we replicated our analyses on the OneFlorida+ cohort as an independent validation. The OneFlorida+ cohort included 22,341 adult patients with lab-confirmed SARS-CoV-2 infection and 177,010 non-infected as control patients (See inclusion cascade in Fig.  1 ). We summarized the prediction performance of different potential PASC conditions with regularized Cox model in Supplementary Fig.  4 and the identified risk associations in Supplementary Fig.  5 . From Supplementary Fig.  4 we again observed the heterogeneous predictability of different conditions as has been observed in Fig.  4 , and the more predictable conditions (with c-index > 0.8, such as malnutrition, COPD, dementia, and acute kidney failure) and less predictable (with c-index around or below 0.6, such as fatigue, anxiety, sleep disorders, and depression) remained the same. Similarly, the risk associations shown in Supplementary Fig.  5 are consistent with the risk associations shown in Fig.  2 . Hospitalization and ICU admission in the acute infection phase were associated with a diverse set of incident diagnoses in the post-acute infection phase. We still observed the risk associations between older age and dementia (5.4-fold increased risk), female and hair loss (2.2-fold increased risk), black race, and diabetes (1.5-fold increased risk). Infection confirmation from July to November 2021 was associated with a 1.7-fold increased risk of being diagnosed with general PASC symptoms and signs (the U099/B948 ICD code).

In this paper, we investigated associated risk factors for a wide range of PASC conditions as well as the predictability of PASC using the EHR data from two large-scale PCORnet CRN, INSIGHT, and OneFlorida+. Compared with existing research on this topic which was mostly based on patient-reported symptoms 14 , 29 , our study was based on routinely collected EHR datasets, aimed to uncover potentially complex association maps between baseline covariates and a set of heterogenous PASC conditions, and checked their generalizability across two different populations.

We examined the associations between a broad list of baseline covariates and a list of likely PASC conditions. The baseline covariates include demographics (age, gender, race, ethnicity, etc.), socioeconomic status, healthcare utilization, BMI, time of infection, a list of comorbidities, and severities in the acute phase of SARS-CoV-2 infection according to care settings. What distinguishes our analytic method from existing association analysis are two folds: we conducted fully adjusted association analyses for all baseline covariates and each PASC condition to uncover potentially complex association maps, and we adopted a set of stringent screening criteria to identify likely risk factors including comparing with the non-infected comparison group to remove background associations and using corrected P-value to reduce the chance of false findings in the multiple test settings. Specifically, following prior research on PASC 3 , 6 , 7 , 30 , we ascertained newly incident conditions in the post-acute infection period (30 days to 180 days after infection) in this study. We have built a comprehensive list of diagnoses based on a prior study by Al-Aly et al. 3 with further refinements from our clinician team 6 , 7 . Different from existing relevant studies that treated PASC as a holistic concept 3 , 30 , we have explored the potential risk factors of each condition, as there had been abundant evidence suggesting PASC was a highly heterogeneous condition affecting multiple organ systems 3 , 6 , 7 . Second, for a covariate to be considered as a potential risk factor of a specific condition, we required its corresponding fully adjusted hazard ratio (aHR) to be larger than 1, statistically significant in the multiple testing setting, and we further required the estimated aHR value to be larger in patients who were infected by SARS-CoV-2 compared to the non-infected patients, in this way associations that may not be attributed to COVID-19 can be filtered out (See Supplementary Fig.  1 .). Figure  2 and Supplementary Fig.  5 summarized the identified risk associations from the INSIGHT and OneFlorida+ cohorts respectively. Both figures show that severe acute infection approximated by hospitalizations and admissions to the ICU during the acute infection phase was associated with a broad set of incident conditions in the post-acute infection phase, covering multiple organ systems. Older age (≥75 years) was also found to be a potential risk factor for many of these conditions. These discoveries were consistent with the conclusions from prior studies 31 , 32 . Other notable risk associations consistently identified from both cohorts include higher baseline comorbidity burden and fluid disorder, baseline obesity and sleep disorder, as well as baseline end-stage renal disease and malnutrition. Some associations should be interpreted more cautiously. For example, baseline pulmonary circulation disorder was consistently identified as a risk pulmonary embolism, but the two conditions are highly correlated with each other, and this association could just be due to the ICD coding and grouping. Another example was baseline pregnancy and anemia, as anemia is the most common hematologic problem in pregnancy 33 . However, there were also studies suggesting that SARS-CoV-2 infection during pregnancy can further exacerbate iron deficiency anemia due to hyperinflammation during the acute infection phase 34 . These findings were in line with our approach which can identify associations with potentially exacerbated risks than the non-infected control group.

We then investigated the predictability of different potential PASC diagnoses using different types of machine learning (ML) models including linear models (regularized logistic regression, regularized Cox regression), gradient-boosting-tree-based models, and deep learning models, based on a similar set of baseline covariates (patient demographics, prior conditions, and care settings in the acute phase, etc.). The results from regularized Cox regression were summarized in Fig.  4 , which suggested that different conditions were associated with different predictabilities in the INSIGHT cohort. Conditions such as dementia, heart failure, and kidney failure were more predictable. These conditions are with clear diagnostic criteria according to the underlying disease etiologies and are more likely to be severe COVID complications. General PASC symptoms and signs with the U099/B948 codes were also associated with good prediction performance, which is consistent with prior studies 35 . One potential reason was that these codes were relatively new, and the clinicians might be cautiously using them only when the symptoms and signs were typical. Conditions such as headache, dizziness, chest pain, joint pain, anxiety, and depressive disorders, were more difficult to predict. These conditions are most subjective to diagnose, more similar to patient-reported symptoms, and cannot be explained by alternative disease etiologies. The prediction performance obtained from more complex machine learning models or different feature engineering methods did not make such differences, as evidenced by Supplementary Figs.  2 and   3 , respectively. In addition, we have replicated the predictive modeling analysis on the OneFlorida+ cohort, and the results summarized in Supplementary Fig.  4 were highly consistent with the conclusions obtained from the INSIGHT cohort. Our ML-based predictive models together with observed predictability shed light on how to use EHR data to data-drivenly identify patients who were at risk of heterogeneous PASC conditions.

There were several strengths of our study. First, we studied a comprehensive set of associations between 89 factors and 43 incident PASC conditions in two large EHR cohorts. To our knowledge, this is one of the largest studies on predictive modeling and risk factor analysis for PASC using EHR. Particularly, to reduce false findings, we adopted a non-infected control group and required the adjusted hazard ratio value of the identified association estimated from the infected patients to be larger than the value estimated from the control patients. We also identified likely associations by using significance levels corrected by multiple test settings. On the other hand, extensive sensitivity analyses and validation analyses were conducted to get robust conclusions. We derived our primary results from INSIGHT and did a validation study on OneFlorida+, which validated the generalizability of our findings. We also checked identified associations when stratifying patients by their acute severity. Regarding the prediction performance, we investigated a range of different machine learning models on both a narrow and broad list of covariates, which further validates the robustness of our conclusions.

Our study had several limitations. Our analysis was based on EHR data, which would miss the information from patients who did not visit the hospitals within the CRNs. We only considered newly incident conditions in the post-acute period but did not explore conditions that were prolonged, worsened, or relapsed before and after COVID-19 infection, as well as condition clusters or subphenotypes. The identification of the incident events can be associated with healthcare utilization behaviors: patients who had limited healthcare engagement before COVID-19 infection subsequently might have a greater opportunity to be diagnosed with new conditions simply because of less captured baseline status. Thus, in our analysis, we compared identified associations with those in non-infected patients with similar baseline characteristics including healthcare utilization behavior. In addition, we will also explore clinical notes to better capture incidence events in our future analysis. We acknowledge the limitation in not using COVID-19 vaccination status because the publicly available COVID-19 vaccine began in early December 2020 and nearly half of the study population got infected before any vaccine was available. Regarding the remaining half of the population who got infected after December 2020, the vaccine records collected outside the hospitals were largely missing. In addition, vaccinated patients can still develop severe infection 36 , which was identified as a risk factor for Long COVID by our analysis and others 37 . In addition, the effect of COVID-19 vaccine on Long COVID is not consistent and still needs further investigation 38 , 39 , 40 . Studying how COVID-19 vaccine influences the PASC is a promising future direction as in the later cohort vaccination is more prevalent, and building the linkage to more robust vaccination data (e.g., registry database) of general patients is one of our ongoing efforts. The smoking status was not investigated due to missingness (90.2%). In addition, our analyses did not cover the recent Omicron wave due to the availability of the data. We captured the acute severity of illness by hospitalization and ICU status during their acute infection phase, consistent with the existing Long COVID literature. However, these modelings of acute severity can lack granularity in the medical use variable that may overlook differences in the true severity of illness. For example, a patient who spent a month in the ICU on a ventilator should not be considered as having the same severity of illness as a patient who spent a night in the ICU after elective surgery. We would like to add more granularity to acute severity modeling by capturing the duration of ICU/hospital stay or using the WHO ordinal clinical severity scale 41 in our future analysis as more patients accumulate. Lastly, though we tried to remove background associations, several identified associations should be interpreted with caution. For example, older individuals are more likely to develop dementia, and those with a BMI < 18.5 are more likely to be diagnosed with malnutrition likely represents underlying patient characteristics and known disease state processes. To quantify the potential exacerbation effect (if any) of SARS-CoV-2 infection on some known risk associations remains an open question. Further studies are also warranted to investigate the basic mechanisms of developing Long COVID.

In conclusion, we used two large-scale CRN, INSIGHT and OneFlorida+ to identify likely risk factors associated with incident PASC conditions. We observed complex association patterns and a varying predictability of several PASC conditions which may represent challenges for managing heterogeneous PASC conditions. Among complex association patterns observed, we further highlighted severe acute infections, being underweight, and having baseline conditions including cancer or cirrhosis that are potentially associated with overall incident PASC in the post-acute phase. However, multiple less predictable PASC diagnoses represent an ongoing challenge that may not respond to other measures that decrease the severity of acute COVID-19. Our developed machine learning-based predictive models can help identify those who are at risk of diverse PASC conditions with heterogeneous predictability.

Data availability

The INSIGHT data can be requested through https://insightcrn.org/ . The OneFlorida+ data can be requested through https://onefloridaconsortium.org . Both the INSIGHT and the OneFlorida+ data are HIPAA-limited. Therefore, data use agreements must be established with the INSIGHT and OneFlorida+ networks. The relevant source data for each figure are provided in the Supplementary Data  2 -Source Data file.

Code availability

For reproducibility, our codes are available at https://github.com/calvin-zcx/pasc_phenotype/tree/master/prediction 42 . We used Python 3.9, python package lifelines-0.2666 for survival analysis, and scikit-learn package 1.0.2 and LightGBM package 3.3.2 for machine learning modeling.

WHO Coronavirus (COVID-19) Dashboard. Accessed July 19, 2022. https://covid19.who.int .

Nalbandian, A. et al. Post-acute COVID-19 syndrome. Nat. Med. 27 , 601–615 (2021).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Al-Aly, Z., Xie, Y. & Bowe, B. High-dimensional characterization of post-acute sequelae of COVID-19. Nature. 594 , 259–264 (2021).

Article   CAS   PubMed   Google Scholar  

Xie, Y., Xu, E., Bowe, B. & Al-Aly, Z. Long-term cardiovascular outcomes of COVID-19. Nat. Med . 1–8. https://doi.org/10.1038/s41591-022-01689-3 (2022).

Xie, Y., Xu, E. & Al-Aly, Z. Risks of mental health outcomes in people with covid-19: cohort study. BMJ. 376 , e068993 (2022).

Article   PubMed   Google Scholar  

Zhang, H. et al. Data-driven identification of post-acute SARS-CoV-2 infection subphenotypes. Nat. Med . 1–10. https://doi.org/10.1038/s41591-022-02116-3 (2022).

Zang, C. et al. Data-driven analysis to understand long COVID using electronic health records from the RECOVER initiative. Nat. Commun. 14 , 1948 (2023).

Williamson, E. J. et al. Factors associated with COVID-19-related death using OpenSAFELY. Nature. 584 , 430–436 (2020).

Antonelli, M. et al. Risk factors and disease profile of post-vaccination SARS-CoV-2 infection in UK users of the COVID Symptom Study app: a prospective, community-based, nested, case-control study. Lancet Infect. Dis. 22 , 43–55 (2022).

Asadi-Pooya, A. A. et al. Risk factors associated with long COVID syndrome: a retrospective study. Iran J. Med. Sci. 46 , 428–436 (2021).

PubMed   PubMed Central   Google Scholar  

Crook, H., Raza, S., Nowell, J., Young, M. & Edison, P. Long Covid—mechanisms, risk factors, and management. BMJ . n1648. https://doi.org/10.1136/bmj.n1648 (2021).

Oscar, M. P. et al. Post-acute COVID-19 syndrome. Incidence and risk factors: a Mediterranean cohort study. J. Infect. 82 , 378–383 (2022). https://www.sciencedirect.com/science/article/pii/S0163445321000098 .

Google Scholar  

Aminian, A., Bena, J., Pantalone, K. M. & Burguera, B. Association of obesity with postacute sequelae of COVID-19. Diabetes Obes. Metab. 23 , 2183–2188 (2021).

Sudre, C. H. et al. Attributes and predictors of long COVID. Nat. Med. 27 , 626–631 (2021).

Moy, F. M., Hairi, N. N., Lim, E. R. J. & Bulgiba, A. Long COVID and its associated factors among COVID survivors in the community from a middle-income country—An online cross-sectional study. PLOS ONE 17 , e0273364 (2022).

Fleurence, R. L. et al. Launching PCORnet, a national patient-centered clinical research network. J. Am. Med. Inform. Assoc. 21 , 578–582 (2014).

Article   PubMed   PubMed Central   Google Scholar  

Kaushal, R. et al. Changing the research landscape: the New York City Clinical Data Research Network. J. Am. Med. Inform. Assoc. 21 , 587–590 (2014).

Shenkman, E. et al. OneFlorida clinical research consortium: linking a clinical and translational science institute with a community-based distributive medical education model. Acad. Med. 93 , 451–455 (2018).

Khullar, D. et al. Racial/ethnic disparities in post-acute sequelae of SARS-CoV-2 infection in New York: an EHR-Based cohort study from the RECOVER program. J. Gen. Intern. Med . 38 , 1127–1136. https://doi.org/10.1007/s11606-022-07997-1 (2023).

About the Initiative | RECOVER COVID Initiative. https://recovercovid.org/ (2024).

2024 ICD-10-CM Diagnosis Code U09.9: Post COVID-19 condition, unspecified. https://www.icd10data.com/ICD10CM/Codes/U00-U85/U00-U49/U09-/U09.9 (2024).

Zhang, Y. et al. Identifying environmental risk factors for post-acute sequelae of SARS-CoV-2 infection: An EHR-based cohort study from the recover program. Environ Adv. 11 , 100352 (2023).

Varma, J. K. et al. Excess burden of respiratory and abdominal conditions following COVID-19 infections during the ancestral and Delta variant periods in the United States: An EHR-based cohort study from the RECOVER program. PLOS ONE 19 , e0282451 (2024).

Kind, A. J. H. & Buckingham, W. R. Making Neighborhood-disadvantage metrics accessible—the neighborhood atlas. N. Engl. J. Med. 378 , 2456–2458 (2018).

Armstrong, R. A. When to use the Bonferroni correction. Ophthalmic Physiol. Opt. 34 , 502–508 (2014).

Ke, G. et al. LightGBM: a highly efficient gradient boosting decision tree. in Advances in Neural Information Processing Systems . Vol. 30. (Curran Associates, Inc.; 2017). Accessed December 5, 2022. https://papers.nips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html .

Antonelli, M., Pujol, J. C., Spector, T. D., Ourselin, S. & Steves, C. J. Risk of long COVID associated with delta versus omicron variants of SARS-CoV-2. Lancet. 399 , 2263–2264 (2022).

Brentnall, A. R. & Cuzick, J. Use of the concordance index for predictors of censored survival data. Stat. Methods Med. Res. 27 , 2359–2373 (2018).

Su, Y. et al. Multiple early factors anticipate post-acute COVID-19 sequelae. Cell. 185 , 881–895.e20 (2022).

Xie, Y., Bowe, B. & Al-Aly, Z. Burdens of post-acute sequelae of COVID-19 by severity of acute infection, demographics and health status. Nat. Commun. 12 , 6571 (2021).

Cohen, K. et al. Risk of persistent and new clinical sequelae among adults aged 65 years and older during the post-acute phase of SARS-CoV-2 infection: retrospective cohort study. BMJ. 376 , e068414 (2022).

Xie, Y. & Al-Aly, Z. Risks and burdens of incident diabetes in long COVID: a cohort study. Lancet Diabetes Endocrinol. 10 , 311–321 (2022).

Sifakis, S. & Pharmakides, G. Anemia in pregnancy. Ann. N Y Acad. Sci. 900 , 125–136 (2000).

Gromova, O. A., Torshin, I. Y., Shapovalova, Y. O., Kurtser, M. A. & Chuchalin, A. G. COVID-19 and iron deficiency anemia: relationships of pathogenesis and therapy. Obstet. Gynecol. Reprod. 14 , 644–655 (2020).

Identifying who has long COVID in the USA: a machine learning approach using N3C data. Lancet Digital Health . (2022). https://www.thelancet.com/journals/landig/article/PIIS2589-7500 (22)00048-6/fulltext .

Vo, A. D. et al. Factors associated with severe COVID-19 among vaccinated adults treated in US Veterans Affairs hospitals. JAMA Netw Open. 5 , e2240037 (2022).

Tsampasian, V. et al. Risk factors associated with post−COVID-19 condition: a systematic review and meta-analysis. JAMA Intern Med. 183 , 566–580 (2023).

Kuodi, P. et al. Association between BNT162b2 vaccination and reported incidence of post-COVID-19 symptoms: cross-sectional study 2020-21, Israel. npj Vaccines 7 , 1–8 (2022).

Notarte, K. I. et al. Impact of COVID-19 vaccination on the risk of developing long-COVID and on existing long-COVID symptoms: A systematic review. eClinicalMedicine. 53 , 101624 (2022).

Gao, P., Liu, J. & Liu, M. Effect of COVID-19 vaccines on reducing the risk of long COVID in the real world: a systematic review and meta-analysis. Int. J. Environ. Res. Public Health. 19 , 12422 (2022).

Rubio-Rivas, M. et al. WHO ordinal scale and inflammation risk categories in COVID-19. comparative study of the severity scales. J. Gen. Intern. Med. 37 , 1980–1987 (2022).

Zang, C. calvin-zcx/pasc_phenotype: Code for risk factors and predictive modeling for Long COVID. https://doi.org/10.5281/zenodo.10602738 (2024).

Download references

Acknowledgements

This research was funded by the National Institutes of Health (NIH) Agreement OTA OT2HL161847 (contract number EHR-01-21) as part of the Researching COVID to Enhance Recovery (RECOVER) research program. The PCORnet® Study reported in this work was conducted using PCORnet®, the National Patient-Centered Clinical Research Network. PCORnet® has been developed with funding from the Patient-Centered Outcomes Research Institute® (PCORI®). This work was conducted through the use of data from the INSIGHT Clinical Research Network and supported in part by the Patient-Centered Outcomes Research Institute (PCORI) PCORnet grant to the INSIGHT Clinical Research Network (Grant # RI-CORNELL-01-MC). The statements presented in this work are solely the responsibility of the author(s) and do not necessarily represent the views of other organizations participating in, collaborating with, or funding PCORnet® or of the Patient-Centered Outcomes Research Institute® (PCORI®).

Author information

Authors and affiliations.

Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA

Chengxi Zang, Yu Hou, Zhenxing Xu, Yongkang Zhang, Dmitry Morozyuk, Dhruv Khullar, Yiye Zhang, Jay Varma, Mark G. Weiner, Fei Wang & Rainu Kaushal

Division of Pulmonary and Critical Care Medicine, Weill Cornell Department of Medicine, New York, NY, USA

Edward J. Schenck

Department of Health Outcomes Biomedical Informatics, University of Florida, Gainesville, FL, USA

Jie Xu, Jiang Bian & Elizabeth A. Shenkman

Department of Neurology, Weill Cornell Medicine, New York, NY, USA

Anna S. Nordvig

Center for Health Services Research, Vanderbilt University Medical Center, Nashville, TN, USA

Russell L. Rothman

Department of Population Medicine, Harvard Pilgrim Health Care Institute, Harvard Medical School, Boston, MA, USA

Jason P. Block

Louisiana Public Health Institute, New Orleans, LA, USA

Kristin Lyman & Thomas W. Carton

You can also search for this author in PubMed   Google Scholar

Contributions

C.Z. and F.W. proposed the initial idea. C.Z. and Y.H. designed and implemented the framework and analyzed the results. D.M. and M.G.W. set up the data infrastructure and analytics environment for INSIGHT. J.B. and E.A.S. set up the data infrastructure and analytics environment for OneFlorida+. C.Z. and J.X. preprocessed the INSIGHT and OneFlorida+ data and helped with the analysis. K.L. helped in building computational phenotyping libraries. Z.X., Y.K.Z., and Y.Y.Z. helped with the statistical analysis. D.K., A.S.N., E.J.S., R.L.R., J.P.B., J.V., M.G.W., and R.K. provided clinical inputs on data, study design, and results interpretation. C.Z. drafted the initial manuscript. F.W., E.J.S., T.W.C., and R.K. made critical revisions. All authors have provided feedback to and proofread the final version of the paper.

Corresponding author

Correspondence to Fei Wang .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Peer review

Peer review information.

Communications Medicine thanks the anonymous reviewers for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer review file, supplementary information, description of additional supplementary files, supplementary data 1, supplementary data 2, reporting summary, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Zang, C., Hou, Y., Schenck, E.J. et al. Identification of risk factors of Long COVID and predictive modeling in the RECOVER EHR cohorts. Commun Med 4 , 130 (2024). https://doi.org/10.1038/s43856-024-00549-0

Download citation

Received : 15 February 2023

Accepted : 10 June 2024

Published : 11 July 2024

DOI : https://doi.org/10.1038/s43856-024-00549-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

risk assessment covid

COMMENTS

  1. WHO mass gathering COVID-19 risk assessment tool: generic events, version 3

    The WHO mass gathering COVID-19 risk assessment tool version 3 includes newly published research and evidence from WHO pertaining to areas such improved SARS-CoV-2 diagnostics, vaccines, variants of concern (VoC), ventilation considerations, and risk communication and community engagement and infodemic management (RCCE-IM) event strategies.

  2. Risk Assessment Summary for SARS CoV-2 Sublineage BA.2.86

    Current risk assessment. Based on what CDC knows now, existing tests used to detect and medications used to treat COVID-19 appear to be effective with this variant. BA.2.86 may be more capable of causing infection in people who have previously had COVID-19 or who have received COVID-19 vaccines.

  3. Interim Guidance for Managing Healthcare Personnel with SARS-CoV-2

    Assessment of risk and application of workplace restrictions for asymptomatic HCP with exposure to SARS-CoV-2. Guidance addressing recommended infection prevention and control practices including use of source control by HCP is available in Infection Control: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)

  4. Using a Risk Analysis Framework to Guide COVID-19 Decisions

    It includes the following three key steps: risk assessment, risk management, and risk communication. In fact, with COVID-19 it is clear that how we assess and manage risk is important to guiding policies that reduce disease transmission, especially since people's actions can have such a strong impact on others.

  5. COVID-19 Event Risk Assessment Planning Tool

    The COVID-19 Event Risk Assessment Planning Tool is a collaborative project led by Prof. Joshua Weitz and Prof. Clio Andris at the Georgia Institute of Technology, along with researchers at the Applied Bioinformatics Laboratory, Duke University, and Stanford University, and powered by RStudio.

  6. How to use WHO risk assessment and mitigation checklist for mass

    Overview . This document provides a COVID-19 risk assessment and mitigation checklist for use by host countries and mass gathering organizers. It also includes an operational tool which offers guidance for organizers holding meetings during the COVID-19 outbreak and which should be accompanied by the WHO COVID-19 Generic Risk Assessment Excel file.

  7. PDF SARS-CoV-2 Infection Prevention and Control in Healthcare Settings Toolkit

    (SARS-CoV-2), the virus that causes coronavirus disease 2019 (COVID-19). Healthcare settings may use this toolkit to create flexible policies specific to their facility based on their individual risk assessment following CDC's Interim Infection Prevention and Control Recommendations for Healthcare Personnel During the Coronavirus Disease 2019

  8. Key Considerations for Workplace COVID-19 Risk Assessments and Controls

    Key considerations for workplace covid-10 risk assessments and controls. When the COVID-19 pandemic hit, many businesses were forced to adapt to continue operations. Companies with essential workforces created ad-hoc protocols to reduce the health and safety risks associated with COVID-19 without fully assessing all the potential risks or ...

  9. A novel methodology for epidemic risk assessment of COVID-19 ...

    Definition of a risk assessment framework and calibration with COVID-19 data Conventional risk assessment theory relies on "Crichton's Risk Triangle" 24 , 25 , shown in panel (l) of Fig. 2 .

  10. How to Make Smart COVID Risk-Benefit Decisions

    More by Devabhaktuni Srikrishna. This article was originally published with the title "How to Make Smart COVID Risk-Benefit Decisions" in SA Health & Medicine Vol. 4 No. 3 (June 2022) doi:10. ...

  11. COVID-19 Exposure Assessment Tool (CEAT): Exposure ...

    The coronavirus disease 2019 (COVID-19) Exposure Assessment Tool (CEAT) allows users to compare respiratory relative risk to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) for various scenarios, providing understanding of how combinations of protective measures affect risk.

  12. COVID-19 infection prevention and control risk management

    The Commission has updated guidance to support health service organisations in assessing risks related to COVID-19 and developing a comprehensive Risk Management Plan. The revised guidance reflects the most recent national advice regarding use of the hierarchy of controls and infection prevention and control systems to identify hazards and ...

  13. COVID-19 Pandemic Risk Assessment: Systematic Review

    A COVID-19 risk-based assessment (CRAM) framework was created for analyzing COVID-19 risk in various geographical areas: Three steps were identified: (1) GIS layers of various data were generated, (2) hazard and vulnerability maps were integrated, and (3) risk mapping for decision making was conducted to prioritize COVID-19 risk areas ...

  14. Calculating COVID-19 risk

    A new COVID-19 risk calculator developed at Harvard T.H. Chan School of Public Health can help people understand the ways that masking, ventilation, filtration, and other factors can mitigate the spread of COVID-19 in indoor environments.. The tool, developed by Joseph Allen, associate professor of exposure assessment science, and his team at the Healthy Buildings program, is based on a model ...

  15. COVID-19 Pandemic: Public Health Risk Assessment and Risk Mitigation

    Microbial risk assessment is applicable to quantify and understand the relative risk of fomite-mediated transmission and evaluate the efficiency of preventive actions to lower the risk of COVID-19 . Fomites have given a relatively low contribution to the rapid spread of the COVID-19; however, guidelines need to be followed to avoid the risk and ...

  16. A Guide for Covid-19 Risk in Your County

    Source: Covid-19 risk assessment by The New York Times and Resolve to Save Lives based on reported cases and test positivity data. Read more below . The Times continues to report daily updates for ...

  17. Assessment of COVID-19 risk factors of early and long-term mortality

    Background Coronavirus disease (COVID-19) may lead to serious complications and increased mortality. The outcomes of patients who survive the early disease period are burdened with persistent long-term symptoms and increased long-term morbidity and mortality. The aim of our study was to determine which baseline parameters may provide the best prediction of early and long-term outcomes. Methods ...

  18. COVID-19

    What is the risk to workers in the United States? The risk of worker exposure to SARS-CoV-2, the virus that causes Coronavirus Disease 2019 (COVID-19), depends on numerous factors, including the extent of community transmission; the severity of resulting illness; existing medical conditions workers may have; environmental conditions that may affect exposure risk (e.g., working or living in ...

  19. COVID-19: A Risk Assessment Perspective

    COVID-19 is a newly emerging viral respiratory disease first identified in Wuhan, China, in December 2019. The disease is caused by the coronavirus SARS-CoV-2, which is related to the viruses that cause SARS and MERS. While the case fatality ratio for COVID-19 (5%) is far lower than that for SARS (11%) and MERS (34%), COVID-19 is spreading relatively uncontrolled at this time across the globe ...

  20. Coronavirus (COVID-19) risk assessment

    Risk prediction of covid-19 related death and hospital admission in adults after covid-19 vaccination: national prospective cohort stud; View the list of health conditions and treatments considered by the COVID-19 Population Risk Assessment and the COVID-19 Clinical Risk Assessment Tool.

  21. Interim Guidance for Managing Healthcare Personnel with SARS-CoV-2

    Assessment of risk and application of workplace restrictions for asymptomatic HCP with exposure to SARS-CoV-2. Guidance addressing recommended infection prevention and control practices including use of source control by HCP is available in Infection Control: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)

  22. Use of a risk assessment tool to determine the origin of severe acute

    This risk assessment cannot prove the origin of SARS-CoV-2 but shows that the possibility of a laboratory origin cannot be easily dismissed. 1 INTRODUCTION Since the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) first emerged from Wuhan, China, in December 2019, there has been ongoing debate around the origins of the virus.

  23. Latest risk assessments on COVID-19

    Rapid Risk Assessment: COVID-19 outbreaks in long-term care facilities in the EU/EEA in the context of current vaccination coverage. ECDC risk assessments aim at supporting the EU/EEA countries and the European Commission in their preparedness and response to a public health threat.

  24. Pandemic Risk Assessment

    Be ready for what's next. Our fast, data-driven assessment quantitatively determines your pandemic risks and offers focused resiliency plans to protect your business, customers and shareholders. Our free assessment helps quantify your risk through data-driven analysis across multiple impact zones and risk factors.

  25. COVID-19 Pandemic Risk Assessment: Systematic Review

    In our criteria, we included articles that related to one of three main scopes of study: (1) COVID-19, using the keywords "COVID" or "Coronavirus disease 2019", (2) pandemic risk, using the keywords "pandemic risk", and (3) risk assessment, using the keywords "risk assessment.". The language of each article, the nature of the ...

  26. Identification of risk factors of Long COVID and predictive ...

    Zang et al. use RECOVER EHR data to study Long COVID risk factors and apply mathematical modeling to predict the development of long COVID conditions. They find that severe acute SARS-CoV-2 ...

  27. Covid Inquiry's Damning Report Says Government 'Failed Its ...

    It also slammed the country's "labyrinthine" emergency planning, saying the approach to risk assessment was "flawed". "The UK government's outdated pandemic strategy, developed in ...