Skip to main content

Clinical coding of long COVID in English primary care

Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY

How to cite: Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY Alex J Walker, Brian MacKenna, Peter Inglesby, Laurie Tomlinson, Christopher T Rentsch, Helen J Curtis, Caroline E Morton, Jessica Morley, Amir Mehrkar, Seb Bacon, George Hickman, Chris Bates, Richard Croker, David Evans, Tom Ward, Jonathan Cockburn, Simon Davy, Krishnan Bhaskaran, Anna Schultze, Elizabeth J Williamson, William J Hulme, Helen I McDonald, Rohini Mathur, Rosalind M Eggo, Kevin Wing, Angel YS Wong, Harriet Forbes, John Tazare, John Parry, Frank Hester, Sam Harper, Shaun O’Hanlon, Alex Eavis, Richard Jarvis, Dima Avramov, Paul Griffiths, Aaron Fowles, Nasreen Parkes, Ian J Douglas, Stephen JW Evans, Liam Smeeth, Ben Goldacre and (The OpenSAFELY Collaborative) British Journal of General Practice 2021; 71 (712): e806-e814. DOI:


Background: Long COVID is a term to describe new or persistent symptoms at least four weeks after onset of acute COVID-19. Clinical codes to describe this phenomenon were released in November 2020 in the UK, but it is not known how these codes have been used in practice.

Methods: Working on behalf of NHS England, we used OpenSAFELY data encompassing 96% of the English population. We measured the proportion of people with a recorded code for long COVID, overall and by demographic factors, electronic health record software system, and week. We also measured variation in recording amongst practices.

Results: Long COVID was recorded for 23,273 people. Coding was unevenly distributed amongst practices, with 26.7% of practices having not used the codes at all. Regional variation was high, ranging between 20.3 per 100,000 people for East of England (95% confidence interval 19.3-21.4) and 55.6 in London (95% CI 54.1-57.1). The rate was higher amongst women (52.1, 95% CI 51.3-52.9) compared to men (28.1, 95% CI 27.5-28.7), and higher amongst practices using EMIS software (53.7, 95% CI 52.9-54.4) compared to TPP software (20.9, 95% CI 20.3-21.4).

Conclusions: Long COVID coding in primary care is low compared with early reports of long COVID prevalence. This may reflect under-coding, sub-optimal communication of clinical terms, under-diagnosis, a true low prevalence of long COVID diagnosed by clinicians, or a combination of factors. We recommend increased awareness of diagnostic codes, to facilitate research and planning of services; and surveys of clinicians’ experiences, to complement ongoing patient surveys.