Skip to main content

Clinical coding of long COVID in English primary care

Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY

How to cite: Clinical coding of long COVID in English primary care: a federated analysis of 58 million patient records in situ using OpenSAFELY Alex J Walker, Brian MacKenna, Peter Inglesby, Laurie Tomlinson, Christopher T Rentsch, Helen J Curtis, Caroline E Morton, Jessica Morley, Amir Mehrkar, Seb Bacon, George Hickman, Chris Bates, Richard Croker, David Evans, Tom Ward, Jonathan Cockburn, Simon Davy, Krishnan Bhaskaran, Anna Schultze, Elizabeth J Williamson, William J Hulme, Helen I McDonald, Rohini Mathur, Rosalind M Eggo, Kevin Wing, Angel YS Wong, Harriet Forbes, John Tazare, John Parry, Frank Hester, Sam Harper, Shaun O’Hanlon, Alex Eavis, Richard Jarvis, Dima Avramov, Paul Griffiths, Aaron Fowles, Nasreen Parkes, Ian J Douglas, Stephen JW Evans, Liam Smeeth, Ben Goldacre and (The OpenSAFELY Collaborative) British Journal of General Practice 2021; 71 (712): e806-e814. DOI:



Long COVID is a term to describe new or persistent symptoms at least four weeks after onset of acute COVID-19. Clinical codes to describe this phenomenon were released in November 2020 in the UK, but it is not known how these codes have been used in practice.


Working on behalf of NHS England, we used OpenSAFELY data encompassing 96% of the English population. We measured the proportion of people with a recorded code for long COVID, overall and by demographic factors, electronic health record software system, and week. We also measured variation in recording amongst practices.


Long COVID was recorded for 23,273 people. Coding was unevenly distributed amongst practices, with 26.7% of practices having not used the codes at all. Regional variation was high, ranging between 20.3 per 100,000 people for East of England (95% confidence interval 19.3-21.4) and 55.6 in London (95% CI 54.1-57.1). The rate was higher amongst women (52.1, 95% CI 51.3-52.9) compared to men (28.1, 95% CI 27.5-28.7), and higher amongst practices using EMIS software (53.7, 95% CI 52.9-54.4) compared to TPP software (20.9, 95% CI 20.3-21.4).


Long COVID coding in primary care is low compared with early reports of long COVID prevalence. This may reflect under-coding, sub-optimal communication of clinical terms, under-diagnosis, a true low prevalence of long COVID diagnosed by clinicians, or a combination of factors. We recommend increased awareness of diagnostic codes, to facilitate research and planning of services; and surveys of clinicians’ experiences, to complement ongoing patient surveys.