Skip to main content

Policies for Researchers

Table of Contents

This page provides important instructions that must be read before the sharing and publication of any OpenSAFELY project results released from the Level 4 results server.

If you have any questions, in the first instance contact your co-pilot; if you do not have a co-pilot, please contact [email protected].

All sections with square brackets should be amended as appropriate; please discuss with your co-pilot or contact [email protected] if you have any questions.

Permitted Study Results Policy

All outputs from the NHS England OpenSAFELY COVID-19 service must be aggregated data with small number suppression applied.

The service operates as a trusted research platform where no patient record level data is permitted to be extracted from the platform.

You MUST NOT request the release of any information (e.g. name, listsize) that identifies, or could identify, ICBs, Local Authorities (including MSOA identifiers), Primary Care Networks (PCNs) and individual GP practices from the Level 4 results server.

Larger geographic / regional outputs can be released, such as NHS England operating regions, which are listed in relevant data tables in the OpenSAFELY platform. An example use of these regions is in Table 1, p.3 of this paper.

Refer to the Datasets Used heading regarding the general rules around the sharing of results and the publication of results. Some datasets have their own additional rules for the sharing and publication of results. Make sure you read the information for each dataset carefully.

OpenSAFELY Analytic Methods Policy


This policy outlines the analytic methods that are currently supported and not supported within OpenSAFELY due to resource for compute and output checking. The policy is approved by NHS England, and managed in collaboration with the OpenSAFELY team.

The development of this policy takes into account a variety of constraints, such as: the available and funded OpenSAFELY compute; the capacity within the output-checking service, which oversees statistical disclosure controls (SDC), to satisfy contractual, legal and ethical obligations; the unknown privacy risks associated with some aspects of some analysis methods (eg generative and risk prediction models).

In addition, we must ensure we bring the public, profession and other stakeholders with us if and when there are analytic approaches that have, or are perceived to have, additional challenges around interpretability, explainability, safety, efficacy, cost-effectiveness1 and privacy (such as machine learning / AI models) are being run against GP and NHS data; maintaining the trust of the public and profession is a key priority for teams running the OpenSAFELY service.

The OpenSAFELY output-checking service supports most established analytical methods. The following table is an indicative and non-exhaustive list:

SDC-supported methods
Descriptive Statistics (counts, proportions, means, quantiles, etc.)
Basic Statistical Tests (e.g., t-tests, Chi-squared tests)
Generalized Linear Models (GLM), including linear, logistic and Poisson regression
Survival analyses (e.g., Kaplan-Meier estimates, Cox regression, parametric survival modelling)
Traditional Time Series Analysis (e.g., ARIMA, Exponential Smoothing, STL)
Visualisation of non-patient-level summary statistics (e.g., histograms, time-series, forest plots) and data smoothing for visualisation.
Structural Equation Models

The following table lists the analytic methods currently not-supported by the OpenSAFELY output-checking service which oversees statistical disclosure controls (SDC).

Your OpenSAFELY study must not use any libraries, scripts, or run any code that uses these analytic methods:

Not SDC-supported methods
Neural Networks (including deep NNs, CNNs, RNNs, ANNs)
Support Vector Machines (SVM)
Random Forests
Gradient Boosting Machines
Unsupervised Clustering Algorithms (e.g., k-means, PCA)
Advanced Time Series Models (e.g., LSTM, GRU, TCN)
Natural Language Processing Models (e.g., BERT, GPT)
Reinforcement Learning Models
Adversarial Learning Models
Large Language Models
Generative AI Models

Your responsibility as OpenSAFELY users (researchers, analysts, data scientists)

  1. Study lead(s) must familiarise themself with the not SDC-supported methods.
  2. The study lead(s) must ensure only outputs from SDC-supported methods are requested for release.

This policy will also be named in the data access agreement that all approved users sign.

Important notes

  1. Users are not permitted to run methods on the Non SDC-Supported list even if they are not planning to release the model for external querying or inspection, without exceptional prior permission (see below).
    • It might be acceptable for a non-SDC support method to be used for an internal project analysis need (for example, using such models for propensity matching, inverse propensity weighting) with no requirement to release the outputs, model or performance statistics. Discuss this with your co-pilot: explain the need and purpose and we will let you know.
  2. We do not support the release of models because (not exhaustive list):
    • It is extremely labour intensive to evaluate large volumes of potentially disclosive data for release outside a TRE/SDE; and for some types of model that might be proposed for export there are no currently recognised means to evaluate disclosivity risk in any setting.
    • Development of models that cannot be used as intended (e.g., for external risk prediction) will utilise compute and resources unnecessarily.
  3. If you are planning to conduct analyses to produce a risk prediction model to implement in clinical practice, you must detail this on your application form “Study Information” section, so that NHS England and others can consider any regulatory or liability issues.
  4. It is the responsibility of users to ensure outputs requested for release are clearly explained to output-checkers; only the minimal required; clearly in line with the project purpose; in line with this policy.

How this policy will be updated

As OpenSAFELY’s features and resources change over time, it is possible that various methods on the not SDC-supported list will be supported. To help us prioritise how the platform develops, email us ([email protected], subject: analytic methods suggestions) and share the specific method(s) of interest you would like us to support, including:

  1. Why is the method necessary for a future your project?
  2. Can the analysis not be conducted using supported methods?
  3. What is the intended benefit of using the method?
  4. What have you considered to be the risks of using the method?
  5. How do you intend to mitigate these risks? Specifically, cite how you think the statistical disclosure control process could work.

We encourage users to also talk with their co-pilots as part of this feedback.

Requests for non SDC-supported methods will be logged and reviewed on a quarterly basis by NHS England, the OpenSAFELY team and shared with our governance board to determine how this policy will evolve over time.2

Authorship Policy

Our team is strongly committed to “team science”, and to recognising the deep technical and methodological contribution of research software engineers to research outputs. We have a strong preference, specifically during the pilot phase when all projects are delivered in close collaboration, for members of the OpenSAFELY team who materially contribute to your study and/or to the iterative development of the platform and analytic pipelines to be offered authorship on outputs. This is likely to change over time as the platform expands, and as external teams become more “customers” than “collaborators”. For clarity, this relates to platform contributions, and there is never any expectation of authorship for individual researchers involved in OpenSAFELY who are not involved in a research project. Read our authorship policy for further details.

Plan S

We ask that academic outputs comply with Wellcome’s Plan S requirements for journal publication.

Acknowledgment and Data Sharing / Publication Policy

NHS England oversees the final approval for all publication ready papers, reports or presentations, principally to check that the outputs align with the stated application purpose; NHS England has been extremely supportive of all research and analyses to date. The usual response time for approval is 1-2 weeks.

The acknowledgment and sharing/publication of results guidelines are dependent on the datasets used for your project. The acknowledgement content must be used in all published papers, official reports and presentations given outside of your research team/collaborators.

You MUST NOT share any results that have not been released through the official output checking process. This includes:

- verbal sharing

- allowing someone to look over your shoulder

- transcribing (e.g., to paper or email)

- using screen sharing software or any recording device/software

Datasets used

All Datasets

Acknowledgement content

We are very grateful for all the support received from the [EMIS Technical Operations team] [TPP Technical Operations team] [EMIS and TPP Technical Operations teams] throughout this work, and for generous assistance from the information governance and database teams at NHS England and the NHS England Transformation Directorate.

If the High Cost Drug dataset was also used, add:

North East Commissioning Support Unit provided support on behalf of all Commissioning Support Units to aggregate the high cost drugs data for use in OpenSAFELY studies.


The results of ANY dataset can be shared IN CONFIDENCE and ONLY with key members of the wider research team / research collaborators (for the purpose of seeking feedback and contribution to inform the final paper or report), for example, by a webinar or by email, but the following guidelines must be adhered to:

  1. Acceptable sharing examples include: the senior sponsor; analysts and senior manager in the NHS E/I/X department accountable for the specific policy activities being investigated (but NOT other departments); key members of the relevant scientific advisory groups; established relevant expert collaborators.
  2. If sharing your results, paper, report, etc., with individuals external to your immediate project team (e.g. key members of the relevant scientific advisory groups; relevant external expert collaborators) you must ensure the content being shared has been reviewed and approved by the senior sponsor (for service evaluations and audits), your line manager/PI (for service evaluations, audits and research) and the overall project leads (mamed here under your project number); and provide your co-pilot with a copy of the content.
  3. All recipients must be reminded that the content is shared in confidence and they must not distribute it further (see publication guidelines below).

If you are unsure that your planned sharing is appropriate, please contact your co-pilot in the first instance; or use the OpenSAFELY-users slack channel (if you have joined); or email [email protected].

PUBLICATION OF RESULTS (e.g. papers, presentations, etc.)

You must seek NHS England approval for any publication or wider sharing of results, papers, presentations (e.g. submitting to a journal or a pre-print server, or uploading to any public facing website). For the avoidance of doubt, this means that if an iteration of an analysis is approved for publication, any previous or future iterations must also be approved for publication by NHS England if you want to publish them. The steps you must follow for NHS England approval are:

  1. Ensure the material you seek to publish has been reviewed and approved by the senior sponsor (for service evaluations and audits), your line manager/PI (for service evaluations, audits and research) and the overall project leads (named here under your project number).

  2. Discuss your material with your copilot. Your co-pilot will carry out a brief checklist on your content. There is also an author checklist for you to complete. If you do not have a copilot, please make a request for support via the OpenSAFELY slack channel and we will allocate you a co-pilot.

  3. Once the co-pilot and author checklist are complete, please e-mail [email protected] (and copy your copilot) your proposed publication documents (specifying your project ID, see the list of approved projects ), alongside confirmation that the senior sponsor and line manager (for service evaluation/audit) or line manager/PI (for research) and the overall project leads have read and approved them. The document(s) you submit for publication approval must be roughly “90%” finalised versions, but the results and conclusions must be final.

  4. All submissions must include a brief lay summary of the findings and also highlight anything that could be deemed contentious (we appreciate the notion of contentious is subjective). Do not just copy your abstract - please provide a lay summary.

  5. NHS England publication review windows occur on a two weekly basis. Please ensure you have sent your documents for review to [email protected] by 5pm on the Wednesday of the review week. Submissions deadlines are:

    • 5pm 22nd May;

    • 5pm 5th June;

    • 5pm 19th June;

    • and so on every two weeks (**NOTE: over summer, Easter, and Christmas breaks these review periods will be less frequent.)

    • Consult the users forum for upcoming deadlines.

    • A response will usually be provided within 1-2 weeks.

  6. Upon publication of any associated papers, presentations, etc (and in any case within 12 months of code execution against patient data) you must publish your Github repository.

For the Datasets listed below

The following additional acknowledgement and publication of results guidelines must be followed if your study uses data from ICNARC, ISARIC, ONS-CIS, PHOSP.


Acknowledgement content

Use the All Datasets acknowledgement above and the following:

This publication is based on data derived from the Intensive Care National Audit & Research Centre (ICNARC) Case Mix Programme Database. The Case Mix Programme is the national, comparative audit of patient outcomes from adult critical care coordinated by ICNARC. We thank all the staff in the critical care units participating in the Case Mix Programme. For more information on the representativeness and quality of these data, please contact ICNARC. Disclaimer: The views and opinions expressed therein are those of the authors and do not necessarily reflect those of ICNARC.


Use the All Datasets Sharing of Results guide above.

PUBLICATION OF RESULTS (e.g. papers, presentations, etc.)

Use the All Datasets Publication of Results guide above and the following:

Contact and email ICNARC if any safety concerns are identified. 020 7831 6878 [email protected]

Email [email protected] (and copy [email protected] and your copilot) one draft copy of any proposed publication or presentation at the same time as submission for publication or at least 28 days before the date intended for publication/presentation, whichever is earlier.


Acknowledgement content

Use the All Datasets acknowledgement above and the following:

This report is independent research which used data provided by the MRC funded ISARIC 4C Consortium and which the Consortium collected under a research contract funded by the National Institute for Health Research. The views expressed in this publication are those of the author(s) and not necessarily those of the ISARIC 4C consortium.


Use the All Datasets Sharing of Results guide above.

PUBLICATION OF RESULTS (e.g. papers, presentations, etc.)

Use the All Datasets Publication of Results guide above and the following:

Email [email protected] (and copy [email protected] and your copilot) a copy of any publication at least 7 days in advance of submission for publication.

Submit the results to an open access platform and in accordance with normal academic practice; publication to a bona-fide pre-print service is encouraged where possible.

ONS-CIS data

Acknowledgement content

Use the All Datasets acknowledgement above and the following:

The Coronavirus (Covid-19) infection survey is delivered by the Office for National Statistics in partnership with the University of Oxford, University of Manchester, UK Health Security Agency and Wellcome Trust. The study is funded by the Department of Health and Social Care with in-kind support from the Welsh Government, the Department of Health on behalf of the Northern Ireland Government and the Scottish Government. The collection and testing of samples is carried out by the Lighthouse laboratory. Genome sequencing is funded by the COVID-19 Genomics UK (COG-UK) consortium. COG-UK is supported by funding from the Medical Research Council (MRC) part of UK Research and Innovation (UKRI), the National Institute of Health Research (NIHR), and Genome Research Limited operating as the Wellcome Sanger Institute.

The views expressed are those of the authors and not necessarily those of the funding organisations or those involved in the delivery of the survey.


Use the All Datasets Sharing of Results guide above.

PUBLICATION OF RESULTS (e.g. papers, presentations, etc.)

Use the All Datasets Publication of Results guide above and the following:

Email [email protected] (and copy [email protected] and your copilot) a copy of all proposed publications and presentations arising from agreed analysis to the ONS not less than 7 days in advance of submission for publication or presentation, for approval; such approval shall not be unreasonably withheld or delayed by ONS.

OpenPROMPT data

Acknowledgement content

Use the All Datasets acknowledgement above and the following:

Awaiting additional acknowledgement content.


Use the All Datasets Sharing of Results guide above.

PUBLICATION OF RESULTS (e.g. papers, presentations, etc.)

In discussion.

PHOSP data

Acknowledgement content

Use the All Datasets acknowledgement above and the following:

Awaiting additional acknowledgement content.


Use the All Datasets Sharing of Results guide above.

PUBLICATION OF RESULTS (e.g. papers, presentations, etc.)

In discussion.

UK Renal Registry (UKRR) data

Acknowledgement content

Use the All Datasets acknowledgement above and the following:

This project includes data from the UKRR derived from patient-level information collected by the NHS as part of the care and support of kidney patients. We thank all kidney patients and kidney centres involved. The data are collated, maintained, and quality assured by the UKRR, which is part of the UK Kidney Association. The interpretation and reporting of these data are the responsibility of the authors and in no way should be seen as an official policy or interpretation of the UK Kidney Association. Access to the data was facilitated by the UKRR’s Data Release Group. UKRR data are used within OpenSAFELY to address a number of critical research, audit and service delivery questions related to the impact of COVID-19 on patients with kidney disease.


Use the All Datasets Sharing of Results guide above.

PUBLICATION OF RESULTS (e.g. papers, presentations, etc.)

Where the recipient has chosen to include an UKKA employee as an author on the recipient’s outputs, the recipient must share drafts in sufficient time for the UKKA employee to have input. The UKKA follows the International Committee of Medical Journal Editors (ICMJE) authorship guidelines

Information Governance, Ethics, and funding content policy

For published papers, official reports and presentations you must use the following content for the relevant section headings.

Note: If a study uses both EMIS and TPP, please reference them both as data processors in sections below.


  • Must add: “With the approval of NHS England we…”

Methods - Data Sharing or Data Source headings

  • Must add (note additional comment in paragraph regarding Type-1 Opt outs): All data were linked, stored and analysed securely using the OpenSAFELY platform,, as part of the NHS England OpenSAFELY COVID-19 service. Data include pseudonymised data such as coded diagnoses, medications and physiological parameters. No free text data are included. [If your project is ID 156 or above, also add: No GP data from patients who have registered a Type-1 Opt out with their GP surgery were included in this study]. All code is shared openly for review and re-use under MIT open license [LINK TO GITHUB REPO OF PAPER BEING SUBMITTED]. Detailed pseudonymised patient data is potentially re-identifiable and therefore not shared.
  • When listing data sources, suggested phrase: Primary care records managed by the GP software provider, TPP/EMIS were linked to [ONS death data, etc.] through OpenSAFELY.

Software and Reproducibility

  • If required use: Data management was performed using Python [XX], with analysis carried out using [Stata 16.1/Python/R]. Code for data management and analysis, as well as codelists, are archived online [link your project github repo]. [All iterations of the pre-specified study protocol are archived with version control].
  • For any federated analyses use: This was an analysis delivered using federated analysis through the OpenSAFELY platform. A federated analysis involves carrying out patient level analysis in multiple secure datasets, then later combining them: codelists and code for data management and data analysis were specified once using the OpenSAFELY tools; then transmitted securely from the OpenSAFELY jobs server to the OpenSAFELY-TPP platform within TPP’s secure environment, and separately to the OpenSAFELY-EMIS platform within EMIS’s secure environment, where they were each executed separately against local patient data; summary results were then reviewed for disclosiveness, released, and combined for the final outputs. All code for the OpenSAFELY platform for data management, analysis and secure code execution is shared for review and re-use under open licences on GitHub:

Patient and Public Involvement and Engagement (PPIE)

  • Where relevant: Insert any project specific PPIE.
  • Consider: OpenSAFELY has involved patients and the public in various ways: we developed a public website that provides a detailed description of the platform in language suitable for a lay audience (; we have participated in two citizen juries exploring public trust in OpenSAFELY; we have co-developed an explainer video (; we have patient representation who are experts by experience on our OpenSAFELY Oversight Board; we have partnered with Understanding Patient Data to produce lay explainers on the importance of large datasets for research; we have presented at various online public engagement events to key communities (e.g., Healthcare Excellence Through Technology; Faculty of Clinical Informatics annual conference; NHS Assembly; HDRUK symposium); and more. To ensure the patient voice is represented, we are working closely to decide on language choices with appropriate medical research charities (e.g., Association of Medical Research Charities). We will share information and interpretation of our findings through press releases, social media channels, and plain language summaries.


  • Must add:

    The OpenSAFELY platform is principally funded by grants from:

    • NHS England [2023-2025];
    • The Wellcome Trust (222097/Z/20/Z) [2020-2024];
    • MRC (MR/V015737/1) [2020-2021].

    Additional contributions to OpenSAFELY have been funded by grants from:

    • MRC via the National Core Study programme, Longitudinal Health and Wellbeing strand (MC_PC_20030, MC_PC_20059) [2020-2022] and the Data and Connectivity strand (MC_PC_20058) [2021-2022];
    • NIHR and MRC via the CONVALESCENCE programme (COV-LT-0009, MC_PC_20051) [2021-2024];
    • NHS England via the Primary Care Medicines Analytics Unit [2021-2024].

    The views expressed are those of the authors and not necessarily those of the NIHR, NHS England, UK Health Security Agency (UKHSA), the Department of Health and Social Care, or other funders. Funders had no role in the study design, collection, analysis, and interpretation of data; in the writing of the report; and in the decision to submit the article for publication.

  • Where relevant: Insert any additional grants specific to the work or authors.

Information governance and ethical approval

  • Must add: NHS England is the data controller of the NHS England OpenSAFELY COVID-19 Service; [TPP is the data processor] [EMIS is the data processor] [EMIS and TPP are the data processors]; all study authors using OpenSAFELY have the approval of NHS England.3 This implementation of OpenSAFELY is hosted within the [EMIS environment which is] [TPP environment which is] [EMIS and TPP environments which are] accredited to the ISO 27001 information security standard and [is][are] NHS IG Toolkit compliant;4

    Patient data has been pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the NHS England OpenSAFELY COVID-19 service is via a virtual private network (VPN) connection; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts.5

    The service adheres to the obligations of the UK General Data Protection Regulation (UK GDPR) and the Data Protection Act 2018. The service previously operated under notices initially issued in February 2020 by the the Secretary of State under Regulation 3(4) of the Health Service (Control of Patient Information) Regulations 2002 (COPI Regulations), which required organisations to process confidential patient information for COVID-19 purposes; this set aside the requirement for patient consent.6 As of 1 July 2023, the Secretary of State has requested that NHS England continue to operate the Service under the COVID-19 Directions 2020.7 In some cases of data sharing, the common law duty of confidence is met using, for example, patient consent or support from the Health Research Authority Confidentiality Advisory Group.8

    Taken together, these provide the legal bases to link patient datasets using the service. GP practices, which provide access to the primary care data, are required to share relevant health information to support the public health response to the pandemic, and have been informed of how the service operates.

  • For RESEARCH, you must add: This study was approved by the Health Research Authority [REC reference XXX] and by the XXX Ethics Board [reference XXX].

  • For SERVICE EVALUATION/AUDIT, you must add: This study was supported by [NAME + OFFICIAL ROLE] as senior sponsor, and approved by the XXX Ethics Board [reference XXX]. (NHS England service evaluations/audits are currently not required to have Ethics approval.)

  • NOTE: remember to add additional governance and ethical content pertaining to data not processed within OpenSAFELY.

Data access and verification

  • If requested, use the following: Access to the underlying identifiable and potentially re-identifiable pseudonymised electronic health record data is tightly governed by various legislative and regulatory frameworks, and restricted by best practice. The data in the NHS England OpenSAFELY COVID-19 service is drawn from General Practice data across England where [EMIS is the data processor][TPP is the data processor][EMIS and TPP are the data processors].

    [EMIS][TPP][EMIS and TPP] developers initiate an automated process to create pseudonymised records in the core OpenSAFELY database, which are copies of key structured data tables in the identifiable records. These pseudonymised records are linked onto key external data resources that have also been pseudonymised via SHA-512 one-way hashing of NHS numbers using a shared salt. University of Oxford, Bennett Institute for Applied Data Science developers and PIs, who hold contracts with NHS England, have access to the OpenSAFELY pseudonymised data tables to develop the OpenSAFELY tools.

    These tools in turn enable researchers with OpenSAFELY data access agreements to write and execute code for data management and data analysis without direct access to the underlying raw pseudonymised patient data, and to review the outputs of this code. All code for the full data management pipeline — from raw data to completed results for this analysis — and for the OpenSAFELY platform as a whole is available for review at

    The data management and analysis code for this paper was led by (XX) and contributed to by (XX).

  1. ↩︎

  2. Central government departments, which includes NHS England (data controller for OpenSAFELY), are required to improve transparency regarding the use of algorithmic tools. You may wish to familiarise yourself with the Algorithmic Transparency Record Standard which will be necessary to be completed when we support these Non SDC-supported methods, if they are likely to have a significant influence on a decision-making process with direct or indirect public effect. Public bodies (e.g. patient care organisations or universities) are also encouraged to complete this record standard. ↩︎

  3. The NHS England OpenSAFELY COVID-19 service - privacy notice. NHS Digital (Now NHS England). (accessed 4 July 2023). ↩︎

  4. Data Security and Protection Toolkit - NHS Digital. NHS Digital (Now NHS England). (accessed 4 July 2023). ↩︎

  5. ISB1523: Anonymisation Standard for Publishing Health and Social Care Data. NHS Digital (Now NHS England). (accessed 4 July 2023). ↩︎

  6. Coronavirus (COVID-19): notice under regulation 3(4) of the Health Service (Control of Patient Information) Regulations 2002 – general. 2022. (accessed 5 July 2023). ↩︎

  7. Secretary of State for Health and Social Care - UK Government. COVID-19 Public Health Directions 2020: notification to NHS Digital. (accessed 4 July 2023). ↩︎

  8. Confidentiality Advisory Group. Health Research Authority. (accessed 4 July 2023). ↩︎