OpenSAFELY: Warn about invalid moderately

The opensafely tool has been updated to warn you if you have invalid outputs marked as moderately_sensitive in your project.yaml. This is a follow on from the previous change about stricter output file paths and means that you’ll get more accurate feedback when running the code locally.

Specifically, it will check that moderately_sensitive outputs meet the appropriate output file restrictions.

These currently are:

The file must be of the correct type. You will not be able to run jobs locally at all or on the server if the file is not a valid type, i.e. it must have a valid file extension.
If it is a .csv file, it must not have a patient_id column. Your code will still run, but the log file and the on-screen summary text will show a warning. If you run it on your own computer using the opensafely command line tool, you will still get an output. If you run it in the live system via jobs.opensafely.org, then it will still run, but the file will not be available in level 4
If it is too large, it will be handled in the same way as above. This is unlikey to occur when running locally against dummy data, but may happen when run via jobs.opensafely.org.

Fixing these is likely a case of marking the file as highly_sensitive instead. If you do need it to be moderately_sensitive, then you may need to process the data a bit more, e.g. remove the patient_id column or reduce the size.

As a reminder, the policy for moderately_sensitive senstitive outputs is that they must be aggregate data, not patient level. These checks are designed to catch accidental misclassification of outputs with patient level data as moderately_sensitive.

Any questions or problems, please let us know.

Warn about invalid moderately_sensitive outputs