Warn about invalid moderately_sensitive outputs
14 November 2023
The opensafely tool has been
updated to warn you if you have invalid outputs marked as
moderately_sensitive
in your project.yaml. This is a follow on from the
previous change about stricter output file
paths and means that you’ll
get more accurate feedback when running the code locally.
Specifically, it will check that moderately_sensitive
outputs meet the
appropriate output file
restrictions.
These currently are:
The file must be of the correct type. You will not be able to run jobs locally at all or on the server if the file is not a valid type, i.e. it must have a valid file extension.
If it is a
.csv
file, it must not have apatient_id
column. Your code will still run, but the log file and the on-screen summary text will show a warning. If you run it on your own computer using the opensafely command line tool, you will still get an output. If you run it in the live system via jobs.opensafely.org, then it will still run, but the file will not be available in level 4If it is too large, it will be handled in the same way as above. This is unlikey to occur when running locally against dummy data, but may happen when run via jobs.opensafely.org.
Fixing these is likely a case of marking the file as highly_sensitive
instead. If you do need it to be moderately_sensitive
, then you may need to
process the data a bit more, e.g. remove the patient_id
column or reduce the
size.
As a reminder, the policy for moderately_sensitive
senstitive outputs is that
they must be aggregate data, not patient
level.
These checks are designed to catch accidental misclassification of outputs with patient level data
as moderately_sensitive
.
Any questions or problems, please let us know.