Skip to main content

Context added to dummy data generation for several tables in ehrQL

Posted:

ehrQL’s dummy data generator has been updated to be more context-aware.

Limited range for practice_pseudo_id

Instead of treating practice_registrations.practice_pseudo_id as a generic integer column, ehrQL now only generates dummy practice_pseudo_id values from 0 to 999 inclusive.

Dates generated in logical order

For a number of tables with multiple dates associated with the same event (e.g. addresses with a start_date and an end_date), the dates will now be generated in their expected chronological order (e.g. addresses.end_date on or after addresses.start_date).

See the ehrQL documentation for further information on the limitations of native ehrQL dummy data, including date logic.

The affected tables and columns are listed below.

  • practice_registrations: 0 <= practice_pseudo_id <= 999
  • addresses: start_date <= end_date
  • apcs / apcs_cost: admission_date <= discharge_date
  • appointments: booked_date <= start_date <= seen_date
  • ec_cost: ec_injury_date <= ec_arrival_date <= ec_decision_to_admit_date
  • opa / opa_cost / opa_diag / opa_proc: referral_request_received_date <= appointment_date
  • sgss_covid_all_tests: specimen_taken_date <= lab_report_date
  • wl_clockstops / wl_openpathways: referral_to_treatment_period_start_date <= referral_to_treatment_period_end_date