Data Quality in Healthcare: The Potential Impact on Autism Studies

Large healthcare datasets can provide valuable insights that can significantly improve patient outcomes. However, it’s vital to understand the strengths and limitations of these datasets to ensure the accuracy of the insights they deliver. A recent study, led by Johannes Heyl et al. (2022), aimed to identify data inconsistencies within the Hospital Episodes Statistics (HES) dataset for autistic patients, examining the potential biases and impacts on patient outcomes.

Identifying Data Inconsistencies in Autism Diagnoses

The study analysed data from the HES database for patients diagnosed with autism from 1st April 2013 to 31st March 2021. The initial hospital spells for each patient were identified and linked to any subsequent hospital spell for the same patient. Data inconsistencies were noted where autism was not recorded as a diagnosis in subsequent spells.

Importantly, the study could only identify inconsistencies in recording the autism diagnosis; it couldn’t determine whether the inclusion or exclusion of the autism diagnosis was the error.

Results: High Levels of Inconsistencies

Data were available for 172,324 unique patients initially recorded as having an autism diagnosis. Alarmingly, 43.7% of subsequent spells showed inconsistencies in recording this diagnosis.

Factors most strongly associated with inconsistencies included greater age, higher levels of deprivation, longer time since the first hospital spell, change in the provider, shorter length of stay, being female, and a change in the main specialty description.

The team also used a random forest algorithm, which predicted data inconsistency with an impressive area under the receiver operating characteristic curve of 0.864 (95% CI [0.862 – 0.866]).

For patients who died in hospital, data inconsistencies in their final spell were significantly associated with being 80 years and over, being female, greater deprivation, and the use of a palliative care code in the death spell.

The Potential Impact on Autism Studies

The relatively common data inconsistencies in the HES database for autistic patients have significant implications. The issues found in this study were associated with a range of patient and hospital admission characteristics.

Such inconsistencies can potentially distort our understanding of service use in key demographic groups. In other words, without taking these inconsistencies into account, researchers, practitioners, and policymakers may arrive at flawed conclusions about the needs and experiences of autistic patients.

As we aim for more inclusive healthcare systems that cater to the unique needs of all patient groups, it is crucial to address these data quality issues, especially for communities such as the autism community, who already face considerable healthcare disparities.

Reference: Heyl, J., Hardy, F., Tucker, K., Hopper, A., Marchã, M. J., Liew, A., Reep, J., Harwood, K-A., Roberts, L., Yates, J., Day, J., Wheeler, A., Eve-Jones, S., Briggs, T. W. R., & Gray, W. K. (2022). Data quality and autism: Issues and potential impacts. International Journal of Medical Informatics, 159, 104938. DOI: 10.1016/j.ijmedinf.2022.104938

Related Articles


This site uses Akismet to reduce spam. Learn how your comment data is processed.