The Tale of Endocarditis: The Hidden Risks of Relying Solely on Clinical Coding Data

Hello, coding enthusiasts! Let’s deep dive into an intriguing aspect of our healthcare system: the accuracy of diagnostic codes in electronic health records. Specifically, we’re exploring a critical case study centered on infective endocarditis—an uncommon, but serious, infection.

The Scenario: Electronic Health Records and Diagnostic Codes

The healthcare sector extensively uses electronic health records (EHRs) to assess disease patterns, which are often identified through diagnostic codes. For instance, EHRs have been instrumental in studying the impact of changing guidance on antibiotic prophylaxis for dental procedures on endocarditis incidence. However, there are limited data on the accuracy of these diagnostic codes, which leads us to our investigation today.

Bridging Data and Real-Life Cases

A team of researchers led by Nicola Fawcett examined the relationship between diagnostic codes for endocarditis and confirmed clinical cases, based on objective Duke criteria. They sought to understand discrepancies and improve future study designs.

The team linked EHR data from two UK tertiary care centers with a clinical endocarditis service database (Leeds Teaching Hospital) or retrospective clinical audit and microbiology lab results (Oxford University Hospitals Trust).

A Disconnect in Coding and Clinical Reality

The research revealed some eye-opening findings. In Leeds, between 2006-2016, only 44% of the admissions with an endocarditis code represented a definite or possible case, and 24% of confirmed endocarditis cases had no endocarditis code assigned. The scenario was slightly better in Oxford (2010-2016), with 56% of admissions coded for endocarditis representing a clinical case.

Some diagnostic codes commonly used in endocarditis studies had good positive predictive value (PPV) but low sensitivity. One code, I38-secondary, had a PPV of under 6%. Using raw admission data to estimate endocarditis incidence exaggerated the incidence trends twofold.

Understanding the Discrepancies

Various reasons accounted for these discrepancies, such as changes in coding behavior over time, and coding guidance allowing the assignment of an ‘endocarditis’ code even when endocarditis wasn’t mentioned in the clinical notes.

The takeaway from this study is this: while diagnostic codes are valuable tools in healthcare research, their use without scrutinizing their sensitivity and predictive ability can result in inaccurate estimations of disease incidence and trends.

The Call to Validate and Curate

These revelations underscore the need for careful data curation and validation of diagnostic codes to minimize the risk of serious errors in health record studies. Just like the Latin phrase “caveat emptor” (let the buyer beware), it reminds us to take a critical look at the diagnostic coding data we’re ‘buying’ into.

So the next time we read a study or statistic based on diagnostic codes, let’s remember the tale of endocarditis. It’s a clear reminder that a thorough understanding of health data requires going beyond the codes and delving into real clinical scenarios.

As always, we hope this exploration has left you more informed and intrigued. Stay tuned for more illuminating conversations about the fascinating world of healthcare. Stay healthy, stay curious!


Fawcett N, Young B, Peto L, Quan TP, Gillott R, Wu J, Middlemass C, Weston S, Crook DW, Peto TEA, Muller-Pebody B, Johnson AP, Walker AS, Sandoe JAT. ‘Caveat emptor’: the cautionary tale of endocarditis and the potential pitfalls of clinical coding data-an electronic health records study. BMC Med. 2019 Sep 4;17(1):169. doi: 10.1186/s12916-019-1390-x. PMID: 31481119; PMCID: PMC6724235.

Related Articles


This site uses Akismet to reduce spam. Learn how your comment data is processed.