Case Study: Finding Rare Disease Patients in EHR Data

The best data for finding Rare Disease Patients is in the Electronic Health Record

 •“Most documented rare diseases have genetic origin. Because of their low individual frequency, an initial diagnosis based on phenotypic symptoms is not always easy, as practitioners might never have been exposed to patients suffering from the relevant disease. It is thus important to develop tools that facilitate symptom-based initial diagnosis of rare diseases by clinicians. In this work we aimed at developing a computational approach to aid in that initial diagnosis. We also aimed at implementing this approach in a user friendly web prototype. We call this tool Rare Disease Discovery. Finally, we also aimed at testing the performance of the prototype.”

“Methods. Rare Disease Discovery uses the publicly available ORPHANET data set of association between rare diseases and their symptoms to automatically predict the most likely rare diseases based on a patient’s symptoms. We apply the method to retrospectively diagnose a cohort of 187 rare disease patients with confirmed diagnosis. Subsequently we test the precision, sensitivity, and global performance of the system under different scenarios by running large scale Monte Carlo simulations. All settings account for situations where absent and/or unrelated symptoms are considered in the diagnosis.”

“Results. We find that this expert system has high diagnostic precision (≥80%) and sensitivity (≥99%), and is robust to both absent and unrelated symptoms.”

“Discussion. The Rare Disease Discovery prediction engine appears to provide a fast and robust method for initial assisted differential diagnosis of rare diseases. 


The Phenotype Richness of EHR Data for Rare Disease Patients


Rich Electronic Health Record Data: is the richest healthcare data by a landslide, particularly for finding rare disease patients, we help purpose the Electronic Health Record to do what it was intended to do


The EHR is particularly rich in phenotypical clues that are strong indicators of probable disease, these are only located in the free text unstructured data of the EHR



Rare Disease patients also follow semi predictable patterns of presentation across others data points exclusively within the EHR data


Putting Technology to Work for Patients

Structure Data + Phenotype Rich Unstructured Data + Intelligent Algorithms + Machine Learning + AI = Rare Disease Patient Discovery & Diagnosis

Rare Disease: Its in the Electronic Health Record Data

 “’The future is already invented; it just happens to be stuck in a research paper somewhere.” 


"Can a decision support system accelerate rare disease diagnosis? Evaluating the potential impact of Ada DX in a retrospective study"

RDAD: A Machine Learning System to Support Phenotype-Based Rare Disease Diagnosis

"RDAD: A Machine Learning System to Support Phenotype-Based Rare Disease Diagnosis"


"Computer-assisted initial diagnosis of rare diseases"