Methods for Long-Horizon Event Prediction from Electronic Health Record Data

March 4, 2024
12:00 pm to 1:00 pm

Event sponsored by:

Computational Biology and Bioinformatics (CBB)
Biostatistics and Bioinformatics
Duke Center for Genomic and Computational Biology (GCB)
Precision Genomics Collaboratory
School of Medicine (SOM)


Franklin, Monica


Matt Engelhard, Phd, Assistant Professor, Department of Biostatistics and Bioinformatics, Duke University


Matt Engelhard
Early identification of autism, ADHD, and other neurodevelopmental conditions is important to ensure children receive appropriate developmental support and thereby optimize long term outcomes. Early correlates of these conditions are documented in the electronic health record (EHR) during routine care, and our previous work has shown that EHR data can be leveraged to predict autism and ADHD likelihood with clinically meaningful accuracy prior to age 1. Much like other EHR-based prediction tasks, this task is challenging due to the high-dimensional, multi-modal, irregularly observed nature of the relevant predictors. Additionally, other challenges are amplified due to the breadth of the target population (all children) and length of time between prediction and outcome (several years). Specifically, there is substantial variability in the quantity and quality of data available for prediction, average follow-up length is much lower than average time to diagnosis, and time to diagnosis is affected by systemic biases associated with the diagnosis process. This talk will explore methods we have developed in response to these challenges, with emphasis on a neural mixture cure model designed to disentangle factors affecting the age at diagnosis from factors affecting lifetime diagnosis probability.

CBB Monday Seminar Series