
Speaker:
Abstract: Data from Electronic health records (EHR) present a huge opportunity to operationalize a standardized whole-person health score in the learning health system and identify at-risk patients on a large scale, except they are prone to missingness and errors. Ignoring these data quality issues could lead to biased statistical results and incorrect clinical decisions. Validation of EHR data can provide better-quality data. Still, realistically, only a subset of patients' data can be validated, and most protocols do not recover missing data. Using a representative sample of 1000 patients from the EHR at an extensive learning health system (100 of whom could be validated), we bridge statistics and bioinformatics methods to design, conduct, and analyze statistically efficient and robust studies of the ALI and healthcare utilization. Employing semiparametric sieve maximum likelihood estimation, we robustly incorporate all available patient information into statistical models. Using targeted design strategies, we examine ways to select the most informative patients for validation. Incorporating clinical expertise, we devise a novel validation protocol to promote EHR data quality and completeness. Targeted validation with an enriched protocol allowed us to ensure the quality and promote the completeness of the EHR. Findings from our validation study were incorporated into statistical models, which indicated that worse whole-person health was associated with higher odds of engaging in the healthcare system, adjusting for age.
Bio: Dr. Sarah Lotspeich is an Assistant Professor of Statistics at Wake Forest University. Sarah earned her Ph.D. in Biostatistics from Vanderbilt University in 2021 and completed a postdoctoral fellowship in Biostatistics at the University of North Carolina at Chapel Hill in 2022. Her research tackles challenges in analyzing error-prone and incomplete real-world data, focusing on international HIV cohorts, electronic health records, and health disparities, and in developing statistical models with censored covariates, applicable to Huntington's disease. Sarah has published in peer-reviewed statistical, clinical, and epidemiological journals, and she is the 2023 recipient of the David P. Byar Early Career Award from the American Statistical Association Biometrics Section.
Zoom: https://duke.zoom.us/j/93356559027?pwd=pKJMtCGCIz4QdKtpZ9umqyYiNcLWnc.1
Meeting ID: 933 5655 9027
Passcode: 532900