โก Quick Summary
This study utilized electronic health records to predict the onset of Alzheimer’s disease (AD) and related dementias (ADRD) in a population-based cohort. The findings revealed that a Random Forest model achieved an AUC of 0.67 for predicting dementia, highlighting the potential of structured clinical data in dementia prediction.
๐ Key Details
- ๐ Dataset: 4,206 participants from the Cache County Study (1995-2008)
- ๐งฉ Features used: 163 diagnostic features and 6 sociodemographic features
- โ๏ธ Technology: Machine learning, specifically Random Forest models
- ๐ Performance: AUC of 0.67 for overall dementia prediction
๐ Key Takeaways
- ๐ 12.4% of participants developed incident dementia during the study period.
- ๐ก Random Forest models provided the best predictive performance with a 1-year prediction window.
- ๐ AUC for AD/ADRD: 0.65; for ADRD alone: 0.49.
- ๐ Using ICD-based diagnoses improved accuracy to an AUC of 0.77.
- ๐ Structured clinical data can modestly predict dementia, but further research is needed.
- ๐ง Prior research supports the findings of this study regarding prediction capabilities.
- ๐ Study conducted in Cache County, Utah, focusing on aging and memory.
- ๐ PMID: 39468568.
๐ Background
The prediction of dementia, particularly Alzheimer’s disease, is a critical area of research due to its growing prevalence in aging populations. Traditional methods of diagnosis often rely on clinical notes, biomarkers, and neuroimaging, which may not always be readily available. This study explores the potential of using structured clinical data from electronic health records to enhance dementia prediction models.
๐๏ธ Study
The research was conducted using data from the Cache County Study, which spanned from 1995 to 2008. The study aimed to link administrative healthcare data with sociodemographic information to predict dementia diagnoses. A total of 4,206 participants were analyzed, with a focus on identifying incident cases of dementia through machine learning techniques.
๐ Results
The study found that among the participants, 12.4% developed dementia according to the “gold standard” assessments. The Random Forest model demonstrated an overall AUC of 0.67 for predicting dementia, with varying accuracy for different subtypes. Notably, the use of ICD-based diagnoses significantly improved predictive accuracy, achieving an AUC of 0.77.
๐ Impact and Implications
The findings of this study suggest that structured clinical data can play a valuable role in predicting dementia, potentially leading to earlier interventions and better management of the disease. However, the modest predictive ability indicates that further research is necessary to refine these models and ensure they accurately reflect true dementia diagnoses. This could pave the way for improved healthcare strategies in managing Alzheimer’s disease and related dementias.
๐ฎ Conclusion
This study highlights the potential of using machine learning and electronic health records to predict the onset of Alzheimer’s disease and related dementias. While the results are promising, they also underscore the need for continued research to enhance the accuracy of dementia prediction models. The integration of AI in healthcare could significantly improve patient outcomes in the future.
๐ฌ Your comments
What are your thoughts on the use of electronic health records for predicting dementia? We invite you to share your insights and engage in a discussion! ๐ฌ Leave your comments below or connect with us on social media:
Predicting the onset of Alzheimer’s disease and related dementia using electronic health records: findings from the cache county study on memory in aging (1995-2008).
Abstract
INTRODUCTION: Clinical notes, biomarkers, and neuroimaging have proven valuable in dementia prediction models. Whether commonly available structured clinical data can predict dementia is an emerging area of research. We aimed to predict gold-standard, research-based diagnoses of dementia including Alzheimer’s disease (AD) and/or Alzheimer’s disease related dementias (ADRD), in addition to ICD-based AD and/or ADRD diagnoses, in a well-phenotyped, population-based cohort using a machine learning approach.
METHODS: Administrative healthcare data (kโ=โ163 diagnostic features), in addition to census/vital record sociodemographic data (kโ=โ6 features), were linked to the Cache County Study (CCS, 1995-2008).
RESULTS: Among successfully linked UPDB-CCS participants (nโ=โ4206), 522 (12.4%) had incident dementia (AD alone, AD comorbid with ADRD, or ADRD alone) as per the CCS “gold standard” assessments. Random Forest models, with a 1-year prediction window, achieved the best performance with an Area Under the Curve (AUC) of 0.67. Accuracy declined for dementia subtypes: AD/ADRD (AUCโ=โ0.65); ADRD (AUCโ=โ0.49). Accuracy improved when using ICD-based dementia diagnoses (AUCโ=โ0.77).
DISCUSSION: Commonly available structured clinical data (without labs, notes, or prescription information) demonstrate modest ability to predict “gold-standard” research-based AD/ADRD diagnoses, corroborated by prior research. Using ICD diagnostic codes to identify dementia as done in the majority of machine learning dementia prediction models, as compared to “gold-standard” dementia diagnoses, can result in higher accuracy, but whether these models are predicting true dementia warrants further research.
Author: [‘Schliep KC’, ‘Thornhill J’, ‘Tschanz JT’, ‘Facelli JC’, ‘รstbye T’, ‘Sorweid MK’, ‘Smith KR’, ‘Varner M’, ‘Boyce RD’, ‘Cliatt Brown CJ’, ‘Meeks H’, ‘Abdelrahman S’]
Journal: BMC Med Inform Decis Mak
Citation: Schliep KC, et al. Predicting the onset of Alzheimer’s disease and related dementia using electronic health records: findings from the cache county study on memory in aging (1995-2008). Predicting the onset of Alzheimer’s disease and related dementia using electronic health records: findings from the cache county study on memory in aging (1995-2008). 2024; 24:316. doi: 10.1186/s12911-024-02728-4