โก Quick Summary
This study utilized machine learning to identify key clinical variances associated with prolonged hospital stays (PLOS) in lung cancer patients. The ridge regression model outperformed others, achieving an AUC of 0.84 and highlighting critical factors such as abnormal respiratory sounds and postoperative fever.
๐ Key Details
- ๐ Dataset: 480 lung cancer patients, mean age 68.3 years
- ๐งฉ Features used: Patient demographics, comorbidities, surgical details, medications
- โ๏ธ Technology: Machine learning models including ridge regression, random forest, and extreme gradient boosting
- ๐ Performance: Ridge regression: AUC 0.84, Brier score 0.16 in derivation cohort
๐ Key Takeaways
- ๐ Prolonged hospital stays can lead to inefficiencies in healthcare delivery.
- ๐ก Machine learning effectively identifies clinical variances linked to PLOS.
- ๐ฉโ๐ฌ Data was sourced from the ePath system, focusing on real-world patient outcomes.
- ๐ Ridge regression was the most effective model for predicting PLOS.
- ๐ Six key variables were identified as strongly linked to PLOS, including abnormal respiratory sounds and postoperative fever.
- ๐ Study conducted at a university hospital between 2019 and 2023.
- ๐ Clinical implications suggest improved patient management through automated decision-making tools.

๐ Background
Prolonged hospital stays can significantly impact healthcare systems, leading to increased costs and resource utilization. Understanding the factors contributing to these extended stays is crucial for enhancing patient care and optimizing clinical pathways. This study aimed to leverage real-world data and advanced analytics to uncover these factors, particularly in patients undergoing video-assisted thoracoscopic surgery for lung cancer.
๐๏ธ Study
The research involved a cohort of 480 patients who underwent surgery between 2019 and 2023. The study defined PLOS as a hospital stay exceeding 9 days post-surgery. Various clinical variables were collected and analyzed using machine learning techniques, including sparse linear regression and decision tree ensembles, to identify predictors of PLOS.
๐ Results
The analysis revealed a comprehensive 3D heatmap that illustrated the relationships between clinical factors and PLOS. Among the five algorithms tested, the ridge regression model demonstrated superior performance, achieving an AUC of 0.84 and a Brier score of 0.16 in the derivation cohort. The final model highlighted several clinical variances, with a focus on six key variables that were most strongly associated with PLOS.
๐ Impact and Implications
The findings from this study have significant implications for clinical practice. By utilizing machine learning to identify critical variances in clinical pathways, healthcare providers can enhance decision-making processes and improve patient management strategies. This automated approach could lead to reduced hospital stays and better resource allocation, ultimately benefiting both patients and healthcare systems.
๐ฎ Conclusion
This study underscores the potential of machine learning in identifying factors associated with prolonged hospital stays. By focusing on real-world data, healthcare professionals can gain valuable insights that may lead to improved patient outcomes and more efficient healthcare delivery. The integration of such technologies into clinical practice holds promise for the future of patient management.
๐ฌ Your comments
What are your thoughts on the use of machine learning in identifying clinical variances? We would love to hear your insights! ๐ฌ Leave your comments below or connect with us on social media:
Identifying Key Variances in Clinical Pathways Associated With Prolonged Hospital Stays Using Machine Learning and ePath Real-World Data: Model Development and Validation Study.
Abstract
BACKGROUND: Prolonged hospital stays can lead to inefficiencies in health care delivery and unnecessary consumption of medical resources.
OBJECTIVE: This study aimed to identify key clinical variances associated with prolonged length of stay (PLOS) in clinical pathways using a machine learning model trained on real-world data from the ePath system.
METHODS: We analyzed data from 480 patients with lung cancer (age: mean 68.3, SD 11.2 years; n=263, 54.8% men) who underwent video-assisted thoracoscopic surgery at a university hospital between 2019 and 2023. PLOS was defined as a hospital stay exceeding 9 days after video-assisted thoracoscopic surgery. The variables collected between admission and 4 days after surgery were examined, and those that showed a significant association with PLOS in univariate analyses (P<.01) were selected as predictors. Predictive models were developed using sparse linear regression methods (Lasso, ridge, and elastic net) and decision tree ensembles (random forest and extreme gradient boosting). The data were divided into derivation (earlier study period) and testing (later period) cohorts for temporal validation. The model performance was assessed using the area under the receiver operating characteristic curve, Brier score, and calibration plots. Counterfactual analysis was used to identify key clinical factors influencing PLOS.
RESULTS: A 3D heatmap illustrated the temporal relationships between clinical factors and PLOS based on patient demographics, comorbidities, functional status, surgical details, care processes, medications, and variances recorded from admission to 4 days after surgery. Among the 5 algorithms evaluated, the ridge regression model demonstrated the best performance in terms of both discrimination and calibration. Specifically, it achieved area under the receiver operating characteristic curve values of 0.84 and 0.82 and Brier scores of 0.16 and 0.17 in the derivation and test cohorts, respectively. In the final model, a range of variables, including blood tests, care, patient background, procedures, and clinical variances, were associated with PLOS. Among these, particular emphasis was placed on clinical variances. Counterfactual analysis using the ridge regression model identified 6 key variables strongly linked to PLOS. In order of impact, these were abnormal respiratory sounds, postoperative fever, arrhythmia, impaired ambulation, complications after drain removal, and pulmonary air leaks.
CONCLUSIONS: A machine learning-based model using ePath data effectively identified critical variances in the clinical pathways associated with PLOS. This automated tool may enhance clinical decision-making and improve patient management.
Author: [‘Tou S’, ‘Matsumoto K’, ‘Hashinokuchi A’, ‘Kinoshita F’, ‘Nohara Y’, ‘Yamashita T’, ‘Wakata Y’, ‘Takenaka T’, ‘Soejima H’, ‘Yoshizumi T’, ‘Nakashima N’, ‘Kamouchi M’]
Journal: JMIR Med Inform
Citation: Tou S, et al. Identifying Key Variances in Clinical Pathways Associated With Prolonged Hospital Stays Using Machine Learning and ePath Real-World Data: Model Development and Validation Study. Identifying Key Variances in Clinical Pathways Associated With Prolonged Hospital Stays Using Machine Learning and ePath Real-World Data: Model Development and Validation Study. 2025; 13:e71617. doi: 10.2196/71617