ā” Quick Summary
A recent study developed a random survival forest (RSF) model utilizing machine learning to predict early recurrence after hepatectomy for adult hepatocellular carcinoma (HCC). The RSF model demonstrated superior predictive ability compared to traditional models, achieving a C-index of 0.896 in the training group.
š Key Details
- š Dataset: 541 patients with HCC
- āļø Technology: Random Survival Forest (RSF) model
- š Performance: C-index for RSF: 0.896 (training), 0.798 (validation)
- š AUC values: RSF model at 6 months: 0.971 (training), 0.830 (validation)
š Key Takeaways
- š RSF model outperformed the traditional Cox Proportional Hazard model in predicting early recurrence of HCC.
- š AUC values for RSF model were consistently higher than those of the CPH and ALBI grade models.
- š Key variables identified included microvascular invasion (MVI), liver capsule invasion (LCI), and satellite nodules (SN).
- š” Clinical utility of the RSF model was confirmed through decision curve analysis (DCA).
- šļø Patient stratification into low, medium, and high-risk groups was successfully achieved.
- š Time points assessed were at 6, 12, and 18 months post-surgery.
- š¬ Study conducted at a single center with a robust sample size.
- š Implications for improved postoperative care and follow-up strategies for HCC patients.
š Background
Hepatocellular carcinoma (HCC) is known for its high rates of early recurrence following surgical intervention, which significantly impacts patient prognosis. Traditional predictive models, such as the Cox Proportional Hazard (CPH) model, often fall short due to their linear assumptions and inability to capture the complexity of clinical data. This study explores the potential of machine learning, specifically the random survival forest (RSF) model, to enhance predictive accuracy in this challenging clinical scenario.
šļø Study
Conducted as a retrospective cohort study, this research included a total of 541 patients who underwent hepatectomy for HCC. After excluding 41 patients, the remaining cohort was divided into a training group (378 patients) and a validation group (163 patients). The study employed LASSO regression to identify significant risk factors, which were then utilized to develop both the RSF and CPH models for comparison.
š Results
The RSF model, built using 500 trees, achieved a C-index of 0.896 in the training group and 0.798 in the validation group, outperforming the CPH model (C-index: 0.803 and 0.772, respectively). The area under the receiver operating characteristic curve (AUC) for the RSF model at 6 months was 0.971 in the training group and 0.830 in the validation group, indicating robust predictive capabilities. Additionally, the RSF model demonstrated superior clinical utility as evidenced by decision curve analysis.
š Impact and Implications
The findings from this study suggest that the RSF model could significantly improve the management of patients undergoing hepatectomy for HCC. By accurately predicting early recurrence, clinicians can better stratify patients into risk categories, allowing for tailored follow-up care and potentially improving overall patient outcomes. This advancement in predictive modeling represents a promising step forward in the fight against HCC recurrence.
š® Conclusion
The random survival forest model has shown remarkable efficacy in predicting early recurrence after hepatectomy for HCC, surpassing traditional models in both predictive accuracy and clinical utility. As we continue to integrate machine learning into clinical practice, tools like the RSF model may become invaluable in guiding postoperative care and improving patient prognoses. Further research is encouraged to validate these findings across diverse populations and clinical settings.
š¬ Your comments
What are your thoughts on the use of machine learning in predicting cancer recurrence? We would love to hear your insights! š¬ Please share your comments below or connect with us on social media:
Construction of a random survival forest model based on a machine learning algorithm to predict early recurrence after hepatectomy for adult hepatocellular carcinoma.
Abstract
BACKGROUND AND AIMS: Hepatocellular carcinoma (HCC) exhibits a propensity for early recurrence following liver resection, resulting in a bleak prognosis. At present, majority of the predictive models for the early postoperative recurrence of HCC rely on the linear assumption of the Cox Proportional Hazard (CPH) model. However, the predictive efficacy of this model is constrained by the intricate nature of clinical data. The present study aims to investigate the efficacy of the random survival forest (RSF) model, which is a machine learning algorithm, in predicting the early postoperative recurrence of HCC, and compare its performance with that of the traditional CPH model. This analysis seeks to elucidate the potential advantages of the RSF model over the CPH model in addressing this clinical challenge.
METHODS: The present retrospective cohort study was conducted at a single center. After excluding 41 patients, a total of 541 patients were included in the final model construction and subsequent analysis. The patients were randomly divided into two groups at a 7:3 ratio: training group (nā=ā378) and validation group (nā=ā163). The least absolute shrinkage and selection operator (LASSO) regression was used to identify the risk factors in the training group. Then, the identified factors were used to develop the RSF and CPH regression models. The predictive ability of the model was assessed using the concordance index (C-index). The accuracy of the model predictions was evaluated using the receiver operating characteristic curve (ROC) and area under the receiver operating characteristic curve (AUC). The clinical practicality of the model was measured by decision curve analysis (DCA), and the overall performance of the model was evaluated using the Brier score. The RSF model was visually represented using the Shapley additive explanations (SHAP) framework. Then, the RSF, CPH regression, and albumin-bilirubin (ALBI) grade models were compared.
RESULTS: The following variables were examined by LASSO regression: alpha fetoprotein (AFP), gamma-glutamyl transpeptidase to platelet ratio (GPR), blood transfusion (BT), microvascular invasion (MVI), large vessel invasion (LVI), Edmondson-Steiner (ES) grade, liver capsule invasion (LCI), satellite nodule (SN), and Barcelona clinic liver cancer (BCLC) grade. Then, a RSF model was developed using 500 trees, and the variable importance (VIMP) ranking was MVI, LCI, SN, BT, BCLC, ESG, AFP, GPR and LVI. After these aforementioned factors were applied, the RSF and CPH regression models were developed and compared using the ALBI grade model. The C-index for the RSF model (0.896 and 0.798, respectively) outperformed that of the CPH regression model (0.803 and 0.772, respectively) and ALBI grade model (0.517 and 0.515, respectively), in both the training and validation groups. Three time points were selected to assess the predictive capabilities of these models: 6, 12 and 18 months. For the training group, the AUC value for the RSF model at 6, 12 and 18 months was 0.971 (95% CI: 0.955-0.988), 0.919 (95% CI: 0.887-0.951) and 0.899 (95% CI: 0.867-0.932), respectively. For the validation cohort, the AUC value for the RSF model at 6, 12 and 18 months was 0.830 (95% CI: 0.728-0.932), 0.856 (95% CI: 0.787-0.924) and 0.832 (95% CI: 0.764-0.901), respectively. The AUC values were higher in the RSF model, when compared to the CPH regression model and ALBI grade model, in both groups. The DCA results revealed that the net clinical benefits associated to the RSF model were superior to those associated to the CPH regression model and ALBI grade model in both groups, suggesting a higher level of clinical utility in the RSF model. The Brier score for the RSF model at 6, 12 and 18 months was 0.062, 0.125 and 0.178, respectively, in the training group, and 0.111, 0.128 and 0.149, respectively, in the validation group. In summary, the RSF model demonstrated superior performance, when compared to the CPH regression model and ALBI grade model. Furthermore, the RSF model demonstrated superior predictive ability, accuracy, clinical practicality, and overall performance, when compared to the CPH regression model and ALBI grade model. In addition, the RSF model was able to successfully stratify patients into three distinct risk groups (low-risk, medium-risk and high-risk) in both groups (pā<ā0.001).
CONCLUSIONS: The RSF model demonstrates efficacy in predicting early recurrence following HCC surgery, exhibiting superior performance, when compared to the CPH regression model and ALBI grade model. For patients undergoing HCC surgery, the RSF model can serve as a valuable tool for clinicians to postoperatively stratify patients into distinct risk categories, offering guidance for subsequent follow-up care.
Author: [‘Zhang J’, ‘Chen Q’, ‘Zhang Y’, ‘Zhou J’]
Journal: BMC Cancer
Citation: Zhang J, et al. Construction of a random survival forest model based on a machine learning algorithm to predict early recurrence after hepatectomy for adult hepatocellular carcinoma. Construction of a random survival forest model based on a machine learning algorithm to predict early recurrence after hepatectomy for adult hepatocellular carcinoma. 2024; 24:1575. doi: 10.1186/s12885-024-13366-4