Machine learning based outcome prediction models, can be utilised in personalised medicine to provide individualised estimates of patient prognosis.
Prior to clinical use, it is necessary to demonstrate the model’s accuracy and generalisability in the broader population by performing external validation in heterogeneous patient populations.
This study aims to:
Perform an external validation of a larynx cancer outcome prediction model  for 2 year overall survival (OS) and local recurrence (LR) in an Australian patient cohort.
Determine the impact of missing data by comparing model...
Methods and materials
The study cohort comprised of patients with laryngeal squamous cell carcinoma, diagnosed between 2010 – 2018 and treated with primary radiation therapy (RT) +/- systemic therapy at Liverpool and Macarthur cancer therapy centres (LCTC).
Uncurated data stored in a structured format were extracted from electronic medical records (EMR).
Manual data curation was conducted to detect missing variable values in non-structured locations within EMR.
Imputation methods were used to address missing values in both the curated and uncurated datasets.
Performance of the model was expressed as...
106 patients met eligibility criteria for model validation in the uncurated cohort (LCTC uncurated); and 105 patients in the curated cohort (LCTC curated): one patient was reclassified post curation as having had a non-laryngeal primary tumour. Patient characteristics are summarised in Table 1. [Fig 1]
The patient characteristics of the LCTC cohort were different to the original published model cohort. The original model cohort consisted of patients receiving RT only, whereas the LCTC cohort included patients that received RT alone or with systemic therapy. Therefore...
External validation of a larynx outcome prediction model demonstrated similar model discrimination performance to the original model for OS but inferior performance for LR in our validation patient cohort.
The LR model (both cohorts) and OS model (uncurated cohort) demonstrated poor calibration in the validation patient cohort, with calibration curves demonstrating differences between predicted survival compared to actual survival outcomes.
The accuracy of the model performance on our validation patient cohort may be affected by several factors:
Small sample size of the validation patient cohort...
 A. G. T. M. Egelmeer et al., “Development and validation of a nomogram for prediction of survival and local control in laryngeal carcinoma patients treated with radiotherapy alone: A cohort study based on 994 patients,” Radiother. Oncol., vol. 100, pp. 108–115, Jul. 2011, doi: 10.1016/j.radonc.2011.06.023.