Retrospective analysis of 259 clinical CTs of patients with abdominal diseases.
Patients with focal liver parenchyma changes (e.g.
neoplasia or abscesses) and a history of liver surgery or liver interventions were excluded.
For all consecutive 259 patients the Child-Pugh Score was determined within the period of hospitalization.
The Child-Pugh Score is built on 5 sub-scores,
hereof 3 laboratory parameters (prothrombin time,
bilirubin,
and albumin) as well as 2 clinical parameters (ascites and encephalopathy).
The cut-off values for all parameters were defined according to Forman et al.
[11] (Tab. 1).
CT was performed by using helical CT scanners.
The contrast-enhanced images were acquired following body-weight adapted application of iodinated contrast media.
Subsequently the non-enhanced as well as hepatic arterial and portalvenous contrast phases were acquired.
All phases were registered using an automated rigid transformation and checked visually.
For liver delineation,
the venous phase was transferred to a separate workstation and analyzed semi-automatically (Fig.
1).
All segmentations were checked visually and corrected manually in those cases where automatic delineation missed the liver or vessel surface.
Radiomic features comprised first order statistical features as well as shape features and texture features (Gray-Level Co-Occurrence Matrix,
Gray-Level Size Zone Matrix,
Grey-Level Run Length Matrix).
In total,
3 machine learning approaches were used to predict the Child-Pugh Score:
1) Linear Regression (LR).
2) Random Forest (RF).
3) Convolutional Neural Network (CNN).
The performance of each machine learning model was assessed based on a 10-fold cross-validation procedure with splits into 10 % test,
27 % validation and 63 % training data,
re-spectively.
Splits were stratified such that a patient only ever belonged to one of the 3 sets.
For the human-reader-based rating (Radiologist’s prediction,
RP) of Child Pugh Class,
3 radi-ologists rated the appearance of the liver and allocated the CT to the corresponding Child-Pugh Classes.
Rank coefficient by means of Spearman’s ρ was used to quantify agreement with those pre-dicted by the machine learning algorithms.
Moreover,
the accuracy was determined for each model.
The measured accuracies were tested against the no-information-rate (NIR).
Additionally a classification of a low disease severity (Child-Pugh class= A) and an advanced disease severity (Child-Pugh class≥ B) was evaluated by means of accuracy,
sensitivity and specificity.
Receiver operating characteristic (ROC) analysis was performed by evaluation of the area under the ROC curve (AUC).