Utilizing Deep Learning to Standardize Annotation and Labeling of Large Abdominal CT Multi-Centre Datasets

Congress:

ECR 2019

Poster Number:

C-3057

Type:

Scientific Exhibit

Keywords:

Computer Applications-General, CT, Liver, Artificial Intelligence, Abdomen, Image verification

Authors:

R. Remtulla¹, S. L. Mihalcioiu², J. W. Luo², B. Gallix², J. J. R. Chong²; ¹Montreal, Quebec/CA, ²Montreal, QC/CA

DOI:

10.26044/ecr2019/C-3057

DOI-Link:

https://dx.doi.org/10.26044/ecr2019/C-3057

Aims and objectives

The development of machine learning and AI systems in healthcare critically relies on large well-labelled datasets with adequate data volume, annotation, truth, and reusability [1]. Unfortunately, it has been cited that the day-to-day radiologist error rate is on average 3–5% and in medicine the rate of missed, incorrect, or delayed diagnoses has been reported as high as 10-15% [2-3]. As well, Sadigh et al. found that imaging reports were categorized into either mislabeled or misidentified patient or wrong dictation or report events at a rate of 4 per 100,000 examinations [4]. This significant error rate corrupts image datasets, thus compromising the accuracy of classifiers. Although certain image classifiers can demonstrate high test performance when trained on corrupt data, higher levels of noise in datasets can impair classifier performance on more complex tasks [5-6]. Furthermore, heterogeneous datasets are often unsuitable for training sets in machine learning. Different imaging facilities can have various imaging protocols, with one study finding that across 13 workgroups there were 481 distinct MRI and CT protocols [7]. Furthermore, imaging protocols may transition over time and inter-radiographer variability can further variability. Even variability among radiologist interpretations can contribute to heterogeneity, with some studies indicating a median weighted percentage of agreement between radiologists to be only 78 percent [8]. Additionally, depending on the technician or machine used, the orientation and labeling may differ between scans. As a result medical imaging data collected for training classifiers is often inaccurate and heterogeneous, compromising the performance of the generated classification model. We propose that neural networks could be used to curate and standardize image datasets prior to further machine learning development. We explore direct visual pre-sorting of images using a pre-trained deep convolutional neural network approach to generate large standardized cohorts of training data by labeling the phases of enhancement of abdominal liver CT scans from multiple imaging centres.