Towards radiologist-level malignancy detection on chest CT scans: a comparative study of the performance of convolutional neural networks and four thoracic radiologists

Congress:

ECR 2019

Poster Number:

C-2065

Type:

Scientific Exhibit

Keywords:

Artificial Intelligence, Lung, CT, Computer Applications-Detection, diagnosis, Cancer

Authors:

V. Venugopal¹, A. VAIDYA², A. AHUJA², Y. Singh², K. Vaidhya³, A. Raj³, V. Mahajan², S. Vaidya⁴, A. Rangasai Devalla³; ¹Aligarh/IN, ²New Delhi/IN, ³Bangalore/IN, ⁴Mumbai/IN

DOI:

10.26044/ecr2019/C-2065

DOI-Link:

https://dx.doi.org/10.26044/ecr2019/C-2065

Fig. 1: A Feature Pyramid Network used as nodule detector in the CADe system.

Fig. 2: Snapshot of a 3.4cm nodule on the right upper lobe detected by the system.

Fig. 3: A suspicious lobulated nodule detected by the system in the right lower lobe

Fig. 4: A suspicious part-solid nodule detected by the system in the right upper lobe

Methods and materials

Data preparation:

In this retrospective study, low-dose chest CTs were taken from the NLST dataset. 1245 CT scans were taken for training and 350 CT scans were taken for validation. Pathologically proven malignancy status of lung cancers was taken as ground-truth. Lung nodule annotations from 4 radiologists were taken from 888 CT scans from the publicly available LIDC-IDRI ^[3] dataset. CT scans with slice-thickness > 2.5mm were excluded to avoid partial-volume effect as recommended by Ginneken et al ^[4] and Setio et al ^[5].

Nodule detection is a volumetric detection task and hence all CT scans were resampled to isotropic voxel spacing of [1.0, 1.0, 1.0] mm in each direction to leverage the computational capacity of 3D convolutions and HU windowing was done from -1200 HU to 600 HU to visualize the lung fields effectively.

Training:

A deep learning system based on convolutional neural networks was trained to predict the malignancy status from CT scans of the chest. The deep learning system comprises of a nodule detector and a malignancy estimator. The nodule detection system was trained and validated to pick up pulmonary nodules >= 3mm on 554 CT scans from NLST and 888 CT scans from LIDC-IDRI. The malignancy estimator was trained on the 1245 CT scans and validated on 350 CT scans from NLST.

CADe (Nodule Detector):

The CADe system is an ensemble of 3 single-shot 3D Feature Pyramid Networks (FPN) ^[5][6] which is trained to detect lung nodules from the CT scans. The 3D FPN is built with a U-Net encoder-decoder architecture, composed of 3D convolutions, to maximize the effective receptive field and fuse multi-scale information.

Multi-scale information is essential for differentiating pulmonary nodules from vasculature present in the organs. The network takes in a 3D patch of size 128x128x128 as input and gives out 32x32x32x3x5 as output, with 3 anchor boxes of varying size limits for each network in the ensemble. During inference, the ensemble is rolled over the CT scan with 128x128x128 overlapping patches and the predicted bounding boxes are fused with non-maximum suppression to provide the final candidates for lung nodules.

The CADe system is trained on 711 CTs from LIDC-IDRI and 554 CTs from NLST with Adam optimizer and a learning rate of 0.0001, a weight decay of 0.0005 and a dropout of 0.5. Validation was done on 177 CTs from LIDC-IDRI. The data was augmented with nodules of different sizes to ensure training was not biased towards detecting small nodules.

CADx (Malignancy Estimator):

The CADx system is a leaky Noisy-OR gate ^[6] based on deep convolutional neural networks. The noisy-OR model operates on 96x96x96 patches from each detected nodule, fuses the information from each nodule and gives the probability of the patient being affected by lung cancer.

During training, top 5 nodule candidates, based on their nodule probabilities, are taken from the CADe system and fed to the noisy-OR model. A leakage probability is assigned to the CADx model during training to account for missed primary nodules/masses by the CADe system. The CADx model shares the same backbone as the CADe model with the convolutional layers sharing their weights to avoid over-fitting.

The CADx system was trained on 1245 CT scans and validated on 350 CT scans from NLST. During inference, all the detected nodules are considered to compute the overall malignancy risk at a scan-level.

Evaluation:

100 unseen low-dose CT scans from the validation set were chosen at random and predictions were generated from the deep learning system. Studies were randomized and presented to 4 thoracic radiologists with 2, 5, 8- and 15-years’ experience to characterize the chest CT scans. The radiologists were asked to assess the probability of malignancy in the scans on a Likert scale of 1 (highly unlikely) to 5 (highly suspicious). The ROC curves were analysed for the AI and the radiologists. Post-analysis, 4 CT scans without lung nodules but marked malignant in the NLST EMR were removed from the study.