Aims and objectives
To stress test the performance of a deep learning algorithm on a dataset with spectrum bias against normalcy in chestx-ray normal vs.
Methods and materials
A Deep Learning algorithm consisting of an ensemble of 14 Convolutional Neural Networks (CNN) and a weighting Fully Connected Network (Fig.
1) were trained with more than 112,000 Chest X Ray studiesidentified with one or more labels from14 different thoracic pathologies defined.
The 14 CNN were based in the VGG-19 (Fig.
2) architecture and transfer learning with ImageNet dataset was used to accelerate convergence and improve the performance of the algorithm.
The output of the algorithm was the probability of an input image of being...
The algorithm correctly classified 237 (78.74%) CXRs with a sensitivity of 83.76% (95% CI - 77.85% to 88.62%) and specificity of 69.23% (95% CI - 59.42% to 77.91%).
There were equal number of false positives and false negative cases- 32 (13.5%).
For screening applications sensitivity is crucial due to overlooking a patholology may cause severe consequences for patients,
therefore isvery convenientand positivethat the system's performance under a stress testing prioritize sensitivity over specificity.
As compared to the validation results,
there is an increment in the performance of the deep learning algorithm on the stress test on biased datasets with more abnormal scans than normal scans.
1- Wang X.,
ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases.
A systematic study of the class imbalance problem inconvolutional neural networks.
A Very Deep Convolutional Neural Networks for Large-ScaleImage Recognition.
Imagenet: A large-scalehierarchical image database.
IEEE Conference on Computer Vision and Pattern Recognition,