Purpose
The performances of image segmentation/translation algorithms are typically evaluated by measuring image similarity metrics like DICE score or SSIM. In some instances, this approach may be counter-productive. In this study, we propose to compare such an approach with more clinical relevance focussed qualitative assessment method for estimating the accuracy of a virtually generated diffusion-weighted (DW) sequences using Generative Adversarial Networks (GAN).
Methods and materials
we used a previously described Virtual Imaging Using Generative Adversarial Networks for Image Translation (VIGANIT) network which comprises a 15-layer deep convolution neural network (CNN) used in conjunction with a GAN to improve the clarity of the output image. VIGANIT was used to predict B1000 diffusion-weighted image from input T2W images in 24 cases (12 cases of acute and chronic infarcts each). The ground truth B1000 DW and the predicted B1000 images were blinded and randomized. A radiologist with 9 years’ experience in MRI did...
Results
The DICE score for the cases with acute infarcts ranged 0 to 0.85 with an average of 0.43 and the dark areas ranged from 0.27 to 0.81 with an average of 0.46. The qualitative assessment revealed that eight out of the 12 cases had positive scan level predictions of restricted diffusion. None of the 12 chronic infarct cases had false predictions of restricted diffusion. There was an absence of comparable predictions in 4 out of the 12 cases with acute infarcts. Two of these four...
Conclusion
Despite the low dice score co-efficient for image translation, the scan level accuracy for the clinical classification of presence or absence of acute infarct was reasonably good. This study makes the case for additionally employing clinical-significance of lesions as an indicator of model performance.
In this study, we demonstrate a significant change in the acceptability score of an image translation network by applying a more clinically relevant assessment method as compared to in-silico mathematical methods.
Personal information and conflict of interest
V. K. Venugopal; New Delhi/IN - Other at Research collaboration, General Electric Company Research collaboration, Koninklijke Philips NV Research collaboration, Qure.ai Research collaboration, Predible Health V. Mahajan; New Delhi/IN - nothing to disclose A. Venkatraman; Bangalore/IN - Employee at TriOcula Technologies A. Upadhyaya; Bangalore/IN - Employee at TriOcula Technologies S. Rajan; New Delhi/IN - nothing to disclose H. Mahajan; New Delhi/IN - Other at Director, Mahajan Imaging Pvt Ltd Research collaboration, General Electric Company Research collaboration, Koninklijke Philips NV Research collaboration, Qure.ai Research collaboration, Predible...
References
1. Bertels, J. et al. Optimizing the Dice Score and Jaccard Index for Medical Image Segmentation: Theory & Practice. arXiv:1911.01685 [cs, eess] 11765, 92–100 (2019).
2. Yu, Z. et al. Retinal image synthesis from multiple-landmarks input with generative adversarial networks. Biomed Eng Online 18, (2019)
3. Towards virtual MR imaging: predicting diffusion-weighted brain MR images from T2-weighted images using convolutional neural networks V. Mahajan et al. ECR 2019
4. Hagiwara, A. et al. Improving the Quality of Synthetic FLAIR Images with Deep Learning Using a...