Keywords:
Breast, Mammography, Neural networks, Computer Applications-Detection, diagnosis, Technology assessment, Cancer
Authors:
R. Osuala, K. Kushibar, O. Diaz, K. Lekadir
DOI:
10.26044/ecr2023/C-24413
Conclusion
Our results suggest that pretraining models with synthetic mammography data can improve the performance of breast cancer classification models, particularly when addressing the utility of privacy-preserving deep learning models. When training clinical deep learning models with a patient privacy guarantee (i.e. under differential privacy), synthetic data can help to enhance the model’s privacy-utility trade-off. Rather than training on both synthetic and real patient data simultaneously, our experiments indicate that it can be beneficial to first pretrain on synthetic data before fine-tuning on real data. We also observe that training exclusively on synthetic data is a further viable alternative, in particular if the absence of real patient data allows to apply less strict differential privacy parameters. Our study motivates future work to validate these findings across further application domains, modalities, and clinical scenarios.