Keywords:
Breast, Computer applications, Mammography, Neural networks, Computer Applications-Detection, diagnosis, Efficacy studies
Authors:
E. F. Conant1, S. Periaswamy2, S. Fotin2, J. Go2, J. Pike2, J. boatsman3, J. Hoffmeister2; 1Philadelphia, PA/US, 2Nashua, NH/US, 3San Antonio, TX/US
DOI:
10.26044/ecr2019/C-1648
Methods and materials
A multi-reader,
multi-case (MRMC) study was retrospectively conducted with 24 radiologists and 260 cases with DBT exams. All cases were read by each radiologist both with and without the AI system,
with at least 4 weeks between readings of the same with and without AI.
The 260 cases included 65 biopsy-proven cancer cases with 66 malignant lesions,
65 benign cases with biopsy-proven benign lesions,
21 cases with lesions shown not to warrant biopsy based on additional imaging and 109 BI-RADS 1 or 2 cases without any suspicious lesions.
All cases without biopsy-proven lesions had at least one-year normal imaging follow-up.
The cancer-enriched dataset was randomly selected from blocks of sequential series of cases at 7 US acquisition sites to match a screening population within the cancer and non-cancer cases,
except that invasive cancers larger than 2.5cm were excluded.
All data were retrospectively collected under institutional review board approval with waiver of informed consent and in compliance with the Health Insurance Portability and Accountability Act.
The 65 cancer cases included 50 cases with soft tissue lesions (with or without calcifications) and 15 cases with only calcifications.
Similarly,
51 cases had invasive carcinoma and 14 DCIS.
The 195 non-cancer cases included 62 cases with soft tissue lesions (with or without calcifications),
24 cases with only calcifications and 109 BI-RADS 1 or 2 cases.
Radiologist sensitivity,
specificity and reading time were evaluated at the case-level with AI versus without AI in 4 subgroups of cancer cases: soft tissue,
calcifications-only,
invasive carcinomas and DCIS; and 3 subgroups of non-cancer cases: soft tissue,
calcifications-only and BI-RADS 1 or 2.
These pre-planned subgroup analyses were not included in the hierarchical,
pre-specified,
fixed hypothesis testing sequence to protect the study’s type I error rate (alpha = 0.05) from inflation associated with multiple comparisons; thus,
formal hypothesis testing was not done,
although 95% confidence intervals (CIs) were computed.