Potential impact of digital breast tomosynthesis on the recall rate: observational retrospective study in assessment setting from digital mammography screening programme

Congress:

ECR 2019

Poster Number:

C-3322

Type:

Scientific Exhibit

Keywords:

Breast, Digital radiography, Screening, Cancer

Authors:

G. SOPPELSA¹, A. Saramin¹, M. TONUTTI¹, M. costa², F. GIUDICI¹, E. Makuc¹, C. Gasparini¹, M. Assante¹, M. A. Cova¹; ¹Trieste/IT, ²Udine/IT

DOI:

10.26044/ecr2019/C-3322

DOI-Link:

https://dx.doi.org/10.26044/ecr2019/C-3322

Table 1: Parameters of the women evaluated in the study

Table 2: Concordance of radiologists in reading DBT one view images

Table 3: Concordance of radiologists in reading DBT two view images

Table 4: Comparison of performance indicators of the DBT in one and two view for...

Table 5: Comparison of performance indicators of the DBT in one and two view for...

Table 6: Performance indicators of the DBT depending on the type of lesion

Table 7: Performance indicators of DBT based on breast density

Fig. 1: Comparison of AUC of radiologist A in one (pink line) and two view (blue line)

Fig. 2: Comparison of AUC of radiologist B in one (pink line) and two view (blue line)

Fig. 3: Comparison of the AUC of the two radiologists: Radiologist A (blue line) and...

Results

In the 150 women included in the study, the age of the cohort was between 50 and 74 years, the mammographic density was variable and the most frequently found lesions were irregular opacities. The final outcome, which we know as a retrospective study, was 77% negative (no need for in-depth examination with needle biopsy), while 8% of women received histological or cytological diagnosis of benignity and 14.7% diagnosis of a carcinoma. (Table 1)

The patients were distinguished according to the final outcome known in: positive (carcinoma) and negative (no lesion or benign lesion). It was therefore possible to derive the results of diagnostic accuracy of DBT in one and two view (ML / MLO) in terms of true positives, false negatives, true negatives and false positives obtained from each radiologist. For both radiologists, the use of DBT would have avoided the recall of 70 women (46.7%) for radiologist A and 71 women (47.3%) in the case of radiologist B which were really negative (p = 0.99). Also the percentage of recalls is superimposable between the two radiologists (p = 0.91): radiologist A would have recalled 79 women (52.7%) and would not have recalled a case which instead turned out to be a cancer (0.6%). Similarly radiologist B would have recalled 77 women (51.3%) and would not have recalled 2 cases that instead were cancers (1.3%).

We have compared the outcome, in terms of the willingness to recall or not, of the revision of radiologist A and of radiologist B on the DBT one view (Table 2) and also in two view (Table 3) in order to evaluate the agreement between radiologists.

For the evaluation of DBT one view the percentage of agreement is 84% (66 + 60/150). The two radiologists are in agreement in reading (McNemar test: p = 0.68).

Specifically, the coefficient that quantifies this agreement is equal to: Cohen Kappa = 0.68, [0.52-0.84], whose interpretation according to Landis and Koch is of substantial agreement.

For the valuation of DBT two view were excluded 14 women cause have done the DBT only in one view. The percentage of agreement is 79% (54 + 53/136). The two radiologists are in agreement in reading (McNemar test: p = 0.02), radiologist B recalls more than radiologist A.

Specifically, the coefficient that quantifies this agreement is equal to: Cohen Kappa = 0.58, [0.52-0.84], whose interpretation according to Landis and Koch is of substantial agreement.

The performance indicators (sensitivity, specificity, positive predictive value and negative predictive value) of the DBT were compared for the same reader in one and two view.

In the case of radiologist A, the sensitivity and negative predictive value remained unchanged while the specificity and positive predictive value increase in reading of the DBT exams in two view. (Table 4)

In the case of radiologist B, a non-statistically significant reduction is observed of all the performance indexes in the revision of the two view DBT images projection, except for the negative predictive value which remained the same. (Table 5)

In one view DBT the reproducibility among the two radiologists were better: we pass from a Coele coefficient Kappa k = 0.68 in one view, equivalent to a substantial agreement, to a k = 0.58 in two view, corresponding to a moderate concordance. Consequently, the performance indicators of one view DBT are similar between the two radiologists A and B, in terms of avoided and executed recalls. In the two view DBT radiologist B tended to recall more than radiologist A, resulting in a higher percentage of false positives and therefore a better specificity in the case of radiologist A (63% versus 51%); this difference is a limit of statistical significance. In the case of radiologist A the diagnostic indicators improved the reading of the images of DBT in two projections because the specificity increased from 55% to 63%, however, this increase is not statistically significant (p = 0.20).

For the analysis of the performance of the DBT based firstly on the type of lesion and secondarily on the breast density, the study took into consideration the evaluation of radiologist A, as a reader with more years of experience, and two-view DBT as a situation with better diagnostics indicators. By evaluating the area under curve (AUC) of ROC curve according to the type of lesion, there is a better performance of DBT in correctly identifying the malignancy or benignity of the lesions when the mammographic pattern is of distortion or asymmetry compared to one opacity framework. (Table 6)
Secondarily the AUC values and the indicators were comparable for the density, so this does not affect on the DBT accuracy in identifying malignant or benign lesions. (Table 7).

The study of ROC curve demonstrated that in one and two view DBT the AUC is larger for radiologist A than B (Figures 1,2); in the case of one view the comparison of both radiologist's AUC showed no statistically significant differences between the two readers in correctly identifying lesions (Figure 3).