Inter-observer reproducibility according to three methods of evaluating mammographic density and parenchymal pattern: Impact on risk prediction

This poster is published under an open license. Please read the disclaimer for further details.

Congress:

ECR 2015

Poster Number:

C-1037

Type:

Scientific Exhibit

Keywords:

Breast, Mammography, Observer performance, Comparative studies, Cancer, Epidemiology, Image registration

Authors:

R. R. Winkel¹, M. von Euler-Chelpin¹, M. Nielsen², M. Bachmann Nielsen¹, W. Uldall¹, P. Diao¹, I. Vejborg¹; ¹Copenhagen/DK, ²Copenhagen /DK

DOI:

10.1594/ecr2015/C-1037

DOI-Link:

https://dx.doi.org/10.1594/ecr2015/C-1037

Conclusion

This is to our knowledge the first study to report inter-observer agreement on the Tabár classification. Whereas the BI-RADS density classification and the PMD measurements are based on simple quantitative assessment of density, the Tabár classification is far more intuitive. However, we found inter-observer reproducibility on the Tabár classification to be highly comparable with the two other methods. The Inter-observer concordances we demonstrated are also comparable with previous inter-observer studies reporting kappa values ranging from the extremes of 0.02-0.87 [11]–[15] regarding the BIRADS classification, and ICC values of 0.94 on CC views [1] and 0.91 on MLO views [16] regarding the interactive threshold technique.

Despite substantial to almost perfect inter-observer reproducibility for all three methods, different impact on breast cancer risk prediction on a multiple-category scale was observed depending on the density scale used. This indicates that the overall concordance is not as important as the specific type of “misclassification” when estimating risk, as has also previously been discussed by Grove et al [17]. However, no difference in OR risk estimates between readers was seen after categorising into only two risk-groups.

Only a few studies have investigated the association between the Tabár classification and the risk of breast cancer. In line with Jakes and colleagues we also found the correlation of parenchymal pattern and breast cancer risk to be specifically associated with PIV [18].

The study have some limitations to address: In this retrospective study we have not been able to control for other breast cancer risk variables other than age, such as e.g. BMI and reproductive variables. Moreover, we did not differentiate between interval cancers (defined as cancers diagnosed between two screenings) and screen-detected cancers. These limitations are important to bear in mind when interpreting the risk estimates. The lack of BMI adjustment has probably led to some underestimation of risk. On the other hand, we might have included some “excess” cancers which may have been initially un-detected (masked at the negative screening in 2007), leading to an overestimation of risk.

Additionally, readings were done on analogue digitized mammograms reducing the quality of the images. Finally, it would have strengthened our study methodologically to have had more readers.

In conclusion, parenchymal pattern as well as density may play a role in a future individualized screening setting; however, automated computerized techniques are needed to fully overcome the impact of subjectivity.