Keywords:
Image verification, Trauma, Computer Applications-Detection, diagnosis, Conventional radiography, Pelvis, Artificial Intelligence
Authors:
J. Gregory1, J. W. Luo2, C. H. MO1, J. J. R. Chong2; 1Montreal/CA, 2Montreal, QC/CA
DOI:
10.26044/ecr2019/C-3471
Methods and materials
Patient Population and Labels:
The study employed a retrospective case-control study concerning routinely performed exams at two academic tertiary care hospitals.
We performed a keyphrase electronic search for adult patients who had underwent emergency pelvic radiograph studies from January 2006 to December 2017.
5560 frontal pelvic radiographs were identified.
We then identified 613 positive cases of acute fracture or dislocation and 4947 negative controls.
All positive cases were reviewed by a radiology fellow in order to label the presence,
and specific involved image quadrant(s) of the fracture(s).
All radiographs were included regardless of type of fracture,
quality of the X-ray or metallic artifacts to maximize dataset generalizability.
Image Pre-Processing:
Full DICOM studies were exported and converted to a baseline 512x512px image.
This dataset was then converted into two datasets: downsampled 256x256px input images of the full image (i.e.
‘FULL’) and of specific quadrants (i.e.
‘QUADS’).
Given the nature of the clinical application with respect to detecting high-frequency features such as displaced fractures,
it was felt that by preserving more image resolution and detail in the quadrant images,
this could improve the identification of more subtle fractures as well as improve localization ability given the dimensional reduction in the field of view,
however at the expense of overall location and context.
Acquisition standard width/window level settings were maintained.
Images were anonymized as per standard protocols.
Neural Network Configuration:
We employ a dual-channel network design with two parallel ImageNet deep convolutional neural networks (DCNN),
pre-trained on non-medical images derived from Inception-ResNet-v2 architecture (Fig. 1).
The intention of this system is to balance the automated classification of lower spatial frequency findings (e.g.
dislocations/disruptions) with higher spatial frequency findings (e.g.
fractures).
The combined predictions from both networks would then be synthesized into a combined classification prediction and heat map to regions of interest.
Model Assessment:
Objective validation of the trained CNN was performed using ROC curve analysis of the inferred predictions on the withheld test group using Area Under the Curve (AUC) measures for both network channels separately (Selvaraju,
2017).
Given the limited dataset,
further subjective evaluation of the validation set predictions was performed with localization heat-map analysis,
using a combination of both Salience Maps and Gradient-weighted Class Activation Mapping (Grad-CAM) (9).
Model performance of the trained CNN was assessed using Area Under the Curve (AUC) of the receiver operating characteristic (ROC),
of the full view and quadrant view networks independently.