Lung, Management, Thorax, CT, PACS, Computer Applications-General, Statistics, Cancer, Epidemiology
W. Hendrix1, N. Hendrix1, M. Prokop1, E. Scholten1, B. Van Ginneken1, M. Rutten2, C. Jacobs1; 1Nijmegen/NL, 2's-Hertogenbosch/NL
Methods and materials
We retrospectively collected the radiology reports from chest CT studies performed between 2008 and 2019 from the Electronic Health Record (EHR) systems of two Dutch hospitals, one academic hospital and one teaching hospital. All studies were registered in both the EHR and Picture Archiving and Communication Systems (PACS). Cases were included between 2008 and 2017; two years served as follow-up. Radiology reports from CT scans that only contain portions of the lungs (e.g. abdominal CT) were not included. This study was approved by the medical ethical review boards of both institutions.
A natural language processing (NLP) algorithm was used to identify studies with any reported pulmonary nodule measuring up to 3 cm in diameter. An overview of this algorithm is shown in Figure 2. The algorithm is a rule-based system that uses combinations of keywords (e.g. nodule, lesion) and specific search patterns (e.g. for the detection of nodule diameter). The NLP algorithm was developed and evaluated on a set of 971 radiology reports from chest CT studies from the selected period. We found an accuracy of 91.6%, sensitivity of 92.7%, and specificity of 89.2%.
We calculated the annual number of chest CT studies and patients who underwent chest CT. Based on the predictions from the NLP algorithm, we estimated the number of patients in whom nodules were reported, and those who received follow-up after nodule identification.