Ismail Erbas, Ferhat Demikiran, et al.
NeurIPS 2025
Rule-based Natural Language Processing (NLP) pipelines depend on robust domain knowledge. Given the long tail of important terminology in radiology reports, it is not uncommon for standard approaches to miss items critical for understanding the image. AI techniques can accelerate the concept expansion and phrasal grouping tasks to efficiently create a domain specific lexicon ontology for structuring reports. Using Chest X-ray (CXR) reports as an example, we demonstrate that with robust vocabulary, even a simple NLP pipeline can extract 83 directly mentioned abnormalities (Ave. recall=93.83%, precision=94.87%) and 47 abnormality/normality descriptions of key anatomies. The richer vocabulary enables identification of additional label mentions in 10 out of 13 labels (compared to baseline methods). Furthermore, it captures expert insight into critical differences between observed and inferred descriptions, and image quality issues in reports. Finally, we show how the CXR ontology can be used to anatomically structure labeled output.
Ismail Erbas, Ferhat Demikiran, et al.
NeurIPS 2025
Bc Kwon, Natasha Mulligan, et al.
ISMB 2025
Wojciech Ozga, Do Le Quoc , et al.
IFIP DBSec 2021
John D. Gould
Journal of Experimental Psychology