Artificial intelligence model outperformed experienced clinicians in differentiating the aetiology of pneumonia on chest computed tomography: a retrospective study.
Wenting Jin, Ying Shao, Jue Pan, Meixia Wang, Tongjie Gu, Wei Shen, Xi Ouyang, Zhi Qiao, et al. (12 authors)
Quantitative imaging in medicine and surgery · 2026-01
Abstract
BACKGROUND: Rapid and precise aetiological diagnosis is crucial for managing pneumonia. We aimed to develop and validate deep learning (DL) models for differentiating ten pneumonia aetiologies on chest computed tomography images.
METHODS: We enrolled 1,091 pneumonia patients with 1 of 10 definite aetiological diagnoses between October 1, 2015 and June 30, 2022 in this retrospective study. We trained and validated two DL models: a classic 3D-DenseNet model (DenseNet) and a novel large vision model (LVM). The models were tested on a data from 183 nonoverlapping patients for external dataset. Model performance was assessed using the area under the curve (AUC) of the Top1 diagnosis and the accuracy of the Top1, Top2, and Top3 diagnoses. Comparisons were also performed between the DL models and eight experienced radiologists and pulmonologists.
RESULTS: The LVM combined with non-imaging model (LVM+) had a greater average prediction performance than DenseNet combined with non-imaging model (DenseNet+), radiologists' results with non-imaging data (radiologists+) and pulmonologists' results with non-imaging data (pulmonologists+), with Top1 AUCs of 0.872, 0.851, 0.643 and 0.644, respectively. The Top1, Top2, and Top3 accuracies of LVM+ were 0.527, 0.701 and 0.820, respectively, similarly outperforming DenseNet+, radiologists+ and pulmonologists+. The two models performed similarly in the external test sets, with the Top1 AUCs of 0.743 for DenseNet and 0.775 for LVM. The classification-related confusion matrix of LVM/DenseNet with or without non-imaging model showed a significant advantage in identifying pulmonary non-tuberculous mycobacterium pulmonary disease (PNTM), pulmonary tuberculosis (PTB) andpneumonia (PJP).
CONCLUSIONS: This study presents a comprehensive classification closely aligned with pneumonia diagnosis in realistic clinical settings. We expect this method to be applied clinically to foster novel approaches to improve the accuracy in diagnosing pneumonia.