Advanced Diagnostic & Interventional Radiology Research Center | Methodological insights into ChatGPT’s screening performance

Advanced Diagnostic & Interventional Radiology Research Center | Methodological insights into ChatGPT’s screening performance
| Dec 14 2025
logo

Advanced Diagnostic & Interventional Radiology Research Center

COVID-19 pandemic 

During the COVID-19 pandemic, the Radiology Research Center at Tehran University of Medical Sciences continued its research activities despite the challenges posed by the increased demand for CT scans of COVID-19 patients and the necessity of adhering to strict health protocols. This center played a crucial role in improving medical imaging techniques, optimizing diagnostic protocols, and advancing technologies related to CT scan image analysis.

Faculty members, researchers, and staff remained committed to ensuring the safety and well-being of healthcare professionals and patients while actively engaging in imaging data analysis, developing artificial intelligence algorithms for faster disease detection, publishing scientific articles, and presenting their findings at international conferences. These efforts aimed to enhance diagnostic accuracy, improve treatment processes, and alleviate pressure on healthcare systems.

 

Key achievements of the Radiology Research Center during the COVID-19 pandemic include:


✔️ Development and optimization of lung imaging protocols for faster and more accurate COVID-19 diagnosis
✔️ Implementation of artificial intelligence technologies for automated CT scan analysis and reduced diagnosis time
✔️ Publication of high-impact research articles on innovative imaging methods for COVID-19 patients
✔️ Participation in national and international projects focused on COVID-19 diagnosis and patient management

The center remains dedicated to advancing research in medical imaging and continues to contribute as a leading scientific institution in improving the quality of diagnostic and therapeutic services.

 

Some of the center's significant achievements during the pandemic include:

 

  • Release Date : Jun 18 2025 - 10:05
  • : 119
  • Study time : 1 minute(s)

Methodological insights into ChatGPT’s screening performance in systematic reviews

 {faces}

Background: The screening process for systematic reviews and meta-analyses in medical research is a labor-intensive and time-consuming task. While machine learning and deep learning have been applied to facilitate this process, these methods often require training data and user annotation. This study aims to assess the efficacy of ChatGPT, a large language model based on the Generative Pretrained Transformers (GPT) architecture, in automating the screening process for systematic reviews in radiology without the need for training data.

Methods: A prospective simulation study was conducted between May 2nd and 24th, 2023, comparing ChatGPT's performance in screening abstracts against that of general physicians (GPs). A total of 1198 abstracts across three subfields of radiology were evaluated. Metrics such as sensitivity, specificity, positive and negative predictive values (PPV and NPV), workload saving, and others were employed. Statistical analyses included the Kappa coefficient for inter-rater agreement, ROC curve plotting, AUC calculation, and bootstrapping for p-values and confidence intervals.

Results: ChatGPT completed the screening process within an hour, while GPs took an average of 7-10 days. The AI model achieved a sensitivity of 95% and an NPV of 99%, slightly outperforming the GPs' sensitive consensus (i.e., including records if at least one person includes them). It also exhibited remarkably low false negative counts and high workload savings, ranging from 40 to 83%. However, ChatGPT had lower specificity and PPV compared to human raters. The average Kappa agreement between ChatGPT and other raters was 0.27.

Conclusions: ChatGPT shows promise in automating the article screening phase of systematic reviews, achieving high sensitivity and workload savings. While not entirely replacing human expertise, it could serve as an efficient first-line screening tool, particularly in reducing the burden on human resources. Further studies are needed to fine-tune its capabilities and validate its utility across different medical subfields.

  • Article_DOI : 10.1186/s12874-024-02203-8
  • Author(s) : mahbod issaiy,kavous firouznia
  • News Group : research,research article,AI
  • News Code : 299823
نفیسه سادات قوامی
Author:

نفیسه سادات قوامی

0 Comments for this article

comment

Post your comment:

متن درون تصویر را در جعبه متن زیر وارد نمائید *
Enter your desired term to search
Theme settings