In a groundbreaking development, ChatGPT-4, the latest advancement from OpenAI, has demonstrated its superiority over human clinicians in the nuanced realm of probabilistic reasoning. The research, detailed in a research letter published on December 11 in JAMA Network Open, reveals the AI’s remarkable proficiency in determining pretest and post-test disease probability, particularly after a negative test result involving chest radiographs and mammograms.
Led by Dr. Adam Rodman from Beth Israel Deaconess Medical Center in Boston, the investigative team found that ChatGPT-4 excelled in minimizing errors related to pretest and post-test probability after a negative outcome in five distinct cases. These cases involved chest radiography for pneumonia, mammography for breast cancer, stress tests for coronary artery disease, urine culture for urinary tract disease, and hypothetical testing scenarios.
The research, comparing ChatGPT-4’s performance with input from 553 human clinicians across various specialties, revealed the AI’s superiority, especially in scenarios where the median estimate from the model diverged from the correct answer more than the median human estimate. For instance, in the case of asymptomatic bacteriuria, the AI displayed a median pretest probability of 26%, outperforming humans with a median of 20%.
Notably, ChatGPT-4 showcased a narrower distribution of responses compared to its human counterparts and demonstrated higher accuracy in estimating posttest probability after a positive test result in certain cases. The study acknowledged imperfections, emphasizing the potential of AI diagnostic aids like ChatGPT-4 to complement human diagnostic performance through collective intelligence.
While the findings underscore the evolving role of AI in medical diagnostics, the researchers called for future studies to explore the performance of large language models in more complex clinical cases. The study signifies a significant stride in integrating AI technologies into healthcare, promising improved diagnostic capabilities and potential advancements in medical decision-making.