It could quickly become essential in hospitals. To emergencies, the artificial intelligence chatbot ChatGPT performed diagnoses at least as well as doctors and in some cases outperformed them, Dutch researchers have found, saying AI could “revolutionize the medical field.”
The authors of the study published Wednesday, however, stressed that the days of emergency doctors were not yet numbered, the chatbot being potentially capable of speeding up the diagnosis but not of replacing the judgment and experience of a human.
Thirty cases treated in an emergency department in the Netherlands in 2022 were reviewed by feeding ChatGPT based on patient histories, laboratory tests and physician observations, asking the chatbot to suggest five possible diagnoses. In 87% of cases, the correct diagnosis was found in the list of practitioners, compared to 97% for version 3.5 of ChatGPT. The chatbot “was capable of carrying out medical diagnoses much like a human doctor would have done,” summarized Hidde ten Berg, from the emergency department of Jeroen Bosch Hospital, in the south of the Netherlands.
“Ideas the doctor hadn’t thought of”
Study co-author Steef Kurstjens stressed that it did not conclude that computers could one day run emergency rooms but that AI could play a vital role in helping doctors under pressure. The chatbot “can help make a diagnosis and maybe can offer ideas that the doctor hadn’t thought of,” he said. Such tools are not designed as medical devices, he noted, however, also sharing concerns about data privacy sensitive medical information in a chatbot.
And, as in other areas, ChatGPT encountered some limitations. His reasoning was “at times medically implausible or inconsistent, which can lead to misinformation or incorrect diagnosis, with significant implications,” the study notes.
The scientists also admit to some shortcomings in their research, such as the small sample size. Additionally, only relatively simple cases were reviewed, with patients presenting with only one main complaint. The effectiveness of chatbot in complex cases is unclear.
Sometimes ChatGPT did not provide the correct diagnosis in all five possibilities, explains Steef Kurstjens, especially in the case of an abdominal aortic aneurysm, a potentially fatal complication, with the aorta swelling. But, consolation for ChatGPT, in this case the doctor was also wrong.
The report further points to medical “blunders” made by the chatbot, for example diagnosing anemia (low hemoglobin level in the blood) in a patient with normal hemoglobin level. The results of the study, published in the specialized journal Annals of Emergency Medicinewill be presented at the 2023 European Congress of Emergency Medicine (EUSEM) in Barcelona.