Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination

Michał Bielówka; Jakub Kufel; Marcin Rojek; Adam Mitręga; Dominika Kaczyńska; Łukasz Czogalik; Michał Janik; Wiktoria Bartnikowska; Sylwia Mielcarska; Dominika Kondoł

doi:10.5114/pja.2024.135380

Abstract

1/2024 vol. 11

Original paper

Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination

Michał Bielówka ¹

,

Jakub Kufel ^{2, 3}

,

Marcin Rojek ^{1, 4}

,

Adam Mitręga ¹

,

Dominika Kaczyńska ¹

,

Łukasz Czogalik ¹

,

Michał Janik ¹

,

Wiktoria Bartnikowska ⁵

,

Sylwia Mielcarska ⁶

,

Dominika Kondoł ⁷

Students’ Scientific Association of Computer Analysis and Artificial Intelligence at the Department of Radiology and Nuclear Medicine, Medical University of Silesia, Katowice, Poland
Department of Radiodiagnostics, Interventional Radiology and Nuclear Medicine, Medical University of Silesia, Katowice, Poland
Department of Radiology and Nuclear Medicine, Medical University of Silesia, Katowice, Poland
Students’ Scientific Association at the Department of Microbiology and Immunology, Medical University of Silesia, Katowice, Poland
Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland
Department of Medical and Molecular Biology, Faculty of Medical Sciences in Zabrze, Medical University of Silesia, Zabrze, Poland
Dr B. Hager Memorial Multi-specialty District Hospital, Tarnowskie Góry, Poland

Alergologia Polska – Polish Journal of Allergology 2024; 11, 1: 42–47

DOI: https://doi.org/10.5114/pja.2024.135380

Online publish date: 2024/02/13

View full text

AMA

Bielówka M, Kufel J, Rojek M, et al. Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination. Alergologia Polska - Polish Journal of Allergology. 2024;11(1):42-47. doi:10.5114/pja.2024.135380.

APA

Bielówka, M., Kufel, J., Rojek, M., Mitręga, A., Kaczyńska, D., & Czogalik, Ł. et al. (2024). Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination. Alergologia Polska - Polish Journal of Allergology, 11(1), 42-47. https://doi.org/10.5114/pja.2024.135380

Chicago

Bielówka, Michał, Jakub Kufel, Marcin Rojek, Adam Mitręga, Dominika Kaczyńska, Łukasz Czogalik, and Michał Janik et al. 2024. "Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination". Alergologia Polska - Polish Journal of Allergology 11 (1): 42-47. doi:10.5114/pja.2024.135380.

Harvard

Bielówka, M., Kufel, J., Rojek, M., Mitręga, A., Kaczyńska, D., Czogalik, Ł., Janik, M., Bartnikowska, W., Mielcarska, S., and Kondoł, D. (2024). Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination. Alergologia Polska - Polish Journal of Allergology, 11(1), pp.42-47. https://doi.org/10.5114/pja.2024.135380

MLA

Bielówka, Michał et al. "Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination." Alergologia Polska - Polish Journal of Allergology, vol. 11, no. 1, 2024, pp. 42-47. doi:10.5114/pja.2024.135380.

Vancouver

Bielówka M, Kufel J, Rojek M, Mitręga A, Kaczyńska D, Czogalik Ł et al. Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination. Alergologia Polska - Polish Journal of Allergology. 2024;11(1):42-47. doi:10.5114/pja.2024.135380.

Introduction:

The development of Artificial Intelligence (AI) and attempts to use it in medicine are increasingly becoming the subject of more scientific research.

Aim:

The aim of this article is to present the effectiveness of the advanced language model, ChatGPT-3.5 in the context of the pass rate of the Polish National Specialist Examination (PES) in allergology. Additionally, it seeks to comprehend the potential applications of artificial intelligence in the field of medicine, particularly within allergology.

Material and methods:

The study used the latest available PES exam prepared by the Medical Research Centre in Lodz. 118 questions were asked using the openai.com platform, which allows free access to the ChatGPT-3.5 model. All questions were classified according to Bloom’s taxonomy to assess their complexity and difficulty, with additional three categorisations. Each question was asked five times.

Results:

ChatGPT-3.5 did not pass the allergology PES, achieving a score of 52.54%. It was observed that the model performed better in answering memory questions (60%) compared to those requiring comprehension and critical thinking, where the results were slightly lower (45%). Moreover, within the categories of ‘treatment’, ‘immune system’ and ‘symptoms’, the model exceeded the passing threshold. Questions to which ChatGPT provided the correct answer significantly exhibited higher difficulty compared to those to which it provided an incorrect response.

Conclusions:

The results indicate that ChatGPT’s pass rate in the allergology PES is considerably lower than that of resident doctors specializing in this field. The potential applications of AI in medicine require further research to effectively support clinical practice among physicians.

Keywords

artificial intelligence, allergology, language model, ChatGPT-3

Abstract

Evaluating ChatGPT-3.5 in allergology: performance in the Polish Specialist Examination

Introduction:

Aim:

Material and methods:

Results:

Conclusions:

Keywords

Share

Coverage in

Integrated with

Editorial Policies