Title: Assessing Quality of Scenario-Based Multiple-Choice Questions in Physiology: Faculty-Generated vs. ChatGPT-Generated Questions among Phase I Medical Students
Document Type
Article
Publication Title
International Journal of Artificial Intelligence in Education
Abstract
The integration of Artificial Intelligence (AI), particularly Chatbot Generative Pre-Trained Transformer (ChatGPT), in medical education has introduced new possibilities for generating various educational resources for assessments. However, ensuring the quality of ChatGPT-generated assessments poses challenges, with limited research in the literature addressing this issue. Recognizing this gap, our study aims to investigate the quality of ChatGPT-based assessment. In this study among first-year medical students, a crossover design was employed to compare scenario-based multiple-choice questions (SBMCQs) crafted by both faculty members and ChatGPT through item analysis to determine the quality of assessment. The study comprised three main phases: development, implementation, and evaluation of SBMCQs. During the development phase, both faculty members and ChatGPT generated 60 SBMCQs each, covering topics related to cardiovascular, respiratory, and endocrinology. These questions underwent assessment by independent reviewers, after which 80 SBMCQs were selected for the tests. Subsequently, in the implementation phase, one hundred and twenty students, divided into two batches, were assigned to receive either faculty-generated or ChatGPT-generated questions across four test sessions. The collected data underwent rigorous item analysis and thematic analysis to evaluate the effectiveness and quality of the questions generated by both parties. Only 9 of ChatGPT’s SBMCQs met ideal criteria MCQ on Difficulty Index, Discrimination Index and Distractor Effectiveness contrasting with 19 from faculty. Moreover, ChatGPT’s questions exhibited a higher rate of nonfunctional distractors (33.75% vs. faculty’s 13.75%). During focus group discussion, faculty highlighted importance of educators in reviewing, refining, and validating ChatGPT-generated SBMCQs to ensure their appropriateness within the educational context.
First Page
2315
Last Page
2344
DOI
10.1007/s40593-025-00471-z
Publication Date
12-1-2025
Recommended Citation
Chauhan, Archana; Khaliq, Farah; and Nayak, Kirtana Raghurama, "Title: Assessing Quality of Scenario-Based Multiple-Choice Questions in Physiology: Faculty-Generated vs. ChatGPT-Generated Questions among Phase I Medical Students" (2025). Open Access archive. 12051.
https://impressions.manipal.edu/open-access-archive/12051