โก Quick Summary
This study introduced an AI-driven platform for generating USMLE-style practice questions, demonstrating that 87% of the generated questions were accurate and compliant with NBME guidelines. The pilot program at the University of Cincinnati College of Medicine showed promising results in enhancing student engagement and performance.
๐ Key Details
- ๐ Participants: 177 first-year medical students
- ๐งฉ Content: 565 USMLE-style practice questions
- โ๏ธ Technology: Large Language Model (LLM) with retrieval augmented generation (RAG)
- ๐ Accuracy: 490 questions (87%) deemed accurate and NBME-compliant
- ๐ฑ Delivery: Questions deployed via a mobile app
๐ Key Takeaways
- ๐ค AI technology can effectively generate high-quality practice questions for medical exams.
- ๐ Student engagement increased, with some students completing up to 220 questions.
- ๐ฌ Feedback from students was overwhelmingly positive, highlighting enthusiasm for AI-assisted study tools.
- ๐ Future iterations will focus on scalability and reducing faculty workload.
- ๐ Potential expansion to additional courses and health professions is planned.
- ๐งโ๐ซ Human oversight ensured content validity and adherence to educational standards.
- ๐ Performance trends indicated improved exam results with increased question bank usage.

๐ Background
High-stakes licensing exams, such as the USMLE, are pivotal in medical education, impacting both trainee progression and patient outcomes. However, access to quality board preparation resources is often uneven, particularly disadvantaging students from underrepresented or financially disadvantaged backgrounds. This study addresses these disparities by leveraging AI technology to democratize access to high-quality practice materials.
๐๏ธ Study
Conducted at the University of Cincinnati College of Medicine between November and December 2023, this pilot study aimed to develop an AI-driven system for generating USMLE-style practice questions. The system utilized a Large Language Model (LLM) enhanced with retrieval augmented generation (RAG) and was validated through a human-in-the-loop process led by a faculty course director.
๐ Results
Out of the 565 questions generated, 490 (87%) were validated as accurate and compliant with NBME guidelines. The mobile app allowed students to practice and receive performance feedback, with qualitative feedback indicating a strong enthusiasm for the AI-assisted study tools. Although not statistically significant, there was a trend toward improved performance on related exam questions with increased usage of the question bank.
๐ Impact and Implications
This pilot study demonstrates the potential of AI-driven platforms to enhance medical education by providing equitable access to high-quality practice resources. The positive feedback and engagement from students suggest that such technologies could significantly improve learning outcomes and prepare future healthcare professionals more effectively. The implications extend beyond just the USMLE, with plans for expansion into other courses and health professions.
๐ฎ Conclusion
The findings from this study highlight the transformative potential of AI in medical education. By generating high-quality, guideline-aligned practice questions, we can enhance student learning experiences and outcomes. As the platform evolves, it promises to further reduce barriers to access and improve educational equity in the medical field. We look forward to seeing how this technology will shape the future of medical training!
๐ฌ Your comments
What are your thoughts on the integration of AI in medical education? Do you believe it can truly level the playing field for all students? ๐ฌ Share your insights in the comments below or connect with us on social media:
An Artificial Intelligence-Driven Platform for Practice Question Generation.
Abstract
PROBLEM: High-stakes licensing exams such as the USMLE play a critical role in medical education, influencing both trainee progression and patient outcomes. Access to high-quality board preparation resources is uneven and often cost-prohibitive, disproportionately affecting students from underrepresented or financially disadvantaged backgrounds.
APPROACH: An AI-driven system to generate USMLE-style practice questions aligned with NBME item-writing guidelines using a Large Language Model (LLM) enhanced with retrieval augmented generation (RAG), chain-of-thought and few-shot prompting, and JSON schema validation was developed and piloted at the University of Cincinnati College of Medicine between November and December 2023. Five lectures from a preclinical hematology course were selected, and 565 questions were generated for 177 first-year medical students. A human-in-the-loop process, led by a faculty course director, ensured content validity and adherence to educational standards. Validated questions were deployed via a mobile app, allowing students to practice, receive performance feedback, and access an AI tutor.
OUTCOMES: Of the 565 questions, 490 (87%) were deemed accurate and NBME-compliant. Eighty students used the question bank, completing up to 220 questions each. Although not statistically significant, increased use trended toward improved performance on related exam questions. Qualitative feedback highlighted enthusiasm for AI-assisted study tools, with calls for broader content coverage.
NEXT STEPS: This pilot demonstrates that LLMs can generate high-quality, guideline-aligned practice questions. To improve scalability and reduce faculty workload, future iterations will incorporate AI-based review agents for pre-screening content. The platform is intended to be expanded to additional courses, training phases, and health professions. Ongoing refinement will focus on improving content specificity and maintaining accuracy, especially in advanced and subspecialty education.
Author: [‘Zahn A’, ‘Overla S’, ‘Lowrie DJ’, ‘Zhou CY’, ‘Santen SA’, ‘Zheng W’, ‘Turner L’]
Journal: Acad Med
Citation: Zahn A, et al. An Artificial Intelligence-Driven Platform for Practice Question Generation. An Artificial Intelligence-Driven Platform for Practice Question Generation. 2026; (unknown volume):(unknown pages). doi: 10.1093/acamed/wvaf074