๐Ÿง‘๐Ÿผโ€๐Ÿ’ป Research - April 11, 2025

Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study.

๐ŸŒŸ Stay Updated!
Join Dr. Ailexa’s channels to receive the latest insights in health and AI.

โšก Quick Summary

This study evaluated the effectiveness of large language models (LLMs) in providing patient education for Chinese patients with ocular myasthenia gravis (OMG). The findings revealed that ChatGPT o1-preview outperformed other models in accuracy and patient satisfaction, highlighting the potential of AI in enhancing patient education.

๐Ÿ” Key Details

  • ๐Ÿ“Š Participants: 130 ophthalmology examination questions, 23 OMG-related patient questions, and 20 patients with OMG.
  • โš™๏ธ Technology: 5 different LLMs including ChatGPT o1-preview, GEMINI, and Ernie 3.5.
  • ๐Ÿ† Performance Metrics: Accuracy, completeness, helpfulness, safety, and readability.

๐Ÿ”‘ Key Takeaways

  • ๐Ÿ“ˆ ChatGPT o1-preview achieved the highest accuracy rate of 73% on ophthalmology questions.
  • ๐Ÿ’ก For OMG-related questions, ChatGPT scored highest in correctness (4.44), completeness (4.44), helpfulness (4.47), and safety (4.6).
  • ๐Ÿ“š GEMINI provided the most readable responses, while GPT-4o had more complex answers.
  • ๐Ÿค Patient satisfaction was higher for ChatGPT o1-preview compared to Ernie 3.5 (4.40 vs 3.89).
  • ๐Ÿ“ Readability scores favored Ernie 3.5 slightly (4.31 vs 4.03).
  • โš ๏ธ Challenges include misinformation risks and ethical considerations in AI integration.
  • ๐ŸŒ Study conducted in China, addressing a significant gap in patient education resources.

๐Ÿ“š Background

Ocular myasthenia gravis (OMG) is a neuromuscular disorder that primarily affects the extraocular muscles, resulting in symptoms such as ptosis and diplopia. Effective patient education is essential for managing this condition, yet many patients in China face barriers due to limited healthcare resources. The emergence of large language models (LLMs) offers a promising solution to provide instant, AI-driven health information, but their effectiveness in this context requires thorough evaluation.

๐Ÿ—’๏ธ Study

This mixed methods study was conducted in two phases. In the first phase, 130 ophthalmology examination questions were input into five different LLMs, and their performance was compared with that of undergraduates, master’s students, and ophthalmology residents. Additionally, 23 common OMG-related questions were posed to four LLMs, with responses evaluated by ophthalmologists across five domains. The second phase involved 20 patients with OMG interacting with two LLMs, assessing the responses for satisfaction and readability.

๐Ÿ“ˆ Results

The results indicated that ChatGPT o1-preview achieved the highest accuracy rate of 73% on the ophthalmology examination questions. For the OMG-related questions, it also scored highest in correctness, completeness, helpfulness, and safety. In terms of readability, GEMINI provided the easiest-to-understand responses, while GPT-4o was more complex. In the second phase, ChatGPT o1-preview received higher satisfaction scores compared to Ernie 3.5, although the latter had slightly better readability.

๐ŸŒ Impact and Implications

The findings from this study suggest that LLMs like ChatGPT o1-preview have the potential to significantly enhance patient education for those with OMG. By providing accurate and accessible information, these AI tools can help bridge the gap in healthcare resources, particularly in regions with limited access to personalized medical guidance. However, it is crucial to address challenges such as misinformation and ethical considerations to ensure safe integration into clinical practice.

๐Ÿ”ฎ Conclusion

This study highlights the remarkable potential of large language models in improving patient education for ocular myasthenia gravis. With the ability to deliver accurate and user-friendly information, LLMs could transform how patients manage their conditions. Continued research and development in this area are essential to maximize the benefits of AI in healthcare, paving the way for more informed and empowered patients.

๐Ÿ’ฌ Your comments

What are your thoughts on the use of AI in patient education? Do you believe it can effectively bridge the gap in healthcare resources? Let’s discuss! ๐Ÿ’ฌ Leave your comments below or connect with us on social media:

Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study.

Abstract

BACKGROUND: Ocular myasthenia gravis (OMG) is a neuromuscular disorder primarily affecting the extraocular muscles, leading to ptosis and diplopia. Effective patient education is crucial for disease management; however, in China, limited health care resources often restrict patients’ access to personalized medical guidance. Large language models (LLMs) have emerged as potential tools to bridge this gap by providing instant, AI-driven health information. However, their accuracy and readability in educating patients with OMG remain uncertain.
OBJECTIVE: The purpose of this study was to systematically evaluate the effectiveness of multiple LLMs in the education of Chinese patients with OMG. Specifically, the validity of these models in answering patients with OMG-related questions was assessed through accuracy, completeness, readability, usefulness, and safety, and patients’ ratings of their usability and readability were analyzed.
METHODS: The study was conducted in two phases: 130 choice ophthalmology examination questions were input into 5 different LLMs. Their performance was compared with that of undergraduates, master’s students, and ophthalmology residents. In addition, 23 common patients with OMG-related patient questions were posed to 4 LLMs, and their responses were evaluated by ophthalmologists across 5 domains. In the second phase, 20 patients with OMG interacted with the 2 LLMs from the first phase, each asking 3 questions. Patients assessed the responses for satisfaction and readability, while ophthalmologists evaluated the responses again using the 5 domains.
RESULTS: ChatGPT o1-preview achieved the highest accuracy rate of 73% on 130 ophthalmology examination questions, outperforming other LLMs and professional groups like undergraduates and master’s students. For 23 common patients with OMG-related questions, ChatGPT o1-preview scored highest in correctness (4.44), completeness (4.44), helpfulness (4.47), and safety (4.6). GEMINI (Google DeepMind) provided the easiest-to-understand responses in readability assessments, while GPT-4o had the most complex responses, suitable for readers with higher education levels. In the second phase with 20 patients with OMG, ChatGPT o1-preview received higher satisfaction scores than Ernie 3.5 (Baidu; 4.40 vs 3.89, P=.002), although Ernie 3.5’s responses were slightly more readable (4.31 vs 4.03, P=.01).
CONCLUSIONS: LLMs such as ChatGPT o1-preview may have the potential to enhance patient education. Addressing challenges such as misinformation risk, readability issues, and ethical considerations is crucial for their effective and safe integration into clinical practice.

Author: [‘Wei B’, ‘Yao L’, ‘Hu X’, ‘Hu Y’, ‘Rao J’, ‘Ji Y’, ‘Dong Z’, ‘Duan Y’, ‘Wu X’]

Journal: J Med Internet Res

Citation: Wei B, et al. Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study. Evaluating the Effectiveness of Large Language Models in Providing Patient Education for Chinese Patients With Ocular Myasthenia Gravis: Mixed Methods Study. 2025; 27:e67883. doi: 10.2196/67883

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on whatsapp
WhatsApp

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.