โก Quick Summary
This study conducted a comparative analysis of ChatGPT-4o mini, ChatGPT-4o, and Gemini Advanced in addressing frequently asked questions about postmenopausal osteoporosis (PMOP). The findings revealed that ChatGPT-4o outperformed the others in accuracy and conciseness, particularly in relation to the 2022 ACOG-PMOP guidelines.
๐ Key Details
- ๐ Dataset: 48 PMOP FAQs and 24 specific questions based on the 2022 ACOG-PMOP guidelines
- ๐งฉ Participants: Four professional orthopedic surgeons rated the responses
- โ๏ธ Technology: AI large-scale language models (AI-LLMs): ChatGPT-4o mini, ChatGPT-4o, Gemini Advanced
- ๐ Evaluation Metrics: 5-point Likert scale for satisfaction and Flesch Reading Ease (FRE) score for readability
๐ Key Takeaways
- ๐ ChatGPT-4o demonstrated a significantly higher accuracy rate compared to ChatGPT-4o mini and Gemini Advanced.
- โ๏ธ Conciseness was notably better in ChatGPT-4o and Gemini Advanced when addressing PMOP questions.
- ๐ ChatGPT-4o mini and ChatGPT-4o showed superior performance in answering questions related to the 2022 ACOG-PMOP guidelines.
- ๐ Self-Correction: All three models exhibited good levels of self-correction.
- ๐ฃ๏ธ Professional Feedback: Responses were rated by orthopedic surgeons, ensuring clinical relevance.
- ๐ Readability: The FRE scores indicated varying levels of readability across the models.
- ๐ Global Relevance: The study addresses a significant public health issue, as PMOP affects millions of women worldwide.
๐ Background
Postmenopausal osteoporosis (PMOP) is a prevalent condition that poses significant health risks for women after menopause. It is characterized by decreased bone density, leading to an increased risk of fractures. As public health research continues to focus on PMOP, the integration of artificial intelligence (AI) in providing accurate and accessible information is becoming increasingly important. This study aims to evaluate the effectiveness of AI large-scale language models in addressing common queries related to PMOP.
๐๏ธ Study
The research involved collecting 48 frequently asked questions about PMOP from offline counseling and online medical forums. Additionally, 24 specific questions were formulated based on the 2022 ACOG Clinical Practice Guideline No. 2. The questions were input into three AI-LLMs: ChatGPT-4o mini, ChatGPT-4o, and Gemini Advanced. Four orthopedic surgeons independently rated the responses for satisfaction and clarity.
๐ Results
The results indicated that ChatGPT-4o provided the most accurate responses overall, particularly in relation to the PMOP FAQs. In contrast, ChatGPT-4o mini and Gemini Advanced were less accurate when addressing the 2022 ACOG-PMOP guidelines. All models demonstrated a commendable ability to self-correct, enhancing their reliability as information sources.
๐ Impact and Implications
The findings from this study highlight the potential of AI-LLMs in improving the dissemination of information regarding postmenopausal osteoporosis. By providing accurate, concise, and readable responses, these technologies can empower patients and healthcare providers alike. As AI continues to evolve, its role in public health education and patient engagement will likely expand, leading to better health outcomes for women affected by PMOP.
๐ฎ Conclusion
This study underscores the significant advancements in AI technology, particularly in the context of healthcare. The superior performance of ChatGPT-4o in addressing PMOP-related queries suggests a promising future for AI in enhancing patient education and support. Continued research and development in this area are essential to fully harness the potential of AI in improving health literacy and outcomes.
๐ฌ Your comments
What are your thoughts on the use of AI in healthcare, especially regarding conditions like postmenopausal osteoporosis? We would love to hear your insights! ๐ฌ Leave your comments below or connect with us on social media:
Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis.
Abstract
BACKGROUND: Osteoporosis is a sex-specific disease. Postmenopausal osteoporosis (PMOP) has been the focus of public health research worldwide. The purpose of this study is to evaluate the quality and readability of artificial intelligence large-scale language models (AI-LLMs): ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced for responses generated in response to questions related to PMOP.
METHODS: We collected 48 PMOP frequently asked questions (FAQs) through offline counseling and online medical community forums. We also prepared 24 specific questions about PMOP based on the Management of Postmenopausal Osteoporosis: 2022 ACOG Clinical Practice Guideline No. 2 (2022 ACOG-PMOP Guideline). In this project, the FAQs were imported into the AI-LLMs (ChatGPT-4o mini, ChatGPT-4o, Gemini Advanced) and randomly assigned to four professional orthopedic surgeons, who independently rated the satisfaction of each response via a 5-point Likert scale. Furthermore, a Flesch Reading Ease (FRE) score was calculated for each of the LLMs’ responses to assess the readability of the text generated by each LLM.
RESULTS: When it comes to addressing questions related to PMOP and the 2022 ACOG-PMOP guidelines, ChatGPT-4o and Gemini Advanced provide more concise answers than ChatGPT-4o mini. In terms of the overall FAQs of PMOP, ChatGPT-4o has a significantly higher accuracy rate than ChatGPT-4o mini and Gemini Advanced. When answering questions related to the 2022 ACOG-PMOP guidelines, ChatGPT-4o mini vs. ChatGPT-4o have significantly higher response accuracy than Gemini Advanced. ChatGPT-4o mini, ChatGPT-4o, and Gemini Advanced all have good levels of self-correction.
CONCLUSIONS: Our research shows that Gemini Advanced and ChatGPT-4o provide more concise and intuitive answers. ChatGPT-4o responds better in answering frequently asked questions related to PMOP. When answering questions related to the 2022 ACOG-PMOP guidelines, ChatGPT-4o mini and ChatGPT-4o responded significantly better than Gemini Advanced. ChatGPT-4o mini, ChatGPT-4o, and Gemini Advanced have demonstrated a strong ability to self-correct.
CLINICAL TRIAL NUMBER: Not applicable.
Author: [‘Liu R’, ‘Liu J’, ‘Yang J’, ‘Sun Z’, ‘Yan H’]
Journal: BMC Musculoskelet Disord
Citation: Liu R, et al. Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis. Comparative analysis of ChatGPT-4o mini, ChatGPT-4o and Gemini Advanced in the treatment of postmenopausal osteoporosis. 2025; 26:369. doi: 10.1186/s12891-025-08601-3