Follow us
🧑🏼‍💻 Research - January 16, 2025

Performance of Artificial Intelligence Chatbots on Ultrasound Examinations: Cross-Sectional Comparative Analysis.

🌟 Stay Updated!
Join Dr. Ailexa’s channels to receive the latest insights in health and AI.

⚡ Quick Summary

This study evaluated the performance of AI chatbots, specifically ChatGPT and ERNIE Bot, in answering ultrasound-related medical questions. The findings revealed that ERNIE Bot outperformed ChatGPT, with accuracy rates influenced by language and question type.

🔍 Key Details

  • 📊 Dataset: 554 ultrasound-related questions
  • 🧩 Features used: Questions in English and Chinese
  • ⚙️ Technology: ChatGPT and ERNIE Bot
  • 🏆 Performance: Accuracy rates ranged from 8.33% to 80%

🔑 Key Takeaways

  • 📊 Question types: 64% single-choice, 12% short answers, 11% noun explanations.
  • 💡 Accuracy: True or false questions scored highest.
  • 👩‍⚕️ Subjective ratings: Acceptability rates ranged from 47.62% to 75.36%.
  • 🏆 ERNIE Bot: Showed superior performance compared to ChatGPT (P<.05).
  • 🌍 Language impact: Both models performed worse in English, but ERNIE Bot’s decline was less significant.
  • 🧠 Knowledge areas: Better performance in basic knowledge and ultrasound methods than in signs and diagnosis.
  • 🔍 Study context: Insights for users and developers in selecting appropriate models.

📚 Background

The integration of artificial intelligence in healthcare is rapidly evolving, particularly in the realm of medical inquiries. AI chatbots are increasingly utilized for providing information and answering questions, especially in specialized fields like ultrasound medicine. However, their effectiveness can vary significantly based on various factors, including language and the nature of the questions posed.

🗒️ Study

This cross-sectional comparative analysis aimed to assess the performance of two prominent AI chatbots, ChatGPT and ERNIE Bot, in responding to ultrasound-related medical examination questions. A total of 554 questions were curated, encompassing a range of topics and question types, and were presented in both English and Chinese to evaluate the models’ performance across different languages.

📈 Results

The study revealed that the accuracy rates for objective questions varied widely, ranging from 8.33% to 80%. Notably, true or false questions achieved the highest accuracy. Subjective questions were rated by experienced doctors, yielding acceptability rates between 47.62% and 75.36%. Overall, ERNIE Bot demonstrated superior performance compared to ChatGPT, particularly in areas of basic knowledge and ultrasound methods.

🌍 Impact and Implications

The findings of this study underscore the potential of AI chatbots in enhancing medical inquiries related to ultrasound examinations. By understanding the performance characteristics of different models, users and developers can make informed decisions about which chatbot to utilize for specific questions and languages. This could lead to improved patient education and support in clinical settings, ultimately enhancing the quality of care.

🔮 Conclusion

This research highlights the significant role that AI chatbots can play in the field of ultrasound medicine. With ERNIE Bot outperforming ChatGPT, it is clear that model selection is crucial for optimizing chatbot use. As AI technology continues to advance, further studies are encouraged to explore its applications in various medical domains, paving the way for more effective healthcare solutions.

💬 Your comments

What are your thoughts on the performance of AI chatbots in medical inquiries? Do you believe they can significantly enhance patient care? 💬 Share your insights in the comments below or connect with us on social media:

Performance of Artificial Intelligence Chatbots on Ultrasound Examinations: Cross-Sectional Comparative Analysis.

Abstract

BACKGROUND: Artificial intelligence chatbots are being increasingly used for medical inquiries, particularly in the field of ultrasound medicine. However, their performance varies and is influenced by factors such as language, question type, and topic.
OBJECTIVE: This study aimed to evaluate the performance of ChatGPT and ERNIE Bot in answering ultrasound-related medical examination questions, providing insights for users and developers.
METHODS: We curated 554 questions from ultrasound medicine examinations, covering various question types and topics. The questions were posed in both English and Chinese. Objective questions were scored based on accuracy rates, whereas subjective questions were rated by 5 experienced doctors using a Likert scale. The data were analyzed in Excel.
RESULTS: Of the 554 questions included in this study, single-choice questions comprised the largest share (354/554, 64%), followed by short answers (69/554, 12%) and noun explanations (63/554, 11%). The accuracy rates for objective questions ranged from 8.33% to 80%, with true or false questions scoring highest. Subjective questions received acceptability rates ranging from 47.62% to 75.36%. ERNIE Bot was superior to ChatGPT in many aspects (P<.05). Both models showed a performance decline in English, but ERNIE Bot's decline was less significant. The models performed better in terms of basic knowledge, ultrasound methods, and diseases than in terms of ultrasound signs and diagnosis. CONCLUSIONS: Chatbots can provide valuable ultrasound-related answers, but performance differs by model and is influenced by language, question type, and topic. In general, ERNIE Bot outperforms ChatGPT. Users and developers should understand model performance characteristics and select appropriate models for different questions and languages to optimize chatbot use.

Author: [‘Zhang Y’, ‘Lu X’, ‘Luo Y’, ‘Zhu Y’, ‘Ling W’]

Journal: JMIR Med Inform

Citation: Zhang Y, et al. Performance of Artificial Intelligence Chatbots on Ultrasound Examinations: Cross-Sectional Comparative Analysis. Performance of Artificial Intelligence Chatbots on Ultrasound Examinations: Cross-Sectional Comparative Analysis. 2025; 13:e63924. doi: 10.2196/63924

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on whatsapp
WhatsApp

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.