Follow us
pubmed meta image 2
๐Ÿง‘๐Ÿผโ€๐Ÿ’ป Research - December 30, 2024

The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.

๐ŸŒŸ Stay Updated!
Join Dr. Ailexa’s channels to receive the latest insights in health and AI.

โšก Quick Summary

A recent study evaluated the effectiveness of Generative Pre-trained Transformer 4 (GPT-4) in analyzing medical notes across three languages, revealing that GPT-4 achieved a remarkable 79% agreement rate with physician responses. This study highlights the potential of large language models in enhancing the processing of unstructured medical data.

๐Ÿ” Key Details

  • ๐Ÿ“Š Dataset: 56 de-identified medical notes from 8 university hospitals
  • ๐ŸŒ Languages: English, Spanish, and Italian
  • โš™๏ธ Technology: Generative Pre-trained Transformer 4 (GPT-4)
  • ๐Ÿ† Performance: 79% agreement rate with physician responses

๐Ÿ”‘ Key Takeaways

  • ๐Ÿ“„ Medical notes are often unstructured and challenging for computers to analyze.
  • ๐Ÿค– GPT-4 demonstrated strong performance in understanding and responding to medical queries.
  • ๐ŸŒ Multilingual capability was assessed, with notes in English, Spanish, and Italian.
  • ๐Ÿฉบ High agreement rates were observed, particularly for notes in Spanish (88%) and Italian (84%).
  • ๐Ÿ” Future research should focus on integrating LLMs into clinical workflows.
  • ๐Ÿ“… Study period: Medical notes were collected from February 2020 to June 2023.
  • ๐Ÿ‘จโ€โš•๏ธ Physician validation was conducted independently to ensure accuracy.
  • ๐Ÿ’ก Potential applications in healthcare could improve patient care and data analysis.

๐Ÿ“š Background

The analysis of patient notes is crucial for effective healthcare delivery, yet their unstructured nature poses significant challenges for computational analysis. With the advent of large language models (LLMs) like GPT-4, there is a growing interest in leveraging these technologies to enhance the understanding and processing of medical documentation.

๐Ÿ—’๏ธ Study

This retrospective model-evaluation study involved eight university hospitals across four countries: the USA, Colombia, Singapore, and Italy. Each site contributed seven de-identified medical notes, focusing on patients aged 18-65 with diagnoses of obesity and COVID-19. The study aimed to assess GPT-4’s ability to answer predefined questions based on these notes.

๐Ÿ“ˆ Results

Out of 784 responses generated by GPT-4, there was a 79% agreement rate with physician evaluations. Notably, the model performed better with notes in Spanish and Italian compared to English, indicating potential variations in language processing capabilities. The findings suggest that GPT-4 can effectively analyze medical notes and provide reliable responses.

๐ŸŒ Impact and Implications

The implications of this study are significant for the future of healthcare. By integrating LLMs like GPT-4 into clinical workflows, healthcare providers could enhance the efficiency and accuracy of medical note analysis. This could lead to improved patient outcomes and more streamlined data management processes, ultimately transforming how healthcare professionals interact with patient information.

๐Ÿ”ฎ Conclusion

This study underscores the potential of AI technologies in revolutionizing the analysis of medical notes. With a high agreement rate between GPT-4 and physician responses, there is a promising avenue for further research into the integration of LLMs in healthcare settings. As we continue to explore these technologies, the future of medical data analysis looks increasingly bright.

๐Ÿ’ฌ Your comments

What are your thoughts on the use of AI in analyzing medical notes? We would love to hear your insights! ๐Ÿ’ฌ Share your comments below or connect with us on social media:

The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study.

Abstract

BACKGROUND: Patient notes contain substantial information but are difficult for computers to analyse due to their unstructured format. Large-language models (LLMs), such as Generative Pre-trained Transformer 4 (GPT-4), have changed our ability to process text, but we do not know how effectively they handle medical notes. We aimed to assess the ability of GPT-4 to answer predefined questions after reading medical notes in three different languages.
METHODS: For this retrospective model-evaluation study, we included eight university hospitals from four countries (ie, the USA, Colombia, Singapore, and Italy). Each site submitted seven de-identified medical notes related to seven separate patients to the coordinating centre between June 1, 2023, and Feb 28, 2024. Medical notes were written between Feb 1, 2020, and June 1, 2023. One site provided medical notes in Spanish, one site provided notes in Italian, and the remaining six sites provided notes in English. We included admission notes, progress notes, and consultation notes. No discharge summaries were included in this study. We advised participating sites to choose medical notes that, at time of hospital admission, were for patients who were male or female, aged 18-65 years, had a diagnosis of obesity, had a diagnosis of COVID-19, and had submitted an admission note. Adherence to these criteria was optional and participating sites randomly chose which medical notes to submit. When entering information into GPT-4, we prepended each medical note with an instruction prompt and a list of 14 questions that had been chosen a priori. Each medical note was individually given to GPT-4 in its original language and in separate sessions; the questions were always given in English. At each site, two physicians independently validated responses by GPT-4 and responded to all 14 questions. Each pair of physicians evaluated responses from GPT-4 to the seven medical notes from their own site only. Physicians were not masked to responses from GPT-4 before providing their own answers, but were masked to responses from the other physician.
FINDINGS: We collected 56 medical notes, of which 42 (75%) were in English, seven (13%) were in Italian, and seven (13%) were in Spanish. For each medical note, GPT-4 responded to 14 questions, resulting in 784 responses. In 622 (79%, 95% CI 76-82) of 784 responses, both physicians agreed with GPT-4. In 82 (11%, 8-13) responses, only one physician agreed with GPT-4. In the remaining 80 (10%, 8-13) responses, neither physician agreed with GPT-4. Both physicians agreed with GPT-4 more often for medical notes written in Spanish (86 [88%, 95% CI 79-93] of 98 responses) and Italian (82 [84%, 75-90] of 98 responses) than in English (454 [77%, 74-80] of 588 responses).
INTERPRETATION: The results of our model-evaluation study suggest that GPT-4 is accurate when analysing medical notes in three different languages. In the future, research should explore how LLMs can be integrated into clinical workflows to maximise their use in health care.
FUNDING: None.

Author: [‘Menezes MCS’, ‘Hoffmann AF’, ‘Tan ALM’, ‘Nalbandyan M’, ‘Omenn GS’, ‘Mazzotti DR’, ‘Hernรกndez-Arango A’, ‘Visweswaran S’, ‘Venkatesh S’, ‘Mandl KD’, ‘Bourgeois FT’, ‘Lee JWK’, ‘Makmur A’, ‘Hanauer DA’, ‘Semanik MG’, ‘Kerivan LT’, ‘Hill T’, ‘Forero J’, ‘Restrepo C’, ‘Vigna M’, ‘Ceriana P’, ‘Abu-El-Rub N’, ‘Avillach P’, ‘Bellazzi R’, ‘Callaci T’, ‘Gutiรฉrrez-Sacristรกn A’, ‘Malovini A’, ‘Mathew JP’, ‘Morris M’, ‘Murthy VL’, ‘Buonocore TM’, ‘Parimbelli E’, ‘Patel LP’, ‘Sรกez C’, ‘Samayamuthu MJ’, ‘Thompson JA’, ‘Tibollo V’, ‘Xia Z’, ‘Kohane IS’, ‘Consortium for Clinical Characterization of COVID-19 by Electronic Health Records’]

Journal: Lancet Digit Health

Citation: Menezes MCS, et al. The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study. The potential of Generative Pre-trained Transformer 4 (GPT-4) to analyse medical notes in three different languages: a retrospective model-evaluation study. 2025; 7:e35-e43. doi: 10.1016/S2589-7500(24)00246-2

Share on facebook
Facebook
Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on whatsapp
WhatsApp

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.