A new clinical pilot shows that letting an AI interview patients before they see a surgeon slashes consultation times while actually improving data quality.
Can we trust an AI to talk directly to patients before a major surgery? Most healthcare AI stays hidden in back-office administration, safely away from raw patient interaction. Doctors worry about missed symptoms, while patients fear cold, robotic care.
A new study in the European Spine Journal challenges this caution. Researchers tested Spine-GPT, an AI assistant designed to interview patients before their face-to-face appointments. The results suggest we have been looking at clinical AI the wrong way. It is not a replacement for human judgment, but a tool to clear the cognitive clutter before a doctor even walks into the room.
The clinical trial
The researchers built the system using advanced prompt engineering, a technique increasingly recognized as vital for clinical safety. Indeed, recent research on Prompt Engineering in Healthcare shows that how we structure AI instructions dictates its clinical safety. The trial occurred in two phases. First, Phase 0 tested the AI using 24 fictional clinical scenarios. Three board-certified spine surgeons graded the AI across 7 domains on a 5-point Likert scale, where the system scored above 4.0 in every category.
Phase 1 moved to real patients. The researchers split 60 patients into two equal groups of 30. The control group received standard consultations. The intervention group used Spine-GPT before meeting their surgeon.
Fewer minutes, better data
The AI did not just save time. It gathered better clinical information than the human doctors did alone. The key metrics tell a clear story:
- Active history-taking time dropped by 31.3%, falling from 11.47 minutes to 7.88 minutes.
- Information completeness increased by an 11.7%-point margin.
- The AI successfully flagged 100% of the 3 predefined high-risk red-flag cases.
- Surgeons needed to make zero edits on 70% of the AI-generated summaries.
Both patients and surgeons reported significantly higher satisfaction. This matters because spine surgery requires highly precise diagnostic data. When a surgeon saves nearly four minutes per patient while getting more complete data, they can focus on physical exams and treatment decisions rather than basic data entry.
The reality check
We must view these results with healthy skepticism. A cohort of 60 patients at a single clinic is a tiny sample size. The system worked well here, but real-world patients often describe pain in confusing, non-standard ways. This limitation aligns with broader findings in clinical informatics. For instance, a study on using ChatGPT for extracting structured data warns that LLMs can struggle with the messy reality of unstructured clinical notes.
Ultimately, this trial proves that patient-facing AI is no longer a futuristic risk. By acting as a transparent partner rather than a black box, it allows surgeons to do what they do best: make the final, critical decisions.
Read the full study in the European Spine Journal.
