โก Quick Summary
This article discusses the need to shift from assessing mental health chatbot safety at discrete end points to evaluating trajectories over time. By focusing on the entire dialogue, researchers can better identify clinically meaningful deterioration in user interactions.
๐ Key Details
- ๐ Focus: Mental health safety assessment of AI chatbots
- ๐งฉ Key Issues: Compulsive use, sleep disruption, and withdrawal from human contact
- โ๏ธ Proposed Method: Evaluate entire dialogues instead of just end points
- ๐ Importance: Capturing relational cues and dynamics over time
๐ Key Takeaways
- ๐ฃ๏ธ Trajectory Effects: Risk accumulates through extended dialogue, not at single tipping points.
- ๐ Shift in Evaluation: Focus should be on the whole conversation, not just the last message.
- ๐ Turn-by-Turn Analysis: Reporting dynamics like delusion confirmation and harm enablement is crucial.
- ๐ Context Matters: Short tests may not reflect real-world interactions effectively.
- ๐ง Human Outcomes: Incorporate shifts in certainty and behavior post-interaction for better safety assessments.
- ๐ Clinical Surveillance: Build infrastructure for transcript donation linked to health outcomes.
- ๐ Real-World Relevance: Current safety evaluations may miss important relational cues.

๐ Background
The integration of AI chatbots into daily life has brought both benefits and challenges, particularly in the realm of mental health. While many users find support and companionship, a minority experience concerning shifts in their mental state, leading to potential crises. Understanding these dynamics is essential for ensuring user safety and well-being.
๐๏ธ Study
The authors argue for a paradigm shift in how we assess the safety of mental health chatbots. By examining the entire dialogue rather than isolated end points, they aim to capture the nuances of user interactions that may indicate risk. This approach emphasizes the importance of understanding how conversations evolve over time and how they can impact mental health.
๐ Results
The study highlights that traditional safety evaluations often overlook critical relational cues and dynamics that emerge throughout conversations. By focusing on the trajectory of interactions, researchers can better identify when users may be drifting toward harmful beliefs or behaviors, even if the chatbot’s responses do not explicitly indicate danger.
๐ Impact and Implications
This shift in assessment methodology could significantly enhance the safety and effectiveness of mental health chatbots. By aligning safety evaluations with real-world user experiences, we can develop more robust benchmarks that reflect the complexities of human-AI interactions. This could lead to improved mental health outcomes and a better understanding of how to support users effectively.
๐ฎ Conclusion
The call to move from end-point assessments to trajectory evaluations in chatbot safety is a crucial step toward enhancing mental health support through AI. By adopting this comprehensive approach, we can better safeguard users and ensure that these technologies serve their intended purpose of promoting well-being. Continued research in this area is essential for developing effective safety measures and improving user experiences.
๐ฌ Your comments
What are your thoughts on this new approach to assessing chatbot safety? We would love to hear your insights! ๐ฌ Leave your comments below or connect with us on social media:
It Is the Journey, Not the Destination: Moving From End Points to Trajectories When Assessing Chatbot Mental Health Safety.
Abstract
Large language models are rapidly becoming embedded in everyday life through artificial intelligence (AI) chatbots that people use for practical assistance and companionship, as well as for support with mental health and emotional well-being. Alongside clear benefits, clinicians and public reports increasingly describe a minority of users whose interactions seem to drift over days or weeks toward strongly questionable convictions, delusions, or suicidal crises. Importantly, clinically meaningful deterioration can occur even without overtly unsafe text outputs, via more insidious processes, such as compulsive use, sleep disruption, withdrawal from human contact, and progressive narrowing of attention around the chatbot relationship. In this Viewpoint, we argue that risk often arises not at a single tipping point but through trajectory effects that accumulate across extended dialogue and that prevailing safety evaluation approaches are misaligned with this reality because they primarily score risk at discrete conversational end points often reached through scripted dialogues lasting just a single turn or several turns. Mental health benchmarks and safety suites (including clinician-informed efforts) have advanced the field by testing refusal behavior, toxicity, and adversarial prompting. However, they often treat the last message as the unit of analysis and, therefore, miss when risk-relevant relational cues, signs of validation, contradiction handling, and shifts in certainty first emerge and how they compound. We propose that mental health safety assessment should shift from end points to trajectories by (1) treating the whole dialogue, not just the end result, as the focus of evaluation; (2) reporting turn-by-turn dynamics, such as delusion confirmation and harm enablement, and timing and persistence of safety interventions; and (3) calibrating short multiturn tests against longer, clinically realistic interaction sequences that can reveal context-length effects and drift. We further argue that transcript-only evaluation is insufficient in mental health contexts. Similar language can reflect very different internal states, and the relationship between expressed psychopathology and real-world harm is nonlinear. Therefore, safety research should incorporate proximal human outcomes following interactions (eg, shifts in certainty, openness to counterevidence, arousal, urge to continue, and subsequent sleep or behavior) and build a prospective clinical surveillance infrastructure that supports transcript donation with consent and linkage to health outcomes. Together, these steps would enable benchmarks that are clinically relevant and better aligned with the types of harms now being observed in real-world chatbot use.
Author: [‘Morrin H’, ‘Au Yeung J’, ‘Agnew Z’, ‘รstergaard SD’, ‘Pollak TA’]
Journal: JMIR Ment Health
Citation: Morrin H, et al. It Is the Journey, Not the Destination: Moving From End Points to Trajectories When Assessing Chatbot Mental Health Safety. It Is the Journey, Not the Destination: Moving From End Points to Trajectories When Assessing Chatbot Mental Health Safety. 2026; 13:e91454. doi: 10.2196/91454