โก Quick Summary
The study introduces Kinematic Adaptive Frame Recognition (KAFR), a novel framework designed for video segmentation in surgical procedures. By leveraging frame similarity and surgical tool tracking, KAFR achieves a tenfold reduction in redundant frames while enhancing accuracy by 4.32%.
๐ Key Details
- ๐ Datasets: Newly annotated Gastrojejunostomy (GJ) dataset (2017-2021) and Pancreaticojejunostomy (PJ) dataset (2011-2022).
- โ๏ธ Technology: YOLOv8 for tool detection and X3D CNN for classification.
- ๐ Performance: GJ dataset: Accuracy improved from 0.749 to 0.7814; PJ dataset: Accuracy improved from 0.8801 to 0.8982.
๐ Key Takeaways
- ๐ค KAFR effectively reduces video data size while maintaining critical information.
- ๐ Tenfold reduction in frames for the GJ dataset, improving accuracy by 4.32%.
- ๐ Fivefold reduction in frames for the PJ dataset, with a 2.05% accuracy improvement.
- ๐ Adaptability of KAFR allows application to broader surgical datasets beyond GJ and PJ.
- ๐ Competitive performance compared to state-of-the-art approaches in video segmentation.
- ๐ Tool tracking is central to computing frame similarity, enhancing segmentation accuracy.
- ๐ F1 score improvements of 0.16% for GJ and 2.54% for PJ datasets demonstrate KAFR’s effectiveness.
- ๐ KAFR can complement existing AI models, enhancing their performance by reducing redundant data.
๐ Background
The integration of Artificial Intelligence (AI) in surgical procedures has gained momentum, particularly in automating the analysis of lengthy operative videos. These videos, often spanning from thirty minutes to several hours, present challenges for AI models due to their size and complexity. As the volume of surgical videos is expected to increase, innovative techniques like KAFR are essential for efficient data processing and analysis.
๐๏ธ Study
The study conducted a comprehensive evaluation of KAFR using two datasets from referral centers. The GJ dataset involved segmenting robotic GJ videos into six distinct phases, while the PJ dataset provided a historical perspective on pancreatic surgeries. The methodology included tracking surgical tools using a YOLOv8 model and computing frame similarities based on their spatial positions and velocities.
๐ Results
KAFR demonstrated remarkable results, achieving a tenfold reduction in frames for the GJ dataset while improving accuracy from 0.749 to 0.7814. Similarly, the PJ dataset showed a fivefold reduction in frames with an accuracy increase from 0.8801 to 0.8982. The improvements in F1 scores further validate the effectiveness of KAFR in surgical video segmentation.
๐ Impact and Implications
The implications of KAFR are significant for the field of surgical analysis. By reducing redundant data while retaining essential information, KAFR not only enhances the efficiency of AI models but also improves the accuracy of surgical performance assessments. This advancement could lead to better training and evaluation of surgical techniques, ultimately improving patient outcomes and surgical education.
๐ฎ Conclusion
The introduction of Kinematic Adaptive Frame Recognition (KAFR) marks a significant step forward in the analysis of surgical videos. By effectively managing data size and improving accuracy, KAFR showcases the potential of AI in enhancing surgical procedures. Continued research and development in this area could lead to even more innovative solutions in surgical analytics.
๐ฌ Your comments
What are your thoughts on the KAFR framework and its potential applications in surgical video analysis? We invite you to share your insights! ๐ฌ Leave your comments below or connect with us on social media:
Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking.
Abstract
The interest in leveraging Artificial Intelligence (AI) for surgical procedures to automate analysis has witnessed a significant surge in recent years. One of the primary tools for recording surgical procedures and conducting subsequent analyses, such as performance assessment, is through videos. However, these operative videos tend to be notably lengthy compared to other fields, spanning from thirty minutes to several hours, which poses a challenge for AI models to effectively learn from them. Despite this challenge, the foreseeable increase in the volume of such videos in the near future necessitates the development and implementation of innovative techniques to tackle this issue effectively. In this article, we propose a novel technique called Kinematics Adaptive Frame Recognition (KAFR) that can efficiently eliminate redundant frames to reduce dataset size and computation time while retaining useful frames to improve accuracy. Specifically, we compute the similarity between consecutive frames by tracking the movement of surgical tools. Our approach follows these steps:1) Tracking phase: a YOLOv8 model is utilized to detect tools presented in the scene, 2) Similarity phase: Similarities between consecutive frames are computed by estimating variation in the spatial positions and velocities of the tools, 3) Classification phase: An X3D CNN is trained to classify segmentation. We evaluate the effectiveness of our approach by analyzing datasets obtained through retrospective reviews of cases at two referral centers. The newly annotated Gastrojejunostomy (GJ) dataset covers procedures performed between 2017 and 2021, while the previously annotated Pancreaticojejunostomy (PJ) dataset spans from 2011 to 2022 at the same centers. In the GJ dataset, each robotic GJ video is segmented into six distinct phases. By adaptively selecting relevant frames, we achieve a tenfold reduction in the number of frames while improving accuracy by 4.32% (from 0.749 to 0.7814) and the F1 score by 0.16%. Our approach is also evaluated on the PJ dataset, demonstrating its efficacy with a fivefold reduction of data and a 2.05% accuracy improvement (from 0.8801 to 0.8982), along with 2.54% increase in F1 score (from 0.8534 to 0.8751). In addition, we also compare our approach with the state-of-the-art approaches to highlight its competitiveness in terms of performance and efficiency. Although we examined our approach on the GJ and PJ datasets for phase segmentation, this could also be applied to broader, more general surgical datasets. Furthermore, KAFR can serve as a supplement to existing approaches, enhancing their performance by reducing redundant data while retaining key information, making it a valuable addition to other AI models.
Author: [‘Nguyen HP’, ‘Khairnar SM’, ‘Palacios SG’, ‘Al-Abbas A’, ‘Hogg ME’, ‘Zureikat AH’, ‘Polanco PM’, ‘Zeh HJ’, ‘Sankaranarayanan G’]
Journal: IEEE Access
Citation: Nguyen HP, et al. Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking. Kinematic Adaptive Frame Recognition (KAFR): A Novel Framework for Video Segmentation via Frame Similarity and Surgical Tool Tracking. 2025; 13:101681-101697. doi: 10.1109/access.2025.3573264