โก Quick Summary
This study presents a groundbreaking international consensus on priority scoring metrics for evaluating computer-aided detection (CADe) programs in colonoscopy. Utilizing a modified Delphi approach, the research identifies six key criteria essential for the development and refinement of CADe software.
๐ Key Details
- ๐ Participants: 25 global leaders in CADe, including endoscopists and researchers
- ๐ Methodology: Modified Delphi approach over 8 months
- ๐ Scoring Criteria: 121 criteria generated, 54 deemed relevant
- ๐ข Final Criteria: Six highest-priority metrics identified
๐ Key Takeaways
- ๐ Sensitivity and independent validation of CADe algorithms are top priorities (average score 4.16).
- ๐ Adenoma detection rate scored 4.08, highlighting its importance in colorectal cancer screening.
- โ ๏ธ False positive rate is crucial, with a mean score of 4.00.
- โฑ๏ธ Latency and adenoma miss rate are also significant, scoring 3.84 and 3.68, respectively.
- ๐ ๏ธ Standardized metrics will guide future CADe software development and refinement.
- ๐ฌ Future research should validate these metrics on benchmark video datasets.
๐ Background
The rise of computer-aided detection (CADe) software in colonoscopy has transformed colorectal cancer screening practices. However, the lack of a standardized methodology for comparing different CADe algorithms has created uncertainty regarding their performance. This study aims to fill that gap by establishing a consensus on the most critical metrics for evaluation.
๐๏ธ Study
Conducted over eight months, this study involved 25 experts in the field of CADe, including endoscopists, researchers, and industry representatives. Through an online survey, participants generated a comprehensive list of scoring criteria, which were then refined through multiple rounds of ranking and open commentary.
๐ Results
The study identified six key metrics for evaluating CADe programs: sensitivity, independent validation, adenoma detection rate, false positive rate, latency, and adenoma miss rate. The mean priority scores for these criteria ranged from 3.68 to 4.16, indicating a strong consensus among experts on their importance.
๐ Impact and Implications
The establishment of these standardized metrics is a significant step forward in the field of colorectal cancer screening. By providing a clear framework for evaluating CADe software, this research will not only enhance the development of more effective algorithms but also improve patient outcomes through better detection rates and reduced false positives. The implications extend beyond colonoscopy, potentially influencing other areas of medical imaging and diagnostics.
๐ฎ Conclusion
This study marks a pivotal moment in the evaluation of CADe programs in colonoscopy. By identifying and standardizing key metrics, it paves the way for future advancements in the field. As we move forward, validating these metrics on benchmark datasets will be crucial for ensuring their effectiveness and reliability in clinical practice. The future of colorectal cancer screening looks promising with the integration of these insights!
๐ฌ Your comments
What are your thoughts on the importance of standardized metrics in evaluating CADe programs? We invite you to share your insights and engage in a discussion! ๐ฌ Leave your comments below or connect with us on social media:
Creating a Standardized Tool for the Evaluation and Comparison of Artificial Intelligence-Based Computer-Aided Detection Programs in Colonoscopy: a Modified Delphi Approach.
Abstract
BACKGROUND AND AIM: Multiple computer-aided detection (CADe) software have now achieved regulatory approval in the US, Europe, and Asia and are being used in routine clinical practice to support colorectal cancer screening. There is uncertainty regarding how different CADe algorithms may perform. No objective methodology exists for comparing different algorithms. We aimed to identify priority scoring metrics for CADe evaluation and comparison.
METHODS: A modified Delphi approach was used. Twenty-five global leaders in CADe in colonoscopy, including endoscopists, researchers, and industry representatives, participated in an online survey over the course of 8 months. Participants generated 121 scoring criteria, 54 of which were deemed within the study scope and distributed for review and asynchronous email-based open comment. Participants then scored criteria in order of priority on a 5-point Likert scale during ranking round one. The top eleven highest-priority criteria were re-distributed, with another opportunity for open-comment, followed by a final round of priority scoring to identify the final 6 criteria.
RESULTS: Mean priority scores for the 54 criteria ranged from 2.25 to 4.38 following the first ranking round. The top eleven criteria following ranking round one yielded mean priority scores ranging from 3.04 to 4.16. The final six highest priority criteria were 1) sensitivity (average = 4.16) and separate & independent validation of the CADe algorithm (4.16), 3) adenoma detection rate (4.08), 4) false positive rate (4.00), 5) latency (3.84), and 6) adenoma miss rate (3.68).
CONCLUSIONS: This is the first reported international consensus statement of priority scoring metrics for CADe in colonoscopy. These scoring criteria should inform CADe software development and refinement. Future research should validate these metrics on a benchmark video data set to develop a validated scoring instrument.
Author: [‘Gadi SRV’, ‘Mori Y’, ‘Misawa M’, ‘East JE’, ‘Hassan C’, ‘Repici A’, ‘Byrne MF’, ‘von Renteln D’, ‘Hewett DG’, ‘Wang P’, ‘Saito Y’, ‘Matsubayashi CO’, ‘Ahmad OF’, ‘Sharma P’, ‘Gross SA’, ‘Sengupta N’, ‘Mansour N’, ‘Cherubini A’, ‘Dinh NN’, ‘Xiao X’, ‘Mountney P’, ‘Gonzรกlez-Bueno Puyal J’, ‘Little G’, ‘LaRocco S’, ‘Conjeti S’, ‘Seibt H’, ‘Zur D’, ‘Shimada H’, ‘Berzin TM’, ‘Glissen Brown JR’]
Journal: Gastrointest Endosc
Citation: Gadi SRV, et al. Creating a Standardized Tool for the Evaluation and Comparison of Artificial Intelligence-Based Computer-Aided Detection Programs in Colonoscopy: a Modified Delphi Approach. Creating a Standardized Tool for the Evaluation and Comparison of Artificial Intelligence-Based Computer-Aided Detection Programs in Colonoscopy: a Modified Delphi Approach. 2024; (unknown volume):(unknown pages). doi: 10.1016/j.gie.2024.11.042