上海交通大学学报(医学版) ›› 2024, Vol. 44 ›› Issue (6): 762-772.doi: 10.3969/j.issn.1674-8115.2024.06.012

• 论著 · 技术与方法 • 上一篇    下一篇

基于深度学习的结直肠息肉内镜图像分割和分类方法比较

陈健1(), 王珍妮1, 夏开建2, 王甘红3, 刘罗杰1(), 徐晓丹1()   

  1. 1.江苏省常熟市第一人民医院(苏州大学附属常熟医院)消化内科,常熟 215500
    2.江苏省常熟市医学人工智能与大数据重点实验室,常熟 215500
    3.江苏省常熟市中医院消化内科,常熟 215500
  • 收稿日期:2024-01-01 接受日期:2024-03-15 出版日期:2024-06-28 发布日期:2024-06-28
  • 通讯作者: 刘罗杰,徐晓丹 E-mail:szcsdocter@gmail.com;Luojie13542@163.com;xxddocter@gmail.com
  • 作者简介:陈 健(1987—),男,副主任医师,硕士;电子信箱:szcsdocter@gmail.com
  • 基金资助:
    苏州市第二十三批科技发展计划项目(SLT2023006);常熟市医学人工智能与大数据重点实验室能力提升项目(CYZ202301);苏州市护理学会科研项目(SZHL-B-202407)

Comparative study on methods for colon polyp endoscopic image segmentation and classification based on deep learning

CHEN Jian1(), WANG Zhenni1, XIA Kaijian2, WANG Ganhong3, LIU Luojie1(), XU Xiaodan1()   

  1. 1.Department of Gastroenterology, Changshu No. 1 People's Hospital (Changshu Hospital Affiliated to Soochow University), Jiangsu Province, Changshu 215500, China
    2.Changshu Key Laboratory of Medical Artificial Intelligence and Big Data, Jiangsu Province, Changshu 215500, China
    3.Department of Gastroenterology, Changshu Traditional Chinese Medicine Hospital, Jiangsu Province, Changshu 215500, China
  • Received:2024-01-01 Accepted:2024-03-15 Online:2024-06-28 Published:2024-06-28
  • Contact: LIU Luojie,XU Xiaodan E-mail:szcsdocter@gmail.com;Luojie13542@163.com;xxddocter@gmail.com
  • Supported by:
    Suzhou City's 23rd Science and Technology Development Plan(SLT2023006);Changshu City Key Laboratory of Medical Artificial Intelligence and Big Data Capability Enhancement Project(CYZ202301);Scientific Research Project of Suzhou Nursing Association(SZHL-B-202407)

摘要:

目的·比较不同深度学习方法在结直肠息肉内镜图像分割和分类任务中的性能,以确定最优方法。方法·从3家医院采集4个结肠息肉数据集,涵盖1 534个静态图像及15个肠镜视频。所有样本均经病理学验证,分为锯齿状病变和腺瘤性息肉2类。使用LabelMe工具进行多边形标注,将标注结果转换为整数掩膜格式。数据用于训练不同架构的深度神经网络,包括卷积神经网络、Transformer以及这2种技术的融合,建立有效的语义分割模型。对比不同架构模型自动诊断结肠息肉的多项性能指标,包括平均交并比(mIoU)、整体准确率(aAcc)、平均准确率(mAcc)、平均Dice系数(mDice)、平均F分数(mFscore)、平均精确率(mPrecision)和平均召回率(mRecall)。结果·开发了4种不同架构的语义分割模型,包括2种深度卷积神经网络架构(Fast-SCNN和DeepLabV3plus)、1种Transformer架构(Segformer)以及1种混合架构(KNet)。在对291张测试图像进行综合性能评估中,KNet最高mIoU为84.59%,显著优于Fast-SCNN(75.32%)、DeepLabV3plus(78.63%)和Segformer(80.17%)。在“背景”“锯齿状病变”和“腺瘤性息肉”3个类别上,KNet的交并比(IoU)分别为98.91%、74.12%和80.73%,均超越其他模型。KNet在关键性能指标上也表现优异,其中aAcc、mAcc、mDice、mFscore和mRecall分别达到98.59%、91.24%、91.31%、91.31%和91.24%,均优于其他模型。尽管在mPrecision上,91.46%并非最突出,但KNet的整体性能仍领先。在80张外部测试图像的推理测试中,KNet保持了81.53%的mIoU,展现出良好的泛化能力。结论·利用基于KNet混合架构的深度神经网络构建的结直肠息肉内镜图像语义分割模型表现出优异的预测性能,具有成为检测结直肠息肉高效工具的潜力。

关键词: 深度学习, 结直肠息肉, 卷积神经网络, Transformer, 图像分割

Abstract:

Objective ·To compare the performance of various deep learning methods in the segmentation and classification of colorectal polyp endoscopic images, and identify the most effective approach. Methods ·Four colorectal polyp datasets were collected from three hospitals, encompassing 1 534 static images and 15 videos. All samples were pathologically validated and categorized into two types: serrated lesions and adenomatous polyps. Polygonal annotations were performed by using the LabelMe tool, and the annotated results were converted into integer mask formats. These data were utilized to train various architectures of deep neural networks, including convolutional neural network (CNN), Transformers, and their fusion, aiming to develop an effective semantic segmentation model. Multiple performance indicators for automatic diagnosis of colon polyps by different architecture models were compared, including mIoU, aAcc, mAcc, mDice, mFscore, mPrecision and mRecall. Results ·Four different architectures of semantic segmentation models were developed, including two deep CNN architectures (Fast-SCNN and DeepLabV3plus), one Transformer architecture (Segformer), and one hybrid architecture (KNet). In a comprehensive performance evaluation of 291 test images, KNet achieved the highest mIoU of 84.59%, significantly surpassing Fast-SCNN (75.32%), DeepLabV3plus (78.63%), and Segformer (80.17%). Across the categories of “background”, “serrated lesions” and “adenomatous polyps” , KNet's intersection over union (IoU) were 98.91%, 74.12%, and 80.73%, respectively, all exceeding other models. Additionally, KNet performed excellently in key performance metrics, with aAcc, mAcc, mDice, mFscore, and mRecall reaching 98.59%, 91.24%, 91.31%, 91.31%, and 91.24%, respectively, all superior to other models. Although its mPrecision of 91.46% was not the most outstanding, KNet's overall performance remained leading. In inference testing on 80 external test images, KNet maintained an mIoU of 81.53%, demonstrating strong generalization capabilities. Conclusion ·The semantic segmentation model of colorectal polyp endoscopic images constructed by deep neural network based on KNet hybrid architecture, exhibits superior predictive performance, demonstrating its potential as an efficient tool for detecting colorectal polyps.

Key words: deep learning, colorectal polyp, convolutional neural network, Transformer, image segmentation

中图分类号: