上海交通大学学报(医学版) ›› 2024, Vol. 44 ›› Issue (9): 1169-1181.doi: 10.3969/j.issn.1674-8115.2024.09.012

• 论著 · 临床研究 • 上一篇    

机器学习预测乳腺癌新辅助治疗后炎症代谢状态改变的模型评价

吴其蓁(), 刘启明(), 柴烨子, 陶政宇, 王依楠, 郭欣宁, 姜萌(), 卜军()   

  1. 上海交通大学医学院附属仁济医院心内科,上海 200127
  • 收稿日期:2024-01-29 接受日期:2024-06-04 出版日期:2024-09-28 发布日期:2024-10-09
  • 通讯作者: 姜萌,卜军 E-mail:wuqizhen@sjtu.edu.cn;090503liu@sjtu.edu.cn;jiangmeng0919@163.com;pujun310@hotmail.com
  • 作者简介:吴其蓁(1997—),女,博士生;电子信箱:wuqizhen@sjtu.edu.cn
    刘启明(1996—),男,硕士;电子信箱:090503liu@sjtu.edu.cn第一联系人:(吴其蓁、刘启明并列第一作者)
  • 基金资助:
    国家自然科学基金(U21A20341);上海市科学技术委员会基金(20Y11910500);上海市优秀学术/技术带头人计划项目(21XD1432100);上海申康医院发展中心临床三年行动计划项目(SHDC2020CR2025B);上海市肿瘤研究所基金(ZZ-20-22SYL);上海交通大学医学院“双百人”项目(20172014)

Evaluation of machine learning prediction of altered inflammatory metabolic state after neoadjuvant therapy for breast cancer

WU Qizhen(), LIU Qiming(), CHAI Yezi, TAO Zhengyu, WANG Yinan, GUO Xinning, JIANG Meng(), PU Jun()   

  1. Department of Cardiology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200127, China
  • Received:2024-01-29 Accepted:2024-06-04 Online:2024-09-28 Published:2024-10-09
  • Contact: JIANG Meng,PU Jun E-mail:wuqizhen@sjtu.edu.cn;090503liu@sjtu.edu.cn;jiangmeng0919@163.com;pujun310@hotmail.com
  • Supported by:
    National Natural Science Foundation of China(U21A20341);Project of Science and Technology Commission of Shanghai Municipality(20Y11910500);Advanced Technology Leader of Science and Technology Commission of Shanghai Municipality(21XD1432100);Three-year Action Plan of Shanghai Shenkang Hospital Development Center(SHDC2020CR2025B);Project of Shanghai Cancer Institute(ZZ-20-22SYL);“Two-hundred Talents” Program of Shanghai Jiao Tong University School of Medicine(20172014)

摘要:

目的·通过机器学习方法,采用临床常见实验室指标及心脏彩色多普勒超声指标,在乳腺癌患者中探究早期识别并预测新辅助治疗后发生与代谢状态改变相关的心血管疾病高风险患者的方案。方法·连续入选2020年9月—2022年9月在上海交通大学医学院附属仁济医院乳腺外科确诊的原发性浸润性乳腺癌女性患者。收集并记录患者的一般情况、实验室检查结果及心脏彩色多普勒超声结果。经过特征提取后,分别应用梯度增强(gradient boost,GB)、支持向量机(support vector machine,SVM)、决策树(decision tree,DT)、K-近邻(K-nearest neighbour,KNN)及随机森林(random forest,RF)5种机器学习方法构建新辅助治疗后患者炎症代谢状态改变预测模型,并比较5种模型的预测性能。结果·最终纳入232例有效临床数据,其中135例为新辅助治疗前,97例为完成4个周期的新辅助治疗后。特征提取筛选出白细胞计数、血红蛋白、高密度脂蛋白、白细胞介素-2受体和白细胞介素-8这5项特征。在多特征分析中,白细胞计数+血红蛋白+高密度脂蛋白的受试者操作特征曲线下面积高于白细胞介素-2受体+白细胞介素-8(RF:0.928 vs 0.772;GB:0.900 vs 0.792;SVM:0.941 vs 0.764;KNN:0.907 vs 0.762;DT:0.799 vs 0.714),并且在RF、SVM、GB模型中的曲线下面积(0.928、0.941、0.900)及准确率(0.914、0.897、0.776)较高;与RF、GB模型(P=0.122,P=0.097)相比,SVM模型在训练集数据上校准度较好(P=0.394)。结论·SVM模型可通过纳入白细胞计数、血红蛋白、高密度脂蛋白、白细胞介素-2受体、白细胞介素-8这5项临床常见指标,在乳腺癌患者中建立早期预测新辅助治疗后代谢状态改变相关心血管疾病风险的预测模型,可能有助于临床上建立基于患者炎症代谢状态的个体化筛查方案。

关键词: 乳腺癌, 新辅助治疗, 机器学习, 支持向量机

Abstract:

Objective ·To develop a machine learning approach for early identification of metabolic syndromes associated with inflammatory metabolic state changes in breast cancer patients after neoadjuvant therapy, using common laboratory and transthoracic echocardiography indices. Methods ·Female patients with primary invasive breast cancer diagnosed at the Department of Breast Surgery, Renji Hospital, Shanghai Jiao Tong University School of Medicine, between September 2020 and September 2022, were included. General patient information, laboratory test results, and transthoracic echocardiography data were collected. After feature extraction, five machine learning algorithms, including random forest (RF), gradient boosting (GB), support vector machine (SVM), K-nearest neighbor (KNN), and decision tree (DT), were applied to construct a prediction model for the changes of the patients′ metabolic state after neoadjuvant therapy, and the prediction performances of the five models were compared. Results ·A total of 232 cases with valid clinical data were included, comprising 135 cases before neoadjuvant therapy and 97 cases after completing 4 cycles of neoadjuvant therapy. Feature extraction identified five key features: white blood cell count, hemoglobin, high-density lipoprotein (HDL), interleukin-2 receptor, and interleukin-8. In the multi-feature analysis, the area under the receiver operating characferistic curve (AUC) was higher in the combination of white blood cell count, hemoglobin and HDL compared to the combination of interleukin-2 receptor and interleukin-8 (RF: 0.928 vs 0.772, GB: 0.900 vs 0.792, SVM: 0.941 vs 0.764, KNN: 0.907 vs 0.762, DT: 0.799 vs 0.714). The RF, SVM, and GB models showed higher AUC (0.928, 0.941, 0.900) and accuracy (0.914, 0.897, 0.776). The SVM model exhibited superior accuracy in the training data compared to the RF and GB models (P=0.394, 0.122 and 0.097, respectively). Conclusion ·The SVM model can be used to establish a prediction model for identifying breast cancer patients at high risk of developing inflammatory metabolic state-related metabolic syndrome after neoadjuvant therapy by incorporating five common clinical indicators, namely, white blood cell count, hemoglobin, high-density lipoprotein, interleukin-2 receptor, and interleukin-8. SVM modeling may be useful for clinicians to establish individualized screening protocols based on a patient′s inflammatory metabolic state.

Key words: breast cancer, neoadjuvant therapy, machine learning, support vector machine

中图分类号: