上海交通大学学报(医学版) ›› 2019, Vol. 39 ›› Issue (10): 1156-.doi: 10.3969/j.issn.1674-8115.2019.10.009

• 论著·临床研究 • 上一篇    下一篇

基于约登指数的肝纤维化诊断模型改进研究

桑 潮1,谢国祥2,梁丹丹1,赵爱华1,贾 伟1, 2,陈天璐1   

  1. 1. 上海交通大学附属第六人民医院转化医学中心,上海 200233;2.夏威夷大学癌症研究中心,火奴鲁鲁 96813
  • 出版日期:2019-10-28 发布日期:2019-11-22
  • 通讯作者: 陈天璐,电子信箱:chentianlu@sjtu.edu.cn。
  • 作者简介:桑 潮(1994—),女,硕士生;电子信箱:sang_chao@163.com。
  • 基金资助:
    国家自然科学基金(81772530, 31501079, 31500954)

Improvement of liver fibrosis diagnostic models based on Youden index

SANG Chao1, XIE Guo-xiang2, LIANG Dan-dan1, ZHAO Ai-hua1, JIA Wei1, 2, CHEN Tian-lu1   

  1. 1. Center for Translational Medicine, Shanghai Sixth Peoples Hospital, Shanghai Jiao Tong University, Shanghai 200233, China; 2. University of Hawaii Cancer Center, Honolulu 96813, USA
  • Online:2019-10-28 Published:2019-11-22
  • Supported by:
    National Natural Science Foundation of China, 81772530, 31501079, 31500954

摘要: 目的·采用约登指数改进肝纤维化诊断模型的性能,解决组间样本数相差大时诊断灵敏度不平衡的问题。方法·使用在GitHub网站公开获取的来自上海中医药大学附属曙光医院的482例乙型肝炎病毒(hepatitis B virus,HBV)感染患者和来自厦门市中医院的86例HBV感染患者分别作为训练集和验证集开展研究。基于HBV患者的年龄和3项血液学检查结果(血小板计数、血清中谷草转氨酶和谷丙转氨酶含量),建立线性判别分析、随机森林、梯度增强、决策树4种机器学习模型,实现早期和晚期肝纤维化的诊断,以及肝纤维化和肝硬化的诊断。借助约登指数调整模型的分类阈值和诊断结果。采用总准确率、受试者工作特征曲线下面积(area under the curve,AUC)和灵敏度等指标,比较各模型以及临床常用的基于4个因素的纤维化指数(fibrosis index based on the 4 factor,FIB-4)的诊断性能。结果·在肝纤维化诊断中,4种机器学习模型均存在组间灵敏度不平衡的现象。在向模型引入约登指数后,组间灵敏度的差别均大幅减小;机器学习模型的总准确率和AUC普遍高于FIB-4。结论·基于约登指数的诊断模型可平衡各组间的灵敏度,有助于提高肝纤维化诊断模型的综合性能。

关键词: 约登指数, 机器学习, 肝纤维化, 疾病诊断

Abstract:

Objective ·using Youden index, to improve the performance of the hepatic fibrosis diagnostic models, and to solve the problem of unbalanced diagnostic sensitivity when there is a big difference in the sample size of two groups. Methods · Two hepatitis B virus (HBV) datasets available on GitHub were selected, including 482 HBV infected subjects recruited Shuguang Hospital in affiliation with Shanghai University of Traditional Chinese Medicine (train set) and 86 HBV infected subjects Xiamen Hospital of Traditional Chinese Medicine (validation set).using the two datasets, linear discriminant analysis model, random forest model, gradient boosting model and decision tree model were established, based on four clinical parameters (age, glutamic-oxaloacetic transaminase, glutamic-pyruvic transaminase, and platelet count) of patients, for the diagnosis of early and advanced hepatic fibrosis as well as the diagnosis of hepatic fibrosis and cirrhosis. Youden index was used to adjust the threshold value and the classification result of each diagnostic model. The diagnostic performances of each machine learning model and fibrosis index based on the 4 factor (FIB-4) were evaluatedaccuracy, the area under the receiver operating characteristic curve (AUC) and sensitivity. Results · The intergroup sensitivity imbalance occurred in all machine learning models. After using Youden index, the difference of intergroup sensitivity was greatly reduced, and the total accuracy and AUC values of machine learning models were generally higher than those of FIB-4 index. Conclusion · The improved diagnostic models based on Youden index can reduce the difference of intergroup sensitivity and improve the comprehensive performance of the diagnostic models of hepatic fibrosis.

Key words: Youden index, machine learning, liver fibrosis, disease diagnosis

中图分类号: