上海交通大学学报(医学版) ›› 2022, Vol. 42 ›› Issue (10): 1394-1403.doi: 10.3969/j.issn.1674-8115.2022.10.004

• 论著 · 基础研究 • 上一篇    

胃癌免疫相关长链非编码RNA预测模型的构建

陈彬(), 崔洪全, 杨懿瑾, 徐海燕, 张玲()   

  1. 上海交通大学医学院苏州九龙医院肿瘤科,苏州 215021
  • 收稿日期:2022-03-29 接受日期:2022-06-14 出版日期:2022-10-28 发布日期:2022-12-02
  • 通讯作者: 张玲 E-mail:chb083@126.com;mfwl3@163.com
  • 作者简介:陈 彬(1981—),男,回族,副主任医师,硕士;电子信箱:chb083@126.com
  • 基金资助:
    苏州市“科教兴卫”青年科技项目(kjxw2018081)

Construction of a prediction model of immune-related long non-coding RNA in gastric cancer

CHEN Bin(), CUI Hongquan, YANG Yijin, XU Haiyan, ZHANG Ling()   

  1. Department of Oncology, Suzhou Kowloon Hospital, Shanghai Jiao Tong University School of Medicine, Suzhou 215021, China
  • Received:2022-03-29 Accepted:2022-06-14 Online:2022-10-28 Published:2022-12-02
  • Contact: ZHANG Ling E-mail:chb083@126.com;mfwl3@163.com
  • Supported by:
    Youth Science and Technology Project of Suzhou for Developing Healthcare through Science and Education Project(kjxw2018081)

摘要:

目的·通过生物信息学方法构建胃癌患者免疫相关长链非编码RNA(long non-coding RNA,lncRNA)预测模型并探讨其应用价值。方法·通过癌症基因组图谱(the cancer genome atlas,TCGA)数据库下载413例胃癌样本的转录组测序(RNA sequencing,RNA-seq)数据,其中正常样本32例、肿瘤样本381例。通过ImmPort网站获得免疫相关基因。通过相关性分析获得免疫相关lncRNA(immune-related lncRNA,irlncRNA)。通过limma R软件包获得差异表达irlncRNA(differentially expressed immune-related lncRNA,DEirlncRNA)并绘制热图和火山图。通过构建DEirlncRNA对解决样本批次矫正问题。下载TCGA胃癌患者临床病理特征数据,通过单因素分析获得预后相关的DEirlncRNA对,进而通过LASSO回归分析筛选DEirlncRNA对,最后通过COX比例风险回归分析构建风险预测模型。通过计算曲线下面积(area under curve,AUC)分析并比较该模型与传统临床病理特征的预测效能。根据公式计算患者风险值,并依据最优阈值将患者分为高、低风险组。使用Kaplan-Meier曲线绘制生存图,并通过Log-rank检验比较2组患者生存率差异。根据Wilcoxon符号秩检验分析风险评分与临床病理特征之间的关系。通过单因素分析和多因素分析,验证胃癌患者独立预后因子。根据Spearman相关性分析验证风险评分与免疫浸润细胞和免疫相关基因之间的关系。通过pRRophetic R软件包比较药物在高、低风险组患者中的半数抑制浓度(half-maximal inhibitory concentration,IC50)值。结果·与正常组织相比,胃癌组织中有106个DEirlncRNA,其中11个为低表达、95个为高表达。32对DEirlncRNA对被纳入COX比例风险模型中,其中20对DEirlncRNA对为胃癌独立预后因子。风险预测模型的1、2和3年AUC值分别为0.889、0.966、0.935,显著高于传统临床病理特征的AUC值。高风险组患者生存率显著低于低风险组(P=0.000)。风险评分高与肿瘤分期高、存在远处转移和患者死亡密切相关。单因素分析显示,年龄、TNM分期、T分期、N分期、M分期和风险评分与胃癌患者预后密切相关(均P<0.05)。多因素分析显示,年龄、TNM分期和风险评分是胃癌患者的独立预后因子(均P<0.05)。风险评分与多种T细胞、肥大细胞呈负相关,而与肿瘤相关纤维细胞、巨噬细胞及内皮细胞呈正相关。在高风险组中,免疫相关基因IFNGMSH2表达水平均低于低风险组。在高风险组患者中,药物阿霉素、顺铂、替比法尼、丝裂霉素的敏感性均低于低风险组(均P<0.05)。结论·DEirlncRNA对构建的COX比例风险模型可准确预测胃癌患者的生存状态、生存率及对化学治疗药物的敏感性。

关键词: 胃癌, 免疫相关长链非编码RNA, COX比例风险模型, LASSO回归, 曲线下面积

Abstract:

Objective ·To construct a prediction model of immune-related long non-coding RNA (lncRNA) in gastric cancer patients by bioinformatics method, and explore its application value. Methods ·Transcriptome sequencing (RNA sequencing, RNA-seq) data of 413 gastric cancer samples were downloaded from the cancer genome atlas (TCGA) database, including 32 normal samples and 381 tumor samples. Immune-related genes were obtained from ImmPort website. The immune-related lncRNAs (irlncRNAs) were obtained by correlation analysis. The differentially expressed irlncRNAs (DEirlncRNAs) were obtained by the limma R package, and heat maps and volcano maps were drawn. Batch effects of sample were corrected by constructing DEirlncRNA pairs. The clinicopathological characteristic data of TCGA gastric cancer patients were downloaded, and the DEirlncRNA pairs related to prognosis were obtained by univariate analysis, and screened by LASSO regression analysis. Finally, a risk prediction model was constructed by COX proportional hazards regression analysis. The predictive performance of the model and traditional clinicopathological features were analyzed and compared by calculating the area under the curve (AUC). The patient risk value was calculated according to the formula, and the patients were divided into high and low risk groups according to the optimal cutoff value. Survival maps were drawn by using Kaplan-Meier curves, and differences in survival rates between the two groups were compared by Log-rank test. The relationship between risk scores and clinicopathological characteristics was analyzed according to the Wilcoxon signed-rank test. The independent prognostic factors of gastric cancer patients were verified by univariate analysis and multivariate analysis. The relationship between risk scores and immune infiltrating cells and immune-related genes was validated according to Spearman correlation analysis. The half-maximal inhibitory concentration (IC50) values of drugs in high and low risk groups were compared by using the pRRophetic R package. Results ·Compared with normal tissues, 106 irlncRNAs were differentially expressed in gastric cancer tissues, of which 11 were low-expressed and 95 were high-expressed. A total of 32 DEirlncRNAs pairs were included in the COX proportional hazards model, 20 of which were independent prognostic factors for gastric cancer. The 1-, 2- and 3-year AUC values of the risk prediction model were 0.889, 0.966 and 0.935, respectively, which were significantly higher than those of traditional clinicopathological features. The survival rate of patients in the high risk group was significantly lower than that in the low risk group (P=0.000). High risk scores were more closely related to high tumor stage, distant metastasis and patient death. Univariate analysis showed that age, TNM stage, T stage, N stage, M stage and risk score were closely related to the prognosis of gastric cancer patients (all P<0.05). Multivariate analysis showed that age, TNM stage and risk score were independent prognostic factors for gastric cancer patients (all P<0.05). Risk scores were negatively correlated with various T cells and mast cells, and positively correlated with tumor-associated fibroblasts, macrophages and endothelial cells. In the high risk group, the expression levels of immune-related genes IFNG and MSH2 were lower than those in the low risk group. In the high risk group, the sensitivity of the drugs doxorubicin, cisplatin, tipifarnib and mitomycin were lower than those in the low risk group (all P<0.05). Conclusion ·The COX proportional hazards model constructed with irlncRNAs pairs can accurately predict the survival status and survival rate of gastric cancer patients and its sensitivity to chemotherapy drugs.

Key words: gastric cancer, immune-related long non-coding RNA (irlncRNA), COX proportional hazards model, LASSO regression, area under the curve (AUC)

中图分类号: