收稿日期: 2020-05-21
网络出版日期: 2021-02-28
基金资助
上海市教育委员会高峰高原学科建设计划(20161309);上海交通大学医学院高水平地方高校创新团队(SSMU-ZLCX20180200)
Establishment and validation of prognostic prediction model of colorectal cancer based on single-cell RNA sequencing
Received date: 2020-05-21
Online published: 2021-02-28
Supported by
Shanghai Municipal Education Commission—Gaofeng Clinical Medicine Grant Support(20161309);Innovative Research Team of High-Level Local Universities in Shanghai(SSMU-ZLCX20180200)
目的·基于单细胞RNA测序(single cell RNA sequence,scRNA-seq)技术构建结直肠癌预后预测模型。方法·利用GEO(Gene Expression Omnibus)数据库获取结直肠癌样本的scRNA-seq数据集,筛选与结直肠癌转移相关的差异基因作为预测模型的候选基因,运用套索回归算法(LASSO)、Logistic回归和Kaplan-Meier生存分析进一步在癌症基因组图谱(The Cancer Genome Atlas,TCGA)数据库中筛选及验证与结直肠癌预后相关的基因集,并建立结直肠癌预后预测模型。通过决策曲线分析和受试者工作特征(receiver operating characteristic,ROC)曲线评估预测模型在临床应用中的价值。结果·利用GEO数据库获取的scRNA-seq数据筛选出30个差异表达基因,进一步在TCGA数据库中利用LASSO回归得到9个关键基因,并以此对每例患者的关键基因表达进行评分。分别在训练集和验证集中对复发和未复发患者的评分进行比较,差异均有统计学意义(P<0.05)。采用 Logistic回归分析将肿瘤原发灶分级(T stage)和是否发生远处转移(M stage)2个独立的临床变量纳入评分-临床变量整合模型。对评分-临床变量整合模型的实际预测价值进行评估,ROC曲线在训练集和验证集的曲线下面积分别为0.775和0.705。结论·基于scRNA-seq结果,构建了较为稳定的结直肠癌预后预测模型,可供临床评估患者预后参考。
马燕如 , 季林华 , 童天颖 , 严宇青 , 沈超琴 , 张昕雨 , 曹颖颖 , 洪洁 , 陈豪燕 . 基于单细胞RNA测序的结直肠癌预后预测模型的建立和验证[J]. 上海交通大学学报(医学版), 2021 , 41(2) : 159 -165 . DOI: 10.3969/j.issn.1674-8115.2021.02.006
·To establish a model for predicting the prognosis in patients with colorectal cancer (CRC) using single cell RNA sequencing (scRNA-seq).
·scRNA-seq data of patients with CRC from Gene Expression Omnibus (GEO) database was used to filter out candidate genes, which were related to metastatic CRC. The least absolute shrinkage and selection operator (LASSO) regression, Logistic regression and Kaplan-Meier analysis were used to select and evaluate the significance of the hub gene filtered out in The Cancer Genome Atlas (TCGA) database, and to develop the prognostic prediction model of CRC. Decision curve analysis and receiver operating characteristic (ROC) curve were used to assess the clinical use of the prediction model.
·Thirty candidate genes were filtered out from the scRNA-seq data which was downloaded in GEO database, and then 9 hub genes were selected by LASSO regression in the TCGA database. The hub-gene expression was scored for each patient. The scores had significant difference between the groups with and without recurrence both in the training set and the validation set (P<0.05). In addition, Logistic regression analysis was carried out to incorporate the two independent clinical variables of primary tumor grade (T stage) and metastasis status (M stage) into the score-clinical variable integration model. Area under curve of the ROC curve in the training set and validation set were 0.775 and 0.705, respectively.
·A relatively stable model for predicting prognosis in CRC was constructed based on the results of scRNA-seq, which has certain guiding significance for treatment decision and prognostic prediction.
1 | Siegel RL, Miller KD, Jemal A. Cancer statistics, 2020[J]. CA Cancer J Clin, 2020, 70(1): 7-30. |
2 | Hackl C, Neumann P, Gerken M, et al. Treatment of colorectal liver metastases in Germany: a ten-year population-based analysis of 5 772 cases of primary colorectal adenocarcinoma[J]. BMC Cancer, 2014, 14: 810. |
3 | Elferink MAG, Jong KP, Klaase JM, et al. Metachronous metastases from colorectal cancer: a population-based study in North-East Netherlands[J]. Int J Color Dis, 2015, 30(2): 205-212. |
4 | Brenner H, Kloor M, Pox CP. Colorectal cancer[J]. Lancet, 2014, 383(9927): 1490-1502. |
5 | Guan X, Ma CX, Quan JC, et al. A clinical model to predict the risk of synchronous bone metastasis in newly diagnosed colorectal cancer: a population-based study[J]. BMC Cancer, 2019, 19(1): 704. |
6 | Liu JN, Li CY, Xu J, et al. A patient-oriented clinical decision support system for CRC risk assessment and preventative care[J]. BMC Med Inform Decis Mak, 2018, 18(): 118. |
7 | Boursi B, Mamtani R, Hwang WT, et al. A risk prediction model for sporadic CRC based on routine lab results[J]. Dig Dis Sci, 2016, 61(7): 2076-2086. |
8 | Gao MM, Zhong A, Patel N, et al. High throughput RNA sequencing utility for diagnosis and prognosis in colon diseases[J]. World J Gastroenterol, 2017, 23(16): 2819-2825. |
9 | Han YX, Gao SG, Muegge K, et al. Advanced applications of RNA sequencing and challenges[J]. Bioinform Biol Insights, 2015, 9(): 29-46. |
10 | Xie T, Wang YZ, Deng N, et al. Single-cell deconvolution of fibroblast heterogeneity in mouse pulmonary fibrosis[J]. Cell Rep, 2018, 22(13): 3625-3640. |
11 | Olsen TK, Baryawno N. Introduction to single-cell RNA sequencing[J]. Curr Protoc Mol Biol, 2018, 122(1): e57. |
12 | Peng JY, Sun BF, Chen CY, et al. Single-cell RNA-seq highlights intra-tumoral heterogeneity and malignant progression in pancreatic ductal adenocarcinoma[J]. Cell Res, 2019, 29(9): 725-738. |
13 | Chung W, Eum HH, Lee HO, et al. Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer[J]. Nat Commun, 2017, 8: 15081. |
14 | Navin NE. The first five years of single-cell cancer genomics and beyond[J]. Genome Res, 2015, 25(10): 1499-1507. |
15 | Zhang YB, Song JJ, Zhao ZW, et al. Single-cell transcriptome analysis reveals tumor immune microenvironment heterogenicity and granulocytes enrichment in colorectal cancer liver metastases[J]. Cancer Lett, 2020, 470: 84-94. |
16 | Wolf FA, Angerer P, Theis FJ. SCANPY: large-scale single-cell gene expression data analysis[J]. Genome Biol, 2018, 19(1): 15. |
17 | Takeichi M. Cadherins in cancer: implications for invasion and metastasis[J]. Curr Opin Cell Biol, 1993, 5(5): 806-811. |
18 | Christofori G, Semb H. The role of the cell-adhesion molecule E-cadherin as a tumour-suppressor gene[J]. Trends Biochem Sci, 1999, 24(2): 73-76. |
19 | Zhi JJ, Sun JW, Wang ZC, et al. Support vector machine classifier for prediction of the metastasis of colorectal cancer[J]. Int J Mol Med, 2018, 41(3): 1419-1426. |
20 | Paugh SW, Bonten EJ, Savic D, et al. NALP3 inflammasome upregulation and CASP1 cleavage of the glucocorticoid receptor cause glucocorticoid resistance in leukemia cells[J]. Nat Genet, 2015, 47(6): 607-614. |
21 | Wang J, Xie SD, Yang JJ, et al. The long noncoding RNA H19 promotes tamoxifen resistance in breast cancer via autophagy[J]. J Hematol Oncol, 2019, 12: 81. |
22 | Zhou X, Wang X, Huang K, et al. Investigation of the clinical significance and prospective molecular mechanisms of cystatin genes in patients with hepatitis B virus-related hepatocellular carcinoma[J]. Oncol Rep, 2019, 42(1): 189-201. |
23 | Weren RD, Ligtenberg MJ, Kets CM, et al. A germline homozygous mutation in the base-excision repair gene NTHL1 causes adenomatous polyposis and colorectal cancer[J]. Nat Genet, 2015, 47(6): 668-671. |
24 | Lin SH, Raju GS, Huff C, et al. The somatic mutation landscape of premalignant colorectal adenoma[J]. Gut, 2018, 67(7): 1299-1305. |
/
〈 |
|
〉 |