JOURNAL OF SHANGHAI JIAOTONG UNIVERSITY (MEDICAL SCIENCE) ›› 2021, Vol. 41 ›› Issue (9): 1197-1206.doi: 10.3969/j.issn.1674-8115.2021.09.010

• Basic research • Previous Articles    

Construction of a metastasis prediction model of microsatellite instability-high colorectal cancer based on differentially expressed gene assembly

Ying XU(), Yi-min CHU, Da-ming YANG, Ji LI, Hai-qin ZHANG, Hai-xia PENG()   

  1. Endoscopy Center, Tongren Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200336, China
  • Received:2021-03-10 Online:2021-08-24 Published:2021-08-24
  • Contact: Hai-xia PENG;
  • Supported by:
    Fund of Shanghai Municipal Science and Technology Committee(18ZR1434900);General Program of Science and Technology Commission of Changning District of Shanghai(CNKW2018Y02);Interdisciplinary Program of Shanghai Jiao Tong University(ZH2018QNB24);Research Fund of Shanghai Sixth People′s Hospital Medical Group(ly202003)

Abstract: Objective

·To explore the potential key genes and the gene expression characteristics of microsatellite instability-high (MSI-H) colorectal cancer (CRC) with metastasis at the transcriptome level, and establish a metastasis prediction gene model.


·The transcriptome data of MSI-H CRC patients was obtained from The Cancer Genome Atlas database. The patients were divided into metastatic group (21 patients) and non-metastatic group (42 patients). The differentially expressed genes (DEGs) between the two groups were analyzed by Gene Ontology (GO) and Gene Set Enrichment Analysis (GSEA) to annotate, and cluster DEGs and enrich the signaling pathways. STRING and Cytoscape were used to select the hub genes. Nomogram was drawn based on the selected DEGs. The cross validation of the model was performed by Bootstrap method. Survival analysis was done to explore the influences of each gene in the nomogram on progression-free survival (PFS) of MSI-H CRC.


·A total of 245 DEGs were obtained from the metastatic group and non-metastatic group, among which 204 genes were up-regulated and 41 genes were down-regulated. GO analysis showed that DEGs were mainly clustered in ion transmembrane transport, chloride transmembrane transport and chloride channel activity in terms of biological process and molecular function. In terms of cellular component, DEGs were mainly clustered in extracellular region and extracellular space. GSEA showed that the neuroactive ligand-receptor interaction and metabolic pathways were enriched in the up-regulated genes. The top 10 hub genes in the protein-protein interaction network of the up-regulated genes were screened by Cytoscape. The metastasis prediction gene model, which was set up based on the top 10 DEGs with the lowest adjusted P value and high physiological relevance to tumor, had certain predictive efficiency [area under curve (AUC)=0.975 for training, AUC=0.920 for validation]. The expression levels of AC078993.1 and IGLJ2 (immunoglobulin lambda joining 2) were significantly negatively correlated with PFS of MSI-H CRC (P=0.011, P=0.005).


·The changes in ion channels and extracellular environment may have important impacts on metastasis of MSI-H CRC. Neuroactive ligand-receptor interaction and metabolic pathways may be two important signaling pathways for metastasis of MSI-H CRC. A metastasis prediction gene model is established, which can provide reference for the follow-up related clinical researches.

Key words: colorectal cancer (CRC), microsatellite instability-high (MSI-H), metastasis, differentially expressed gene (DEG), bioinformatics, nomogram

CLC Number: