›› 2018, Vol. 38 ›› Issue (9): 1019-.doi: 10.3969/j.issn.1674-8115.2018.09.004

• Original article (Basic research) • Previous Articles     Next Articles

Bacterial signatures for diagnosis of colorectal cancerfecal metagenomics analysis

ZHANG Xin-yu1*, ZHANG Jing2*, ZHU Xiao-qiang1, CAO Ying-ying1, CHEN Hao-yan1   

  1. 1. Department of Gastroenterology and Hepatology, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai Institute of Digestive Disease, Shanghai 200001, China; 2. Medical Record Statistics Center, Renji Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200001, China
  • Online:2018-09-28 Published:2018-10-15
  • Supported by:
    National Natural Science Foundation of China, 31371273; “Youth Eastern Scholar” at Shanghai Institutions of Higher Learning, QD2015003; Shanghai Municipal Education Commission— Gaofeng Clinical Medicine Support, 20161309

Abstract: Objective · To construct bacterial signaturesanalyzing fecal metagenomics for the screening and diagnosis of colorectal cancer (CRC). Methods · A total of 285 samples were included in the study. Diagnostic models for CRC according to six different machine learning algorithms were developed using the featured bacteria selectedrandom forest algorithm, and validated in validation sets. Results · Nine bacteria that differentiated CRC and the control were identified, with which 6 models were established. The best model was random forest model, with an accuracy of 0.847 7 in the training set. Its accuracy in two test sets was 0.815 8 and 0.734 4, respectively. The area under curve (AUC) of receiver operating characteristic of the random forest model in the set including all samples was 0.894. Conclusion · Bacterial signatures based on random forest algorithm for the diagnosis of CRC can differentiate patients with CRC and the control effectively, which suggests the potential clinical value of the bacterial signatures.

Key words: colorectal cancer, diagnosis, intestinal bacteria, machine learning, random forest

CLC Number: