关键词: Advanced adenomas Enteroviruses Gut bacteria Metagenomic sequencing Prediction model

Mesh : Humans Gastrointestinal Microbiome / genetics Bacteria / genetics classification isolation & purification Adenoma / microbiology virology Feces / microbiology virology Colorectal Neoplasms / microbiology virology Male Middle Aged Female Viruses / isolation & purification classification genetics pathogenicity High-Throughput Nucleotide Sequencing Aged Machine Learning

来  源:   DOI:10.1186/s12866-024-03416-z   PDF(Pubmed)

Abstract:
BACKGROUND: More than 90% of colorectal cancer (CRC) arises from advanced adenomas (AA) and gut microbes are closely associated with the initiation and progression of both AA and CRC.
OBJECTIVE: To analyze the characteristic microbes in AA.
METHODS: Fecal samples were collected from 92 AA and 184 negative control (NC). Illumina HiSeq X sequencing platform was used for high-throughput sequencing of microbial populations. The sequencing results were annotated and compared with NCBI RefSeq database to find the microbial characteristics of AA. R-vegan package was used to analyze α diversity and β diversity. α diversity included box diagram, and β diversity included Principal Component Analysis (PCA), principal co-ordinates analysis (PCoA), and non-metric multidimensional scaling (NMDS). The AA risk prediction models were constructed based on six kinds of machine learning algorithms. In addition, unsupervised clustering methods were used to classify bacteria and viruses. Finally, the characteristics of bacteria and viruses in different subtypes were analyzed.
RESULTS: The abundance of Prevotella sp900557255, Alistipes putredinis, and Megamonas funiformis were higher in AA, while the abundance of Lilyvirus, Felixounavirus, and Drulisvirus were also higher in AA. The Catboost based model for predicting the risk of AA has the highest accuracy (bacteria test set: 87.27%; virus test set: 83.33%). In addition, 4 subtypes (B1V1, B1V2, B2V1, and B2V2) were distinguished based on the abundance of gut bacteria and enteroviruses (EVs). Escherichia coli D, Prevotella sp900557255, CAG-180 sp000432435, Phocaeicola plebeiuA, Teseptimavirus, Svunavirus, Felixounavirus, and Jiaodavirus are the characteristic bacteria and viruses of 4 subtypes. The results of Catboost model indicated that the accuracy of prediction improved after incorporating subtypes. The accuracy of discovery sets was 100%, 96.34%, 100%, and 98.46% in 4 subtypes, respectively.
CONCLUSIONS: Prevotella sp900557255 and Felixounavirus have high value in early warning of AA. As promising non-invasive biomarkers, gut microbes can become potential diagnostic targets for AA, and the accuracy of predicting AA can be improved by typing.
摘要:
背景:超过90%的结直肠癌(CRC)发生于晚期腺瘤(AA),肠道微生物与AA和CRC的发生和进展密切相关。
目的:分析AA中的特征微生物。
方法:从92AA和184阴性对照(NC)收集粪便样品。IlluminaHiSeqX测序平台用于微生物群体的高通量测序。测序成果注解并与NCBIRefSeq数据库比拟,找到AA的微生物特征。使用R-素食包装分析α多样性和β多样性。α多样性包括框图,和β多样性包括主成分分析(PCA),主要坐标分析(PCoA),和非度量多维缩放(NMDS)。基于6种机器学习算法构建了AA风险预测模型。此外,使用无监督聚类方法对细菌和病毒进行分类。最后,分析了不同亚型细菌和病毒的特征。
结果:Prevotellasp900557255,Alistipesputredinis的丰度,假单胞菌在AA中较高,而大量的利利病毒,Felixounavirus,德鲁利病毒在AA中也较高。用于预测AA风险的基于Catboost的模型具有最高的准确度(细菌测试集:87.27%;病毒测试集:83.33%)。此外,根据肠道细菌和肠道病毒(EV)的丰度区分了4种亚型(B1V1,B1V2,B2V1和B2V2)。大肠杆菌D,Prevotellasp900557255,CAG-180sp000432435,PhocaeicolaplebeiuA,睾丸病毒,Svunavirus,Felixounavirus,角达病毒是4种亚型的特征性细菌和病毒。Catboost模型的结果表明,纳入亚型后预测的准确性有所提高。发现集的准确率是100%,96.34%,100%,在4个亚型中占98.46%,分别。
结论:普氏菌sp900557255和Felixounavirus对AA的早期预警具有很高的价值。作为有希望的非侵入性生物标志物,肠道微生物可以成为AA的潜在诊断靶标,并且通过分型可以提高预测AA的准确性。
公众号