Data Mining

数据挖掘
  • 文章类型: Journal Article
    这项研究基于FDA不良事件报告系统(FAERS)数据库进行了药物警戒分析,以比较吸入或鼻用倍氯米松的感染风险,氟替卡松,布地奈德,环索奈德,莫米松,曲安奈德.
    我们使用比例失衡分析来评估ICS/INC与感染事件之间的相关性。数据是从2015年4月至2023年9月的FAERS数据库中提取的。进一步分析其临床特点,感染部位,以及ICS和INCs感染不良事件(AEs)的病原菌。我们使用气泡图来显示它们的前5个感染不良事件。
    我们分析了21,837例与ICS和INCs相关的感染不良事件报告,平均年龄为62.12岁。其中,61.14%的感染报告与女性有关。据报道,氟替卡松感染的三分之一发生在下呼吸道,布地奈德,Ciclesonidec,和莫米松;曲安奈德报告的感染中有40%以上是眼部感染;倍氯米松引起的口腔感染率为7.39%。倍氯米松引起的真菌和病毒感染的报告率分别为21.15%和19.2%,分别。布地奈德和西索奈德引起的分枝杆菌感染分别占3.29%和2.03%,分别。气泡图显示ICS组有更多的真菌感染,口腔感染,肺炎,支气管炎,等。INCs组有更多的眼部症状,鼻炎,鼻窦炎,鼻咽炎,等。
    使用ICS和INCs的女性更容易发生感染事件。与布地奈德相比,氟替卡松似乎有较高的肺炎和口腔念珠菌病的风险。莫米松可能导致更多的上呼吸道感染。倍氯米松的口腔感染风险较高。倍氯米松会导致更多的真菌和病毒感染,而环索奈德和布地奈德更容易感染分枝杆菌。
    UNASSIGNED: This study conducted a pharmacovigilance analysis based on the FDA Adverse Event Reporting System (FAERS) database to compare the infection risk of inhaled or nasal Beclomethasone, Fluticasone, Budesonide, Ciclesonide, Mometasone, and Triamcinolone Acetonide.
    UNASSIGNED: We used proportional imbalance analysis to evaluate the correlation between ICS /INCs and infection events. The data was extracted from the FAERS database from April 2015 to September 2023. Further analysis was conducted on the clinical characteristics, site of infection, and pathogenic bacteria of ICS and INCs infection adverse events (AEs). We used bubble charts to display their top 5 infection adverse events.
    UNASSIGNED: We analyzed 21,837 reports of infection AEs related to ICS and INCs, with an average age of 62.12 years. Among them, 61.14% of infection reports were related to females. One-third of infections reported to occur in the lower respiratory tract with Fluticasone, Budesonide, Ciclesonidec, and Mometasone; over 40% of infections reported by Triamcinolone Acetonide were eye infections; the rate of oral infections caused by Beclomethasone were 7.39%. The reported rates of fungal and viral infections caused by beclomethasone were 21.15% and 19.2%, respectively. The mycobacterial infections caused by Budesonide and Ciclesonidec account for 3.29% and 2.03%, respectively. Bubble plots showed that the ICS group had more fungal infections, oral infections, pneumonia, tracheitis, etc. The INCs group had more eye symptoms, rhinitis, sinusitis, nasopharyngitis, etc.
    UNASSIGNED: Women who use ICS and INCs are more prone to infection events. Compared to Budesonide, Fluticasone seemed to have a higher risk of pneumonia and oral candidiasis. Mometasone might lead to more upper respiratory tract infections. The risk of oral infection was higher with Beclomethasone. Beclomethasone causes more fungal and viral infections, while Ciclesonide and Budesonide are more susceptible to mycobacterial infections.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阑尾炎是由阑尾腔阻塞或血液供应终止引起的炎症,导致阑尾坏死,随后继发细菌感染。TYROBP基因与阑尾炎护理的关系尚不清楚。从GPL571产生的基因表达综合数据库下载阑尾炎数据集GSE9579概况。筛选差异表达基因,其次是加权基因共表达网络分析,功能富集分析,基因集富集分析,蛋白质相互作用网络的构建与分析,比较毒性基因组学数据库分析,和免疫浸润分析。绘制基因表达水平的热图。总共鉴定了1570个差异表达的基因。根据基因本体论分析,它们主要富集在有机酸代谢过程中,凝聚染色体动粒,氧化还原酶活性。在京都基因和基因组分析百科全书,它们主要集中在代谢途径,P53信号通路,PPAR信号通路。加权基因共表达网络分析中的软阈值功率设为12。通过对蛋白质-蛋白质相互作用网络的构建和分析,5个核心基因(FCGR2A,IL1B,ITGAM,获得TLR2、TYROBP)。核心基因表达水平的热图显示TYROBP在阑尾炎样品中的高表达。比较毒性基因组学数据库分析发现,核心基因(FCGR2A,IL1B,ITGAM,TLR2、TYROBP)与腹痛密切相关,胃肠功能障碍,发烧,和炎症的发生。TYROBP基因在阑尾炎中高表达,TYROBP基因表达越高,预后越差。TYROBP可作为阑尾炎及其护理的分子靶标。
    Appendicitis is an inflammation caused by obstruction of the appendiceal lumen or termination of blood supply leading to appendiceal necrosis followed by secondary bacterial infection. The relationship between TYROBP gene and the nursing of appendicitis remains unclear. The appendicitis dataset GSE9579 profile was downloaded from the gene expression omnibus database generated from GPL571. Differentially expressed genes were screened, followed by weighted gene co-expression network analysis, functional enrichment analysis, gene set enrichment analysis, construction and analysis of protein-protein interaction network, Comparative Toxicogenomics Database analysis, and immune infiltration analysis. Heatmaps of gene expression levels were plotted. A total of 1570 differentially expressed genes were identified. According to gene ontology analysis, they were mainly enriched in organic acid metabolic process, condensed chromosome kinetochore, oxidoreductase activity. In Kyoto Encyclopedia of Gene and Genome analysis, they mainly concentrated in metabolic pathways, P53 signaling pathway, PPAR signaling pathway. The soft threshold power in weighted gene co-expression network analysis was set to 12. Through the construction and analysis of protein-protein interaction network, 5 core genes (FCGR2A, IL1B, ITGAM, TLR2, TYROBP) were obtained. Heatmap of core gene expression levels revealed high expression of TYROBP in appendicitis samples. Comparative Toxicogenomics Database analysis found that core genes (FCGR2A, IL1B, ITGAM, TLR2, TYROBP) were closely related to abdominal pain, gastrointestinal dysfunction, fever, and inflammation occurrence. TYROBP gene is highly expressed in appendicitis, and higher expression of TYROBP gene indicates worse prognosis. TYROBP may serve as a molecular target for appendicitis and its nursing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    基于骨架节点的视频动作识别是计算机视觉领域的一个突出问题。在实际应用场景中,个体间大量的骨架节点和行为遮挡问题严重影响识别的速度和准确性。因此,提出了一种轻量级的多流特征交叉融合(L-MSFCF)模型来识别格斗等异常行为,恶毒的踢,爬过墙壁,etal.,基于轻量级骨架节点计算,可以明显提高识别速度,基于遮挡骨架节点预测分析提高识别精度,以有效解决行为遮挡问题。实验表明,我们提出的All-MSFCF模型对8种异常行为的视频动作识别平均准确率为92.7%。尽管我们提出的轻量级L-MSFCF模型的平均准确率为87.3%,其平均识别速度比全骨架识别模型高62.7%,更适合解决实时跟踪问题。此外,我们提出的轨迹预测跟踪(TPT)模型可以根据动态选择的核心骨架节点计算实时预测运动位置,特别是对于具有较低平均丢失误差的15帧和30帧内的短期预测。
    Video action recognition based on skeleton nodes is a highlighted issue in the computer vision field. In real application scenarios, the large number of skeleton nodes and behavior occlusion problems between individuals seriously affect recognition speed and accuracy. Therefore, we proposed a lightweight multi-stream feature cross-fusion (L-MSFCF) model to recognize abnormal behaviors such as fighting, vicious kicking, climbing over the wall, et al., which could obviously improve recognition speed based on lightweight skeleton node calculation, and improve recognition accuracy based on occluded skeleton node prediction analysis in order to effectively solve the behavior occlusion problem. The experiments show that our proposed All-MSFCF model has a video action recognition average accuracy rate of 92.7% for eight kinds of abnormal behavior recognition. Although our proposed lightweight L-MSFCF model has an 87.3% average accuracy rate, its average recognition speed is 62.7% higher than the full-skeleton recognition model, which is more suitable for solving real-time tracing problems. Moreover, our proposed Trajectory Prediction Tracking (TPT) model could real-time predict the moving positions based on the dynamically selected core skeleton node calculation, especially for the short-term prediction within 15 frames and 30 frames that have lower average loss errors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    建立了正四面体模型,以通过高分辨率质谱来刺穿四元成分中溶解有机物(DOM)的分馏。该模型可以立体可视化DOM的分子式,以根据正四面体中的位置显示对每个组件的偏好。随后开发了一种分类方法,将分子式分为与分馏比有关的15类,证明其相对变化与质量峰面积的不确定性一致。以胞外聚合物分层与OrbitrapMS耦合为例,以垃圾渗滤液处理和污水处理厂的7种污泥为例,验证了正四面体模型的实用性,呈现分层污泥絮体中的DOM化学多样性。敏感性分析证明,在四个模型参数的扰动下,分类结果相对稳定。根据正四面体模型的分类结果,多项逻辑回归分析可以进一步帮助识别分子性质对DOM分馏的影响。该模型提供了一种方法,用于评估从固体或半固体成分中顺序提取DOM的特异性,并简化了四元成分分馏系数的复杂数学表达式。
    A regular tetrahedron model was established to pierce the fractionation of dissolved organic matter (DOM) among quaternary components by using high-resolution mass spectrometry. The model can stereoscopically visualize molecular formulas of DOM to show the preference to each component according to the position in a regular tetrahedron. A classification method was subsequently developed to divide molecular formulas into 15 categories related to fractionation ratios, the relative change of which was demonstrated to be convergent with the uncertainty of mass peak area. The practicality of the regular tetrahedron model was verified by seven kinds of sludge from waste leachate treatment and sewage wastewater treatment plants by using stratification of extracellular polymeric substances coupled with Orbitrap MS as an example, presenting the DOM chemodiversity in stratified sludge flocs. Sensitivity analysis proved that classification results were relatively stable with the perturbation of four model parameters. Multinomial logistic regression analysis could further help identify the effect of molecular properties on the fractionation of DOM based on the classification results of the regular tetrahedron model. This model offers a methodology for the assessment of specificity of sequential extraction on DOM from solid or semisolid components and simplifies the complex mathematical expression of fractionation coefficients for quaternary components.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:圆锥角膜(KC)是一种以进行性角膜陡峭化和变薄为特征的疾病。然而,其病理生理机制仍不明确。我们主要进行文献挖掘,以在RNA水平上提取KC的生物信息学和相关数据。这项研究的目的是通过在RNA水平上识别hub基因和关键分子途径来探索KC的潜在病理机制。
    方法:我们对PubMed数据库进行了详尽的搜索,并确定了与KC患者不同角膜层基因转录相关的研究。鉴定的差异表达基因相交,并提取重叠基因进行进一步分析。使用“基因本体论”(GO)和“京都基因和基因组百科全书”(KEGG)分析以及“注释数据库”,筛选了显着富集的基因,可视化,和集成发现(DAVID)数据库。使用STRING数据库为显着富集的基因构建了蛋白质-蛋白质相互作用(PPI)网络。PPI网络是使用Cytoscape软件可视化的,和集线器基因通过中间性中心值进行筛选。使用集线器基因的GO和KEGG分析发现了在KC的病理生理学中起关键作用的途径。
    结果:获得了68个重叠基因。50个基因在67个生物过程中显著富集,在7条KEGG通路中鉴定出16个基因。此外,通过使用STRING数据库构建的PPI网络识别出14个节点和32条边。多重分析确定了4个hub基因,12个丰富的生物过程,和6个KEGG途径。GO富集分析表明,hub基因主要参与细胞凋亡过程的正向调控,和KEGG分析表明,hub基因主要与白介素17(IL-17)和肿瘤坏死因子(TNF)途径相关。总的来说,基质金属蛋白酶9,IL-6,雌激素受体1和前列腺素-内过氧化物合酶2是与KC相关的潜在重要基因。
    结论:四个基因,基质金属蛋白酶9,IL-6,雌激素受体1,和前列腺素内过氧化物合酶2,以及IL-17和TNF途径,对KC的发展至关重要。炎症和细胞凋亡可能与KC的发病有关。
    OBJECTIVE: Keratoconus (KC) is a condition characterized by progressive corneal steepening and thinning. However, its pathophysiological mechanism remains vague. We mainly performed literature mining to extract bioinformatic and related data on KC at the RNA level. The objective of this study was to explore the potential pathological mechanisms of KC by identifying hub genes and key molecular pathways at the RNA level.
    METHODS: We performed an exhaustive search of the PubMed database and identified studies that pertained to gene transcripts derived from diverse corneal layers in patients with KC. The identified differentially expressed genes were intersected, and overlapping genes were extracted for further analyses. Significantly enriched genes were screened using \"Gene Ontology\" (GO) and \"Kyoto Encyclopedia of Genes and Genomes\" (KEGG) analysis with the \"Database for Annotation, Visualization, and Integrated Discovery\" (DAVID) database. A protein-protein interaction (PPI) network was constructed for the significantly enriched genes using the STRING database. The PPI network was visualized using the Cytoscape software, and hub genes were screened via betweenness centrality values. Pathways that play a critical role in the pathophysiology of KC were discovered using the GO and KEGG analyses of the hub genes.
    RESULTS: 68 overlapping genes were obtained. Fifty genes were significantly enriched in 67 biological processes, and 16 genes were identified in 7 KEGG pathways. Moreover, 14 nodes and 32 edges were identified via the PPI network constructed using the STRING database. Multiple analyses identified 4 hub genes, 12 enriched biological processes, and 6 KEGG pathways. GO enrichment analysis showed that the hub genes are mainly involved in the positive regulation of apoptotic process, and KEGG analysis showed that the hub genes are primarily associated with the interleukin-17 (IL-17) and tumor necrosis factor (TNF) pathways. Overall, the matrix metalloproteinase 9, IL-6, estrogen receptor 1, and prostaglandin-endoperoxide synthase 2 were the potential important genes associated with KC.
    CONCLUSIONS: Four genes, matrix metalloproteinase 9, IL-6, estrogen receptor 1, and prostaglandin endoperoxide synthase 2, as well as IL-17 and TNF pathways, are critical in the development of KC. Inflammation and apoptosis may contribute to the pathogenesis of KC.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    工程变更(EC)风险可能会对项目进度产生负面影响,成本,质量,和利益相关者的满意度。然而,现有的EC风险管理方法在证据选择方面存在一定的缺陷,没有充分考虑与EC风险相关的证据的质量和可靠性。证据分级在确保与EC风险相关的决策的可靠性方面起着至关重要的作用,可以为决策提供必要的科学和可靠性支持。为了探索与建筑工程变更(EC)相关的潜在风险并确定最重要的风险,本研究提出了一种结合证据分级理论和潜在狄利克雷分配(LDA)主题分析手段的方法。最初,基于证据的分级理论为EC风险相关的证据来源创建了分级表.具体来说,我们根据证据来源的可信度将其分为三个级别。随后,我们选择可信度较高的证据进行文本分析,利用LDA主题模型。这涉及分析法规,行业标准,和与EC有关的判决文件,最终确定与EC风险相关的主题。此外,通过将EC风险主题与相关文献相结合,我们确定了影响EC风险的因素。随后,我们设计了一个专家调查问卷,以确定与潜在风险相关的关键风险和重要风险主题。结果表明,通过综合A类和B类证据的信息,总共确定了五个突出的风险主题,即合同,技术,基金,人员,和其他危险。其中,技术风险具有最高价值,所以这意味着风险是最重要的,关键风险是工程设计缺陷,错误,和遗漏。
    Engineering change (EC) risk may negatively impact project schedule, cost, quality, and stakeholder satisfaction. However, existing methods for managing EC risk have certain shortcomings in evidence selection and do not adequately consider the quality and reliability of evidence associated with EC risks. Evidence grading plays a crucial role in ensuring the reliability of decisions related to EC risks and can provide essential scientific and reliability support for decision-making. In order to explore the potential risks associated with architectural engineering changes (ECs) and identify the most significant ones, this study proposed a methodology that combines evidence grading theory and Latent Dirichlet Allocation (LDA) topic analysis means. Initially, the evidence-based grading theory served as the creation of a grading table for evidence sources related to EC risk. Specifically, we categorized the evidence sources into three levels based on their credibility. Subsequently, we selected evidence with higher credibility levels for textual analysis, utilizing the LDA topic model. This involved analyzing regulations, industry standards, and judgment documents related to EC, ultimately identifying the themes associated with EC risks. In addition, by combining EC risk topics with relevant literature, we identified factors influencing EC risks. Subsequently, we designed an expert survey questionnaire to determine the key risks and important risk topics associated with potential risks. The results show that by synthesizing information from both Class A and B evidence, a total of five prominent risk themes were identified, namely contract, technology, funds, personnel, and other hazards. Among them, the technical risk has the highest value, so it implies that the risk is the most important, and the key risks are engineering design defects, errors, and omissions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    绿色设计涉及产品的整个生命周期,包括原材料采购等阶段,生产和制造,销售和运输,使用,回收,和处置。提取与产品绿色设计(PGD)相关的客户要求(CR)是实现双碳目标的必要条件之一。然而,只有少数研究从全生命周期的角度评估了PGD的CRs。这项研究从电子商务平台获得了20,000条洗衣机的在线评论。分析和评估客户对洗衣机生命周期各个阶段需求的情绪倾向。通过聚类分析确定了在线洗衣机评论中包含的CR。基于生命周期理论,提取并分析了CRs的产品绿色设计要求(PGDR)。本研究可为绿色产品设计提供理论和方法支持。
    Green design involves the entire life cycle of a product, including stages such as raw material acquisition, production and manufacturing, sales and transportation, use, recycling, and disposal. Extracting customer requirements (CRs) related to product green design (PGD) is one of the necessary conditions for achieving the dual carbon goal. However, only a few studies have evaluated CRs for PGD from a full life cycle perspective. This study obtained 20,000 online reviews of washing machines from e-commerce platforms. The customers\' sentiment tendencies toward the requirements of washing machines at various stages of their life cycle are analyzed and evaluated. The CRs contained in online washing machine reviews were identified through cluster analysis. Based on the life cycle theory, the product green design requirements (PGDRs) of CRs were extracted and analyzed. This study can provide theoretical and methodological support for green product design.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    中国提出的“一带一路”倡议(BRI)是中国提出的国家间的顶级合作倡议,这促进了中国与有关国家在各个方面和领域的合作。来自智库和专家的研究报告,分析,BRI的研究结论可以反映这一立场,意见,以及国外各国对这一倡议的要求。本文以《2020年全球走向智库指数报告》中重要智库的BRI报告为研究对象,并利用文本挖掘分析了国外智库研究BRI的重点和发展趋势,主题演变,和社会网络分析。为推进中国“一带一路”建设和深化相关合作提供了合理化建议和思路。研究表明,智库对BRI研究报告的主题分布主要集中在政治领域,经济,和军事。研究领域相对稳定,主题演变的趋势不强。演化路径也主要分布在政治领域,经济,和军事。多年来,主题演变方向没有太多扩展,主题有很强的继承性。研究主题与“一带一路”倡议的主要目的之间的联系有些不足,这表明对“一带一路”倡议的理解存在一定的局限性。
    China\'s \"the Belt and Road Initiative\" (BRI) is a top-level cooperation initiative among countries proposed by China, which has promoted China\'s cooperation with relevant countries in various aspects and fields. Research reports from think tanks and experts on the evaluation, analysis, and research conclusions of the BRI can reflect the stance, opinions, and demands of various countries abroad regarding the initiative. This paper takes the BRI reports of important think tanks in the \" Global Go To Think Tank Index Report 2020\" as the subject of its research, and analyzes the key points and development trends of foreign think tank research on the BRI by using text mining, topic evolution, and social network analysis. It provides reasonable suggestions and ideas for promoting the construction of the BRI and deepening related cooperation in China. Research shows that the thematic distribution of research reports on the BRI by think tanks is mainly focused on the fields of politics, economy, and military. The research areas are relatively stable, and there is not a strong trend of thematic evolution. The evolution paths are also mainly distributed in the fields of politics, economy, and military. There are not many expansions in the thematic evolution directions over the years, and there is a strong inheritance of themes. The connection between research themes and the main purpose of the BRI is somewhat inadequate, indicating a certain limitation in the understanding of the BRI.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    对环境中频繁检测的混合体系进行筛选和优先排序研究具有重要意义,因为对所有混合物进行毒性测试是不切实际的。因此,本文介绍了频繁项集挖掘(FIM),并将其应用于识别数据集中通常共同出现的变量。基于水环境中季铵化合物(QAC)的数据集,发现了四个检出率≥35%的频繁QAC混合系统,包括[BDMM]+Cl--[BTMM]+Cl-(M1),[BDMM]+Cl--[BHMM]+Cl-(M2),[BTMM]+Cl--[BHMM]+Cl-(M3),和[BDMM]+Cl--[BTMM]+Cl--[BHMM]+Cl-(M4)。[BDMM]+Cl-,[BTMM]+Cl-,和[BHMM]+Cl-是苄基十二烷基二甲基氯化铵,苄基十四烷基二甲基氯化铵,和苄基十六烷基二甲基氯化铵,分别。然后,使用青海弧菌对四个经常检测到的混合物系统的代表性混合物射线和成分的毒性进行了测试。-Q67(Q67)在0.25和12h时作为发光指示生物。使用浓度添加(CA)和独立作用(IA)模型预测混合物的毒性。结果表明,四种经常检测的混合物系统的组分和代表性混合物射线对Q67均表现出明显的急性和慢性毒性,其中位有效浓度(EC50)低于7mg/L。CA和IA模型都可以很好地预测四种混合物系统的毒性。然而,在12h时,CA模型对M3和M4混合物的毒性具有比IA更好的预测能力。
    Screening and prioritizing research on frequently detected mixture systems in the environment is of great significance, as conducting toxicity testing on all mixtures is impractical. Therefore, the frequent itemset mining (FIM) was introduced and applied in this paper to identify variables that commonly co-occur in a dataset. Based on the dataset of the quaternary ammonium compounds (QACs) in the water environment, the four frequent QAC mixture systems with detection rate ≥ 35 % were found, including [BDMM]+Cl--[BTMM]+Cl- (M1), [BDMM]+Cl--[BHMM]+Cl- (M2), [BTMM]+Cl- -[BHMM]+Cl- (M3), and [BDMM]+Cl--[BTMM]+Cl--[BHMM]+Cl- (M4). [BDMM]+Cl-, [BTMM]+Cl-, and [BHMM]+Cl- are benzyl dodecyl dimethyl ammonium chloride, benzyl tetradecyl dimethyl ammonium chloride, and benzyl hexadecyl dimethyl ammonium chloride, respectively. Then, the toxicity of the representative mixture rays and components for the four frequently detected mixture systems was tested using Vibrio qinghaiensis sp.-Q67 (Q67) as a luminescent indicator organism at 0.25 and 12 h. The toxicity of the mixtures was predicted using concentration addition (CA) and independent action (IA) models. It was shown that both the components and the representative mixture rays for the four frequently detected mixture systems exhibited obvious acute and chronic toxicity to Q67, and their median effective concentrations (EC50) were below 7 mg/L. Both CA and IA models predicted the toxicity of the four mixture systems well. However, the CA model had a better predictive ability for the toxicity of the M3 and M4 mixtures than IA at 12 h.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:准确预测癌症的驱动基因对肿瘤的发生发展研究和治疗具有重要意义。近年来,越来越多的基于深度学习的方法被用于预测癌症驱动基因。然而,深度学习算法通常具有黑盒属性,无法解释输出结果。这里,我们提出了一种新的基于异构网络元路径(MCDHGN)的癌症驱动基因挖掘方法,它使用元路径聚合来增强预测的可解释性。
    结果:MCDHGN通过使用与基因生物关联的几种类型的多组学数据来构建异质网络。SNV的差分概率,DNA甲基化,癌组织和正常组织之间的基因表达数据被提取作为基因的初始特征。手动选择九个元路径,元路径节点内和跨节点聚合信息得到的表示向量作为后续分类和预测任务的新特征。通过与两个泛癌症数据集上的八个同质和异质网络模型进行比较,MCDHGN在AUC和AUPR值上具有更好的性能。此外,MCDHGN通过生物学上有意义的元路径的不同权重提供预测的癌症驱动基因的可解释性。
    背景:https://github.com/1160300611/MCDHGN。
    背景:补充数据可在Bioinformatics在线获得。
    BACKGROUND: Accurately predicting the driver genes of cancer is of great significance for carcinogenesis progress research and cancer treatment. In recent years, more and more deep-learning-based methods have been used for predicting cancer driver genes. However, deep-learning algorithms often have black box properties and cannot interpret the output results. Here, we propose a novel cancer driver gene mining method based on heterogeneous network meta-paths (MCDHGN), which uses meta-path aggregation to enhance the interpretability of predictions.
    RESULTS: MCDHGN constructs a heterogeneous network by using several types of multi-omics data that are biologically linked to genes. And the differential probabilities of SNV, DNA methylation, and gene expression data between cancerous tissues and normal tissues are extracted as initial features of genes. Nine meta-paths are manually selected, and the representation vectors obtained by aggregating information within and across meta-path nodes are used as new features for subsequent classification and prediction tasks. By comparing with eight homogeneous and heterogeneous network models on two pan-cancer datasets, MCDHGN has better performance on AUC and AUPR values. Additionally, MCDHGN provides interpretability of predicted cancer driver genes through the varying weights of biologically meaningful meta-paths.
    METHODS: https://github.com/1160300611/MCDHGN.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号