disease prediction

疾病预测
  • 文章类型: Journal Article
    图神经网络(GNN)在疾病预测中获得了极大的关注,其中患者的潜在嵌入被建模为节点,患者之间的相似性通过边缘表示。图结构,它决定了信息是如何聚合和传播的,在图形学习中起着至关重要的作用。最近的方法通常基于患者的潜在嵌入创建图,这可能无法准确反映他们现实世界的亲密关系。我们的分析表明,原始数据,如人口统计属性和实验室结果,为评估患者的相似性提供了丰富的信息,并且可以作为仅由潜在嵌入构建的图形的补偿措施。在这项研究中,我们首先分别从潜在表示和原始数据构造自适应图,然后通过加权求和合并这些图。鉴于图形可能包含无关和嘈杂的连接,我们应用程度敏感的边缘修剪和kNN稀疏化技术来选择性地稀疏化和修剪这些边缘。我们对两个诊断预测数据集进行了深入的实验,结果表明,我们提出的方法超越了当前最先进的技术。
    Graph neural networks (GNNs) have gained significant attention in disease prediction where the latent embeddings of patients are modeled as nodes and the similarities among patients are represented through edges. The graph structure, which determines how information is aggregated and propagated, plays a crucial role in graph learning. Recent approaches typically create graphs based on patients\' latent embeddings, which may not accurately reflect their real-world closeness. Our analysis reveals that raw data, such as demographic attributes and laboratory results, offers a wealth of information for assessing patient similarities and can serve as a compensatory measure for graphs constructed exclusively from latent embeddings. In this study, we first construct adaptive graphs from both latent representations and raw data respectively, and then merge these graphs via weighted summation. Given that the graphs may contain extraneous and noisy connections, we apply degree-sensitive edge pruning and kNN sparsification techniques to selectively sparsify and prune these edges. We conducted intensive experiments on two diagnostic prediction datasets, and the results demonstrate that our proposed method surpasses current state-of-the-art techniques.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:口腔癌的早期诊断对于降低发病率和死亡率至关重要。本研究探讨不确定性估计在深度学习中用于早期口腔癌诊断。
    方法:我们开发了一种称为“概率HRNet”的贝叶斯深度学习模型,它利用HRNet上的集成MCdropout方法。此外,创建了两个具有不同分布的口腔病变数据集。我们进行了一项回顾性研究,以评估这些数据集的概率HRNet的预测性能和不确定性。
    结果:概率HRNet在域内测试集上表现最佳,通过排除前30%的高不确定度样本,获得95.3%的F1评分和96.9%的AUC。对于Domain-shift测试集的评估,结果显示F1评分为64.9%,AUC为80.3%。排除30%的高不确定度样本后,这些指标提高到F1得分为74.4%,AUC为85.6%.
    结论:将具有高不确定性的样本重新引导至专家进行后续诊断可显著降低误诊率,这强调了不确定性估计对于确保计算机辅助早期口腔癌诊断的安全决策至关重要。
    BACKGROUND: Early diagnosis in oral cancer is essential to reduce both morbidity and mortality. This study explores the use of uncertainty estimation in deep learning for early oral cancer diagnosis.
    METHODS: We develop a Bayesian deep learning model termed \'Probabilistic HRNet\', which utilizes the ensemble MC dropout method on HRNet. Additionally, two oral lesion datasets with distinct distributions are created. We conduct a retrospective study to assess the predictive performance and uncertainty of Probabilistic HRNet across these datasets.
    RESULTS: Probabilistic HRNet performs optimally on the In-domain test set, achieving an F1 score of 95.3% and an AUC of 96.9% by excluding the top 30% high-uncertainty samples. For evaluations on the Domain-shift test set, the results show an F1 score of 64.9% and an AUC of 80.3%. After excluding 30% of the high-uncertainty samples, these metrics improve to an F1 score of 74.4% and an AUC of 85.6%.
    CONCLUSIONS: Redirecting samples with high uncertainty to experts for subsequent diagnosis significantly decreases the rates of misdiagnosis, which highlights that uncertainty estimation is vital to ensure safe decision making for computer-aided early oral cancer diagnosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    图神经网络(GNN)最近在疾病预测中越来越受欢迎。现有的基于GNN的方法主要围绕单个模态构建图拓扑结构,并将其与其他模态组合以获取采集的特征表示。每个模态中复杂的关系,然而,由于其特异性,可能不会很好地突出显示。Further,相对较浅的网络限制了对高级特征的充分提取,影响疾病预测性能。因此,本文开发了一种新的交互式深度级联频谱图卷积网络,该网络具有多关系图(IDCGN),用于疾病预测任务。它的关键点在于构造多关系图和双级联谱图卷积分支与交互(DCSGBI)。具体来说,前者通过设计两个可学习网络,从不同的模态设计了基于成对成像的边缘生成器和基于成对非成像的边缘生成器,自适应地捕获图形结构并提供相同采集的各种视图以帮助疾病诊断。再一次,建立DCSGBI以丰富疾病数据的高级语义信息和低级细节。它为每个分支设计了一个级联谱图卷积算子,并将不同分支之间的交互策略纳入网络,成功地形成了一个深入的模型,并从不同的分支中捕获了互补的信息。以这种方式,更有利的和足够的特征是学习一个可靠的诊断。在几个疾病数据集上的实验表明,IDCGN超过了最先进的模型,并取得了有希望的结果。
    Graph neural networks (GNNs) have recently grown in popularity for disease prediction. Existing GNN-based methods primarily build the graph topological structure around a single modality and combine it with other modalities to acquire feature representations of acquisitions. The complicated relationship in each modality, however, may not be well highlighted due to its specificity. Further, relatively shallow networks restrict adequate extraction of high-level features, affecting disease prediction performance. Accordingly, this paper develops a new interactive deep cascade spectral graph convolutional network with multi-relational graphs (IDCGN) for disease prediction tasks. Its crucial points lie in constructing multiple relational graphs and dual cascade spectral graph convolution branches with interaction (DCSGBI). Specifically, the former designs a pairwise imaging-based edge generator and a pairwise non-imaging-based edge generator from different modalities by devising two learnable networks, which adaptively capture graph structures and provide various views of the same acquisition to aid in disease diagnosis. Again, DCSGBI is established to enrich high-level semantic information and low-level details of disease data. It devises a cascade spectral graph convolution operator for each branch and incorporates the interaction strategy between different branches into the network, successfully forming a deep model and capturing complementary information from diverse branches. In this manner, more favorable and sufficient features are learned for a reliable diagnosis. Experiments on several disease datasets reveal that IDCGN exceeds state-of-the-art models and achieves promising results.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:整合多组学数据正在成为增强我们对复杂疾病的理解的重要方法。需要能够管理高维和异构数据集的创新计算方法来释放这种丰富和多样化数据的全部潜力。
    方法:我们提出了一种具有辅助分类器增强的AuToencoders(MOCAT)的Multi-Omics集成框架,以综合利用组学内部和组学之间的信息。此外,结合了具有置信度学习的注意力机制,以增强特征表示和可信预测。
    结果:在四个基准数据集上进行了广泛的实验,以评估我们提出的模型的有效性,包括BRCA,ROSMAP,LGG,还有KIPAN.我们的模型显着改善了大多数评估测量,并始终超越了最先进的方法。消融研究表明,辅助分类器显着提高了ROSMAP和LGG数据集的分类准确性。此外,注意机制和置信度评估模块有助于我们模型的预测准确性和可推广性的提高.
    结论:提出的框架在疾病分类和生物标志物发现方面表现出优异的性能,将自己确立为分析多层生物数据的强大而通用的工具。这项研究强调了精心设计的深度学习方法在剖析复杂疾病表型和提高疾病预测准确性方面的重要性。
    BACKGROUND: Integrating multi-omics data is emerging as a critical approach in enhancing our understanding of complex diseases. Innovative computational methods capable of managing high-dimensional and heterogeneous datasets are required to unlock the full potential of such rich and diverse data.
    METHODS: We propose a Multi-Omics integration framework with auxiliary Classifiers-enhanced AuToencoders (MOCAT) to utilize intra- and inter-omics information comprehensively. Additionally, attention mechanisms with confidence learning are incorporated for enhanced feature representation and trustworthy prediction.
    RESULTS: Extensive experiments were conducted on four benchmark datasets to evaluate the effectiveness of our proposed model, including BRCA, ROSMAP, LGG, and KIPAN. Our model significantly improved most evaluation measurements and consistently surpassed the state-of-the-art methods. Ablation studies showed that the auxiliary classifiers significantly boosted classification accuracy in the ROSMAP and LGG datasets. Moreover, the attention mechanisms and confidence evaluation block contributed to improvements in the predictive accuracy and generalizability of our model.
    CONCLUSIONS: The proposed framework exhibits superior performance in disease classification and biomarker discovery, establishing itself as a robust and versatile tool for analyzing multi-layer biological data. This study highlights the significance of elaborated designed deep learning methodologies in dissecting complex disease phenotypes and improving the accuracy of disease predictions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    疾病的预测可以促进早期干预,全面的诊断和治疗,从而有利于医疗保健和降低医疗成本。虽然单类和多类学习方法已应用于疾病预测,它们不足以区分初级和次级诊断,这对治疗至关重要。在本文中,建议标签分布来描述诊断,分配描述程度以量化诊断。此外,提出了一种新的分层标签分布学习(HLDL)模型,以基于疾病的分层分类进行细粒度预测,考虑到疾病之间的关系。真实世界数据集上的实验结果表明,HLDL模型优于基线,具有统计意义。
    The prediction of disease can facilitate early intervention, comprehensive diagnosis and treatment, thereby benefiting healthcare and reducing medical costs. While single class and multi-class learning methods have been applied for disease prediction, they are inadequate in distinguishing between primary and secondary diagnoses, which is crucial for treatments. In this paper, label distribution is suggested to describe the diagnosis, which assigns the description degree to quantify the diagnosis. Additionally, a novel hierarchical label distribution learning (HLDL) model is proposed to make fine-grained predictions based on the hierarchical classification of diseases, taking into account the relationship among diseases. The experimental results on real-world datasets demonstrate that the HLDL model outperforms the baselines with statistical significance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:肌肉无力是帕金森病的一个突出特征,但健康成人中这种缺陷的发生是否与随后的PD诊断相关尚不清楚.
    目的:这项研究试图检查肌肉力量之间的关系,以握力和步行速度为代表,以及事故PD的风险。
    方法:来自英国生物库的总共422,531名参与者被纳入本研究。通过Cox比例风险模型对几个公认的风险因素进行调整,研究了握力和步行速度与PD事件风险的纵向关联。还进行了亚组和敏感性分析以进一步验证。
    结果:经过9.23年的中位随访,2,118(0.5%)个人发生了PD事件。对于每5kg的绝对握力增量,发生PD的风险显着降低了10.2%(HR=0.898,95%CI[0.872-0.924],P<0.001)。同样,每增加0.05kg/kg的相对握力与发生PD的风险降低9.2%相关(HR=0.908,95%CI[0.887-0.929],P<0.001)。值得注意的是,当握力计算为五分位数时,这些关联保持一致.此外,步速较慢的参与者表现出发生PD的风险升高(HR=1.231,95CI[1.075-1.409],P=0.003)。亚组和敏感性分析进一步验证了观察到的关联的稳健性。
    结论:我们的研究结果表明,独立于重要的混杂因素,握力和步行步速与PD事件风险呈负相关。这些结果对PD高危人群的早期筛查具有潜在意义。
    BACKGROUND: Muscle weakness is a prominent feature of Parkinson\'s disease, but whether the occurrence of this deficit in healthy adults is associated with subsequent PD diagnosis remains unclear.
    OBJECTIVE: This study sought to examine the relationship between muscle strength, represented by grip strength and walking pace, and the risk of incident PD.
    METHODS: A total of 422,531 participants from the UK biobank were included in this study. Longitudinal associations of grip strength and walking pace with the risk of incident PD were investigated by Cox proportional hazard models adjusting for several well-established risk factors. Subgroup and sensitivity analyses were also conducted for further validation.
    RESULTS: After a median follow-up of 9.23 years, 2,118 (0.5%) individuals developed incident PD. For per 5 kg increment of absolute grip strength, there was a significant 10.2% reduction in the risk of incident PD (HR = 0.898, 95% CI [0.872-0.924], P < 0.001). Similarly, per 0.05 kg/kg increment of relative grip strength was related to a 9.2% reduced risk of incident PD (HR = 0.908, 95% CI [0.887-0.929], P < 0.001). Notably, the associations remained consistent when grip strength was calculated as quintiles. Moreover, participants with a slower walking pace demonstrated an elevated risk of incident PD (HR = 1.231, 95%CI [1.075-1.409], P = 0.003). Subgroup and sensitivity analyses further validated the robustness of the observed associations.
    CONCLUSIONS: Our findings showed a negative association of grip strength and walking pace with the risk of incident PD independent of important confounding factors. These results hold potential implications for the early screening of people at high-risk of PD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:在过去的十年中,越来越多的研究导致了人们对C4d的广泛理解,补体成分4(C4)的分裂产物,是系统性红斑狼疮(SLE)和狼疮性肾炎(LN)的潜在生物标志物。目的:这篇综述的目的是总结研究使用C4d作为诊断和监测SLE和LN患者的生物标志物的研究重点。数据收集:我们使用术语“C4d和系统性红斑狼疮”搜索PubMed/Medline和Wanfang数据库,“C4d和狼疮肾炎”,和“补充C4d”。结果:多项临床研究表明,C4d在循环血细胞上的沉积是一种潜在的诊断标志物,可用于监测SLE患者。此外,循环血细胞上的C4d沉积物可能是LN的有用诊断标记,SLE最严重的并发症之一。同时,使用肾活检标本的研究表明,LN患者肾小管周围毛细血管中的C4d沉积可能预测更严重的LN或更差的患者预后。一般来说,高血浆C4d水平和高血浆C4d/C4比值也可能是有前景的指标,可用于监测SLE和LN患者.结论:C4d检测可能是进一步临床预测和治疗的新策略。
    Background: Increasing studies in the last decade have led to the widespread understanding that C4d, a split product of complement component 4 (C4), is a potential biomarker for systemic lupus erythematosus (SLE) and lupus nephritis (LN).Purpose: The aim of this review is to summarize the highlights of studies investigating the use of C4d as a biomarker for diagnosing and monitoring SLE and LN patients.Data collection: we searched PubMed/Medline and Wanfang databases using the terms \"C4d and systemic lupus erythematosus\", \"C4d and lupus nephritis\", and \"Complement C4d\".Results: The deposition of C4d on circulating blood cells has been shown in several clinical studies to be a potential diagnostic marker that can be used to monitor patients with SLE. In addition, C4d deposits on circulating blood cells may be a helpful diagnostic marker for LN, one of the most severe complications of SLE. Meanwhile, studies utilizing renal biopsy specimens have indicated that C4d deposition in the renal peritubular capillaries of LN patients may predict more severe LN or a worse patient prognosis. Generally, a high plasma C4d level and a high plasma C4d/C4 ratio may also be promising indicators that can be used to monitor patients with SLE and LN.Conclusions: C4d detection may be a novel strategy for further clinical prediction and therapy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    帕金森病(PD)是一种以α-突触核蛋白积累和多巴胺能神经元变性为特征的常见神经退行性疾病。采用全基因组测序,我们在中国PD患者中发现了一个显著富集的多态性USP8等位基因(USP8D442G).为了测试这种多态性在PD发病机制中的参与,我们从具有USP8D442G等位基因的PD患者及其健康同胞的成纤维细胞重编程的人诱导多能干细胞(hiPSCs)中衍生出多巴胺能神经元(DAn)。此外,我们将D442G多态性位点敲入人胚胎干细胞(hESCs)的内源性USP8基因,并从这些敲入的hESCs中衍生DAn,以探讨其细胞表型和分子机制。我们发现USP8D442G在DAn中的表达诱导α-突触核蛋白(α-Syn)的积累和异常的亚细胞定位。机械上,我们证明D442G多态性增强了α-Syn和USP8之间的相互作用,从而增加了K63特异性去泛素化和α-Syn的稳定性。我们发现PD的致病性多态性代表了PD的有希望的治疗和诊断靶标。
    Parkinson\'s disease (PD) is one of the most common neuro-degenerative diseases characterized by α-synuclein accumulation and degeneration of dopaminergic neurons. Employing genome-wide sequencing, we identified a polymorphic USP8 allele (USP8D442G) significantly enriched in Chinese PD patients. To test the involvement of this polymorphism in PD pathogenesis, we derived dopaminergic neurons (DAn) from human-induced pluripotent stem cells (hiPSCs) reprogrammed from fibroblasts of PD patients harboring USP8D442G allele and their healthy siblings. In addition, we knock-in D442G polymorphic site into the endogenous USP8 gene of human embryonic stem cells (hESCs) and derived DAn from these knock-in hESCs to explore their cellular phenotypes and molecular mechanism. We found that expression of USP8D442G in DAn induces the accumulation and abnormal subcellular localization of α-Synuclein (α-Syn). Mechanistically, we demonstrate that D442G polymorphism enhances the interaction between α-Syn and USP8 and thus increases the K63-specific deubiquitination and stability of α-Syn . We discover a pathogenic polymorphism for PD that represent a promising therapeutic and diagnostic target for PD.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    患者表示学习旨在以数学表示的形式对有关患者电子健康记录(EHR)的有意义的信息进行编码。深度学习的最新进展使患者表示学习方法具有更大的表示能力,允许学习的表示显着提高疾病预测模型的性能。然而,深度学习模型的固有缺陷,例如需要大量的标记数据和无法解释,将基于深度学习的患者表示学习方法的性能限制为进一步改进。特别是,当患者数据缺失或不足时,学习健壮的患者表示是具有挑战性的。尽管数据增强技术可以解决这一缺陷,复杂的数据处理进一步削弱了患者表征学习模型的无法解释性。为应对上述挑战,本文提出了一种用于疾病预测(EAPR)的可解释和增强的患者表示学习。EAPR利用由置信区间控制的数据增强来在存在有限的患者数据的情况下增强患者表示。此外,EAPR建议使用两阶段梯度反向传播来解决由于复杂的数据增强过程而导致的无法解释的患者表示学习模型的问题。在真实临床数据上的实验结果验证了该方法的有效性和可解释性。
    Patient representation learning aims to encode meaningful information about the patient\'s Electronic Health Records (EHR) in the form of a mathematical representation. Recent advances in deep learning have empowered Patient representation learning methods with greater representational power, allowing the learned representations to significantly improve the performance of disease prediction models. However, the inherent shortcomings of deep learning models, such as the need for massive amounts of labeled data and inexplicability, limit the performance of deep learning-based Patient representation learning methods to further improvements. In particular, learning robust patient representations is challenging when patient data is missing or insufficient. Although data augmentation techniques can tackle this deficiency, the complex data processing further weakens the inexplicability of patient representation learning models. To address the above challenges, this paper proposes an Explainable and Augmented Patient Representation Learning for disease prediction (EAPR). EAPR utilizes data augmentation controlled by confidence interval to enhance patient representation in the presence of limited patient data. Moreover, EAPR proposes to use two-stage gradient backpropagation to address the problem of unexplainable patient representation learning models due to the complex data enhancement process. The experimental results on real clinical data validate the effectiveness and explainability of the proposed approach.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肠道微生物组被认为是调节人类健康的基本决定因素之一。和多组学数据分析已越来越多地用于加强对这个复杂系统的深刻理解。然而,源于成本或其他限制,多元组学的整合往往存在观点不完整的问题,这对综合分析提出了很大的挑战。在这项工作中,提出了一种新的深度模型,称为不完全多组学变分神经网络(IMOVNN),用于不完全数据集成,疾病预测应用和生物标志物识别。受益于信息瓶颈和边际向联合配送一体化机制,IMOVNN可以学习每个个体组学的边缘潜在表示和联合潜在表示,以更好地预测疾病。此外,由于基于具体分布的特征选择层,该模型是可解释的,可以识别最相关的特征。对炎症性肠病多组学数据集的实验表明,我们的方法优于几种最新的疾病预测方法。此外,IMOVNN已从多组学数据源中识别出重要的生物标志物。
    The gut microbiome has been regarded as one of the fundamental determinants regulating human health, and multi-omics data profiling has been increasingly utilized to bolster the deep understanding of this complex system. However, stemming from cost or other constraints, the integration of multi-omics often suffers from incomplete views, which poses a great challenge for the comprehensive analysis. In this work, a novel deep model named Incomplete Multi-Omics Variational Neural Networks (IMOVNN) is proposed for incomplete data integration, disease prediction application and biomarker identification. Benefiting from the information bottleneck and the marginal-to-joint distribution integration mechanism, the IMOVNN can learn the marginal latent representation of each individual omics and the joint latent representation for better disease prediction. Moreover, owing to the feature-selective layer predicated upon the concrete distribution, the model is interpretable and can identify the most relevant features. Experiments on inflammatory bowel disease multi-omics datasets demonstrate that our method outperforms several state-of-the-art methods for disease prediction. In addition, IMOVNN has identified significant biomarkers from multi-omics data sources.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号