Ensemble learning

合奏学习
  • 文章类型: Journal Article
    乳腺癌的高死亡率通常与晚期诊断有关,以乳房X光检查为关键,但有时在早期检测中的工具有限。为了提高诊断的准确性和速度,这项研究介绍了一种新颖的计算机辅助检测(CAD)集成系统。该系统结合了先进的深度学习网络-EfficientNet,Xception,MobileNetV2,InceptionV3和Resnet50通过我们创新的共识自适应加权(CAW)方法集成。该方法允许对多个深度网络进行动态调整,增强系统的检测能力。我们的方法还解决了更快R-CNN的像素级数据注释中的一个主要挑战,在之前的一项突出研究中得到了强调。对各种数据集的评估,包括裁剪的DDSM(筛查乳房X线摄影数字数据库),DDSM,和内胸,展示了系统的卓越性能。特别是,我们的CAD系统在裁剪的DDSM数据集上显示出显著的改进,将检测率提高约1.59%,准确率达到95.48%。这种创新的系统代表了早期乳腺癌检测的重大进展,提供更精确和及时诊断的潜力,最终促进改善患者预后。
    Breast cancer\'s high mortality rate is often linked to late diagnosis, with mammograms as key but sometimes limited tools in early detection. To enhance diagnostic accuracy and speed, this study introduces a novel computer-aided detection (CAD) ensemble system. This system incorporates advanced deep learning networks-EfficientNet, Xception, MobileNetV2, InceptionV3, and Resnet50-integrated via our innovative consensus-adaptive weighting (CAW) method. This method permits the dynamic adjustment of multiple deep networks, bolstering the system\'s detection capabilities. Our approach also addresses a major challenge in pixel-level data annotation of faster R-CNNs, highlighted in a prominent previous study. Evaluations on various datasets, including the cropped DDSM (Digital Database for Screening Mammography), DDSM, and INbreast, demonstrated the system\'s superior performance. In particular, our CAD system showed marked improvement on the cropped DDSM dataset, enhancing detection rates by approximately 1.59% and achieving an accuracy of 95.48%. This innovative system represents a significant advancement in early breast cancer detection, offering the potential for more precise and timely diagnosis, ultimately fostering improved patient outcomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:聚类分析是精准医学和系统生物学的组成部分,用于定义患者或生物分子组。共识聚类是一种广泛用于这些领域的集成方法,它组合了来自非确定性聚类算法的多次运行的输出。在这里,我们考虑将共识聚类应用于广泛的启发式聚类算法,这些算法可以通过在对这些模型执行基于采样的推断时采用早期停止标准来从贝叶斯混合模型(及其扩展)中得出。虽然由此产生的方法是非贝叶斯的,它继承了共识集群的通常好处,特别是在计算可扩展性和提供聚类稳定性/鲁棒性的评估方面。
    结果:在模拟研究中,我们证明了我们的方法可以成功地揭示目标聚类结构,同时还探索了数据的不同似是而非的聚类。我们证明,当并行计算环境可用时,与对基础模型执行基于采样的贝叶斯推断相比,我们的方法显着减少了运行时间,在保留贝叶斯方法的许多实际好处的同时,例如探索不同数量的集群。我们提出了一种启发式方法来决定集合大小和早期停止标准,然后将一致性聚类应用于贝叶斯综合聚类方法得出的聚类算法。我们使用所得到的方法来对出芽酵母的三个组学数据集进行综合分析,并找到具有共享调节蛋白的共表达基因的簇。我们使用分析外部的数据验证这些集群。
    结论:我们的方法可以用作基本上任何现有的基于采样的贝叶斯聚类实现的包装器,并允许使用这样的实现来执行有意义的聚类分析,即使计算贝叶斯推理不可行,例如,由于对目标密度的探索不足(通常是特征数量增加的结果)或计算预算有限,无法从单个链中提取足够的样本。这使研究人员能够直接将现有软件的适用性扩展到更大的数据集,包括复杂模型的实现,例如联合建模多个数据集的模型。
    BACKGROUND: Cluster analysis is an integral part of precision medicine and systems biology, used to define groups of patients or biomolecules. Consensus clustering is an ensemble approach that is widely used in these areas, which combines the output from multiple runs of a non-deterministic clustering algorithm. Here we consider the application of consensus clustering to a broad class of heuristic clustering algorithms that can be derived from Bayesian mixture models (and extensions thereof) by adopting an early stopping criterion when performing sampling-based inference for these models. While the resulting approach is non-Bayesian, it inherits the usual benefits of consensus clustering, particularly in terms of computational scalability and providing assessments of clustering stability/robustness.
    RESULTS: In simulation studies, we show that our approach can successfully uncover the target clustering structure, while also exploring different plausible clusterings of the data. We show that, when a parallel computation environment is available, our approach offers significant reductions in runtime compared to performing sampling-based Bayesian inference for the underlying model, while retaining many of the practical benefits of the Bayesian approach, such as exploring different numbers of clusters. We propose a heuristic to decide upon ensemble size and the early stopping criterion, and then apply consensus clustering to a clustering algorithm derived from a Bayesian integrative clustering method. We use the resulting approach to perform an integrative analysis of three \'omics datasets for budding yeast and find clusters of co-expressed genes with shared regulatory proteins. We validate these clusters using data external to the analysis.
    CONCLUSIONS: Our approach can be used as a wrapper for essentially any existing sampling-based Bayesian clustering implementation, and enables meaningful clustering analyses to be performed using such implementations, even when computational Bayesian inference is not feasible, e.g. due to poor exploration of the target density (often as a result of increasing numbers of features) or a limited computational budget that does not along sufficient samples to drawn from a single chain. This enables researchers to straightforwardly extend the applicability of existing software to much larger datasets, including implementations of sophisticated models such as those that jointly model multiple datasets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号