Autoencoder

自动编码器
  • 文章类型: Journal Article
    物联网(IoT)和工业物联网(IIoT)中互连设备的使用不断增加,显着提高了个人和工业环境中的效率和实用性,但也加剧了网络安全漏洞。特别是通过物联网恶意软件。本文探讨了一类分类的使用,一种无监督学习的方法,这特别适用于未标记的数据,动态环境,和恶意软件检测,这是异常检测的一种形式。我们介绍了TF-IDF方法,用于将标称特征转换为避免信息丢失并有效管理维度的数值格式,当与n-gram结合时,这对于增强模式识别至关重要。此外,我们比较了多类别与一类分类模型,包括隔离森林和深度自动编码器,使用良性和恶意NetFlow样本与只对良性NetFlow样本进行训练。我们使用单类分类在各种测试数据集上实现了100%的召回率,准确率高于80%和90%。这些模型显示了无监督学习的适应性,尤其是一类分类,物联网领域不断演变的恶意软件威胁,提供有关增强物联网安全框架的见解,并为这一关键领域的未来研究提出方向。
    The increasing usage of interconnected devices within the Internet of Things (IoT) and Industrial IoT (IIoT) has significantly enhanced efficiency and utility in both personal and industrial settings but also heightened cybersecurity vulnerabilities, particularly through IoT malware. This paper explores the use of one-class classification, a method of unsupervised learning, which is especially suitable for unlabeled data, dynamic environments, and malware detection, which is a form of anomaly detection. We introduce the TF-IDF method for transforming nominal features into numerical formats that avoid information loss and manage dimensionality effectively, which is crucial for enhancing pattern recognition when combined with n-grams. Furthermore, we compare the performance of multi-class vs. one-class classification models, including Isolation Forest and deep autoencoder, that are trained with both benign and malicious NetFlow samples vs. trained exclusively on benign NetFlow samples. We achieve 100% recall with precision rates above 80% and 90% across various test datasets using one-class classification. These models show the adaptability of unsupervised learning, especially one-class classification, to the evolving malware threats in the IoT domain, offering insights into enhancing IoT security frameworks and suggesting directions for future research in this critical area.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    网络流量异常检测,作为一种有效的网络安全分析方法,可以识别差异化的流量信息,并在复杂多变的网络环境中提供安全的操作。在提高交通特征信息检测性能的同时,避免在处理交通数据时造成的信息丢失,提出了一种基于卷积神经网络和自动编码器的多信息融合模型。该模型使用卷积神经网络直接从原始交通数据中提取特征,和自动编码器对从原始交通数据中提取的统计特征进行编码,用于补充因裁剪而造成的信息损失。这两个功能组合在一起,形成一个新的网络流量集成功能,具有来自原始交通数据的负荷信息和来自统计特征的原始交通数据的全局信息,从而提供了网络流量中包含的信息的完整表示,提高了模型的检测性能。实验表明,利用该模型进行网络流量异常检测的分类准确率优于经典机器学习方法。
    Network traffic anomaly detection, as an effective analysis method for network security, can identify differentiated traffic information and provide secure operation in complex and changing network environments. To avoid information loss caused when handling traffic data while improving the detection performance of traffic feature information, this paper proposes a multi-information fusion model based on a convolutional neural network and AutoEncoder. The model uses a convolutional neural network to extract features directly from the raw traffic data, and a AutoEncoder to encode the statistical features extracted from the raw traffic data, which are used to supplement the information loss due to cropping. These two features are combined to form a new integrated feature for network traffic, which has the load information from the original traffic data and the global information of the original traffic data obtained from the statistical features, thus providing a complete representation of the information contained in the network traffic and improving the detection performance of the model. The experiments show that the classification accuracy of network traffic anomaly detection using this model outperforms that of classical machine learning methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    植物叶部病害的识别在精准农业中至关重要,在推进农业现代化中发挥着举足轻重的作用。及时检测和诊断叶部病害的预防措施大大有助于提高农产品的数量和质量,从而促进精准农业的深入发展。然而,尽管植物叶部病害鉴定研究发展迅速,它仍然面临挑战,例如农业数据集不足以及基于深度学习的疾病识别模型具有大量训练参数和准确性不足的问题。针对上述问题,提出了一种基于改进SinGAN和改进ResNet34的植物叶部病害识别方法。首先,提出了一种改进的SinGAN,称为基于重建的单幅图像生成网络(ReSinGN),用于图像增强。该网络通过使用自动编码器代替SinGAN中的GAN来加快模型训练速度,并将卷积块注意模块(CBAM)集成到自动编码器中,以更准确地捕获图像中的重要特征和结构信息。ReSinGN中引入了随机像素改组,以使模型能够学习更丰富的数据表示,进一步增强生成图像的质量。其次,提出了一种改进的ResNet34用于植物叶片病害识别。这涉及将CBAM模块添加到ResNet34,以减轻参数共享的限制,用LeakyReLU激活函数代替ReLU激活函数来解决神经元死亡的问题,利用基于迁移学习的训练方法加快网络训练速度。本文以番茄叶部病害为实验对象,实验结果表明:(1)与SinGAN相比,ReSinGN生成的高质量图像的训练速度至少快44.6倍。(2)ReSinGN模型生成的图像的Tengrade得分为67.3,与SinGAN相比提高了30.2,产生更清晰的图像。(3)具有随机像素混洗的ReSinGN模型在图像清晰度和失真方面都优于SinGAN,实现图像清晰度和失真之间的最佳平衡。(4)改进的ResNet34实现了平均识别精度,识别精度,识别精度(冗余,因为它类似于精度),召回,F1得分为98.57、96.57、98.68、97.7和98.17%,分别,用于番茄叶部病害鉴定。与原始ResNet34相比,这代表了3.65、4.66、0.88、4.1和2.47%的增强,分别。
    The identification of plant leaf diseases is crucial in precision agriculture, playing a pivotal role in advancing the modernization of agriculture. Timely detection and diagnosis of leaf diseases for preventive measures significantly contribute to enhancing both the quantity and quality of agricultural products, thereby fostering the in-depth development of precision agriculture. However, despite the rapid development of research on plant leaf disease identification, it still faces challenges such as insufficient agricultural datasets and the problem of deep learning-based disease identification models having numerous training parameters and insufficient accuracy. This paper proposes a plant leaf disease identification method based on improved SinGAN and improved ResNet34 to address the aforementioned issues. Firstly, an improved SinGAN called Reconstruction-Based Single Image Generation Network (ReSinGN) is proposed for image enhancement. This network accelerates model training speed by using an autoencoder to replace the GAN in the SinGAN and incorporates a Convolutional Block Attention Module (CBAM) into the autoencoder to more accurately capture important features and structural information in the images. Random pixel Shuffling are introduced in ReSinGN to enable the model to learn richer data representations, further enhancing the quality of generated images. Secondly, an improved ResNet34 is proposed for plant leaf disease identification. This involves adding CBAM modules to the ResNet34 to alleviate the limitations of parameter sharing, replacing the ReLU activation function with LeakyReLU activation function to address the problem of neuron death, and utilizing transfer learning-based training methods to accelerate network training speed. This paper takes tomato leaf diseases as the experimental subject, and the experimental results demonstrate that: (1) ReSinGN generates high-quality images at least 44.6 times faster in training speed compared to SinGAN. (2) The Tenengrad score of images generated by the ReSinGN model is 67.3, which is improved by 30.2 compared to the SinGAN, resulting in clearer images. (3) ReSinGN model with random pixel Shuffling outperforms SinGAN in both image clarity and distortion, achieving the optimal balance between image clarity and distortion. (4) The improved ResNet34 achieved an average recognition accuracy, recognition precision, recognition accuracy (redundant as it\'s similar to precision), recall, and F1 score of 98.57, 96.57, 98.68, 97.7, and 98.17%, respectively, for tomato leaf disease identification. Compared to the original ResNet34, this represents enhancements of 3.65, 4.66, 0.88, 4.1, and 2.47%, respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的本研究的目的是使用磁共振T1加权成像建立9-10岁儿童未校正/实际流体智力得分的预测模型。探索基于重建正则化的自编码器模型对青少年流体智能的预测性能。方法我们收集了11,534名青少年的实际流体智力得分和T1加权MRI,这些青少年完成了来自ABCDDataRelease3.0的基线任务。总共选择了148个ROI,并通过FreeSurfer分割提出了604个特征。训练和测试集以7:3的比例划分。为了预测流动性智力得分,我们用AE,MLP和经典的机器学习模型,并比较了它们在测试装置上的表现。此外,我们探讨了他们在性别亚群中的表现.此外,我们使用SHapley加法解释方法评估了特征的重要性。结果:所提出的模型在测试集上实现了预测实际流体智能分数的最佳性能(PCC=0.209±0.02,MSE=105.212±2.53)。结果表明,重构正则化的自编码器比MLPs和经典机器学习模型更有效。此外,所有模型在女性青少年中的表现均优于男性青少年.对不同人群的相关特征的进一步分析表明,这可能与潜在的流体智力机制中的性别差异有关。结论我们使用自动编码器在大脑结构特征和原始流体智力之间构建了弱但稳定的相关性。未来的研究可能需要探索利用多模式数据上的多种机器学习算法的集成回归策略,以提高基于神经成像特征的流体智能的预测性能。
    UNASSIGNED: The aim of this study was to develop a predictive model for uncorrected/actual fluid intelligence scores in 9-10 year old children using magnetic resonance T1-weighted imaging. Explore the predictive performance of an autoencoder model based on reconstruction regularization for fluid intelligence in adolescents.
    UNASSIGNED: We collected actual fluid intelligence scores and T1-weighted MRIs of 11,534 adolescents who completed baseline tasks from ABCD Data Release 3.0. A total of 148 ROIs were selected and 604 features were proposed by FreeSurfer segmentation. The training and testing sets were divided in a ratio of 7:3. To predict fluid intelligence scores, we used AE, MLP and classic machine learning models, and compared their performance on the test set. In addition, we explored their performance across gender subpopulations. Moreover, we evaluated the importance of features using the SHapley Additive Explain method. Results: The proposed model achieves optimal performance on the test set for predicting actual fluid intelligence scores (PCC = 0.209 ± 0.02, MSE = 105.212 ± 2.53). Results show that autoencoders with refactoring regularization are significantly more effective than MLPs and classical machine learning models. In addition, all models performed better on female adolescents than on male adolescents. Further analysis of relevant characteristics in different populations revealed that this may be related to gender differences in underlying fluid intelligence mechanisms.
    UNASSIGNED: We construct a weak but stable correlation between brain structural features and raw fluid intelligence using autoencoders. Future research may need to explore ensemble regression strategies utilizing multiple machine learning algorithms on multimodal data in order to improve the predictive performance of fluid intelligence based on neuroimaging features.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    射线照相在医疗保健中起着重要的作用,准确的定位对于提供最佳质量的图像至关重要。诊断价值不足的射线照片被拒绝,需要重拍。然而,确定重新拍摄射线照片的适用性是一种定性评估。
    使用基于无监督学习的自动编码器(AE)和变分自动编码器(VAE)自动评估颅骨射线照片的准确性。在这项研究中,我们取消了视觉定性评估,并使用无监督学习从定量评估中识别颅骨射线照相重拍。
    在射线照片上拍摄了五个头骨体模,并获得了1,680张图像。这些图像对应于两类:在适当位置捕获的正常图像和在不适当位置捕获的图像。本研究使用异常检测方法验证了颅骨X光片的辨别能力。
    AE和VAE的曲线下面积分别为0.7060和0.6707,在接收机工作特性分析中。我们提出的方法显示出比以前的研究更高的辨别能力,准确率为52%。
    我们的发现表明,所提出的方法在确定重新拍摄颅骨射线照片的适用性方面具有很高的分类准确性。最佳图像考虑的自动化,是否重新拍摄射线照片,有助于在繁忙的X射线成像操作中提高操作效率。
    UNASSIGNED: Radiography plays an important role in medical care, and accurate positioning is essential for providing optimal quality images. Radiographs with insufficient diagnostic value are rejected, and retakes are required. However, determining the suitability of retaking radiographs is a qualitative evaluation.
    UNASSIGNED: To evaluate skull radiograph accuracy automatically using an unsupervised learning-based autoencoder (AE) and a variational autoencoder (VAE). In this study, we eliminated visual qualitative evaluation and used unsupervised learning to identify skull radiography retakes from the quantitative evaluation.
    UNASSIGNED: Five skull phantoms were imaged on radiographs, and 1,680 images were acquired. These images correspond to two categories: normal images captured at appropriate positions and images captured at inappropriate positions. This study verified the discriminatory ability of skull radiographs using anomaly detection methods.
    UNASSIGNED: The areas under the curves for AE and VAE were 0.7060 and 0.6707, respectively, in receiver operating characteristic analysis. Our proposed method showed a higher discrimination ability than those of previous studies which had an accuracy of 52%.
    UNASSIGNED: Our findings suggest that the proposed method has high classification accuracy in determining the suitability of retaking skull radiographs. Automation of optimal image consideration, whether or not to retake radiographs, contributes to improving operational efficiency in busy X-ray imaging operations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:在患有多形性胶质母细胞瘤(GBM)的患者中,这项研究旨在评估深度学习算法在自动化脑磁共振(MR)图像分割中的功效,以准确确定4个不同区域的3D掩模:增强的肿瘤,瘤周水肿,非增强/坏死性肿瘤,和总肿瘤。
    方法:开发了一种用于GBM语义分割的3DU-Net神经网络算法。训练数据集由一组专家神经放射科医生对来自脑肿瘤分割挑战2021(BraTS2021)图像库的MR图像进行手动描绘,作为四个MR序列(T1w,T1w对比度增强,T2w,和FLAIR)在1251名患者中。对我们队列中的50名GBM患者进行了内部测试(PerProGlio项目)。通过探索各种超参数,网络的性能得到了优化,并确定了最优参数配置。利用Dice分数对优化网络性能的评估,精度,和敏感度指标。
    结果:我们对3DU网的调整以及额外的残差块在BraTS2021数据集和内部PerProGlio队列上都表现出可靠的性能,仅使用T1w-ce序列用于增强和非增强/坏死肿瘤模型,使用T1w-ceT2wFLAIR用于肿瘤周围水肿和总肿瘤。平均Dice评分(训练和测试)为0.89和0.75;0.75和0.64;0.79和0.71;和0.60和0.55,对于总肿瘤,水肿,增强的肿瘤,和非增强/坏死性肿瘤,分别。
    结论:结果强调了我们的网络可以有效地分割GBM肿瘤及其不同亚区域的高精度。达到的准确性水平与以前的GBM研究中记录的系数一致。特别是,我们的方法允许针对每个不同肿瘤子区域的模型特化,仅使用那些为分割提供价值的MR序列.
    OBJECTIVE: In patients having naïve glioblastoma multiforme (GBM), this study aims to assess the efficacy of Deep Learning algorithms in automating the segmentation of brain magnetic resonance (MR) images to accurately determine 3D masks for 4 distinct regions: enhanced tumor, peritumoral edema, non-enhanced/necrotic tumor, and total tumor.
    METHODS: A 3D U-Net neural network algorithm was developed for semantic segmentation of GBM. The training dataset was manually delineated by a group of expert neuroradiologists on MR images from the Brain Tumor Segmentation Challenge 2021 (BraTS2021) image repository, as ground truth labels for diverse glioma (GBM and low-grade glioma) subregions across four MR sequences (T1w, T1w-contrast enhanced, T2w, and FLAIR) in 1251 patients. The in-house test was performed on 50 GBM patients from our cohort (PerProGlio project). By exploring various hyperparameters, the network\'s performance was optimized, and the most optimal parameter configuration was identified. The assessment of the optimized network\'s performance utilized Dice scores, precision, and sensitivity metrics.
    RESULTS: Our adaptation of the 3D U-net with additional residual blocks demonstrated reliable performance on both the BraTS2021 dataset and the in-house PerProGlio cohort, employing only T1w-ce sequences for enhancement and non-enhanced/necrotic tumor models and T1w-ce + T2w + FLAIR for peritumoral edema and total tumor. The mean Dice scores (training and test) were 0.89 and 0.75; 0.75 and 0.64; 0.79 and 0.71; and 0.60 and 0.55, for total tumor, edema, enhanced tumor, and non-enhanced/necrotic tumor, respectively.
    CONCLUSIONS: The results underscore the high precision with which our network can effectively segment GBM tumors and their distinct subregions. The level of accuracy achieved agrees with the coefficients recorded in previous GBM studies. In particular, our approach allows model specialization for each of the different tumor subregions employing only those MR sequences that provide value for segmentation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    越来越多的研究结果表明,环状RNA(circularRNA,circRNA)通过与miRNA结合,在复杂的人类疾病的发病机理中发挥着至关重要的功能。识别它们的潜在相互作用对于疾病的诊断和治疗至关重要。然而,长周期,小尺度,和耗时的过程表征以前的生物湿实验。因此,使用高效的计算模型来预测circRNA和miRNA之间的相互作用正逐渐成为主流。在这项研究中,我们提出了一种新的预测模型BJLD-CMI。该模型通过应用Jaccard和Bert的方法提取circRNA序列特征和miRNA序列特征,并将它们有机整合,得到CMI属性特征,然后使用图嵌入方法Line基于已知的circRNA-miRNA关联图信息提取CMI行为特征。然后,我们通过自动编码器网络中的自动编码器融合属性和行为等多角度特征信息来预测潜在的circRNA-miRNA相互作用。BJLD-CMI在CMI-9589和CMI-9905数据集上达到了ROC曲线下面积的94.95%和90.69%。与现有模型相比,结果表明,BJLD-CMI表现出最佳的整体能力。在案例研究实验中,我们进行了PubMed文献检索,以确认在前10个预测CMI中,确实存在七对。这些结果表明BJLD-CMI是预测circRNAs和miRNAs之间相互作用的有效方法。它为生物湿实验提供了有价值的候选者,可以减轻研究人员的负担。
    Increasing research findings suggest that circular RNA (circRNA) exerts a crucial function in the pathogenesis of complex human diseases by binding to miRNA. Identifying their potential interactions is of paramount importance for the diagnosis and treatment of diseases. However, long cycles, small scales, and time-consuming processes characterize previous biological wet experiments. Consequently, the use of an efficient computational model to forecast the interactions between circRNA and miRNA is gradually becoming mainstream. In this study, we present a new prediction model named BJLD-CMI. The model extracts circRNA sequence features and miRNA sequence features by applying Jaccard and Bert\'s method and organically integrates them to obtain CMI attribute features, and then uses the graph embedding method Line to extract CMI behavioral features based on the known circRNA-miRNA correlation graph information. And then we predict the potential circRNA-miRNA interactions by fusing the multi-angle feature information such as attribute and behavior through Autoencoder in Autoencoder Networks. BJLD-CMI attained 94.95% and 90.69% of the area under the ROC curve on the CMI-9589 and CMI-9905 datasets. When compared with existing models, the results indicate that BJLD-CMI exhibits the best overall competence. During the case study experiment, we conducted a PubMed literature search to confirm that out of the top 10 predicted CMIs, seven pairs did indeed exist. These results suggest that BJLD-CMI is an effective method for predicting interactions between circRNAs and miRNAs. It provides a valuable candidate for biological wet experiments and can reduce the burden of researchers.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    预处理在拉曼光谱分析中起着关键作用。然而,经典的预处理算法通常具有在处理光谱时降低拉曼峰强度和改变峰形状的问题。本文介绍了一种基于卷积自动编码器的统一预处理解决方案,以增强拉曼光谱数据。一种是使用卷积去噪自动编码器(CDAE模型)的去噪算法,另一种是基于卷积自动编码器(CAE+模型)的基线校正算法。CDAE模型在其瓶颈层中包含两个额外的卷积层,以增强降噪效果。CAE+模型不仅在瓶颈处添加卷积层,而且在解码之后包括用于有效基线校正的比较函数。使用拉曼光谱仪系统测量的模拟光谱和实验光谱对所提出的模型进行了验证。将它们的性能与传统信号处理技术的性能进行比较,CDAE-CAE+模型的结果表明在降噪和拉曼峰保存方面有改善。
    Preprocessing plays a key role in Raman spectral analysis. However, classical preprocessing algorithms often have issues with reducing Raman peak intensities and changing the peak shape when processing spectra. This paper introduces a unified solution for preprocessing based on a convolutional autoencoder to enhance Raman spectroscopy data. One is a denoising algorithm that uses a convolutional denoising autoencoder (CDAE model), and the other is a baseline correction algorithm based on a convolutional autoencoder (CAE+ model). The CDAE model incorporates two additional convolutional layers in its bottleneck layer for enhanced noise reduction. The CAE+ model not only adds convolutional layers at the bottleneck but also includes a comparison function after the decoding for effective baseline correction. The proposed models were validated using both simulated spectra and experimental spectra measured with a Raman spectrometer system. Comparing their performance with that of traditional signal processing techniques, the results of the CDAE-CAE+ model show improvements in noise reduction and Raman peak preservation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在使用无监督方法进行表面缺陷检测时,在重建高质量正常背景的同时准确检测缺陷仍然是一个重大挑战。本研究提出了一种无监督的方法,通过实现准确的缺陷检测和无噪声的高质量正常背景重建,有效地解决了这一挑战。我们提出了一种自适应加权结构相似性(AW-SSIM)损失用于聚焦特征学习。AW-SSIM通过为其亮度子函数分配不同的权重来改善结构相似性(SSIM)损失,对比,并根据它们对特定训练样本的相对重要性进行结构。此外,它在损耗计算期间动态调整高斯窗口的标准偏差(σ),以平衡降噪和细节保留。提出了一种人工缺陷生成算法(ADGA),以生成与真实缺陷非常相似的人工缺陷。我们采用两阶段训练策略。在第一阶段,该模型仅使用AW-SSIM损失对正常样本进行训练,允许它学习正常特征的鲁棒表示。在第二阶段的训练中,从第一阶段获得的权重用于在正常训练样本和人工缺陷训练样本上训练模型。此外,第二阶段采用组合学习的感知图像补丁相似度(LPIPS)和AW-SSIM损失。组合损失有助于模型实现高质量的正常背景重建,同时保持准确的缺陷检测。大量的实验结果表明,我们提出的方法达到了最先进的缺陷检测精度。所提出的方法在MVTec异常检测数据集中的六个样本上实现了97.69%的接收器工作特征曲线(AuROC)下的平均面积。
    Accurately detecting defects while reconstructing a high-quality normal background in surface defect detection using unsupervised methods remains a significant challenge. This study proposes an unsupervised method that effectively addresses this challenge by achieving both accurate defect detection and a high-quality normal background reconstruction without noise. We propose an adaptive weighted structural similarity (AW-SSIM) loss for focused feature learning. AW-SSIM improves structural similarity (SSIM) loss by assigning different weights to its sub-functions of luminance, contrast, and structure based on their relative importance for a specific training sample. Moreover, it dynamically adjusts the Gaussian window\'s standard deviation (σ) during loss calculation to balance noise reduction and detail preservation. An artificial defect generation algorithm (ADGA) is proposed to generate an artificial defect closely resembling real ones. We use a two-stage training strategy. In the first stage, the model trains only on normal samples using AW-SSIM loss, allowing it to learn robust representations of normal features. In the second stage of training, the weights obtained from the first stage are used to train the model on both normal and artificially defective training samples. Additionally, the second stage employs a combined learned Perceptual Image Patch Similarity (LPIPS) and AW-SSIM loss. The combined loss helps the model in achieving high-quality normal background reconstruction while maintaining accurate defect detection. Extensive experimental results demonstrate that our proposed method achieves a state-of-the-art defect detection accuracy. The proposed method achieved an average area under the receiver operating characteristic curve (AuROC) of 97.69% on six samples from the MVTec anomaly detection dataset.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    由于科学研究表明miRNA的异常表达会导致许多复杂疾病的发生,miRNA与疾病关系的精确测定极大地促进了人类医学的进步。为了解决传统实验方法效率低下的问题,已经提出了许多计算方法来预测miRNA-疾病相关具有增强的准确性。然而,通过整合基因信息构建miRNA-基因-疾病异质性网络在现有计算技术中的探索相对不足。因此,本文提出了一种通过自动编码器并在miRNA-基因-疾病异质性网络(AE-RW)上实现随机游走来预测miRNA-疾病关联的技术。首先,我们整合了miRNA之间的关联信息和相似性,基因,构建miRNA-基因-疾病异质性网络。随后,我们合并了通过自动编码器和随机游走过程独立提取的两个网络特征表示。最后,利用深度神经网络(DNN)进行关联预测。实验结果表明,AE-RW模型在HMDDv3.2数据集上通过5倍CV实现了0.9478的AUC,超越了现有的五种最先进的模式。此外,对乳腺癌和肺癌进行了案例研究,进一步验证了我们模型的优越预测能力。
    Since scientific investigations have demonstrated that aberrant expression of miRNAs brings about the incidence of numerous intricate diseases, precise determination of miRNA-disease relationships greatly contributes to the advancement of human medical progress. To tackle the issue of inefficient conventional experimental approaches, numerous computational methods have been proposed to predict miRNA-disease association with enhanced accuracy. However, constructing miRNA-gene-disease heterogeneous network by incorporating gene information has been relatively under-explored in existing computational techniques. Accordingly, this paper puts forward a technique to predict miRNA-disease association by applying autoencoder and implementing random walk on miRNA-gene-disease heterogeneous network(AE-RW). Firstly, we integrate association information and similarities between miRNAs, genes, and diseases to construct a miRNA-gene-disease heterogeneous network. Subsequently, we consolidate two network feature representations extracted independently via an autoencoder and a random walk procedure. Finally, deep neural network(DNN) are utilized to conduct association prediction. The experimental results demonstrate that the AE-RW model achieved an AUC of 0.9478 through 5-fold CV on the HMDD v3.2 dataset, outperforming the five most advanced existing models. Additionally, case studies were implemented for breast and lung cancer, further validated the superior predictive capabilities of our model.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号