Autoencoder

自动编码器
  • 文章类型: Journal Article
    这项工作介绍了编码器-解码器卷积神经网络(ED-CNN)模型在自动分割COVID-19计算机断层扫描(CT)数据中的应用。通过这样做,我们正在产生一个替代当前文献的模型,这很容易跟踪和复制,使它更容易为现实世界的应用程序,因为很少的培训将需要使用它。我们简单的方法获得了与以前发表的研究相当的结果,使用更复杂的深度学习网络。我们展示了一种高质量的胸部CT扫描自动分割预测,可以正确描绘肺部感染区域。这种分割自动化可以用作加速轮廓过程的工具,要么检查手动轮廓,代替同行检查,当不可能或迅速给出感染指征时,将其转介进行进一步治疗,从而节省时间和资源。相比之下,手动轮廓绘制是一个耗时的过程,在这个过程中,专业人员会一个接一个地绘制每个患者的轮廓,然后由另一个专业人员进行检查。所提出的模型使用大约49k参数,而其他模型的平均参数超过1000倍。由于我们的方法依赖于一个非常紧凑的模型,观察到较短的训练时间,这使得使用其他数据轻松地重新训练模型成为可能,并可能提供“个性化医疗”工作流程。该模型获得特异性(Sp)=0.996±0.001、准确性(Acc)=0.994±0.002和平均绝对误差(MAE)=0.0075±0.0005的相似性得分。
    This work presents the application of an Encoder-Decoder convolutional neural network (ED-CNN) model to automatically segment COVID-19 computerised tomography (CT) data. By doing so we are producing an alternative model to current literature, which is easy to follow and reproduce, making it more accessible for real-world applications as little training would be required to use this. Our simple approach achieves results comparable to those of previously published studies, which use more complex deep-learning networks. We demonstrate a high-quality automated segmentation prediction of thoracic CT scans that correctly delineates the infected regions of the lungs. This segmentation automation can be used as a tool to speed up the contouring process, either to check manual contouring in place of a peer checking, when not possible or to give a rapid indication of infection to be referred for further treatment, thus saving time and resources. In contrast, manual contouring is a time-consuming process in which a professional would contour each patient one by one to be later checked by another professional. The proposed model uses approximately 49 k parameters while others average over 1,000 times more parameters. As our approach relies on a very compact model, shorter training times are observed, which make it possible to easily retrain the model using other data and potentially afford \"personalised medicine\" workflows. The model achieves similarity scores of Specificity (Sp) = 0.996 ± 0.001, Accuracy (Acc) = 0.994 ± 0.002 and Mean absolute error (MAE) = 0.0075 ± 0.0005.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    社交焦虑症(SAD)的特征是对社交互动或环境的敏感性增强,破坏日常活动和社会关系。本研究旨在探索利用数字表型预测这些症状严重程度的可行性,并阐明主要预测数字表型如何根据症状严重程度而有所不同。
    我们在7至13周内从27名SAD和31名健康个体使用智能手机和智能手环收集了511份行为和生理数据,从中提取了76个数字表型特征。为了减少数据维度,我们使用了一个自动编码器,将这些特征转化为低维潜在表示的无监督机器学习模型。用三种社交焦虑特异性量表和九种其他心理量表评估症状严重程度。对于每个症状,我们开发了单个分类器来预测严重程度,并应用综合梯度来识别关键预测特征.
    针对社交焦虑症状的分类器优于基线准确性,平均准确率和F1评分为87%(两个指标都在84-90%范围内)。对于继发性心理症状,分类器的平均准确率和F1评分为85%.整合梯度的应用揭示了对预测模型有重大影响的关键数字表型,根据症状类型和严重程度进行区分。
    通过特征表征学习利用数字表型可以有效地对SAD中的症状严重性进行分类。它确定了与认知相关的不同数字表型,情感,和SAD的行为维度,从而推进对SAD的理解。这些发现强调了数字表型在指导临床管理方面的潜在效用。
    UNASSIGNED: Social anxiety disorder (SAD) is characterized by heightened sensitivity to social interactions or settings, which disrupts daily activities and social relationships. This study aimed to explore the feasibility of utilizing digital phenotypes for predicting the severity of these symptoms and to elucidate how the main predictive digital phenotypes differed depending on the symptom severity.
    UNASSIGNED: We collected 511 behavioral and physiological data over 7 to 13 weeks from 27 SAD and 31 healthy individuals using smartphones and smartbands, from which we extracted 76 digital phenotype features. To reduce data dimensionality, we employed an autoencoder, an unsupervised machine learning model that transformed these features into low-dimensional latent representations. Symptom severity was assessed with three social anxiety-specific and nine additional psychological scales. For each symptom, we developed individual classifiers to predict the severity and applied integrated gradients to identify critical predictive features.
    UNASSIGNED: Classifiers targeting social anxiety symptoms outperformed baseline accuracy, achieving mean accuracy and F1 scores of 87% (with both metrics in the range 84-90%). For secondary psychological symptoms, classifiers demonstrated mean accuracy and F1 scores of 85%. Application of integrated gradients revealed key digital phenotypes with substantial influence on the predictive models, differentiated by symptom types and levels of severity.
    UNASSIGNED: Leveraging digital phenotypes through feature representation learning could effectively classify symptom severities in SAD. It identifies distinct digital phenotypes associated with the cognitive, emotional, and behavioral dimensions of SAD, thereby advancing the understanding of SAD. These findings underscore the potential utility of digital phenotypes in informing clinical management.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    皮肤癌是一种致命的疾病,它的早期检测在防止其传播到其他身体器官和组织中起着关键作用。基于人工智能(AI)的自动化方法可以在其早期检测中发挥重要作用。这项研究提出了一种基于人工智能的新方法,被称为“DualAutoELM”,用于有效识别各种类型的皮肤癌。所提出的方法利用了自动编码器网络,包括两个不同的自动编码器:空间自动编码器和FFT(快速傅里叶变换)自动编码器。空间自动编码器专门学习输入病变图像内的空间特征,而FFT自动编码器通过重建过程学习捕获经变换的输入皮肤病变图像内的纹理和区分频率模式。在这些自动编码器的编码器部分内的各个级别处使用注意力模块显著地提高了它们的辨别特征学习能力。训练具有单层前馈的极限学习机(ELM),以使用从这些自动编码器的瓶颈层中恢复的特征对皮肤恶性肿瘤进行分类。“HAM10000”和“ISIC-2017”是两个公开可用的数据集,用于彻底评估建议的方法。实验结果证明了该技术的准确性和鲁棒性。AUC,精度,“HAM10000”数据集的精度值为0.98、97.68%和97.66%,对于“ISIC-2017”数据集,分别为0.95、86.75%和86.68%,分别。这项研究强调了准确检测皮肤癌的建议方法的可能性。
    Skin cancer is a lethal disease, and its early detection plays a pivotal role in preventing its spread to other body organs and tissues. Artificial Intelligence (AI)-based automated methods can play a significant role in its early detection. This study presents an AI-based novel approach, termed \'DualAutoELM\' for the effective identification of various types of skin cancers. The proposed method leverages a network of autoencoders, comprising two distinct autoencoders: the spatial autoencoder and the FFT (Fast Fourier Transform)-autoencoder. The spatial-autoencoder specializes in learning spatial features within input lesion images whereas the FFT-autoencoder learns to capture textural and distinguishing frequency patterns within transformed input skin lesion images through the reconstruction process. The use of attention modules at various levels within the encoder part of these autoencoders significantly improves their discriminative feature learning capabilities. An Extreme Learning Machine (ELM) with a single layer of feedforward is trained to classify skin malignancies using the characteristics that were recovered from the bottleneck layers of these autoencoders. The \'HAM10000\' and \'ISIC-2017\' are two publicly available datasets used to thoroughly assess the suggested approach. The experimental findings demonstrate the accuracy and robustness of the proposed technique, with AUC, precision, and accuracy values for the \'HAM10000\' dataset being 0.98, 97.68% and 97.66%, and for the \'ISIC-2017\' dataset being 0.95, 86.75% and 86.68%, respectively. This study highlights the possibility of the suggested approach for accurate detection of skin cancer.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    单细胞多组学数据揭示了复杂的细胞状态,提供对细胞动力学和疾病的重要见解。然而,多组数据的整合带来了挑战。一些模式尚未达到已建立的转录组学的稳健性或清晰度。再加上不太成熟的模式和集成复杂性的数据稀缺,这些挑战限制了我们最大化单细胞组学益处的能力.我们介绍scCross,一种利用变量自动编码器的工具,生成对抗网络,以及用于模态对齐的相互最近邻(MNN)技术。通过启用单细胞跨模态数据生成,多组数据模拟,在硅细胞扰动中,scCross增强了单细胞多组学研究的实用性。
    Single-cell multi-omics data reveal complex cellular states, providing significant insights into cellular dynamics and disease. Yet, integration of multi-omics data presents challenges. Some modalities have not reached the robustness or clarity of established transcriptomics. Coupled with data scarcity for less established modalities and integration intricacies, these challenges limit our ability to maximize single-cell omics benefits. We introduce scCross, a tool leveraging variational autoencoders, generative adversarial networks, and the mutual nearest neighbors (MNN) technique for modality alignment. By enabling single-cell cross-modal data generation, multi-omics data simulation, and in silico cellular perturbations, scCross enhances the utility of single-cell multi-omics studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们提出了一种基于深度神经网络的人工智能方法来解决规范的2D标量反源问题。考虑了基于混合自动编码的学习奇异值分解(L-SVD)。我们比较了L-SVD与截断SVD(TSVD)正则化反演的重建性能,这是一个规范的正则化方案,求解一个不适定线性逆问题。参考远场采集的数值测试表明,L-SVD提供了,在组织良好的数据集上进行适当的培训,与TSVD相比,在重建误差方面表现优异,允许检索源的更快空间变化。的确,L-SVD容纳关于相关未知电流分布的集合的先验信息。与TSVD不同,对线性问题进行线性处理,L-SVD对数据进行非线性操作。数值分析还强调了当未知源与训练数据集不匹配时L-SVD的性能如何下降。
    We propose an artificial intelligence approach based on deep neural networks to tackle a canonical 2D scalar inverse source problem. The learned singular value decomposition (L-SVD) based on hybrid autoencoding is considered. We compare the reconstruction performance of L-SVD to the Truncated SVD (TSVD) regularized inversion, which is a canonical regularization scheme, to solve an ill-posed linear inverse problem. Numerical tests referring to far-field acquisitions show that L-SVD provides, with proper training on a well-organized dataset, superior performance in terms of reconstruction errors as compared to TSVD, allowing for the retrieval of faster spatial variations of the source. Indeed, L-SVD accommodates a priori information on the set of relevant unknown current distributions. Different from TSVD, which performs linear processing on a linear problem, L-SVD operates non-linearly on the data. A numerical analysis also underlines how the performance of the L-SVD degrades when the unknown source does not match the training dataset.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    医学是计算机科学进步取得重大进展的领域之一。一些疾病需要立即诊断以改善患者预后。计算机在医学中的使用提高了精度并加速了数据处理和诊断。为了对生物图像进行分类,混合机器学习,各种深度学习方法的组合,被利用,并在本研究中提供了元启发式算法。此外,引入了两个不同的医疗数据集,一个涉及脑肿瘤的磁共振成像(MRI),另一个涉及COVID-19的胸部X射线(CXRs)。这些数据集被引入到包含深度学习技术的组合网络中,它们基于卷积神经网络(CNN)或自动编码器,提取特征,并与下一步的元启发式算法相结合,以便使用粒子群优化(PSO)算法选择最优特征。这种组合试图降低数据集的维度,同时保持数据的原始性能。这被认为是一种创新的方法,可确保各种医疗数据集的高度准确的分类结果。采用几种分类器来预测疾病。COVID-19数据集发现,使用CNN-PSO-SVM组合的最高准确率为99.76%。相比之下,脑肿瘤数据集获得99.51%的准确率,使用自动编码器-PSO-KNN组合方法得出的最高精度。
    Medicine is one of the fields where the advancement of computer science is making significant progress. Some diseases require an immediate diagnosis in order to improve patient outcomes. The usage of computers in medicine improves precision and accelerates data processing and diagnosis. In order to categorize biological images, hybrid machine learning, a combination of various deep learning approaches, was utilized, and a meta-heuristic algorithm was provided in this research. In addition, two different medical datasets were introduced, one covering the magnetic resonance imaging (MRI) of brain tumors and the other dealing with chest X-rays (CXRs) of COVID-19. These datasets were introduced to the combination network that contained deep learning techniques, which were based on a convolutional neural network (CNN) or autoencoder, to extract features and combine them with the next step of the meta-heuristic algorithm in order to select optimal features using the particle swarm optimization (PSO) algorithm. This combination sought to reduce the dimensionality of the datasets while maintaining the original performance of the data. This is considered an innovative method and ensures highly accurate classification results across various medical datasets. Several classifiers were employed to predict the diseases. The COVID-19 dataset found that the highest accuracy was 99.76% using the combination of CNN-PSO-SVM. In comparison, the brain tumor dataset obtained 99.51% accuracy, the highest accuracy derived using the combination method of autoencoder-PSO-KNN.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景和目的:乳腺癌是女性中最常见的癌症类型。抗癌药物治疗的有效性可能会受到肿瘤异质性的不利影响,包括遗传和转录组特征。这导致患者对治疗药物的反应的临床变异性。抗癌药物的设计和对癌症的理解需要精确识别癌症药物的反应。通过整合多组学数据和药物结构数据可以提高药物反应预测模型的性能。方法:本文,我们提出了一种自动编码器(AE)和图卷积网络(AGCN)用于药物反应预测,它整合了多组学数据和药物结构数据。具体来说,我们首先使用每个omic数据集的AE将每个omic数据的高维表示转换为低维表示。随后,将这些个体特征与使用图卷积网络获得的药物结构数据相结合,并将其提供给卷积神经网络,以计算细胞系和药物的每种组合的IC[公式:见正文]值。然后,通过对每种药物的已知IC[公式:见文本]值进行K均值聚类来获得每种药物的阈值IC[公式:见文本]值。最后,在这个阈值的帮助下,细胞系被分类为对每种药物敏感或抗性。结果:实验结果表明,AGCN的准确率为0.82,并且比许多现有方法更好。除此之外,我们已经使用来自癌症基因组图谱(TCGA)临床数据库的数据对AGCN进行了外部验证,我们得到了0.91的准确度.结论:根据所得结果,使用AGCN进行药物反应预测任务,将多组学数据与药物结构数据连接起来,大大提高了预测任务的准确性。
    Background and objectives: Breast cancer is the most prevalent type of cancer among women. The effectiveness of anticancer pharmacological therapy may get adversely affected by tumor heterogeneity that includes genetic and transcriptomic features. This leads to clinical variability in patient response to therapeutic drugs. Anticancer drug design and cancer understanding require precise identification of cancer drug responses. The performance of drug response prediction models can be improved by integrating multi-omics data and drug structure data. Methods: In this paper, we propose an Autoencoder (AE) and Graph Convolutional Network (AGCN) for drug response prediction, which integrates multi-omics data and drug structure data. Specifically, we first converted the high dimensional representation of each omic data to a lower dimensional representation using an AE for each omic data set. Subsequently, these individual features are combined with drug structure data obtained using a Graph Convolutional Network and given to a Convolutional Neural Network to calculate IC[Formula: see text] values for every combination of cell lines and drugs. Then a threshold IC[Formula: see text] value is obtained for each drug by performing K-means clustering of their known IC[Formula: see text] values. Finally, with the help of this threshold value, cell lines are classified as either sensitive or resistant to each drug. Results: Experimental results indicate that AGCN has an accuracy of 0.82 and performs better than many existing methods. In addition to that, we have done external validation of AGCN using data taken from The Cancer Genome Atlas (TCGA) clinical database, and we got an accuracy of 0.91. Conclusion: According to the results obtained, concatenating multi-omics data with drug structure data using AGCN for drug response prediction tasks greatly improves the accuracy of the prediction task.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    蜡染在印度尼西亚拥有深刻的文化意义,作为国家丰富的遗产和错综复杂的哲学叙事的有形表达。本文介绍了BatikNitikSarimbit120数据集,起源于日惹,印度尼西亚,作为研究人员和爱好者的关键资源。包含从织物样品中精心采购的60Nitik图案的图像,这个数据集代表了该地区艺术传统的蜡染图案的精选。BatikNitikSarimbit120数据集提供了分布在60个不同类别中的120个图案对的全面集合。通过提供全面的蜡染图案库,BatikNitikSarimbit120数据集有助于机器学习算法的训练和验证,特别是通过利用生成方法。这使研究人员能够在蜡染图案生成领域进行探索和创新,在这种古老的艺术形式中培养创造力和表达的新途径。实质上,蜡染NitikSarimbit120数据集证明了文化机构和学术界在保护和促进印度尼西亚丰富的蜡染遗产方面的合作努力。它的可获得性和丰富性使其成为学者的宝贵资源,艺术家,和爱好者寻求更深入地研究印度尼西亚蜡染的复杂世界。
    Batik holds profound cultural significance within Indonesia, serving as a tangible expression of the nation\'s rich heritage and intricate philosophical narratives. This paper introduces the Batik Nitik Sarimbit 120 dataset, originating from Yogyakarta, Indonesia, as a pivotal resource for researchers and enthusiasts alike. Comprising images of 60 Nitik patterns meticulously sourced from fabric samples, this dataset represents a curated selection of batik motifs emblematic of the region\'s artistic tradition. The Batik Nitik Sarimbit 120 dataset offers a comprehensive collection of 120 motif pairs distributed across 60 distinct categories. By providing a comprehensive repository of batik motifs, the Batik Nitik Sarimbit 120 dataset facilitates the training and validation of machine learning algorithms, particularly through the utilization of generative method. This enables researchers to explore and innovate in the realm of batik pattern generation, fostering new avenues for creativity and expression within this venerable art form. In essence, the Batik Nitik Sarimbit 120 dataset stands as a testament to the collaborative efforts of cultural institutions and academia in preserving and promoting Indonesia\'s rich batik heritage. Its accessibility and richness make it a valuable resource for scholars, artists, and enthusiasts seeking to delve deeper into the intricate world of Indonesian batik.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    物联网(IoT)和工业物联网(IIoT)中互连设备的使用不断增加,显着提高了个人和工业环境中的效率和实用性,但也加剧了网络安全漏洞。特别是通过物联网恶意软件。本文探讨了一类分类的使用,一种无监督学习的方法,这特别适用于未标记的数据,动态环境,和恶意软件检测,这是异常检测的一种形式。我们介绍了TF-IDF方法,用于将标称特征转换为避免信息丢失并有效管理维度的数值格式,当与n-gram结合时,这对于增强模式识别至关重要。此外,我们比较了多类别与一类分类模型,包括隔离森林和深度自动编码器,使用良性和恶意NetFlow样本与只对良性NetFlow样本进行训练。我们使用单类分类在各种测试数据集上实现了100%的召回率,准确率高于80%和90%。这些模型显示了无监督学习的适应性,尤其是一类分类,物联网领域不断演变的恶意软件威胁,提供有关增强物联网安全框架的见解,并为这一关键领域的未来研究提出方向。
    The increasing usage of interconnected devices within the Internet of Things (IoT) and Industrial IoT (IIoT) has significantly enhanced efficiency and utility in both personal and industrial settings but also heightened cybersecurity vulnerabilities, particularly through IoT malware. This paper explores the use of one-class classification, a method of unsupervised learning, which is especially suitable for unlabeled data, dynamic environments, and malware detection, which is a form of anomaly detection. We introduce the TF-IDF method for transforming nominal features into numerical formats that avoid information loss and manage dimensionality effectively, which is crucial for enhancing pattern recognition when combined with n-grams. Furthermore, we compare the performance of multi-class vs. one-class classification models, including Isolation Forest and deep autoencoder, that are trained with both benign and malicious NetFlow samples vs. trained exclusively on benign NetFlow samples. We achieve 100% recall with precision rates above 80% and 90% across various test datasets using one-class classification. These models show the adaptability of unsupervised learning, especially one-class classification, to the evolving malware threats in the IoT domain, offering insights into enhancing IoT security frameworks and suggesting directions for future research in this critical area.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    网络流量异常检测,作为一种有效的网络安全分析方法,可以识别差异化的流量信息,并在复杂多变的网络环境中提供安全的操作。在提高交通特征信息检测性能的同时,避免在处理交通数据时造成的信息丢失,提出了一种基于卷积神经网络和自动编码器的多信息融合模型。该模型使用卷积神经网络直接从原始交通数据中提取特征,和自动编码器对从原始交通数据中提取的统计特征进行编码,用于补充因裁剪而造成的信息损失。这两个功能组合在一起,形成一个新的网络流量集成功能,具有来自原始交通数据的负荷信息和来自统计特征的原始交通数据的全局信息,从而提供了网络流量中包含的信息的完整表示,提高了模型的检测性能。实验表明,利用该模型进行网络流量异常检测的分类准确率优于经典机器学习方法。
    Network traffic anomaly detection, as an effective analysis method for network security, can identify differentiated traffic information and provide secure operation in complex and changing network environments. To avoid information loss caused when handling traffic data while improving the detection performance of traffic feature information, this paper proposes a multi-information fusion model based on a convolutional neural network and AutoEncoder. The model uses a convolutional neural network to extract features directly from the raw traffic data, and a AutoEncoder to encode the statistical features extracted from the raw traffic data, which are used to supplement the information loss due to cropping. These two features are combined to form a new integrated feature for network traffic, which has the load information from the original traffic data and the global information of the original traffic data obtained from the statistical features, thus providing a complete representation of the information contained in the network traffic and improving the detection performance of the model. The experiments show that the classification accuracy of network traffic anomaly detection using this model outperforms that of classical machine learning methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号