Neural Networks, Computer

神经网络,计算机
  • 文章类型: Journal Article
    前列腺癌是男性中最常见和最致命的疾病之一,且其早期诊断可对治疗过程产生重大影响,预防死亡。由于它在早期没有明显的临床症状,很难诊断。此外,专家在分析磁共振图像方面的分歧也是一个重大挑战。近年来,各种研究表明,深度学习,尤其是卷积神经网络,已经成功地出现在机器视觉中(特别是在医学图像分析中)。在这项研究中,在多参数磁共振图像上使用了一种深度学习方法,研究了临床和病理数据对模型准确性的协同作用。数据是从德黑兰的Trita医院收集的,其中包括343例患者(在该过程中使用了数据增强和学习迁移方法).在设计的模型中,使用四个独立的ResNet50深度卷积网络分析了四种不同类型的图像,并将其提取的特征转移到完全连接的神经网络,并与临床和病理特征相结合。在没有临床和病理数据的模型中,最高准确率达到88%,但是通过添加这些数据,准确度提高到96%,临床和病理资料对诊断的准确性有显著影响。
    Prostate cancer is one of the most common and fatal diseases among men, and its early diagnosis can have a significant impact on the treatment process and prevent mortality. Since it does not have apparent clinical symptoms in the early stages, it is difficult to diagnose. In addition, the disagreement of experts in the analysis of magnetic resonance images is also a significant challenge. In recent years, various research has shown that deep learning, especially convolutional neural networks, has appeared successfully in machine vision (especially in medical image analysis). In this research, a deep learning approach was used on multi-parameter magnetic resonance images, and the synergistic effect of clinical and pathological data on the accuracy of the model was investigated. The data were collected from Trita Hospital in Tehran, which included 343 patients (data augmentation and learning transfer methods were used during the process). In the designed model, four different types of images are analyzed with four separate ResNet50 deep convolutional networks, and their extracted features are transferred to a fully connected neural network and combined with clinical and pathological features. In the model without clinical and pathological data, the maximum accuracy reached 88%, but by adding these data, the accuracy increased to 96%, which shows the significant impact of clinical and pathological data on the accuracy of diagnosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    化学需氧量(COD)的测量在污水处理过程中非常重要。COD值在一定程度上反映了污水处理的效果和趋势,但是获得准确的数据需要很高的成本和劳动强度。TO1解决这个问题,提出了一种基于卷积神经网络-双向长短期记忆网络-注意力机制(CNN-BiLSTM-attention)算法的COD在线软测量方法。首先,通过分析厌氧-缺氧-氧化(A2O)废水处理过程中好氧池阶段的机理,初步确定了输入变量的选择范围,并对采集的样本数据集进行相关性分析。最后,pH值,溶解氧(DO),电导率(EC),和水温(T)被确定为COD软测量预测的输入变量。然后,基于CNN的特征提取能力和BiLSTM能够捕获时间序列数据中的后向和前向依赖的优势,结合可以为关键数据分配更高权重的注意力机制,建立了CNN-BiLSTM-Attention算法模型对A2O污水处理过程好氧区出水COD进行软测量。同时,均方根误差(RMSE),平均绝对误差(MAE),平均绝对百分比误差(MAPE)和决定系数(R2)三个指标用于评估模型,结果表明,该模型能够准确预测COD值,具有较高的准确性。同时,与CNN-LSTM-Attention等模型相比,CNN-BiLSTM,CNN-LSTM,LSTM,RNN,BP,SVM,XGBoost,和RF等。,结果表明,CNN-BiLSTM注意力模型表现最好,证明了算法模型的优越性。Wilcoxon符号秩检验表明CNN-BiLSTM-注意力模型与其他模型之间存在显著差异。
    The measurement of chemical oxygen demand (COD) is very important in the process of sewage treatment. The value of COD reflects the effectiveness and trend of sewage treatment to a certain extent, but obtaining accurate data requires high cost and labor intensity. To1 solve this problem, this paper proposes an online soft measurement method for COD based on Convolutional Neural Network-Bidirectional Long Short-Term Memory Network-Attention Mechanism (CNN-BiLSTM-Attention) algorithm. Firstly, by analyzing the mechanism of the aerobic tank stage in the Anaerobic-Anoxic-Oxic (A2O) wastewater treatment process, the selection range of input variables was preliminarily determined, and the collected sample dataset was subjected to correlation analysis. Finally, pH, dissolved oxygen (DO), electrical conductivity (EC), and water temperature (T) were determined as input variables for soft measurement prediction of COD.Then, based on the feature extraction ability of CNN and the advantage that BiLSTM is able to capture the backward and forward dependencies in time series data, combined with the attention mechanism that can assign higher weights to the key data, a CNN-BiLSTM-Attention algorithm model was established to soft measure COD in the effluent from the aerobic zone of the A2O wastewater treatment process. At the same time, root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE) and coefficient of determination (R2) were utilized Three indicators were used to evaluate the model, and the results showed that the model can accurately predict the value of COD and has a high accuracy. At the same time, compared with models such as CNN-LSTM-Attention, CNN-BiLSTM, CNN-LSTM, LSTM, RNN, BP, SVM, XGBoost, and RF etc., the results showed that the CNN-BiLSTM Attention model performed the best, proving the superiority of the algorithm model.The Wilcoxon signed-rank test indicates significant differences between the CNN-BiLSTM-Attention model and other models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本研究建立了“技能人才生态评价模型”,势能,动能,创新,以及服务和支持生态。AHP-熵确定指标权重,Hopfield神经网络评估人才生态水平,PVAR模型分析数字化转型效果。研究结果表明:栽培生态率A,潜在生态速率B+,动力学生态速率B-,服务和支持生态费率B-,和创新生态率C.数字化转型刺激了技能需求,影响人才和经济贡献。动力学生态看到需求增加,可能对传统产业产生积极影响。创新生态需要持续的技能学习。服务和支持生态见证了数字创业的增长,需要政策激励和孵化中心支持。
    This study develops a \"Skill Talent Ecological Evaluation Model\" across cultivation, potential energy, kinetic energy, innovation, and service and support ecologies. AHP-entropy determines indicator weights, Hopfield neural network assesses talent ecology levels, and the PVAR model analyzes digital transformation effects. Findings reveal: Cultivation ecology rates A, potential ecology rates B+, kinetic ecology rates B-, service and support ecology rates B-, and innovation ecology rates C. Digital transformation spurs skill demand, impacting talent and economic contributions. Kinetic ecology sees increased demand, potentially impacting traditional industries positively. Innovation ecology necessitates continuous skill learning. Service and support ecology witnesses growth in digital entrepreneurship, requiring policy incentives and incubation center support.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    探讨深度学习(DL)网络模型在物联网(IoT)数据库查询与优化中的应用效果。本研究首先分析了物联网数据库查询的体系结构,然后探索DL网络模型,最后通过优化策略对DL网络模型进行优化。通过实验验证了本研究中优化模型的优越性。实验结果表明,在模型训练和参数优化阶段,优化后的模型比其他模型具有更高的效率。特别是当数据量为2000时,优化模型的模型训练时间和参数优化时间明显低于传统模型。在资源消耗方面,随着数据量的增加,所有型号的中央处理单元和图形处理单元的使用量以及内存使用量都有所增加。然而,优化后的模型在能耗方面表现出更好的性能。在吞吐量分析中,优化后的模型可以在处理大数据请求时保持较高的事务数和每秒数据量。特别是在4000数据量下,其峰值时间处理能力超过其他型号。关于延迟,尽管所有模型的延迟都随着数据量的增加而增加,优化后的模型在数据库查询响应时间和数据处理延迟方面表现更好。研究结果不仅揭示了优化模型在处理物联网数据库查询及其优化方面的优越性能,而且为物联网数据处理和DL模型优化提供了有价值的参考。这些发现有助于推动DL技术在物联网领域的应用,特别是在需要处理大规模数据和需要高效处理场景的情况下,为相关领域的研究和实践提供了重要的参考。
    To explore the application effect of the deep learning (DL) network model in the Internet of Things (IoT) database query and optimization. This study first analyzes the architecture of IoT database queries, then explores the DL network model, and finally optimizes the DL network model through optimization strategies. The advantages of the optimized model in this study are verified through experiments. Experimental results show that the optimized model has higher efficiency than other models in the model training and parameter optimization stages. Especially when the data volume is 2000, the model training time and parameter optimization time of the optimized model are remarkably lower than that of the traditional model. In terms of resource consumption, the Central Processing Unit and Graphics Processing Unit usage and memory usage of all models have increased as the data volume rises. However, the optimized model exhibits better performance on energy consumption. In throughput analysis, the optimized model can maintain high transaction numbers and data volumes per second when handling large data requests, especially at 4000 data volumes, and its peak time processing capacity exceeds that of other models. Regarding latency, although the latency of all models increases with data volume, the optimized model performs better in database query response time and data processing latency. The results of this study not only reveal the optimized model\'s superior performance in processing IoT database queries and their optimization but also provide a valuable reference for IoT data processing and DL model optimization. These findings help to promote the application of DL technology in the IoT field, especially in the need to deal with large-scale data and require efficient processing scenarios, and offer a vital reference for the research and practice in related fields.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在这项研究中,我们采用各种机器学习模型来预测代谢表型,关注甲状腺功能,使用2007年至2012年国家健康和营养检查调查(NHANES)的数据集。我们的分析利用实验室参数相关的甲状腺功能或代谢失调,除了人口统计学特征,旨在通过各种机器学习方法揭示甲状腺功能和代谢表型之间的潜在关联。多项Logistic回归最适合确定甲状腺功能与代谢表型之间的关系,接收器工作特征曲线下面积(AUROC)为0.818,其次是神经网络(AUROC:0.814)。根据上述情况,随机森林的性能,BoostedTree,和K最近邻居不如前两种方法(分别为AUROC0.811、0.811和0.786)。在随机森林中,胰岛素抵抗的稳态模型评估,血清尿酸,血清白蛋白,γ-谷氨酰转移酶,和三碘甲状腺原氨酸/甲状腺素比率位于可变重要性的上层。这些结果凸显了机器学习在理解健康数据中复杂关系方面的潜力。然而,重要的是要注意,模型性能可能因数据特征和特定要求而异。此外,我们强调在复杂调查数据分析中考虑抽样权重的重要性,以及纳入额外变量以提高模型准确性和洞察力的潜在好处。未来的研究可以探索结合机器学习的先进方法,样本重量,并扩展了变量集,以进一步推进调查数据分析。
    In this study, we employed various machine learning models to predict metabolic phenotypes, focusing on thyroid function, using a dataset from the National Health and Nutrition Examination Survey (NHANES) from 2007 to 2012. Our analysis utilized laboratory parameters relevant to thyroid function or metabolic dysregulation in addition to demographic features, aiming to uncover potential associations between thyroid function and metabolic phenotypes by various machine learning methods. Multinomial Logistic Regression performed best to identify the relationship between thyroid function and metabolic phenotypes, achieving an area under receiver operating characteristic curve (AUROC) of 0.818, followed closely by Neural Network (AUROC: 0.814). Following the above, the performance of Random Forest, Boosted Trees, and K Nearest Neighbors was inferior to the first two methods (AUROC 0.811, 0.811, and 0.786, respectively). In Random Forest, homeostatic model assessment for insulin resistance, serum uric acid, serum albumin, gamma glutamyl transferase, and triiodothyronine/thyroxine ratio were positioned in the upper ranks of variable importance. These results highlight the potential of machine learning in understanding complex relationships in health data. However, it\'s important to note that model performance may vary depending on data characteristics and specific requirements. Furthermore, we emphasize the significance of accounting for sampling weights in complex survey data analysis and the potential benefits of incorporating additional variables to enhance model accuracy and insights. Future research can explore advanced methodologies combining machine learning, sample weights, and expanded variable sets to further advance survey data analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    本研究试图检验四个网格化降水数据集的有效性,即GPM综合多卫星检索(IMERG),热带降水测量任务(TRMM),现代研究和应用回顾性分析第2版(MERRA-2),使用人工神经网络(PERSIANN)从遥感信息中估算降水,利用印度气象部门(IMD)2001年至2019年在科西河流域的八个雨量计站的观测降雨数据,印度。各种统计指标,应急测试,趋势分析,每天使用降雨异常指数,每月,季节性,和年度时间尺度。分类指标,即检测概率(POD)和误报率(FAR)表明MERRA-2和IMERG数据集与观察到的每日数据具有最高的并发水平。用观察到的IMD数据集进行网格数据集的统计分析表明,IMERG数据集的性能优于MERRA-2,PERSIANN,和TRMM数据集具有“非常好”的确定系数(R2)和每月数据的Nash-Sutcliffe效率值。IMERG的网格季节性数据的趋势分析显示,观察到的季节性数据的趋势相似,而其他数据集不同。IMERG在根据年度数据确定干湿年份方面也表现良好。还讨论了卫星传感器在捕获降水方面的差异。因此,在缺乏观测数据集的情况下,IMERG数据集可有效用于水文气象和气候学调查。
    The present research endeavors to examine the effectiveness of four gridded precipitation datasets, namely Integrated Multi-satellite Retrievals for GPM (IMERG), Tropical Precipitation Measuring Mission (TRMM), Modern-Era Retrospective Analysis for Research and Applications Version 2 (MERRA-2), and Precipitation Estimation from Remotely Sensed Information using Artificial Neural Networks (PERSIANN), with the observed rainfall data of eight rain gauge stations of India Meteorological Department (IMD) from 2001 to 2019 in Kosi River basin, India. Various statistical metrics, contingency tests, trend analysis, and rainfall anomaly index were utilized at daily, monthly, seasonal, and annual time scales. The categorical metrics namely probability of detection (POD) and false alarm ratio (FAR) indicate that MERRA-2 and IMERG datasets have the highest level of concurrence with the observed daily data. Statistical analysis of gridded datasets with observed dataset of IMD showed that the performance of the IMERG dataset is better than MERRA-2, PERSIANN, and TRMM datasets with \"very good\" coefficient of determination (R2) and Nash-Sutcliffe Efficiency values for monthly data. Trend analysis of gridded seasonal data of IMERG showed similar trends of observed seasonal data whereas other dataset differs. IMERG also performed well in identifying wet and dry years based on annual data. Discrepancies of the satellite sensor in capturing the precipitation have also been discussed. Thus, the IMERG dataset can be effectively used for hydro-meteorological and climatological investigations in cases of lack of observed datasets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:电子健康记录(EHR)代表了患者病史的综合资源。EHR对于利用深度学习(DL)等先进技术至关重要,使医疗保健提供商能够分析大量数据,提取有价值的见解,并做出精确和数据驱动的临床决策。诸如递归神经网络(RNN)的DL方法已被用于分析EHR以对疾病进展建模和预测诊断。然而,这些方法不能解决EHR数据中一些固有的不规则性,例如临床就诊之间的不规则时间间隔.此外,大多数DL模型是不可解释的。在这项研究中,我们提出了两种基于RNN的可解释DL架构,即时间感知RNN(TA-RNN)和TA-RNN自动编码器(TA-RNN-AE),用于预测患者在下一次就诊和多次就诊时的EHR临床结果,分别。为了减轻不规则时间间隔的影响,我们建议纳入访问之间经过时间的时间嵌入。为了可解释性,我们建议采用双层关注机制,在每次访问和功能之间运作。
    结果:在阿尔茨海默病神经影像学计划(ADNI)和国家阿尔茨海默病协调中心(NACC)数据集上进行的实验结果表明,与基于F2和敏感性的最新技术和基线方法相比,所提出的用于预测阿尔茨海默病(AD)的模型具有出色的性能。此外,TA-RNN在重症监护医学信息集市(MIMIC-III)数据集上显示出优异的死亡率预测性能。在我们的消融研究中,我们观察到通过结合时间嵌入和注意力机制来增强预测性能。最后,调查注意力权重有助于在预测中识别有影响力的访问和特征。
    方法:https://github.com/bozdaglab/TA-RNN。
    BACKGROUND: Electronic health records (EHRs) represent a comprehensive resource of a patient\'s medical history. EHRs are essential for utilizing advanced technologies such as deep learning (DL), enabling healthcare providers to analyze extensive data, extract valuable insights, and make precise and data-driven clinical decisions. DL methods such as recurrent neural networks (RNN) have been utilized to analyze EHR to model disease progression and predict diagnosis. However, these methods do not address some inherent irregularities in EHR data such as irregular time intervals between clinical visits. Furthermore, most DL models are not interpretable. In this study, we propose two interpretable DL architectures based on RNN, namely time-aware RNN (TA-RNN) and TA-RNN-autoencoder (TA-RNN-AE) to predict patient\'s clinical outcome in EHR at the next visit and multiple visits ahead, respectively. To mitigate the impact of irregular time intervals, we propose incorporating time embedding of the elapsed times between visits. For interpretability, we propose employing a dual-level attention mechanism that operates between visits and features within each visit.
    RESULTS: The results of the experiments conducted on Alzheimer\'s Disease Neuroimaging Initiative (ADNI) and National Alzheimer\'s Coordinating Center (NACC) datasets indicated the superior performance of proposed models for predicting Alzheimer\'s Disease (AD) compared to state-of-the-art and baseline approaches based on F2 and sensitivity. Additionally, TA-RNN showed superior performance on the Medical Information Mart for Intensive Care (MIMIC-III) dataset for mortality prediction. In our ablation study, we observed enhanced predictive performance by incorporating time embedding and attention mechanisms. Finally, investigating attention weights helped identify influential visits and features in predictions.
    METHODS: https://github.com/bozdaglab/TA-RNN.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:RNA设计在合成生物学和治疗学中的应用越来越多,由RNA在各种生物过程中的关键作用驱动。一个基本的挑战是找到满足给定结构约束的功能性RNA序列,称为逆折叠问题。已经出现了基于二级结构的计算方法来解决这个问题。然而,直接从3D结构设计RNA序列仍然具有挑战性,由于数据的稀缺性,非唯一的结构-序列映射,和RNA构象的灵活性。
    结果:在这项研究中,我们提出了核扩散,用于RNA反向折叠的生成扩散模型,可以学习给定3D主链结构的RNA序列的条件分布。我们的模型由基于图神经网络的结构模块和基于Transformer的序列模块组成,迭代地将随机序列转换为期望的序列。通过调整采样重量,我们的模型允许在序列恢复和多样性之间进行权衡,以探索更多的候选.我们基于RNA聚类对测试集进行拆分,对序列或结构相似性具有不同的截止值。我们的模型在序列恢复方面优于基线,序列相似性分裂平均相对提高11%,结构相似性分裂平均提高16%。此外,核扩散在各种RNA长度类别和RNA类型中表现一致。我们还应用计算机折叠来验证生成的序列是否可以折叠到给定的3DRNA主链中。我们的方法可能是RNA设计的强大工具,可以探索广阔的序列空间并找到3D结构约束的新颖解决方案。
    方法:源代码可在https://github.com/ml4bio/RiboDiffusion获得。
    BACKGROUND: RNA design shows growing applications in synthetic biology and therapeutics, driven by the crucial role of RNA in various biological processes. A fundamental challenge is to find functional RNA sequences that satisfy given structural constraints, known as the inverse folding problem. Computational approaches have emerged to address this problem based on secondary structures. However, designing RNA sequences directly from 3D structures is still challenging, due to the scarcity of data, the nonunique structure-sequence mapping, and the flexibility of RNA conformation.
    RESULTS: In this study, we propose RiboDiffusion, a generative diffusion model for RNA inverse folding that can learn the conditional distribution of RNA sequences given 3D backbone structures. Our model consists of a graph neural network-based structure module and a Transformer-based sequence module, which iteratively transforms random sequences into desired sequences. By tuning the sampling weight, our model allows for a trade-off between sequence recovery and diversity to explore more candidates. We split test sets based on RNA clustering with different cut-offs for sequence or structure similarity. Our model outperforms baselines in sequence recovery, with an average relative improvement of 11% for sequence similarity splits and 16% for structure similarity splits. Moreover, RiboDiffusion performs consistently well across various RNA length categories and RNA types. We also apply in silico folding to validate whether the generated sequences can fold into the given 3D RNA backbones. Our method could be a powerful tool for RNA design that explores the vast sequence space and finds novel solutions to 3D structural constraints.
    METHODS: The source code is available at https://github.com/ml4bio/RiboDiffusion.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:遗传扰动(例如敲除,变体)为我们对许多疾病的理解奠定了基础,暗示致病机制并指出治疗靶点。然而,实验测定从根本上受到可测量的扰动数量的限制。计算方法可以通过预测新条件下的扰动效应来填补这一空白,但是准确预测细胞对看不见的扰动的转录反应仍然是一个重大挑战。
    结果:我们通过开发一种新颖的基于注意力的神经网络来解决这个问题,AttitionPert,它可以在多重扰动下准确预测基因表达,并推广到看不见的条件。AttitionPert在多尺度模型中集成了全球和局部效应,代表基因扰动的非均匀系统影响和基因-基因相似性网络中的局部扰动,增强其预测对单基因和多基因扰动的细微转录反应的能力。在综合实验中,AttitionPert在预测差异基因表达和揭示新基因调控方面,在多个数据集上表现出卓越的性能,优于最先进的方法。AttitionPert标志着对当前方法的重大改进,特别是在处理基因扰动的多样性和预测分布外的情景方面。
    方法:代码可在https://github.com/BaiDing1234/AttentionPert获得。
    BACKGROUND: Genetic perturbations (e.g. knockouts, variants) have laid the foundation for our understanding of many diseases, implicating pathogenic mechanisms and indicating therapeutic targets. However, experimental assays are fundamentally limited by the number of measurable perturbations. Computational methods can fill this gap by predicting perturbation effects under novel conditions, but accurately predicting the transcriptional responses of cells to unseen perturbations remains a significant challenge.
    RESULTS: We address this by developing a novel attention-based neural network, AttentionPert, which accurately predicts gene expression under multiplexed perturbations and generalizes to unseen conditions. AttentionPert integrates global and local effects in a multi-scale model, representing both the nonuniform system-wide impact of the genetic perturbation and the localized disturbance in a network of gene-gene similarities, enhancing its ability to predict nuanced transcriptional responses to both single and multi-gene perturbations. In comprehensive experiments, AttentionPert demonstrates superior performance across multiple datasets outperforming the state-of-the-art method in predicting differential gene expressions and revealing novel gene regulations. AttentionPert marks a significant improvement over current methods, particularly in handling the diversity of gene perturbations and in predicting out-of-distribution scenarios.
    METHODS: Code is available at https://github.com/BaiDing1234/AttentionPert.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    结论:顺式作用mRNA元件在mRNA稳定性和翻译效率的调节中起关键作用。揭示这些元素的相互作用及其影响在理解mRNA翻译过程的调节中起着至关重要的作用。支持基于mRNA的药物或疫苗的开发。深度神经网络(DNN)可以从RNA序列中学习复杂的顺式调控代码。然而,从DNN中有效地提取这些顺式监管代码仍然是一个重大挑战。这里,我们提出了一种基于我们的工具包NeuronMotif和基序诱变的方法,这不仅可以发现多样化和高质量的基序,而且可以有效地揭示基序相互作用。通过解释深度学习模型,我们发现了几个影响mRNA翻译效率和稳定性的关键基序,以及一些未知的主题或主题语法,为生物学家提供新的见解。此外,我们注意到,在由随机生成的序列组成的数据集中丰富基序语法是具有挑战性的,它们可能不包含足够的生物信号。
    方法:用于产生本手稿中提供的结果和分析的源代码和数据可从GitHub(https://github.com/WangLabTHU/combtif)获得。
    CONCLUSIONS: Cis-acting mRNA elements play a key role in the regulation of mRNA stability and translation efficiency. Revealing the interactions of these elements and their impact plays a crucial role in understanding the regulation of the mRNA translation process, which supports the development of mRNA-based medicine or vaccines. Deep neural networks (DNN) can learn complex cis-regulatory codes from RNA sequences. However, extracting these cis-regulatory codes efficiently from DNN remains a significant challenge. Here, we propose a method based on our toolkit NeuronMotif and motif mutagenesis, which not only enables the discovery of diverse and high-quality motifs but also efficiently reveals motif interactions. By interpreting deep-learning models, we have discovered several crucial motifs that impact mRNA translation efficiency and stability, as well as some unknown motifs or motif syntax, offering novel insights for biologists. Furthermore, we note that it is challenging to enrich motif syntax in datasets composed of randomly generated sequences, and they may not contain sufficient biological signals.
    METHODS: The source code and data used to produce the results and analyses presented in this manuscript are available from GitHub (https://github.com/WangLabTHU/combmotif).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号