Structural similarity

结构相似性
  • 文章类型: Journal Article
    已经检查了各种IgG抗体的Fc碱基的结构,以了解该区域如何可用于将IgG缀合至纳米颗粒。发现基本结构在一系列物种和亚型中基本一致,包含由亲水残基包围的疏水区,其中一些是在生理条件下充电的。此外,进行了原子分子动力学模拟,以探索模型纳米粒子如何使用中性和带负电荷的金纳米粒子与碱相互作用。两种类型的纳米粒子都容易与碱相互作用,导致抗体基础表面的适应以增强相互作用。此外,这些相互作用使结构域的其余部分在Fc区的底部在结构上完整。这意味着将纳米颗粒与IgG分子的碱基偶联是可行的和合乎需要的。因为它使抗体自由地与其周围环境相互作用,从而可以保留抗原结合功能。因此,这些结果将有助于指导未来开发新的纳米技术,利用抗体和纳米颗粒的独特特性。
    The structures of the Fc base of various IgG antibodies have been examined with a view to understanding how this region can be used to conjugate IgG to nanoparticles. The base structure is found to be largely consistent across a range of species and subtypes, comprising a hydrophobic region surrounded by hydrophilic residues, some of which are charged at physiological conditions. In addition, atomistic Molecular Dynamics simulations were performed to explore how model nanoparticles interact with the base using neutral and negatively charged gold nanoparticles. Both types of nanoparticle interacted readily with the base, leading to an adaptation of the antibody base surface to enhance the interactions. Furthermore, these interactions left the rest of the domain at the base of the Fc region structurally intact. This implies that coupling nanoparticles to the base of an IgG molecule is both feasible and desirable, since it leaves the antibody free to interact with its surroundings so that antigen-binding functionality can be retained. These results will therefore help guide future attempts to develop new nanotechnologies that exploit the unique properties of both antibodies and nanoparticles.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    次级活性转运蛋白在真核和原核膜上穿梭底物,利用不同的电化学梯度。它们被认为是病原体中的抗菌外排泵之一。虽然艰难梭菌630基因组中的主要活性转运蛋白已被完全分类,次级活性转运蛋白的系统研究仍不完整。这里,我们不仅鉴定了次级活性转运蛋白,还揭示了它们在艰难梭菌630中的进化和耐药性中的作用。我们的分析表明,艰难梭菌630携带147个次级活性转运蛋白,属于27个(超级)家族。值得注意的是,其中50例(34%)可能导致抗生素耐药性(AMR)。AMR次级活性转运蛋白在结构上分为五个(超级)家族:对氨基苯甲酰基-谷氨酸转运蛋白(AbgT),药物/代谢物转运蛋白(DMT)超家族,主要促进者(MFS)超家族,多药和有毒化合物挤出(MATE)家族,和抗性结瘤分裂(RND)家族。令人惊讶的是,在艰难梭菌630中发现的完整的RND基因可能是来自与双胚层的共同祖先的进化剩余。通过蛋白质结构比较,我们有可能从DMT中鉴定出六种新的AMR二级活性转运蛋白,MATE,和MFS(超级)家庭。Pangenome分析显示,一半的AMR次级转运蛋白是辅助基因,这表明在适应性AMR功能而不是先天生理稳态中起重要作用。基因表达谱坚定地支持他们对广谱抗生素的反应能力。我们的发现强调了AMR次级活性转运蛋白的进化及其在抗生素反应中的整体作用。这标志着AMR次级活性转运蛋白作为与其他抗生素活性协同作用的有趣治疗靶标。
    Secondary active transporters shuttle substrates across eukaryotic and prokaryotic membranes, utilizing different electrochemical gradients. They are recognized as one of the antimicrobial efflux pumps among pathogens. While primary active transporters within the genome of C. difficile 630 have been completely cataloged, the systematical study of secondary active transporters remains incomplete. Here, we not only identify secondary active transporters but also disclose their evolution and role in drug resistance in C. difficile 630. Our analysis reveals that C. difficile 630 carries 147 secondary active transporters belonging to 27 (super)families. Notably, 50 (34%) of them potentially contribute to antimicrobial resistance (AMR). AMR-secondary active transporters are structurally classified into five (super)families: the p-aminobenzoyl-glutamate transporter (AbgT), drug/metabolite transporter (DMT) superfamily, major facilitator (MFS) superfamily, multidrug and toxic compound extrusion (MATE) family, and resistance-nodulation-division (RND) family. Surprisingly, complete RND genes found in C. difficile 630 are likely an evolutionary leftover from the common ancestor with the diderm. Through protein structure comparisons, we have potentially identified six novel AMR-secondary active transporters from DMT, MATE, and MFS (super)families. Pangenome analysis revealed that half of the AMR-secondary transporters are accessory genes, which indicates an important role in adaptive AMR function rather than innate physiological homeostasis. Gene expression profile firmly supports their ability to respond to a wide spectrum of antibiotics. Our findings highlight the evolution of AMR-secondary active transporters and their integral role in antibiotic responses. This marks AMR-secondary active transporters as interesting therapeutic targets to synergize with other antibiotic activity.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    小分子鉴定是分析化学和生命科学中的一项重要任务。阐明小分子结构的最常用技术之一是质谱法。产物离子光谱(MS/MS)的光谱库搜索是识别或查找结构类似物的流行策略。这种方法依赖于光谱相似性和结构相似性相关的假设。然而,流行的光谱相似性度量,通常基于MS/MS光谱之间的相同片段匹配来计算,并不总是准确地反映结构相似性。在这项研究中,我们提议TransExion,基于Transformer的可解释的IONS相似性度量。TransExION通过其质量差异检测MS/MS光谱之间的相关片段,并使用它们来估计光谱相似性。这些相关的片段可以几乎相同,但也可以共享一个子结构。TransExion还提供了对其估计的事后解释,这可用于支持科学家评估光谱库搜索结果,从而在未知分子的结构阐明。我们的模型具有基于Transformer的体系结构,并根据从GNPSMS/MS库导出的数据进行训练。实验结果表明,它改进了搜索和解释结构类似物以及分子网络中现有的光谱相似性度量。科学贡献:我们提出了一种基于变压器的光谱相似性度量,可改善小分子串联质谱的比较。我们提供了事后解释,可以作为基于数据库光谱的未知光谱注释的良好起点。
    Small molecule identification is a crucial task in analytical chemistry and life sciences. One of the most commonly used technologies to elucidate small molecule structures is mass spectrometry. Spectral library search of product ion spectra (MS/MS) is a popular strategy to identify or find structural analogues. This approach relies on the assumption that spectral similarity and structural similarity are correlated. However, popular spectral similarity measures, usually calculated based on identical fragment matches between the MS/MS spectra, do not always accurately reflect the structural similarity. In this study, we propose TransExION, a Transformer based Explainable similarity metric for IONS. TransExION detects related fragments between MS/MS spectra through their mass difference and uses these to estimate spectral similarity. These related fragments can be nearly identical, but can also share a substructure. TransExION also provides a post-hoc explanation of its estimation, which can be used to support scientists in evaluating the spectral library search results and thus in structure elucidation of unknown molecules. Our model has a Transformer based architecture and it is trained on the data derived from GNPS MS/MS libraries. The experimental results show that it improves existing spectral similarity measures in searching and interpreting structural analogues as well as in molecular networking. SCIENTIFIC CONTRIBUTION: We propose a transformer-based spectral similarity metrics that improves the comparison of small molecule tandem mass spectra. We provide a post hoc explanation that can serve as a good starting point for unknown spectra annotation based on database spectra.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在使用无监督方法进行表面缺陷检测时,在重建高质量正常背景的同时准确检测缺陷仍然是一个重大挑战。本研究提出了一种无监督的方法,通过实现准确的缺陷检测和无噪声的高质量正常背景重建,有效地解决了这一挑战。我们提出了一种自适应加权结构相似性(AW-SSIM)损失用于聚焦特征学习。AW-SSIM通过为其亮度子函数分配不同的权重来改善结构相似性(SSIM)损失,对比,并根据它们对特定训练样本的相对重要性进行结构。此外,它在损耗计算期间动态调整高斯窗口的标准偏差(σ),以平衡降噪和细节保留。提出了一种人工缺陷生成算法(ADGA),以生成与真实缺陷非常相似的人工缺陷。我们采用两阶段训练策略。在第一阶段,该模型仅使用AW-SSIM损失对正常样本进行训练,允许它学习正常特征的鲁棒表示。在第二阶段的训练中,从第一阶段获得的权重用于在正常训练样本和人工缺陷训练样本上训练模型。此外,第二阶段采用组合学习的感知图像补丁相似度(LPIPS)和AW-SSIM损失。组合损失有助于模型实现高质量的正常背景重建,同时保持准确的缺陷检测。大量的实验结果表明,我们提出的方法达到了最先进的缺陷检测精度。所提出的方法在MVTec异常检测数据集中的六个样本上实现了97.69%的接收器工作特征曲线(AuROC)下的平均面积。
    Accurately detecting defects while reconstructing a high-quality normal background in surface defect detection using unsupervised methods remains a significant challenge. This study proposes an unsupervised method that effectively addresses this challenge by achieving both accurate defect detection and a high-quality normal background reconstruction without noise. We propose an adaptive weighted structural similarity (AW-SSIM) loss for focused feature learning. AW-SSIM improves structural similarity (SSIM) loss by assigning different weights to its sub-functions of luminance, contrast, and structure based on their relative importance for a specific training sample. Moreover, it dynamically adjusts the Gaussian window\'s standard deviation (σ) during loss calculation to balance noise reduction and detail preservation. An artificial defect generation algorithm (ADGA) is proposed to generate an artificial defect closely resembling real ones. We use a two-stage training strategy. In the first stage, the model trains only on normal samples using AW-SSIM loss, allowing it to learn robust representations of normal features. In the second stage of training, the weights obtained from the first stage are used to train the model on both normal and artificially defective training samples. Additionally, the second stage employs a combined learned Perceptual Image Patch Similarity (LPIPS) and AW-SSIM loss. The combined loss helps the model in achieving high-quality normal background reconstruction while maintaining accurate defect detection. Extensive experimental results demonstrate that our proposed method achieves a state-of-the-art defect detection accuracy. The proposed method achieved an average area under the receiver operating characteristic curve (AuROC) of 97.69% on six samples from the MVTec anomaly detection dataset.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    蛋白质结构测定与预测,活性位点检测,和蛋白质序列比对技术都利用有关蛋白质结构和结构关系的信息。对于膜蛋白,然而,在突出和绘制这种结构相似性的可用在线工具之间的协议有限。此外,没有可用的资源提供四元和内部对称性的系统概述,以及它们相对于薄膜的方向,尽管这些特性可以为膜蛋白功能和进化提供关键见解。这里,我们描述了通过结构和对称性分析的膜蛋白百科全书(EncoMPASS),从序列的角度来关联已知结构的完整膜蛋白的数据库,结构,和对称性。可以通过Web界面访问EncoMPASS,它的内容可以很容易地下载。这允许用户不仅专注于特定的蛋白质,还要研究膜蛋白的结构和进化的一般性质。
    Protein structure determination and prediction, active site detection, and protein sequence alignment techniques all exploit information about protein structure and structural relationships. For membrane proteins, however, there is limited agreement among available online tools for highlighting and mapping such structural similarities. Moreover, no available resource provides a systematic overview of quaternary and internal symmetries, and their orientation relative to the membrane, despite the fact that these properties can provide key insights into membrane protein function and evolution. Here, we describe the Encyclopedia of Membrane Proteins Analyzed by Structure and Symmetry (EncoMPASS), a database for relating integral membrane proteins of known structure from the points of view of sequence, structure, and symmetry. EncoMPASS is accessible through a web interface, and its contents can be easily downloaded. This allows the user not only to focus on specific proteins, but also to study general properties of the structure and evolution of membrane proteins.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    读取方法是一种流行的数据间隙填充技术,具有用于多种目的的开发应用程序,包括监管。在美国环境保护署(USEPA)的《有毒物质控制法》(TSCA)下的新化学品计划中,读取已被广泛使用,以及经济合作与发展组织发布的技术指南,欧洲化学品管理局,和欧洲生态毒理学和化学品毒理学中心,以填补化学毒性数据空白。根据TSCA新化学品审查计划,美国环保署的任务是在美国境内开始商业生产或进口到美国之前审查拟议的新化学品申请。这项审查的主要目标是确定任何不合理的人类健康和环境风险,制造过程中的环境释放/排放以及这些环境释放造成的暴露。作者提出了将读取技术应用于开发和使用框架来估算化学制造过程中产生的排放量。该方法是利用来自化学家族中的结构相似的模拟化学品或一组结构相似的化学品的可用排放数据,同时考虑它们在指定的化学工艺单元操作和条件下的物理化学性质。该框架还旨在应用先前在化学物质类似物或类别的毒性估计中使用的阅读原理的现有知识,并通过并行案例研究引入和扩展。
    The read-across method is a popular data gap filling technique with developed application for multiple purposes, including regulatory. Within the US Environmental Protection Agency\'s (US EPA) New Chemicals Program under Toxic Substances Control Act (TSCA), read-across has been widely used, as well as within technical guidance published by the Organization for Economic Co-operation and Development, the European Chemicals Agency, and the European Center for Ecotoxicology and Toxicology of Chemicals for filling chemical toxicity data gaps. Under the TSCA New Chemicals Review Program, US EPA is tasked with reviewing proposed new chemical applications prior to commencing commercial manufacturing within or importing into the United States. The primary goal of this review is to identify any unreasonable human health and environmental risks, arising from environmental releases/emissions during manufacturing and the resulting exposure from these environmental releases. The authors propose the application of read-across techniques for the development and use of a framework for estimating the emissions arising during the chemical manufacturing process. This methodology is to utilize available emissions data from a structurally similar analogue chemical or a group of structurally similar chemicals in a chemical family taking into consideration their physicochemical properties under specified chemical process unit operations and conditions. This framework is also designed to apply existing knowledge of read-across principles previously utilized in toxicity estimation for an analogue or category of chemicals and introduced and extended with a concurrent case study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    BACKGROUND: Image registration is a challenging problem in many clinical tasks, but deep learning has made significant progress in this area over the past few years. Real-time and robust registration has been made possible by supervised transformation estimation. However, the quality of registrations using this framework depends on the quality of ground truth labels such as displacement field.
    OBJECTIVE: To propose a simple and reliable method for registering medical images based on image structure similarity in a completely unsupervised manner.
    METHODS: We proposed a deep cascade unsupervised deformable registration approach to align images without reliable clinical data labels. Our basic network was composed of a displacement estimation module (ResUnet) and a deformation module (spatial transformer layers). We adopted l2 -norm to regularize the deformation field instead of the traditional l1 -norm regularization. Additionally, we utilized structural similarity (ssim) estimation during the training stage to enhance the structural consistency between the deformed images and the reference images.
    RESULTS: Experiments results indicated that by incorporating ssim loss, our cascaded methods not only achieved higher dice score of 0.9873, ssim score of 0.9559, normalized cross-correlation (NCC) score of 0.9950, and lower relative sum of squared difference (SSD) error of 0.0313 on CT images, but also outperformed the comparative methods on ultrasound dataset. The statistical t-test results also proved that these improvements of our method have statistical significance.
    CONCLUSIONS: In this study, the promising results based on diverse evaluation metrics have demonstrated that our model is simple and effective in deformable image registration (DIR). The generalization ability of the model was also verified through experiments on liver CT images and cardiac ultrasound images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    多发性硬化症(MS)患者经常同时服用多种药物来改变病程,缓解神经系统症状和管理共存的条件。服用不同药物的患者的主要后果是治疗失败和副作用的风险更高。这是因为一种药物可能会改变另一种药物的药代动力学和/或药效学特性,这被称为药物-药物相互作用(DDI)。我们旨在基于深度神经网络(DNN)使用结构信息作为输入来预测MS患者使用的药物的相互作用。我们进一步旨在确定潜在的药物-食物相互作用(DFIs),这也会影响药物疗效和患者安全。我们用了DeepDDI,特定DDI类型的多标签分类模型,预测两种或多种药物合用时药理作用的变化和/或药物不良事件的风险。使用DrugBank数据库中记录的>100万个DDI更新了具有约3400万个可训练参数的原始模型。从FooDB数据库获得食物成分的结构数据。然后搜索MS患者(n=627)的药物计划,以寻找药物和食物化合物之间的成对相互作用。更新的DeepDDI模型在验证和测试集上实现了92.2%和92.1%的准确度,分别。MS患者使用312种不同的小分子药物作为处方药或非处方药。在药物治疗计划中,我们在DrugBank中确定了3748个DDI,使用DeepDDI确定了13,365个DDI。对于大多数患者发现至少一个DDI(基于DNN模型,n=509或81.2%)。预测显示,如果他们开始使用克拉屈滨(n=242或38.6%)和芬戈莫德(n=279或44.5%)进行疾病改善治疗,许多患者由于潜在的DDI而发生出血和心动过缓并发症的风险增加。分别。我们还获得了许多潜在的相互作用布鲁顿的酪氨酸激酶抑制剂是在临床开发MS,例如evobrutinib(n=434DDI)。与DFI最相关的食物来源是玉米(n=5456DFI)和牛奶(n=4243DFI)。我们证明了深度学习技术可以利用化学结构相似性来准确预测MS患者的DDI和DFI。我们的研究指定了潜在相互作用的药物对,提示引起药物不良反应的机制,告知是否可以用替代药物替代相互作用药物以避免严重的DDI,并为正在服用某些药物的MS患者提供饮食建议.
    Patients with multiple sclerosis (MS) often take multiple drugs at the same time to modify the course of disease, alleviate neurological symptoms and manage co-existing conditions. A major consequence for a patient taking different medications is a higher risk of treatment failure and side effects. This is because a drug may alter the pharmacokinetic and/or pharmacodynamic properties of another drug, which is referred to as drug-drug interaction (DDI). We aimed to predict interactions of drugs that are used by patients with MS based on a deep neural network (DNN) using structural information as input. We further aimed to identify potential drug-food interactions (DFIs), which can affect drug efficacy and patient safety as well. We used DeepDDI, a multi-label classification model of specific DDI types, to predict changes in pharmacological effects and/or the risk of adverse drug events when two or more drugs are taken together. The original model with ~34 million trainable parameters was updated using >1 million DDIs recorded in the DrugBank database. Structure data of food components were obtained from the FooDB database. The medication plans of patients with MS (n = 627) were then searched for pairwise interactions between drug and food compounds. The updated DeepDDI model achieved accuracies of 92.2% and 92.1% on the validation and testing sets, respectively. The patients with MS used 312 different small molecule drugs as prescription or over-the-counter medications. In the medication plans, we identified 3748 DDIs in DrugBank and 13,365 DDIs using DeepDDI. At least one DDI was found for most patients (n = 509 or 81.2% based on the DNN model). The predictions revealed that many patients would be at increased risk of bleeding and bradycardic complications due to a potential DDI if they were to start a disease-modifying therapy with cladribine (n = 242 or 38.6%) and fingolimod (n = 279 or 44.5%), respectively. We also obtained numerous potential interactions for Bruton\'s tyrosine kinase inhibitors that are in clinical development for MS, such as evobrutinib (n = 434 DDIs). Food sources most often related to DFIs were corn (n = 5456 DFIs) and cow\'s milk (n = 4243 DFIs). We demonstrate that deep learning techniques can exploit chemical structure similarity to accurately predict DDIs and DFIs in patients with MS. Our study specifies drug pairs that potentially interact, suggests mechanisms causing adverse drug effects, informs about whether interacting drugs can be replaced with alternative drugs to avoid critical DDIs and provides dietary recommendations for MS patients who are taking certain drugs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    大多数超分子系统是通过反复试验的方法发现的,导致许多合成努力,以获得最佳的超分子结构单元用于选择性客体封装。这里,我们报道了一种在水溶液中制备他莫昔芬选择性超分子纳米材料的简单共组装策略。合成两亲物分子,1,1,2,2-四苯基乙烯(TPE),促进大的他莫昔芬聚集体分解为较小的,离散聚集体,如共组装溶液中的带状和胶束组件,提高溶解性和分散性。TPE部分在他莫昔芬相互作用时表现出增强的发射,能够观察水溶液中的共组装物种以进行细胞成像。在TPE衍生物与他莫昔芬的摩尔比为1:1的情况下,他莫昔芬选择性荧光胶束显示出增强的他莫昔芬吸收和对MCF-7乳腺癌细胞的抗癌作用。这些超分子方法,基于具有分子结构相似性的结构单元的共同组装,可以为高效开发具有增强生物活性的选择性分子载体提供新的策略。
    Most supramolecular systems were discovered by using a trial-and-error approach, leading to numerous synthetic efforts to obtain optimal supramolecular building blocks for selective guest encapsulation. Here, we report a simple coassembly strategy for preparing tamoxifen-selective supramolecular nanomaterials in an aqueous solution. The synthetic amphiphile molecule, 1,1,2,2-tetraphenylethylene (TPE), promotes large tamoxifen aggregate disassembly into smaller, discrete aggregates such as ribbon-like and micellar assemblies in coassembled solutions, enhancing the solubility and dispersion. The TPE moiety exhibits enhanced emission upon tamoxifen interaction, enabling the observation of the coassembled species in an aqueous solution for cell imaging. The tamoxifen-selective fluorescent micelles in the presence of a 1:1 molar ratio of TPE derivative with tamoxifen show enhanced tamoxifen absorption and anticancer effects against MCF-7 breast cancer cells. These supramolecular approaches, based on the coassembly of building blocks with molecular structural similarity, can provide a novel strategy for the efficient development of selective molecular carriers with enhanced biological activities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:微孢子虫是细胞内病原体的大型分类群,其特征是具有异常高的序列差异和许多物种特异性适应的异常流线型基因组。这些独特的因素对基于序列相似性的传统基因组注释方法提出了挑战。因此,迄今为止测序的许多微孢子虫基因组包含许多功能未知的基因。最近在快速准确的结构预测和比较方面的创新,加上结构数据库中不断增长的数据量,提供新的机会来协助新测序基因组的功能注释。
    结果:在这项研究中,我们建立了一个工作流程,结合了基于序列和结构的功能基因注释方法,采用了名为ANNOTEX(ChimeraX注释扩展)的ChimeraX插件,允许目视检查和手动管理。我们将此工作流程应用于高质量的端粒到端粒测序的四倍体基因组。首先,3080个预测的蛋白质编码DNA序列,其中89%被RNA测序数据证实,被用作输入。接下来,ColabFold用于创建蛋白质结构预测,然后进行Foldseek搜索,以便与PDB和AlphaFold数据库进行结构匹配。随后的手动策展,使用基于序列和结构的命中,与仅使用传统注释工具的结果相比,提高了功能基因组注释的准确性和质量。我们的工作流程导致了对V.necatrix基因组的全面描述,以及最普遍的蛋白质组的结构总结,例如蓖麻毒素B凝集素家族。此外,为了测试我们的工具,我们确定了几个以前未表征的cuniculi头孢菌素基因的功能。
    结论:我们为不同的生物体提供了一种新的功能注释工具,高质量的微孢子虫基因组揭示了这种未表征的鳞翅目细胞内病原体。添加基于结构的注释方法可以作为研究其他微孢子虫或类似不同物种的有价值的模板。
    BACKGROUND: Microsporidia are a large taxon of intracellular pathogens characterized by extraordinarily streamlined genomes with unusually high sequence divergence and many species-specific adaptations. These unique factors pose challenges for traditional genome annotation methods based on sequence similarity. As a result, many of the microsporidian genomes sequenced to date contain numerous genes of unknown function. Recent innovations in rapid and accurate structure prediction and comparison, together with the growing amount of data in structural databases, provide new opportunities to assist in the functional annotation of newly sequenced genomes.
    RESULTS: In this study, we established a workflow that combines sequence and structure-based functional gene annotation approaches employing a ChimeraX plugin named ANNOTEX (Annotation Extension for ChimeraX), allowing for visual inspection and manual curation. We employed this workflow on a high-quality telomere-to-telomere sequenced tetraploid genome of Vairimorpha necatrix. First, the 3080 predicted protein-coding DNA sequences, of which 89% were confirmed with RNA sequencing data, were used as input. Next, ColabFold was used to create protein structure predictions, followed by a Foldseek search for structural matching to the PDB and AlphaFold databases. The subsequent manual curation, using sequence and structure-based hits, increased the accuracy and quality of the functional genome annotation compared to results using only traditional annotation tools. Our workflow resulted in a comprehensive description of the V. necatrix genome, along with a structural summary of the most prevalent protein groups, such as the ricin B lectin family. In addition, and to test our tool, we identified the functions of several previously uncharacterized Encephalitozoon cuniculi genes.
    CONCLUSIONS: We provide a new functional annotation tool for divergent organisms and employ it on a newly sequenced, high-quality microsporidian genome to shed light on this uncharacterized intracellular pathogen of Lepidoptera. The addition of a structure-based annotation approach can serve as a valuable template for studying other microsporidian or similarly divergent species.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号