ORFAN

ORFan
  • 文章类型: Journal Article
    随着近年来心血管成像领域的巨大进步,计算机断层扫描(CT)已成为动脉粥样硬化性冠状动脉疾病的表型。使用人工智能(AI)的新分析方法可以分析动脉粥样硬化斑块的复杂表型信息。特别是,使用卷积神经网络(CNN)的基于深度学习的方法促进了病变检测等任务,分割,和分类。新的放射转录组学技术甚至通过对CT图像上的体素进行高阶结构分析来捕获潜在的生物组织化学过程。在不久的将来,国际大规模牛津危险因素和非侵入性成像(ORFAN)研究将为测试和验证基于AI的预后模型提供强大的平台。目标是将这些新方法从研究环境转变为临床工作流程。在这次审查中,我们概述了现有的基于AI的技术,重点是成像生物标志物以确定冠状动脉炎症的程度,冠状动脉斑块,以及相关风险。Further,将讨论使用基于AI的方法的当前限制以及解决这些挑战的优先事项。这将为AI启用的风险评估工具铺平道路,以检测易损的动脉粥样硬化斑块并指导患者的治疗策略。
    With the enormous progress in the field of cardiovascular imaging in recent years, computed tomography (CT) has become readily available to phenotype atherosclerotic coronary artery disease. New analytical methods using artificial intelligence (AI) enable the analysis of complex phenotypic information of atherosclerotic plaques. In particular, deep learning-based approaches using convolutional neural networks (CNNs) facilitate tasks such as lesion detection, segmentation, and classification. New radiotranscriptomic techniques even capture underlying bio-histochemical processes through higher-order structural analysis of voxels on CT images. In the near future, the international large-scale Oxford Risk Factors And Non-invasive Imaging (ORFAN) study will provide a powerful platform for testing and validating prognostic AI-based models. The goal is the transition of these new approaches from research settings into a clinical workflow. In this review, we present an overview of existing AI-based techniques with focus on imaging biomarkers to determine the degree of coronary inflammation, coronary plaques, and the associated risk. Further, current limitations using AI-based approaches as well as the priorities to address these challenges will be discussed. This will pave the way for an AI-enabled risk assessment tool to detect vulnerable atherosclerotic plaques and to guide treatment strategies for patients.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    高通量测序使许多新病毒和病毒组织的发现增加了我们对病毒起源和进化的理解。大多数RNA病毒目前通过注释病毒数据库的相似性搜索来表征。这种方法限制了检测完全新的病毒编码蛋白的可能性,与现有蛋白没有可检测的相似性,即ORFan蛋白。转移转录组中的ORFan病毒起源的强烈指示是缺乏对应于生物样品中组装的RNA序列的DNA。此外,ORFans之间的序列同源性和这些ORFans在特定宿主个体中共同出现的证据提供了病毒起源的进一步指示。这里,我们使用这个理论框架来报道在真菌中没有相应DNA的蛋白质编码RNA片段的三个保守进化枝的发现。蛋白质序列和结构比对表明这些蛋白质与病毒RNA依赖性RNA聚合酶(RdRP)密切相关。在这些新的假定的病毒RdRP进化支中,不存在GDD催化三联体,但最常见的催化三合会是NDD和具有GDQ的进化枝,以前在该地点未报告的三合会。SDD,HDD,和ADD也表示。对于这三个分支的大多数成员来说,我们能够关联第二个基因组片段,编码未知功能的蛋白质。我们暂时将这种新的病毒群命名为分枝杆菌病毒。有趣的是,在感染过程中,其中一个亚进化枝(γ-分枝杆菌病毒)的所有成员都积累了更多的负义RNA,而不是正义RNA。
    High throughput sequencing allowed the discovery of many new viruses and viral organizations increasing our comprehension of virus origin and evolution. Most RNA viruses are currently characterized through similarity searches of annotated virus databases. This approach limits the possibility to detect completely new virus-encoded proteins with no detectable similarities to existing ones, i.e. ORFan proteins. A strong indication of the ORFan viral origin in a metatranscriptome is the lack of DNA corresponding to an assembled RNA sequence in the biological sample. Furthermore, sequence homology among ORFans and evidence of co-occurrence of these ORFans in specific host individuals provides further indication of a viral origin. Here, we use this theoretical framework to report the finding of three conserved clades of protein-coding RNA segments without a corresponding DNA in fungi. Protein sequence and structural alignment suggest these proteins are distantly related to viral RNA-dependent RNA polymerases (RdRP). In these new putative viral RdRP clades, no GDD catalytic triad is present, but the most common putative catalytic triad is NDD and a clade with GDQ, a triad previously unreported at that site. SDD, HDD, and ADD are also represented. For most members of these three clades, we were able to associate a second genomic segment, coding for a protein of unknown function. We provisionally named this new group of viruses ormycovirus. Interestingly, all the members of one of these sub-clades (gammaormycovirus) accumulate more minus sense RNA than plus sense RNA during infection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    细菌DNA促旋酶复合物(GyrA/GyrB)在DNA复制过程中起着至关重要的作用,并作为多种抗生素的靶标,包括氟喹诺酮类药物.尽管它是一个有价值的抗生素靶标,包括铜绿假单胞菌在内的病原体的耐药性出现是有问题的。这里,我们描述了Igy,一种促旋酶的肽抑制剂,由假单胞菌噬菌体LUZ24和布鲁诺病毒属的其他成员编码。Igy(5.6kDa)抑制体外促旋酶活性并与铜绿假单胞菌GyrB亚基相互作用,可能是DNA模仿,如肽和诱变的从头模型所示。在体内,在氟喹诺酮耐药细菌分离株中,Igy的过量生产也会阻断DNA复制并导致细胞死亡。这些数据凸显了发现噬菌体启发的抗生素开发线索的潜力,在共同进化的支持下,因为Igy可以作为小分子模拟的支架来靶向DNA促旋酶复合物,对现有分子没有交叉抗性。
    The bacterial DNA gyrase complex (GyrA/GyrB) plays a crucial role during DNA replication and serves as a target for multiple antibiotics, including the fluoroquinolones. Despite it being a valuable antibiotics target, resistance emergence by pathogens including Pseudomonas aeruginosa are proving problematic. Here, we describe Igy, a peptide inhibitor of gyrase, encoded by Pseudomonas bacteriophage LUZ24 and other members of the Bruynoghevirus genus. Igy (5.6 kDa) inhibits in vitro gyrase activity and interacts with the P. aeruginosa GyrB subunit, possibly by DNA mimicry, as indicated by a de novo model of the peptide and mutagenesis. In vivo, overproduction of Igy blocks DNA replication and leads to cell death also in fluoroquinolone-resistant bacterial isolates. These data highlight the potential of discovering phage-inspired leads for antibiotics development, supported by co-evolution, as Igy may serve as a scaffold for small molecule mimicry to target the DNA gyrase complex, without cross-resistance to existing molecules.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Cryphonectria parasitica, the causal agent of chestnut blight is controlled in Europe through natural spread of Cryphonectria hypovirus 1 (CHV1), a mycovirus able to induce hypovirulence to the host. In recent years C. parasitica was reported infecting Azerbaijani population of chestnut, but the presence of CHV1 still needs to be confirmed. Aim of this work was to investigate fifty-five C. parasitica isolates collected in Azerbaijan to describe the associated viruses. Our work found i) the first negative-sense ssRNA virus known to infect C. parasitica naturally for which we propose the name Cryphonectria parasitica sclerotimonavirus 1 (CpSV1) and ii) an RNA sequence showing peculiar features suggesting a viral nature for which we propose the name Cryphonectria parasitica ambivirus 1 (CpaV1). The discovery of CpaV1 expands our knowledge of the RNA virosphere suggesting the existence of a new lineage that cannot presently be reliably associated to the monophyletic Riboviria.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Here we report the discovery of Yaravirus, a lineage of amoebal virus with a puzzling origin and evolution. Yaravirus presents 80-nm-sized particles and a 44,924-bp dsDNA genome encoding for 74 predicted proteins. Yaravirus genome annotation showed that none of its genes matched with sequences of known organisms at the nucleotide level; at the amino acid level, six predicted proteins had distant matches in the nr database. Complimentary prediction of three-dimensional structures indicated possible function of 17 proteins in total. Furthermore, we were not able to retrieve viral genomes closely related to Yaravirus in 8,535 publicly available metagenomes spanning diverse habitats around the globe. The Yaravirus genome also contained six types of tRNAs that did not match commonly used codons. Proteomics revealed that Yaravirus particles contain 26 viral proteins, one of which potentially representing a divergent major capsid protein (MCP) with a predicted double jelly-roll domain. Structure-guided phylogeny of MCP suggests that Yaravirus groups together with the MCPs of Pleurochrysis endemic viruses. Yaravirus expands our knowledge of the diversity of DNA viruses. The phylogenetic distance between Yaravirus and all other viruses highlights our still preliminary assessment of the genomic diversity of eukaryotic viruses, reinforcing the need for the isolation of new viruses of protists.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Orphan genes (also known as ORFans [i.e., orphan open reading frames]) are new genes that enable an organism to adapt to its specific living environment. Our focus in this study is to compare ORFans between pathogens (P) and nonpathogens (NP) of the same genus. Using the pangenome idea, we have identified 130,169 ORFans in nine bacterial genera (505 genomes) and classified these ORFans into four groups: (i) SS-ORFans (P), which are only found in a single pathogenic genome; (ii) SS-ORFans (NP), which are only found in a single nonpathogenic genome; (iii) PS-ORFans (P), which are found in multiple pathogenic genomes; and (iv) NS-ORFans (NP), which are found in multiple nonpathogenic genomes. Within the same genus, pathogens do not always have more genes, more ORFans, or more pathogenicity-related genes (PRGs)-including prophages, pathogenicity islands (PAIs), virulence factors (VFs), and horizontal gene transfers (HGTs)-than nonpathogens. Interestingly, in pathogens of the nine genera, the percentages of PS-ORFans are consistently higher than those of SS-ORFans, which is not true in nonpathogens. Similarly, in pathogens of the nine genera, the percentages of PS-ORFans matching the four types of PRGs are also always higher than those of SS-ORFans, but this is not true in nonpathogens. All of these findings suggest the greater importance of PS-ORFans for bacterial pathogenicity. IMPORTANCE Recent pangenome analyses of numerous bacterial species have suggested that each genome of a single species may have a significant fraction of its gene content unique or shared by a very few genomes (i.e., ORFans). We selected nine bacterial genera, each containing at least five pathogenic and five nonpathogenic genomes, to compare their ORFans in relation to pathogenicity-related genes. Pathogens in these genera are known to cause a number of common and devastating human diseases such as pneumonia, diphtheria, melioidosis, and tuberculosis. Thus, they are worthy of in-depth systems microbiology investigations, including the comparative study of ORFans between pathogens and nonpathogens. We provide direct evidence to suggest that ORFans shared by more pathogens are more associated with pathogenicity-related genes and thus are more important targets for development of new diagnostic markers or therapeutic drugs for bacterial infectious diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Tectiviridae家族包括一组无尾,二十面体,含有膜的噬菌体,可按其宿主分为两组,革兰氏阴性或革兰氏阳性细菌。虽然第一组由PRD1和几乎相同的特征明确的裂解病毒组成,第二个包括更多可变的温带噬菌体,例如GIL16或Bam35,其宿主是蜡状芽孢杆菌和相关的革兰氏阳性细菌。在Bam35的基因组中,32个注释开放阅读框(ORFs)中有近一半在数据库(ORFans)中没有同源物,被假定为功能未知的蛋白质,这阻碍了他们对生物学的理解。为了增加对病毒蛋白质组的了解,我们对Bam35基因组编码的所有推定蛋白质进行了全面的酵母双杂交分析.由此产生的蛋白质相互作用组包含24种蛋白质之间的76种独特相互作用,其中12个函数未知。这些结果表明,P17蛋白是Bam35的次要衣壳蛋白,P24是五肽蛋白,后者的发现也得到了迭代线程蛋白质建模的支持。此外,内膜转糖基酶蛋白P26可能具有额外的结构作用。我们还检测到涉及非结构蛋白的相互作用,如DNA结合蛋白P1和基因组末端蛋白(P4),这通过重组蛋白的共免疫沉淀得到证实。总之,我们的结果提供了Bam35病毒蛋白质组的功能视图,重点关注病毒颗粒的组成和组织。Tectiviridae家族的无Tailless病毒可以感染共生和致病性革兰氏阳性和革兰氏阴性细菌。此外,它们被认为是几组大型真核DNA病毒和自我复制质粒的进化起源。然而,由于其古老的起源和复杂的多样性,许多病毒蛋白是功能未知的ORFans。对病毒蛋白进行全面的蛋白质-蛋白质相互作用(PPI)分析最终可以揭示生物学机制,从而为通过逐一研究蛋白质无法获得的蛋白质功能提供新的见解。在这里,我们全面描述了使用多重酵母双杂交筛选确定的病毒Bam35蛋白中的病毒内PPI,这些PPI进一步得到了免疫共沉淀分析和蛋白质结构模型的支持。这种方法使我们能够提出已知蛋白质的新功能,并假设某些病毒ORFan蛋白在病毒颗粒内定位的生物学作用,这将有助于理解感染革兰氏阳性细菌的复制病毒的生物学。
    The family Tectiviridae comprises a group of tailless, icosahedral, membrane-containing bacteriophages that can be divided into two groups by their hosts, either Gram-negative or Gram-positive bacteria. While the first group is composed of PRD1 and nearly identical well-characterized lytic viruses, the second one includes more variable temperate phages, like GIL16 or Bam35, whose hosts are Bacillus cereus and related Gram-positive bacteria. In the genome of Bam35, nearly half of the 32 annotated open reading frames (ORFs) have no homologs in databases (ORFans), being putative proteins of unknown function, which hinders the understanding of their biology. With the aim of increasing knowledge about the viral proteome, we carried out a comprehensive yeast two-hybrid analysis of all the putative proteins encoded by the Bam35 genome. The resulting protein interactome comprised 76 unique interactions among 24 proteins, of which 12 have an unknown function. These results suggest that the P17 protein is the minor capsid protein of Bam35 and P24 is the penton protein, with the latter finding also being supported by iterative threading protein modeling. Moreover, the inner membrane transglycosylase protein P26 could have an additional structural role. We also detected interactions involving nonstructural proteins, such as the DNA-binding protein P1 and the genome terminal protein (P4), which was confirmed by coimmunoprecipitation of recombinant proteins. Altogether, our results provide a functional view of the Bam35 viral proteome, with a focus on the composition and organization of the viral particle.IMPORTANCE Tailless viruses of the family Tectiviridae can infect commensal and pathogenic Gram-positive and Gram-negative bacteria. Moreover, they have been proposed to be at the evolutionary origin of several groups of large eukaryotic DNA viruses and self-replicating plasmids. However, due to their ancient origin and complex diversity, many tectiviral proteins are ORFans of unknown function. Comprehensive protein-protein interaction (PPI) analysis of viral proteins can eventually disclose biological mechanisms and thus provide new insights into protein function unattainable by studying proteins one by one. Here we comprehensively describe intraviral PPIs among tectivirus Bam35 proteins determined using multivector yeast two-hybrid screening, and these PPIs were further supported by the results of coimmunoprecipitation assays and protein structural models. This approach allowed us to propose new functions for known proteins and hypothesize about the biological role of the localization of some viral ORFan proteins within the viral particle that will be helpful for understanding the biology of tectiviruses infecting Gram-positive bacteria.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    与已知蛋白质缺乏可检测同源性的预测开放阅读框(ORF)称为ORFans。尽管它们在宏基因组中普遍存在,ORFans编码真正蛋白质的程度,它们可以被注释的程度,以及它们的功能贡献,仍然不清楚。为了深入了解这些问题,我们应用敏感的远程同源性检测方法对土壤中的ORFans进行功能分析,海洋,和人类肠道宏基因组集合。ORFans被确认,聚集到序列家族中,并通过与已知结构的蛋白质的概况比较进行注释。我们发现,相当数量的宏基因组ORFans(738484,121,15.3%的73,896)表现出与结构表征蛋白的显着远程同源性,为ORFan功能分析提供了一种方法。检测到的远程同源性的程度远远超过人工蛋白质家族获得的程度(1.4%)。正如真正的基因所预期的那样,ORFans的预测功能与其基因邻居的功能显著相似(p<0.001)。与通过标准同源性搜索预测的功能谱相比,ORFans显示出生物学上有趣的差异。许多富含ORFan的功能是病毒相关的,并且倾向于反映与极端序列多样性相关的生物过程。每个环境还拥有大量独特的ORFan家族和功能,包括一些已知发挥重要作用的社区,如肠道微生物多糖消化。最后,ORFans是寻找新的感兴趣的酶的宝贵资源,正如我们通过鉴定数百种新型ORFan金属蛋白酶所证明的那样,尽管与已知蛋白质普遍缺乏相似性,但它们都具有特征催化基序。我们的ORFan功能预测是发现新蛋白质家族和探索蛋白质序列空间边界的宝贵资源。所有远程同源性预测可在http://doxey获得。uwaterloo.CA/ORFans。
    Predicted open reading frames (ORFs) that lack detectable homology to known proteins are termed ORFans. Despite their prevalence in metagenomes, the extent to which ORFans encode real proteins, the degree to which they can be annotated, and their functional contributions, remain unclear. To gain insights into these questions, we applied sensitive remote-homology detection methods to functionally analyze ORFans from soil, marine, and human gut metagenome collections. ORFans were identified, clustered into sequence families, and annotated through profile-profile comparison to proteins of known structure. We found that a considerable number of metagenomic ORFans (73,896 of 484,121, 15.3%) exhibit significant remote homology to structurally characterized proteins, providing a means for ORFan functional profiling. The extent of detected remote homology far exceeds that obtained for artificial protein families (1.4%). As expected for real genes, the predicted functions of ORFans are significantly similar to the functions of their gene neighbors (p < 0.001). Compared to the functional profiles predicted through standard homology searches, ORFans show biologically intriguing differences. Many ORFan-enriched functions are virus-related and tend to reflect biological processes associated with extreme sequence diversity. Each environment also possesses a large number of unique ORFan families and functions, including some known to play important community roles such as gut microbial polysaccharide digestion. Lastly, ORFans are a valuable resource for finding novel enzymes of interest, as we demonstrate through the identification of hundreds of novel ORFan metalloproteases that all possess a signature catalytic motif despite a general lack of similarity to known proteins. Our ORFan functional predictions are a valuable resource for discovering novel protein families and exploring the boundaries of protein sequence space. All remote homology predictions are available at http://doxey.uwaterloo.ca/ORFans.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    ORFans是与其他蛋白质缺乏任何显著序列相似性的假设蛋白质。这里,我们通过定量蛋白质组学强调了来自高辐射抗性嗜热球菌gammatolerans古细菌的TGAM_1934ORFan是最丰富的假设蛋白质之一。基于其在三种细胞条件下的丰度,该蛋白质已被选择为结构确定的优先靶标。已使用多维异核NMR光谱确定了其溶液结构。TGAM_1934显示原始折叠,尽管与共济失调蛋白的细菌直系同源的3D结构有一些相似之处,CyaY,一种在细菌和真核生物中保守并参与铁硫簇生物合成的蛋白质。这些结果突出了结构蛋白质组学在基于定量蛋白质组学数据对ORFan靶标进行结构确定的优先级方面的潜力。蛋白质组数据和结构坐标已保存到具有标识符PXD000402的ProteomeXchange(http://proteomecentral。proteomexchange.org/dataset/PXD000402)和蛋白质数据库,登录号为2mcf,分别。
    ORFans are hypothetical proteins lacking any significant sequence similarity with other proteins. Here, we highlighted by quantitative proteomics the TGAM_1934 ORFan from the hyperradioresistant Thermococcus gammatolerans archaeon as one of the most abundant hypothetical proteins. This protein has been selected as a priority target for structure determination on the basis of its abundance in three cellular conditions. Its solution structure has been determined using multidimensional heteronuclear NMR spectroscopy. TGAM_1934 displays an original fold, although sharing some similarities with the 3D structure of the bacterial ortholog of frataxin, CyaY, a protein conserved in bacteria and eukaryotes and involved in iron-sulfur cluster biogenesis. These results highlight the potential of structural proteomics in prioritizing ORFan targets for structure determination based on quantitative proteomics data. The proteomic data and structure coordinates have been deposited to the ProteomeXchange with identifier PXD000402 (http://proteomecentral.proteomexchange.org/dataset/PXD000402) and Protein Data Bank under the accession number 2mcf, respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    使用鸟枪宏基因组学分析整个病毒群落的研究中的一个一致发现是,大多数病毒序列与已知序列没有显着同源性。因此,基于序列集合的生物信息学分析,如GenBanknr,它们主要由已知生物体的序列组成,倾向于忽略大多数鸟枪病毒宏基因组文库中的大多数序列。这里我们描述了一个生物信息管道,用于宏基因组探索的病毒信息学资源(VIROME),强调基于针对已知和环境序列的同源性搜索结果对病毒宏基因组序列(预测的开放阅读框)进行分类。功能和分类学信息来源于与UniRef100数据库链接的五个带注释的序列数据库。环境分类是从自定义数据库的点击中获得的,元基因组在线,其中包含4900万个预测的环境肽。通过VIROME管道运行的每个预测的病毒宏基因组ORF被放入七个ORF类别之一,因此,每个序列都会收到一个有意义的注释。此外,管道包括质量控制措施,以去除污染和劣质序列,并通过筛选rRNA基因评估病毒宏基因组文库中细胞DNA污染的潜在量.对VIROME管道和分析结果的访问是通过动态链接到关系后端数据库的Web应用程序界面提供的。VIROMEWeb应用程序界面旨在允许用户灵活地检索序列(读取,ORFs,预测的肽)和搜索结果,以进行集中的二次分析。
    One consistent finding among studies using shotgun metagenomics to analyze whole viral communities is that most viral sequences show no significant homology to known sequences. Thus, bioinformatic analyses based on sequence collections such as GenBank nr, which are largely comprised of sequences from known organisms, tend to ignore a majority of sequences within most shotgun viral metagenome libraries. Here we describe a bioinformatic pipeline, the Viral Informatics Resource for Metagenome Exploration (VIROME), that emphasizes the classification of viral metagenome sequences (predicted open-reading frames) based on homology search results against both known and environmental sequences. Functional and taxonomic information is derived from five annotated sequence databases which are linked to the UniRef 100 database. Environmental classifications are obtained from hits against a custom database, MetaGenomes On-Line, which contains 49 million predicted environmental peptides. Each predicted viral metagenomic ORF run through the VIROME pipeline is placed into one of seven ORF classes, thus, every sequence receives a meaningful annotation. Additionally, the pipeline includes quality control measures to remove contaminating and poor quality sequence and assesses the potential amount of cellular DNA contamination in a viral metagenome library by screening for rRNA genes. Access to the VIROME pipeline and analysis results are provided through a web-application interface that is dynamically linked to a relational back-end database. The VIROME web-application interface is designed to allow users flexibility in retrieving sequences (reads, ORFs, predicted peptides) and search results for focused secondary analyses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号