Mesh : Algorithms Amino Acid Sequence / genetics Databases, Protein Humans Proteins / genetics Sequence Alignment / methods Sequence Analysis, Protein / methods Sequence Analysis, RNA Sequence Homology Software

来  源:   DOI:10.1093/nar/gkz342   PDF(Sci-hub)   PDF(Pubmed)

Abstract:
Here, we describe a web server that integrates structural alignments with the MAFFT multiple sequence alignment (MSA) tool. For this purpose, we have prepared a web-based Database of Aligned Structural Homologs (DASH), which provides structural alignments at the domain and chain levels for all proteins in the Protein Data Bank (PDB), and can be queried interactively or by a simple REST-like API. MAFFT-DASH integration can be invoked with a single flag on either the web (https://mafft.cbrc.jp/alignment/server/) or command-line versions of MAFFT. In our benchmarks using 878 cases from the BAliBase, HomFam, OXFam, Mattbench and SISYPHUS datasets, MAFFT-DASH showed 10-20% improvement over standard MAFFT for MSA problems with weak similarity, in terms of Sum-of-Pairs (SP), a measure of how well a program succeeds at aligning input sequences in comparison to a reference alignment. When MAFFT alignments were supplemented with homologous sequences, further improvement was observed. Potential applications of DASH beyond MSA enrichment include functional annotation through detection of remote homology and assembly of template libraries for homology modeling.
摘要:
这里,我们描述了一个Web服务器,它将结构对齐与MAFFT多序列对齐(MSA)工具集成在一起。为此,我们已经准备了一个基于网络的对齐结构同源数据库(DASH),它为蛋白质数据库(PDB)中的所有蛋白质提供了结构域和链水平的结构比对,并且可以以交互方式或通过简单的REST类API进行查询。MAFFT-DASH集成可以在Web上使用单个标志调用(https://mafft。银监会。jp/alignment/server/)或MAFFT的命令行版本。在我们使用BAliBase的878个案例的基准测试中,HomFam,OXFam,Mattbench和SISYPHUS数据集,对于相似性较弱的MSA问题,MAFFT-DASH比标准MAFFT提高了10-20%,就配对总和(SP)而言,与参考比对相比,程序在比对输入序列方面的成功程度的度量。当MAFFT比对补充同源序列时,观察到进一步的改善。超越MSA富集的DASH的潜在应用包括通过检测远程同源性和组装用于同源性建模的模板文库的功能注释。
公众号