software benchmark

  • 文章类型: Journal Article
    宏基因组群落分析,在测序技术持续发展的推动下,正在迅速提供微生物学许多方面的见解,并成为基石工具。Illumina,牛津纳米孔技术(ONT)和太平洋生物科学(PacBio)是领先的技术,每个人都有自己的优点和缺点。Illumina以低成本提供准确的读数,但是它们的长度太短,无法关闭细菌基因组。长读克服了这个限制,但这些技术产生的读取精度较低(ONT)或吞吐量较低(PacBio高保真读取)。在关键的第一步分析中,读段被组装以重建群落内的基因组或单个基因。然而,到目前为止,现有组装器的性能从未受到复杂模拟宏基因组的挑战。这里,我们评估当前使用short,在由227个具有不同程度相关性的细菌菌株组成的复杂模拟宏基因组上,长或两者都是阅读类型。我们表明,许多当前的组装者不适合处理这种复杂的宏基因组。此外,混合组件不能发挥其潜力。我们得出的结论是,用CANU组装的ONT读段和用SPAdes组装的Illumina读段为重建复杂宏基因组的基因组和个体基因提供了最佳价值。分别。
    Metagenome community analyses, driven by the continued development in sequencing technology, is rapidly providing insights in many aspects of microbiology and becoming a cornerstone tool. Illumina, Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PacBio) are the leading technologies, each with their own advantages and drawbacks. Illumina provides accurate reads at a low cost, but their length is too short to close bacterial genomes. Long reads overcome this limitation, but these technologies produce reads with lower accuracy (ONT) or with lower throughput (PacBio high-fidelity reads). In a critical first analysis step, reads are assembled to reconstruct genomes or individual genes within the community. However, to date, the performance of existing assemblers has never been challenged with a complex mock metagenome. Here, we evaluate the performance of current assemblers that use short, long or both read types on a complex mock metagenome consisting of 227 bacterial strains with varying degrees of relatedness. We show that many of the current assemblers are not suited to handle such a complex metagenome. In addition, hybrid assemblies do not fulfil their potential. We conclude that ONT reads assembled with CANU and Illumina reads assembled with SPAdes offer the best value for reconstructing genomes and individual genes of complex metagenomes, respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Since the seminal work of Cundall and Strack (1979), the Discrete Element Method (DEM) has now become accepted as a key tool amongst researchers exploring the fundamental behavior of granular materials. Along with a sustained increase in the number of publications documenting use of DEM in research, intensive development of new open-source and commercial DEM codes has taken place in the last decades. The credibility of these software packages depends on their capacity to replicate physical observations and to reproduce theoretical expressions. Researchers often calibrate DEM codes against laboratory data to gain confidence about their predictions, however, theoretical verifications at the macro and particle levels are often omitted or not explicitly documented or acknowledged. The validation of DEM codes against theoretical expressions is fundamental to guarantee reproducibility and generality of the software, and to avoid bias in more complex simulations. In this article, a dataset providing numerical simulation data along with input files is presented. The dataset relates to a series of theoretical validation approaches, previously documented in the literature, that were here applied to verify the open-source DEM code LAMMPS. The ability of LAMMPS to capture the macroscopic behaviour of granular packages is evaluated by shearing a face-center-cubic (FCC) array of monosized spheres. The calculation of particle translational/rotational motions and forces/torques is checked by considering a clump rolling down an inclined plane. Additionally, the stress-strain behavior of Toyoura sand under \"drained\" and \"undrained\" shearing is characterized by a series of LAMMPS outputs. The dataset collected from these simulations can be employed by users to benchmark new or existing DEM codes. Both the LAMMPS input scripts and the simulation results for all the cases are available in a public repository.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号