关键词: biodiversity cloud computing imputation plants research data commons

Mesh : Ecosystem Genotype Computational Biology Software

来  源:   DOI:10.1515/jib-2022-0033   PDF(Pubmed)

Abstract:
Over the last years it has been observed that the progress in data collection in life science has created increasing demand and opportunities for advanced bioinformatics. This includes data management as well as the individual data analysis and often covers the entire data life cycle. A variety of tools have been developed to store, share, or reuse the data produced in the different domains such as genotyping. Especially imputation, as a subfield of genotyping, requires good Research Data Management (RDM) strategies to enable use and re-use of genotypic data. To aim for sustainable software, it is necessary to develop tools and surrounding ecosystems, which are reusable and maintainable. Reusability in the context of streamlined tools can e.g. be achieved by standardizing the input and output of the different tools and adapting to open and broadly used file formats. By using such established file formats, the tools can also be connected with others, improving the overall interoperability of the software. Finally, it is important to build strong communities that maintain the tools by developing and contributing new features and maintenance updates. In this article, concepts for this will be presented for an imputation service.
摘要:
在过去的几年中,人们观察到生命科学数据收集的进展为先进的生物信息学创造了越来越多的需求和机会。这包括数据管理以及个人数据分析,通常涵盖整个数据生命周期。已经开发了各种工具来存储,share,或重用在不同领域产生的数据,如基因分型。尤其是归责,作为基因分型的一个子领域,需要良好的研究数据管理(RDM)策略,以实现基因型数据的使用和重用。为了实现可持续软件,有必要开发工具和周围的生态系统,它们是可重用和可维护的。流线型工具上下文中的可重用性可以例如通过标准化不同工具的输入和输出并适应开放和广泛使用的文件格式来实现。通过使用这种已建立的文件格式,这些工具也可以与其他人连接,提高软件的整体互操作性。最后,重要的是建立强大的社区,通过开发和提供新功能和维护更新来维护工具。在这篇文章中,这方面的概念将针对归因服务提出。
公众号