RESULTS: Here, we present the Omics Dataset Curation Toolkit (OMD Curation Toolkit), a python3 package designed to accompany and guide the researcher during the curation process of metadata and fastq files of public omics datasets. This workflow provides a standardized framework with multiple capabilities (collection, control check, treatment and integration) to facilitate the arduous task of curating public sequencing data projects. While centered on the European Nucleotide Archive (ENA), the majority of the provided tools are generic and can be used to curate datasets from different sources.
CONCLUSIONS: Thus, it offers valuable tools for the in-house curation previously needed to re-use public omics data. Due to its workflow structure and capabilities, it can be easily used and benefit investigators in developing novel omics meta-analyses based on sequencing data.
结果:这里,我们介绍了OMD固化工具包(OMD固化工具包),一个python3软件包,旨在在公共组学数据集的元数据和fastq文件的策展过程中陪伴和指导研究人员。此工作流提供了具有多种功能(集合,控制检查,处理和整合),以促进策划公共测序数据项目的艰巨任务。虽然以欧洲核苷酸档案(ENA)为中心,提供的大多数工具都是通用的,可用于管理来自不同来源的数据集。
结论:因此,它为以前重新使用公共组学数据所需的内部策展提供了有价值的工具。由于其工作流结构和功能,在基于测序数据开发新的组学荟萃分析中,它可以很容易地使用,并使研究者受益.