关键词: bioinformatics cloud computing data Science education training

Mesh : Cloud Computing Software Computational Biology / methods Programming Languages High-Throughput Nucleotide Sequencing / methods Genomics / methods Humans

来  源:   DOI:10.1093/bib/bbae244   PDF(Pubmed)

Abstract:
This manuscript describes the development of a resource module that is part of a learning platform named \'NIGMS Sandbox for Cloud-based Learning\', https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial authored by National Institute of General Medical Sciences: NIGMS Sandbox: A Learning Platform toward Democratizing Cloud Computing for Biomedical Research at the beginning of this supplement. This module delivers learning materials introducing the utility of the BASH (Bourne Again Shell) programming language for genomic data analysis in an interactive format that uses appropriate cloud resources for data access and analyses. The next-generation sequencing revolution has generated massive amounts of novel biological data from a multitude of platforms that survey an ever-growing list of genomic modalities. These data require significant downstream computational and statistical analyses to glean meaningful biological insights. However, the skill sets required to generate these data are vastly different from the skills required to analyze these data. Bench scientists that generate next-generation data often lack the training required to perform analysis of these datasets and require support from bioinformatics specialists. Dedicated computational training is required to empower biologists in the area of genomic data analysis, however, learning to efficiently leverage a command line interface is a significant barrier in learning how to leverage common analytical tools. Cloud platforms have the potential to democratize access to the technical tools and computational resources necessary to work with modern sequencing data, providing an effective framework for bioinformatics education. This module aims to provide an interactive platform that slowly builds technical skills and knowledge needed to interact with genomics data on the command line in the Cloud. The sandbox format of this module enables users to move through the material at their own pace and test their grasp of the material with knowledge self-checks before building on that material in the next sub-module. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning\'\' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.
摘要:
本手稿描述了一个资源模块的开发,该模块是名为“基于云的学习的NIGMSSandbox”的学习平台的一部分,https://github.com/NIGMS/NIGMS-Sandbox。沙箱的整体起源在本补编开始时由美国国家普通医学科学研究所撰写的社论中描述:NIGMS沙箱:面向生物医学研究的民主化云计算的学习平台。该模块以交互式格式提供学习材料,介绍BASH(BourneAgainShell)编程语言用于基因组数据分析的实用性,该语言使用适当的云资源进行数据访问和分析。下一代测序革命已经从众多平台中产生了大量新的生物学数据,这些平台调查了不断增长的基因组模式列表。这些数据需要大量的下游计算和统计分析来收集有意义的生物学见解。然而,生成这些数据所需的技能集与分析这些数据所需的技能大不相同。生成下一代数据的科学家通常缺乏对这些数据集进行分析所需的培训,并且需要生物信息学专家的支持。需要专门的计算培训,以增强生物学家在基因组数据分析领域的能力,然而,学习如何有效地利用命令行界面是学习如何利用常用分析工具的一个重要障碍。云平台有可能使使用现代测序数据所需的技术工具和计算资源的访问民主化。为生物信息学教育提供有效的框架。该模块旨在提供一个交互式平台,该平台可以缓慢地构建与云中的命令行上的基因组学数据进行交互所需的技术技能和知识。该模块的沙箱格式使用户能够按照自己的步调浏览材料,并在下一个子模块中对该材料进行构建之前,通过知识自我检查来测试他们对材料的掌握。本手稿描述了资源模块的开发,该模块是名为“NIGMSSandboxforCloud-basedLearning\'\'https://github.com/NIGMS/NIGMS-Sandbox”的学习平台的一部分。沙箱的整体起源在本补编开头的社论NIGMS沙箱[1]中进行了描述。该模块以交互式格式提供有关批量和单细胞ATAC-seq数据分析的学习材料,该格式使用适当的云资源进行数据访问和分析。
公众号