GenomeTrakr

  • 文章类型: Journal Article
    本文件概述了组装和提交有助于全球肠道病原体基因组监测所需的标准数据包所需的步骤。尽管针对GenomeTrakr实验室和合作者,这些方案广泛适用于为不同目的收集的肠道病原体。本章包含五种方案:(1)基因组序列数据的质量控制(QC)评估,(2)上下文数据的验证,(3)将标准病原体包或病原体数据对象模型(DOM)的数据提交到公共存储库,(4)在NCBI查看和查询数据,(5)维护公共数据相关性的数据管理。数据可通过国际核苷酸序列数据库联盟(INSDC)成员之一获得,国家生物技术信息中心(NCBI)是本文件的主要重点。NCBI病原体检测是NCBI的自定义仪表板,可轻松访问病原体数据以及标准的自动化聚类和基因分型分析套件的结果,这对于告知公共卫生和监管决策至关重要。
    This document outlines the steps necessary to assemble and submit the standard data package required for contributing to the global genomic surveillance of enteric pathogens. Although targeted to GenomeTrakr laboratories and collaborators, these protocols are broadly applicable for enteric pathogens collected for different purposes. There are five protocols included in this chapter: (1) quality control (QC) assessment for the genome sequence data, (2) validation for the contextual data, (3) data submission for the standard pathogen package or Pathogen Data Object Model (DOM) to the public repository, (4) viewing and querying data at NCBI, and (5) data curation for maintaining relevance of public data. The data are available through one of the International Nucleotide Sequence Database Consortium (INSDC) members, with the National Center for Biotechnology Information (NCBI) being the primary focus of this document. NCBI Pathogen Detection is a custom dashboard at NCBI that provides easy access to pathogen data plus results for a standard suite of automated cluster and genotyping analyses important for informing public health and regulatory decision-making.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    废水监测已成为人口水平病原体监测的重要公共卫生工具。在2021年美国救援计划法案的资助下,FDA的基因组流行病学计划,GenomeTrakr,被用来从美国各地的污水处理厂对SARS-CoV-2进行测序。这一举措需要评估,优化,发展,并发布新的方法和分析工具,通过变异分析进行样本收集。开发了该过程每个步骤的版本控制协议,并在protocols.io上发布。构建了自定义数据分析工具和可公开访问的仪表板,以促进对收集的数据进行实时可视化,重点关注整个项目中不同样本和地点的SARS-CoV-2变体和亚谱系的相对丰度。从2021年9月到2023年6月,共收集了3389个废水样本,在BioProject的保护下,有2,517个正在进行测序并提交给NCBI,PRJNA757291。在所有序列记录上使用明确的质量控制(QC)标签发布序列数据,传达我们对数据质量的信心。变异分析显示,在2021年秋季,Delta的广泛循环,并在采样期结束时捕获了Omicron的扫描以及该谱系的随后多样化。该项目成功实现了FDAGenomeTrakr计划的两个重要目标:第一,为SARS-CoV-2大流行反应提供及时的基因组数据,第二,建立独立于文化的能力和最佳实践,对FDA感兴趣的其他病原体进行人群级环境监测。
    目的:本文服务于两个主要目的。首先,它总结了在新冠肺炎大流行应对项目期间收集的基因组和背景数据,利用FDA的实验室网络,传统上用于对食源性病原体进行测序,用于对废水样品中的SARS-CoV-2进行测序。第二,它概述了收集和组织为无文化收集的群体级下一代测序(NGS)数据的最佳实践,监测来自环境样本的病原体。
    Wastewater surveillance has emerged as a crucial public health tool for population-level pathogen surveillance. Supported by funding from the American Rescue Plan Act of 2021, the FDA\'s genomic epidemiology program, GenomeTrakr, was leveraged to sequence SARS-CoV-2 from wastewater sites across the United States. This initiative required the evaluation, optimization, development, and publication of new methods and analytical tools spanning sample collection through variant analyses. Version-controlled protocols for each step of the process were developed and published on protocols.io. A custom data analysis tool and a publicly accessible dashboard were built to facilitate real-time visualization of the collected data, focusing on the relative abundance of SARS-CoV-2 variants and sub-lineages across different samples and sites throughout the project. From September 2021 through June 2023, a total of 3,389 wastewater samples were collected, with 2,517 undergoing sequencing and submission to NCBI under the umbrella BioProject, PRJNA757291. Sequence data were released with explicit quality control (QC) tags on all sequence records, communicating our confidence in the quality of data. Variant analysis revealed wide circulation of Delta in the fall of 2021 and captured the sweep of Omicron and subsequent diversification of this lineage through the end of the sampling period. This project successfully achieved two important goals for the FDA\'s GenomeTrakr program: first, contributing timely genomic data for the SARS-CoV-2 pandemic response, and second, establishing both capacity and best practices for culture-independent, population-level environmental surveillance for other pathogens of interest to the FDA.
    OBJECTIVE: This paper serves two primary objectives. First, it summarizes the genomic and contextual data collected during a Covid-19 pandemic response project, which utilized the FDA\'s laboratory network, traditionally employed for sequencing foodborne pathogens, for sequencing SARS-CoV-2 from wastewater samples. Second, it outlines best practices for gathering and organizing population-level next generation sequencing (NGS) data collected for culture-free, surveillance of pathogens sourced from environmental samples.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    BACKGROUND: Processing and analyzing whole genome sequencing (WGS) is computationally intense: a single Illumina MiSeq WGS run produces ~ 1 million 250-base-pair reads for each of 24 samples. This poses significant obstacles for smaller laboratories, or laboratories not affiliated with larger projects, which may not have dedicated bioinformatics staff or computing power to effectively use genomic data to protect public health. Building on the success of the cloud-based Galaxy bioinformatics platform ( http://galaxyproject.org ), already known for its user-friendliness and powerful WGS analytical tools, the Center for Food Safety and Applied Nutrition (CFSAN) at the U.S. Food and Drug Administration (FDA) created a customized \'instance\' of the Galaxy environment, called GalaxyTrakr ( https://www.galaxytrakr.org ), for use by laboratory scientists performing food-safety regulatory research. The goal was to enable laboratories outside of the FDA internal network to (1) perform quality assessments of sequence data, (2) identify links between clinical isolates and positive food/environmental samples, including those at the National Center for Biotechnology Information sequence read archive ( https://www.ncbi.nlm.nih.gov/sra/ ), and (3) explore new methodologies such as metagenomics. GalaxyTrakr hosts a variety of free and adaptable tools and provides the data storage and computing power to run the tools. These tools support coordinated analytic methods and consistent interpretation of results across laboratories. Users can create and share tools for their specific needs and use sequence data generated locally and elsewhere.
    RESULTS: In its first full year (2018), GalaxyTrakr processed over 85,000 jobs and went from 25 to 250 users, representing 53 different public and state health laboratories, academic institutions, international health laboratories, and federal organizations. By mid-2020, it has grown to 600 registered users and processed over 450,000 analytical jobs. To illustrate how laboratories are making use of this resource, we describe how six institutions use GalaxyTrakr to quickly analyze and review their data. Instructions for participating in GalaxyTrakr are provided.
    CONCLUSIONS: GalaxyTrakr advances food safety by providing reliable and harmonized WGS analyses for public health laboratories and promoting collaboration across laboratories with differing resources. Anticipated enhancements to this resource will include workflows for additional foodborne pathogens, viruses, and parasites, as well as new tools and services.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    一个健康的整体方法,看到人类,动物,植物,和环境卫生作为一个单位,而不是离散的部分,不仅需要跨学科合作,但是通信和归档数据的标准化方法,使参与者能够轻松地分享他们所学到的东西,并允许其他人在他们的发现的基础上再接再厉。NCBI和GenomeTrakr项目正在进行的工作说明了开放数据平台如何帮助满足联邦和州监管机构的需求。公共卫生实验室,农业部门,和大学。在这里,我们描述了如何通过开放访问数据库以及贡献者遵循的最佳实践来改变微生物病原体监测。首先,我们描述了开放的病原体监测框架,托管在NCBI平台上。我们涵盖了WGS质量的当前社区标准,提供SOP以评估您自己的序列质量,并为所有提交者推荐QC阈值。然后,我们提供NCBI数据提交的概述以及一步一步的细节。最后,我们提供策展指导和SOP,以使您的公共数据在数据库中保持最新状态。这些最佳实践可以是其他开放数据项目的模型,从而推进“一个健康”目标,可访问,互操作和可重用(FAIR)数据。
    The holistic approach of One Health, which sees human, animal, plant, and environmental health as a unit, rather than discrete parts, requires not only interdisciplinary cooperation, but standardized methods for communicating and archiving data, enabling participants to easily share what they have learned and allow others to build upon their findings. Ongoing work by NCBI and the GenomeTrakr project illustrates how open data platforms can help meet the needs of federal and state regulators, public health laboratories, departments of agriculture, and universities. Here we describe how microbial pathogen surveillance can be transformed by having an open access database along with Best Practices for contributors to follow. First, we describe the open pathogen surveillance framework, hosted on the NCBI platform. We cover the current community standards for WGS quality, provide an SOP for assessing your own sequence quality and recommend QC thresholds for all submitters to follow. We then provide an overview of NCBI data submission along with step by step details. And finally, we provide curation guidance and an SOP for keeping your public data current within the database. These Best Practices can be models for other open data projects, thereby advancing the One Health goals of Findable, Accessible, Interoperable and Re-usable (FAIR) data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    We review how FDA surveillance identifies several ways that whole genome sequencing (WGS) improves actionable outcomes for public health and compliance in a case involving Listeria monocytogenes contamination in an ice cream facility. In late August 2017 FDA conducted environmental sampling inside an ice cream facility. These isolates were sequenced and deposited into the GenomeTrakr databases. In September 2018 the Centers for Disease Control and Prevention contacted the Florida Department of Health after finding that the pathogen analyses of three clinical cases of listeriosis (two in 2013, one in 2018) were highly related to the aforementioned L. monocytogenes isolates collected from the ice cream facility. in 2017. FDA returned to the ice cream facility in late September 2018 and conducted further environmental sampling and again recovered L. monocytogenes from environmental subsamples that were genetically related to the clinical cases. A voluntary recall was issued to include all ice cream manufactured from August 2017 to October 2018. Subsequently, FDA suspended this food facility\'s registration. WGS results for L. monocytogenes found in the facility and from clinical samples clustered together by 0-31 single nucleotide polymorphisms (SNPs). The FDA worked together with the Centers for Disease Control and Prevention, as well as the Florida Department of Health, and the Florida Department of Agriculture and Consumer Services to recall all ice cream products produced by this facility. Our data suggests that when available isolates from food facility inspections are subject to whole genome sequencing and the subsequent sequence data point to linkages between these strains and recent clinical isolates (i.e., <20 nucleotide differences), compliance officials should take regulatory actions early to prevent further potential illness. The utility of WGS for applications related to enforcement of FDA compliance programs in the context of foodborne pathogens is reviewed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Describing baseline microbiota associated with agricultural commodities in the field is an important step towards improving our understanding of a wide range of important objectives from plant pathology and horticultural sustainability, to food safety. Environmental pressures on plants (wind, dust, drought, water, temperature) vary by geography and characterizing the impact of these variable pressures on phyllosphere microbiota will contribute to improved stewardship of fresh produce for both plant and human health. A higher resolution understanding of the incidence of human pathogens on food plants and co-occurring phytobiota using metagenomic approaches (metagenome tracking) may contribute to improved source attribution and risk assessment in cases where human pathogens become introduced to agro-ecologies. Between 1990 and 2007, as many as 1990 culture-confirmed Salmonella illnesses were linked to tomatoes from as many as 12 multistate outbreaks (Bell et al., 2012; Bell et al., 2015; Bennett et al., 2014; CDC, 2004; CDC, 2007; Greene et al., 2005a; Gruszynski et al., 2014). When possible, source attribution for these incidents revealed a biogeographic trend, most events were associated with eastern growing regions. To improve our understanding of potential biogeographically linked trends in contamination of tomatoes by Salmonella, we profiled microbiota from the surfaces of tomatoes from Virginia, Maryland, North Carolina and California. Bacterial profiles from California tomatoes were completely different than those of Maryland, Virginia and North Carolina (which were highly similar to each other). A statistically significant enrichment of Firmicutes taxa was observed in California phytobiota compared to the three eastern states. Rhizobiaceae, Sphingobacteriaceae and Xanthobacteraceae were the most abundant bacterial families associated with tomatoes grown in eastern states. These baseline metagenomic profiles of phyllosphere microbiota may contribute to improved understanding of how certain ecologies provide supportive resources for human pathogens on plants and how components of certain agro-ecologies may play a role in the introduction of human pathogens to plants.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    This protocol outlines the all the steps necessary to become a GenomeTrakr data contributor. GenomeTrakr is an international genomic reference database of mostly food and environmental isolates from foodborne pathogens. The data and analyses are housed at the National Center for Biotechnology Information (NCBI), which is a database freely available to anyone in the world. The Pathogen Detection browser at NCBI computes daily cluster results adding the newly submitted data to the existing phylogenetic clusters of closely related genomes. Contributors to this database can see how their new isolates are related to the real-time foodborne pathogen surveillance program established in the USA and a few other countries, and at the same time adding valuable new data to the reference database.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Using whole-genome sequence (WGS) data from the GenomeTrakr network, a globally distributed network of laboratories sequencing foodborne pathogens, we present a new phylogeny of Salmonella enterica comprising 445 isolates from 266 distinct serovars and originating from 52 countries. This phylogeny includes two previously unidentified S. enterica subsp. enterica clades. Serovar Typhi is shown to be nested within clade A. Our findings are supported by both phylogenetic support, based on a core genome alignment, and Bayesian approaches, based on single-nucleotide polymorphisms. Serovar assignments were refined by in silico analysis using SeqSero. More than 10% of serovars were either polyphyletic or paraphyletic. We found variable genetic content in these isolates relating to gene mobilization and virulence factors which have different distributions within clades. Gifsy-1- and Gifsy-2-like phages appear more prevalent in clade A; other viruses are more evenly distributed. Our analyses reveal IncFII is the predominant plasmid replicon in S. enterica Few core or clade-defining virulence genes are observed, and their distributions appear probabilistic in nature. Together, these patterns demonstrate that genetic exchange within S. enterica is more extensive and frequent than previously realized, which significantly alters how we view the genetic structure of the bacterial species.IMPORTANCE Rapid improvements in nucleotide sequencing access and affordability have led to a drastic increase in availability of genetic information. This information will improve the accuracy of molecular descriptions, including serovars, within S. enterica Although the concept of serovars continues to be useful, it may have more significant limitations than previously understood. Furthermore, the discrete absence or presence of specific genes can be an unstable indicator of phylogenetic identity. Whole-genome sequencing provides more rigorous tools for assessing the distributions of these genes. Our phylogenetic and genetic content analyses reveal how active genetic elements are dynamically distributed within a species, allowing us to better understand genetic reservoirs and underlying bacterial evolution.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Pathogen monitoring is becoming more precise as sequencing technologies become more affordable and accessible worldwide. This transition is especially apparent in the field of food safety, which has demonstrated how whole-genome sequencing (WGS) can be used on a global scale to protect public health. GenomeTrakr coordinates the WGS performed by public-health agencies and other partners by providing a public database with real-time cluster analysis for foodborne pathogen surveillance. Because WGS is being used to support enforcement decisions, it is essential to have confidence in the quality of the data being used and the downstream data analyses that guide these decisions. Routine proficiency tests, such as the one described here, have an important role in ensuring the validity of both data and procedures. In 2015, the GenomeTrakr proficiency test distributed eight isolates of common foodborne pathogens to participating laboratories, who were required to follow a specific protocol for performing WGS. Resulting sequence data were evaluated for several metrics, including proper labelling, sequence quality and new single nucleotide polymorphisms (SNPs). Illumina MiSeq sequence data collected for the same set of strains across 21 different laboratories exhibited high reproducibility, while revealing a narrow range of technical and biological variance. The numbers of SNPs reported for sequencing runs of the same isolates across multiple laboratories support the robustness of our cluster analysis pipeline in that each individual isolate cultured and resequenced multiple times in multiple places are all easily identifiable as originating from the same source.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号