
  • 文章类型: Journal Article
    This document outlines the steps necessary to assemble and submit the standard data package required for contributing to the global genomic surveillance of enteric pathogens. Although targeted to GenomeTrakr laboratories and collaborators, these protocols are broadly applicable for enteric pathogens collected for different purposes. There are five protocols included in this chapter: (1) quality control (QC) assessment for the genome sequence data, (2) validation for the contextual data, (3) data submission for the standard pathogen package or Pathogen Data Object Model (DOM) to the public repository, (4) viewing and querying data at NCBI, and (5) data curation for maintaining relevance of public data. The data are available through one of the International Nucleotide Sequence Database Consortium (INSDC) members, with the National Center for Biotechnology Information (NCBI) being the primary focus of this document. NCBI Pathogen Detection is a custom dashboard at NCBI that provides easy access to pathogen data plus results for a standard suite of automated cluster and genotyping analyses important for informing public health and regulatory decision-making.






  • 文章类型: Journal Article
    Wastewater surveillance has emerged as a crucial public health tool for population-level pathogen surveillance. Supported by funding from the American Rescue Plan Act of 2021, the FDA\'s genomic epidemiology program, GenomeTrakr, was leveraged to sequence SARS-CoV-2 from wastewater sites across the United States. This initiative required the evaluation, optimization, development, and publication of new methods and analytical tools spanning sample collection through variant analyses. Version-controlled protocols for each step of the process were developed and published on A custom data analysis tool and a publicly accessible dashboard were built to facilitate real-time visualization of the collected data, focusing on the relative abundance of SARS-CoV-2 variants and sub-lineages across different samples and sites throughout the project. From September 2021 through June 2023, a total of 3,389 wastewater samples were collected, with 2,517 undergoing sequencing and submission to NCBI under the umbrella BioProject, PRJNA757291. Sequence data were released with explicit quality control (QC) tags on all sequence records, communicating our confidence in the quality of data. Variant analysis revealed wide circulation of Delta in the fall of 2021 and captured the sweep of Omicron and subsequent diversification of this lineage through the end of the sampling period. This project successfully achieved two important goals for the FDA\'s GenomeTrakr program: first, contributing timely genomic data for the SARS-CoV-2 pandemic response, and second, establishing both capacity and best practices for culture-independent, population-level environmental surveillance for other pathogens of interest to the FDA.
    OBJECTIVE: This paper serves two primary objectives. First, it summarizes the genomic and contextual data collected during a Covid-19 pandemic response project, which utilized the FDA\'s laboratory network, traditionally employed for sequencing foodborne pathogens, for sequencing SARS-CoV-2 from wastewater samples. Second, it outlines best practices for gathering and organizing population-level next generation sequencing (NGS) data collected for culture-free, surveillance of pathogens sourced from environmental samples.






  • 文章类型: Journal Article
    BACKGROUND: Processing and analyzing whole genome sequencing (WGS) is computationally intense: a single Illumina MiSeq WGS run produces ~ 1 million 250-base-pair reads for each of 24 samples. This poses significant obstacles for smaller laboratories, or laboratories not affiliated with larger projects, which may not have dedicated bioinformatics staff or computing power to effectively use genomic data to protect public health. Building on the success of the cloud-based Galaxy bioinformatics platform ( ), already known for its user-friendliness and powerful WGS analytical tools, the Center for Food Safety and Applied Nutrition (CFSAN) at the U.S. Food and Drug Administration (FDA) created a customized \'instance\' of the Galaxy environment, called GalaxyTrakr ( ), for use by laboratory scientists performing food-safety regulatory research. The goal was to enable laboratories outside of the FDA internal network to (1) perform quality assessments of sequence data, (2) identify links between clinical isolates and positive food/environmental samples, including those at the National Center for Biotechnology Information sequence read archive ( ), and (3) explore new methodologies such as metagenomics. GalaxyTrakr hosts a variety of free and adaptable tools and provides the data storage and computing power to run the tools. These tools support coordinated analytic methods and consistent interpretation of results across laboratories. Users can create and share tools for their specific needs and use sequence data generated locally and elsewhere.
    RESULTS: In its first full year (2018), GalaxyTrakr processed over 85,000 jobs and went from 25 to 250 users, representing 53 different public and state health laboratories, academic institutions, international health laboratories, and federal organizations. By mid-2020, it has grown to 600 registered users and processed over 450,000 analytical jobs. To illustrate how laboratories are making use of this resource, we describe how six institutions use GalaxyTrakr to quickly analyze and review their data. Instructions for participating in GalaxyTrakr are provided.
    CONCLUSIONS: GalaxyTrakr advances food safety by providing reliable and harmonized WGS analyses for public health laboratories and promoting collaboration across laboratories with differing resources. Anticipated enhancements to this resource will include workflows for additional foodborne pathogens, viruses, and parasites, as well as new tools and services.







  • 文章类型: Journal Article
    The holistic approach of One Health, which sees human, animal, plant, and environmental health as a unit, rather than discrete parts, requires not only interdisciplinary cooperation, but standardized methods for communicating and archiving data, enabling participants to easily share what they have learned and allow others to build upon their findings. Ongoing work by NCBI and the GenomeTrakr project illustrates how open data platforms can help meet the needs of federal and state regulators, public health laboratories, departments of agriculture, and universities. Here we describe how microbial pathogen surveillance can be transformed by having an open access database along with Best Practices for contributors to follow. First, we describe the open pathogen surveillance framework, hosted on the NCBI platform. We cover the current community standards for WGS quality, provide an SOP for assessing your own sequence quality and recommend QC thresholds for all submitters to follow. We then provide an overview of NCBI data submission along with step by step details. And finally, we provide curation guidance and an SOP for keeping your public data current within the database. These Best Practices can be models for other open data projects, thereby advancing the One Health goals of Findable, Accessible, Interoperable and Re-usable (FAIR) data.







  • 文章类型: Journal Article
    We review how FDA surveillance identifies several ways that whole genome sequencing (WGS) improves actionable outcomes for public health and compliance in a case involving Listeria monocytogenes contamination in an ice cream facility. In late August 2017 FDA conducted environmental sampling inside an ice cream facility. These isolates were sequenced and deposited into the GenomeTrakr databases. In September 2018 the Centers for Disease Control and Prevention contacted the Florida Department of Health after finding that the pathogen analyses of three clinical cases of listeriosis (two in 2013, one in 2018) were highly related to the aforementioned L. monocytogenes isolates collected from the ice cream facility. in 2017. FDA returned to the ice cream facility in late September 2018 and conducted further environmental sampling and again recovered L. monocytogenes from environmental subsamples that were genetically related to the clinical cases. A voluntary recall was issued to include all ice cream manufactured from August 2017 to October 2018. Subsequently, FDA suspended this food facility\'s registration. WGS results for L. monocytogenes found in the facility and from clinical samples clustered together by 0-31 single nucleotide polymorphisms (SNPs). The FDA worked together with the Centers for Disease Control and Prevention, as well as the Florida Department of Health, and the Florida Department of Agriculture and Consumer Services to recall all ice cream products produced by this facility. Our data suggests that when available isolates from food facility inspections are subject to whole genome sequencing and the subsequent sequence data point to linkages between these strains and recent clinical isolates (i.e., <20 nucleotide differences), compliance officials should take regulatory actions early to prevent further potential illness. The utility of WGS for applications related to enforcement of FDA compliance programs in the context of foodborne pathogens is reviewed.






  • 文章类型: Journal Article
    Describing baseline microbiota associated with agricultural commodities in the field is an important step towards improving our understanding of a wide range of important objectives from plant pathology and horticultural sustainability, to food safety. Environmental pressures on plants (wind, dust, drought, water, temperature) vary by geography and characterizing the impact of these variable pressures on phyllosphere microbiota will contribute to improved stewardship of fresh produce for both plant and human health. A higher resolution understanding of the incidence of human pathogens on food plants and co-occurring phytobiota using metagenomic approaches (metagenome tracking) may contribute to improved source attribution and risk assessment in cases where human pathogens become introduced to agro-ecologies. Between 1990 and 2007, as many as 1990 culture-confirmed Salmonella illnesses were linked to tomatoes from as many as 12 multistate outbreaks (Bell et al., 2012; Bell et al., 2015; Bennett et al., 2014; CDC, 2004; CDC, 2007; Greene et al., 2005a; Gruszynski et al., 2014). When possible, source attribution for these incidents revealed a biogeographic trend, most events were associated with eastern growing regions. To improve our understanding of potential biogeographically linked trends in contamination of tomatoes by Salmonella, we profiled microbiota from the surfaces of tomatoes from Virginia, Maryland, North Carolina and California. Bacterial profiles from California tomatoes were completely different than those of Maryland, Virginia and North Carolina (which were highly similar to each other). A statistically significant enrichment of Firmicutes taxa was observed in California phytobiota compared to the three eastern states. Rhizobiaceae, Sphingobacteriaceae and Xanthobacteraceae were the most abundant bacterial families associated with tomatoes grown in eastern states. These baseline metagenomic profiles of phyllosphere microbiota may contribute to improved understanding of how certain ecologies provide supportive resources for human pathogens on plants and how components of certain agro-ecologies may play a role in the introduction of human pathogens to plants.






  • 文章类型: Journal Article
    This protocol outlines the all the steps necessary to become a GenomeTrakr data contributor. GenomeTrakr is an international genomic reference database of mostly food and environmental isolates from foodborne pathogens. The data and analyses are housed at the National Center for Biotechnology Information (NCBI), which is a database freely available to anyone in the world. The Pathogen Detection browser at NCBI computes daily cluster results adding the newly submitted data to the existing phylogenetic clusters of closely related genomes. Contributors to this database can see how their new isolates are related to the real-time foodborne pathogen surveillance program established in the USA and a few other countries, and at the same time adding valuable new data to the reference database.






  • 文章类型: Journal Article
    Using whole-genome sequence (WGS) data from the GenomeTrakr network, a globally distributed network of laboratories sequencing foodborne pathogens, we present a new phylogeny of Salmonella enterica comprising 445 isolates from 266 distinct serovars and originating from 52 countries. This phylogeny includes two previously unidentified S. enterica subsp. enterica clades. Serovar Typhi is shown to be nested within clade A. Our findings are supported by both phylogenetic support, based on a core genome alignment, and Bayesian approaches, based on single-nucleotide polymorphisms. Serovar assignments were refined by in silico analysis using SeqSero. More than 10% of serovars were either polyphyletic or paraphyletic. We found variable genetic content in these isolates relating to gene mobilization and virulence factors which have different distributions within clades. Gifsy-1- and Gifsy-2-like phages appear more prevalent in clade A; other viruses are more evenly distributed. Our analyses reveal IncFII is the predominant plasmid replicon in S. enterica Few core or clade-defining virulence genes are observed, and their distributions appear probabilistic in nature. Together, these patterns demonstrate that genetic exchange within S. enterica is more extensive and frequent than previously realized, which significantly alters how we view the genetic structure of the bacterial species.IMPORTANCE Rapid improvements in nucleotide sequencing access and affordability have led to a drastic increase in availability of genetic information. This information will improve the accuracy of molecular descriptions, including serovars, within S. enterica Although the concept of serovars continues to be useful, it may have more significant limitations than previously understood. Furthermore, the discrete absence or presence of specific genes can be an unstable indicator of phylogenetic identity. Whole-genome sequencing provides more rigorous tools for assessing the distributions of these genes. Our phylogenetic and genetic content analyses reveal how active genetic elements are dynamically distributed within a species, allowing us to better understand genetic reservoirs and underlying bacterial evolution.







  • 文章类型: Journal Article
    Pathogen monitoring is becoming more precise as sequencing technologies become more affordable and accessible worldwide. This transition is especially apparent in the field of food safety, which has demonstrated how whole-genome sequencing (WGS) can be used on a global scale to protect public health. GenomeTrakr coordinates the WGS performed by public-health agencies and other partners by providing a public database with real-time cluster analysis for foodborne pathogen surveillance. Because WGS is being used to support enforcement decisions, it is essential to have confidence in the quality of the data being used and the downstream data analyses that guide these decisions. Routine proficiency tests, such as the one described here, have an important role in ensuring the validity of both data and procedures. In 2015, the GenomeTrakr proficiency test distributed eight isolates of common foodborne pathogens to participating laboratories, who were required to follow a specific protocol for performing WGS. Resulting sequence data were evaluated for several metrics, including proper labelling, sequence quality and new single nucleotide polymorphisms (SNPs). Illumina MiSeq sequence data collected for the same set of strains across 21 different laboratories exhibited high reproducibility, while revealing a narrow range of technical and biological variance. The numbers of SNPs reported for sequencing runs of the same isolates across multiple laboratories support the robustness of our cluster analysis pipeline in that each individual isolate cultured and resequenced multiple times in multiple places are all easily identifiable as originating from the same source.





