COVID-19大流行导致了一项大规模的全球努力,即从患者样本中对SARS-CoV-2基因组进行测序,以跟踪病毒的进化并告知公共卫生反应。数以百万计的SARS-CoV-2基因组序列已保存在全球公共存储库中。加拿大COVID-19基因组学网络(CanCOGeN-VirusSeq),一个财团的任务是在大流行早期协调整个加拿大的SARS-CoV-2基因组的扩展测序,创建了加拿大VirusSeq数据门户,与相关的数据管道和程序,支持这些努力。VirusSeq的目标是允许开放获取加拿大SARS-CoV-2基因组序列,并增强,标准化的上下文数据,这些数据在其他存储库中不可用,并且符合FAIR标准(Findable,可访问,可互操作和可重用)。门户数据提交管道包含数据质量检查过程和数据生成器的适当确认,以鼓励协作。在这里我们也突出多唐,一个网络平台,提供加拿大流行和新兴的SARS-CoV-2变种的基因组流行病学和建模分析。多唐显示了加拿大和各省SARS-CoV-2变异组成的动态变化,估计变异增长,并显示互补的交互式可视化,用文字概述了当前的情况。VirusSeq数据门户和Duotang资源,除了从门户网站计算的其他分析和资源(COVID-MVP,CoVizu),都是开源的,免费提供。一起,他们提供了SARS-CoV-2进化的最新图片,以激发科学讨论,告知公众话语,并支持与公共卫生当局以及公共卫生当局内部的沟通。它们还作为其他有兴趣开放的司法管辖区的框架,协作序列数据共享和分析。
The COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts. The goal of VirusSeq was to allow open access to Canadian SARS-CoV-2 genomic sequences and enhanced, standardized contextual data that were unavailable in other repositories and that meet FAIR standards (Findable, Accessible, Interoperable and Reusable). In addition, the Portal data submission pipeline contains data quality checking procedures and appropriate acknowledgement of data generators that encourages collaboration. From inception to execution, the portal was developed with a conscientious focus on strong data governance principles and practices. Extensive efforts ensured a commitment to Canadian privacy laws, data security standards, and organizational processes. This Portal has been coupled with other resources like Viral AI and was further leveraged by the Coronavirus Variants Rapid Response Network (CoVaRR-Net) to produce a suite of continually updated analytical tools and notebooks. Here we highlight this Portal, including its contextual data not available elsewhere, and the \'Duotang\', a web platform that presents key genomic epidemiology and modeling analyses on circulating and emerging SARS-CoV-2 variants in Canada. Duotang presents dynamic changes in variant composition of SARS-CoV-2 in Canada and by province, estimates variant growth, and displays complementary interactive visualizations, with a text overview of the current situation. The VirusSeq Data Portal and Duotang resources, alongside additional analyses and resources computed from the Portal (COVID-MVP, CoVizu), are all open-source and freely available. Together, they provide an updated picture of SARS-CoV-2 evolution to spur scientific discussions, inform public discourse, and support communication with and within public health authorities. They also serve as a framework for other jurisdictions interested in open, collaborative sequence data sharing and analyses.