关键词: Bayesian phylogeography COVID-19 SARS-CoV-2 genomics metadata public health

Mesh : Humans SARS-CoV-2 / genetics Phylogeography COVID-19 / epidemiology Bayes Theorem Metadata Pandemics Victoria

来  源:   DOI:10.1099/mgen.0.001099   PDF(Pubmed)

Abstract:
Inferring the spatiotemporal spread of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) via Bayesian phylogeography has been complicated by the overwhelming sampling bias present in the global genomic dataset. Previous work has demonstrated the utility of metadata in addressing this bias. Specifically, the inclusion of recent travel history of SARS-CoV-2-positive individuals into extended phylogeographical models has demonstrated increased accuracy of estimates, along with proposing alternative hypotheses that were not apparent using only genomic and geographical data. However, as the availability of comprehensive epidemiological metadata is limited, many of the current estimates rely on sequence data and basic metadata (i.e. sample date and location). As the bias within the SARS-CoV-2 sequence dataset is extensive, the degree to which we can rely on results drawn from standard phylogeographical models (i.e. discrete trait analysis) that lack integrated metadata is of great concern. This is particularly important when estimates influence and inform public health policy. We compared results generated from the same dataset, using two discrete phylogeographical models: one including travel history metadata and one without. We utilized sequences from Victoria, Australia, in this case study for two unique properties. Firstly, the high proportion of cases sequenced throughout 2020 within Victoria and the rest of Australia. Secondly, individual travel history was collected from returning travellers in Victoria during the first wave (January to May) of the coronavirus disease 2019 (COVID-19) pandemic. We found that the implementation of individual travel history was essential for the estimation of SARS-CoV-2 movement via discrete phylogeography models. Without the additional information provided by the travel history metadata, the discrete trait analysis could not be fit to the data due to numerical instability. We also suggest that during the first wave of the COVID-19 pandemic in Australia, the primary driving force behind the spread of SARS-CoV-2 was viral importation from international locations. This case study demonstrates the necessity of robust genomic datasets supplemented with epidemiological metadata for generating accurate estimates from phylogeographical models in datasets that have significant sampling bias. For future work, we recommend the collection of metadata in conjunction with genomic data. Furthermore, we highlight the risk of applying phylogeographical models to biased datasets without incorporating appropriate metadata, especially when estimates influence public health policy decision making.
摘要:
通过贝叶斯系统地理学推断严重急性呼吸综合征冠状病毒2(SARS-CoV-2)的时空传播由于全球基因组数据集中存在压倒性的采样偏差而变得复杂。以前的工作已经证明了元数据在解决这种偏见方面的实用性。具体来说,将SARS-CoV-2阳性个体的近期旅行史纳入扩展的系统地理模型已经证明了估计的准确性提高,同时提出了仅使用基因组和地理数据并不明显的替代假设。然而,由于全面的流行病学元数据的可用性有限,许多当前的估计依赖于序列数据和基本元数据(即样本日期和位置)。由于SARS-CoV-2序列数据集中的偏差很大,我们可以在多大程度上依赖从缺乏集成元数据的标准系统地理学模型(即离散特征分析)得出的结果是非常值得关注的。当估计影响和告知公共卫生政策时,这一点尤为重要。我们比较了从相同数据集生成的结果,使用两个离散的系统地理模型:一个包括旅行历史元数据,一个没有。我们利用了维多利亚的序列,澳大利亚,在这个案例中,研究了两个独特的属性。首先,在维多利亚州和澳大利亚其他地区,整个2020年测序的病例比例很高。其次,在2019年冠状病毒病(COVID-19)大流行的第一波(1月至5月)期间,从维多利亚州的返回旅客那里收集了个人旅行史。我们发现,个人旅行历史的实施对于通过离散的系统地理学模型估计SARS-CoV-2运动至关重要。如果没有旅行历史元数据提供的附加信息,由于数值不稳定,离散性状分析无法拟合数据。我们还建议,在澳大利亚第一波COVID-19大流行期间,SARS-CoV-2传播的主要驱动力是来自国际地区的病毒输入。本案例研究证明了强大的基因组数据集补充流行病学元数据的必要性,以从具有显着的采样偏差的数据集中的系统地理模型中生成准确的估计。为了将来的工作,我们建议结合基因组数据收集元数据。此外,我们强调了将系统地理学模型应用于有偏见的数据集而不包含适当的元数据的风险,特别是当估计影响公共卫生政策决策时。
公众号