关键词: GWAS benchmarking bioinformatics genealogy kinship python simulation

来  源:   DOI:10.12688/f1000research.122840.1   PDF(Pubmed)

Abstract:
Motivation: Genotyping error can impact downstream single nucleotide polymorphism (SNP)-based analyses. Simulating various modes and levels of error can help investigators better understand potential biases caused by miscalled genotypes. Methods: We have developed and validated vcferr, a tool to probabilistically simulate genotyping error and missingness in variant call format (VCF) files. We demonstrate how vcferr could be used to address a research question by introducing varying levels of error of different type into a sample in a simulated pedigree, and assessed how kinship analysis degrades as a function of the kind and type of error. Software availability: vcferr is available for installation via PyPi (https://pypi.org/project/vcferr/) or conda (https://anaconda.org/bioconda/vcferr). The software is released under the MIT license with source code available on GitHub (https://github.com/signaturescience/vcferr).
摘要:
动机:基因分型错误会影响基于下游单核苷酸多态性(SNP)的分析。模拟各种模式和错误水平可以帮助研究人员更好地理解由错误调用的基因型引起的潜在偏见。方法:我们开发并验证了vcferr,一种用于在变体调用格式(VCF)文件中概率模拟基因分型错误和错误的工具。我们演示了vcferr如何通过在模拟谱系中的样本中引入不同类型的不同水平的误差来解决研究问题,并评估亲属关系分析如何随着错误的种类和类型而降级。软件可用性:vcferr可通过PyPi(https://pypi.org/project/vcferr/)或conda(https://anaconda.org/bioconda/vcferr)进行安装。该软件是在MIT许可下发布的,源代码可在GitHub(https://github.com/signaturerescience/vcferr)上获得。
公众号