    The black-footed cat (Felis nigripes) is endemic to the arid regions of southern Africa. One of the world\'s smallest wild felids, the species occurs at low densities and is secretive and elusive, which makes ecological studies difficult. Genetic data could provide key information such as estimates on population size, sex ratios, and genetic diversity. In this study, we test if microsatellite loci can be successfully amplified from scat samples that could be noninvasively collected from the field. Using 21 blood and scat samples collected from the same individuals, we statistically tested whether nine microsatellites previously designed for use in domestic cats can be used to identify individual black-footed cats. Genotypes recovered from blood and scat samples were compared to assess loss of heterozygosity, allele dropout, and false alleles resulting from DNA degradation or PCR inhibitors present in scat samples. The microsatellite markers were also used to identify individuals from scats collected in the field that were not linked to any blood samples. All nine microsatellites used in this study were amplified successfully and were polymorphic. Microsatellite loci were found to have sufficient discriminatory power to distinguish individuals and identify clones. In conclusion, these molecular markers can be used to monitor populations of wild black-footed cats noninvasively. The genetic data will be able to contribute important information that may be used to guide future conservation initiatives.






    Linkage maps are essential for genetic mapping of phenotypic traits, gene map-based cloning, and marker-assisted selection in breeding applications. Construction of a high-quality saturated map requires high-quality genotypic data on a large number of molecular markers. Errors in genotyping cannot be completely avoided, no matter what platform is used. When genotyping error reaches a threshold level, it will seriously affect the accuracy of the constructed map and the reliability of consequent genetic studies. In this study, repeated genotyping of two recombinant inbred line (RIL) populations derived from crosses Yangxiaomai × Zhongyou 9507 and Jingshuang 16 × Bainong 64 was used to investigate the effect of genotyping errors on linkage map construction. Inconsistent data points between the two replications were regarded as genotyping errors, which were classified into three types. Genotyping errors were treated as missing values, and therefore the non-erroneous data set was generated. Firstly, linkage maps were constructed using the two replicates as well as the non-erroneous data set. Secondly, error correction methods implemented in software packages QTL IciMapping (EC) and Genotype-Corrector (GC) were applied to the two replicates. Linkage maps were therefore constructed based on the corrected genotypes and then compared with those from the non-erroneous data set. Simulation study was performed by considering different levels of genotyping errors to investigate the impact of errors and the accuracy of error correction methods. Results indicated that map length and marker order differed among the two replicates and the non-erroneous data sets in both RIL populations. For both actual and simulated populations, map length was expanded as the increase in error rate, and the correlation coefficient between linkage and physical maps became lower. Map quality can be improved by repeated genotyping and error correction algorithm. When it is impossible to genotype the whole mapping population repeatedly, 30% would be recommended in repeated genotyping. The EC method had a much lower false positive rate than did the GC method under different error rates. This study systematically expounded the impact of genotyping errors on linkage analysis, providing potential guidelines for improving the accuracy of linkage maps in the presence of genotyping errors.






    Genotyping-by-sequencing (GBS) provides affordable methods for genotyping hundreds of individuals using millions of markers. However, this challenges bioinformatic procedures that must overcome possible artifacts such as the bias generated by polymerase chain reaction duplicates and sequencing errors. Genotyping errors lead to data that deviate from what is expected from regular meiosis. This, in turn, leads to difficulties in grouping and ordering markers, resulting in inflated and incorrect linkage maps. Therefore, genotyping errors can be easily detected by linkage map quality evaluations.
    We developed and used the Reads2Map workflow to build linkage maps with simulated and empirical GBS data of diploid outcrossing populations. The workflows run GATK, Stacks, TASSEL, and Freebayes for single-nucleotide polymorphism calling and updog, polyRAD, and SuperMASSA for genotype calling, as well as OneMap and GUSMap to build linkage maps. Using simulated data, we observed which genotype call software fails in identifying common errors in GBS sequencing data and proposed specific filters to better handle them. We tested whether it is possible to overcome errors in a linkage map using genotype probabilities from each software or global error rates to estimate genetic distances with an updated version of OneMap. We also evaluated the impact of segregation distortion, contaminant samples, and haplotype-based multiallelic markers in the final linkage maps. Through our evaluations, we observed that some of the approaches produce different results depending on the dataset (dataset dependent) and others produce consistent advantageous results among them (dataset independent).
    We set as default in the Reads2Map workflows the approaches that showed to be dataset independent for GBS datasets according to our results. This reduces the number of required tests to identify optimal pipelines and parameters for other empirical datasets. Using Reads2Map, users can select the pipeline and parameters that best fit their data context. The Reads2MapApp shiny app provides a graphical representation of the results to facilitate their interpretation.






    Linkage mapping is an approach to order markers based on recombination events. Mapping algorithms cannot easily handle genotyping errors, which are common in high-throughput genotyping data. To solve this issue, strategies have been developed, aimed mostly at identifying and eliminating these errors. One such strategy is SMOOTH, an iterative algorithm to detect genotyping errors. Unlike other approaches, SMOOTH can also be used to impute the most probable alternative genotypes, but its application is limited to diploid species and to markers heterozygous in only one of the parents. In this study we adapted SMOOTH to expand its use to any marker type and to autopolyploids with the use of identity-by-descent probabilities, naming the updated algorithm Smooth Descent (SD). We applied SD to real and simulated data, showing that in the presence of genotyping errors this method produces better genetic maps in terms of marker order and map length. SD is particularly useful for error rates between 5% and 20% and when error rates are not homogeneous among markers or individuals. With a starting error rate of 10%, SD reduced it to ∼5% in diploids, ∼7% in tetraploids and ∼8.5% in hexaploids. Conversely, the correlation between true and estimated genetic maps increased by 0.03 in tetraploids and by 0.2 in hexaploids, while worsening slightly in diploids (∼0.0011). We also show that the combination of genotype curation and map re-estimation allowed us to obtain better genetic maps while correcting wrong genotypes. We have implemented this algorithm in the R package Smooth Descent.






    Comprehensive decisions on the management of commercially produced bees, depend largely on associated knowledge of genetic diversity. In this study, we present novel microsatellite markers to support the breeding, management, and conservation of the blue orchard bee, Osmia lignaria Say (Hymenoptera: Megachilidae). Native to North America, O. lignaria has been trapped from wildlands and propagated on-crop and used to pollinate certain fruit, nut, and berry crops. Harnessing the O. lignaria genome assembly, we identified 59,632 candidate microsatellite loci in silico, of which 22 were tested using molecular techniques. Of the 22 loci, 12 loci were in Hardy-Weinberg equilibrium (HWE), demonstrated no linkage disequilibrium (LD), and achieved low genotyping error in two Intermountain North American wild populations in Idaho and Utah, USA. We found no difference in population genetic diversity between the two populations, but there was evidence for low but significant population differentiation. Also, to determine if these markers amplify in other Osmia, we assessed 23 species across the clades apicata, bicornis, emarginata, and ribifloris. Nine loci amplified in three species/subspecies of apicata, 22 loci amplified in 11 species/subspecies of bicornis, 11 loci amplified in seven species/subspecies of emarginata, and 22 loci amplified in two species/subspecies of ribifloris. Further testing is necessary to determine the capacity of these microsatellite loci to characterize genetic diversity and structure under the assumption of HWE and LD for species beyond O. lignaria. These markers will inform the conservation and commercial use of trapped and managed O. lignaria and other Osmia species for both agricultural and nonagricultural systems.






    Estimating the relationships between individuals is one of the fundamental challenges in many fields. In particular, relationship.ip estimation could provide valuable information for missing persons cases. The recently developed investigative genetic genealogy approach uses high-density single nucleotide polymorphisms (SNPs) to determine close and more distant relationships, in which hundreds of thousands to tens of millions of SNPs are generated either by microarray genotyping or whole-genome sequencing. The current studies usually assume the SNP profiles were generated with minimum errors. However, in the missing person cases, the DNA samples can be highly degraded, and the SNP profiles generated from these samples usually contain lots of errors. In this study, a machine learning approach was developed for estimating the relationships with high error SNP profiles. In this approach, a hierarchical classification strategy was employed first to classify the relationships by degree and then the relationship types within each degree separately. As for each classification, feature selection was implemented to gain better performance. Both simulated and real data sets with various genotyping error rates were utilized in evaluating this approach, and the accuracies of this approach were higher than individual measures; namely, this approach was more accurate and robust than the individual measures for SNP profiles with genotyping errors. In addition, the highest accuracy could be obtained by providing the same genotyping error rates in train and test sets, and thus estimating genotyping errors of the SNP profiles is critical to obtaining high accuracy of relationship estimation.






    Clinical use of genotype data requires high positive predictive value (PPV) and thorough understanding of the genotyping platform characteristics. BeadChip arrays, such as the Global Screening Array (GSA), potentially offer a high-throughput, low-cost clinical screen for known variants. We hypothesize that quality assessment and comparison to whole-genome sequence and benchmark data establish the analytical validity of GSA genotyping.
    To test this hypothesis, we selected 263 samples from Coriell, generated GSA genotypes in triplicate, generated whole genome sequence (rWGS) genotypes, assessed the quality of each set of genotypes, and compared each set of genotypes to each other and to the 1000 Genomes Phase 3 (1KG) genotypes, a performance benchmark. For 59 genes (MAP59), we also performed theoretical and empirical evaluation of variants deemed medically actionable predispositions.
    Quality analyses detected sample contamination and increased assay failure along the chip margins. Comparison to benchmark data demonstrated that > 82% of the GSA assays had a PPV of 1. GSA assays targeting transitions, genomic regions of high complexity, and common variants performed better than those targeting transversions, regions of low complexity, and rare variants. Comparison of GSA data to rWGS and 1KG data showed > 99% performance across all measured parameters. Consistent with predictions from prior studies, the GSA detection of variation within the MAP59 genes was 3/261.
    We establish the analytical validity of GSA assays using quality analytics and comparison to benchmark and rWGS data. GSA assays meet the standards of a clinical screen although assays interrogating rare variants, transversions, and variants within low-complexity regions require careful evaluation.






    We have entered an era of direct-to-consumer (DTC) genomics. Patients have relayed many success stories of DTC genomics about finding causal mutations of genetic diseases before showing any symptoms and taking precautions. However, consumers may also take unnecessary medical actions based on false alarms of \"pathogenic alleles\". The severity of this problem is not well known. Using publicly available data, we compared DTC microarray genotyping data with deep-sequencing data of 5 individuals and manually checked each inconsistently reported single nucleotide variants (SNVs). We estimated that, on average, a person would have ~5 \"pathogenic\" alleles reported due to wrongly reported genotypes if using a 23andMe genotyping microarray. We also found that the number of wrongly classified \"pathogenic\" alleles per person is at least as significant as those due to wrongly reported genotypes. We show that the scale of the false alarm problem could be large enough that the medical costs will become a burden to public health.







    This study aimed to determine the effect of different rates of marker genotyping error on the accuracy of genomic prediction that was examined under distinct marker and quantitative trait loci (QTL) densities and different heritability estimates using a stochastic simulation approach. For each scenario of simulation, a reference population with phenotypic and genotypic records and a validation population with only genotypic records were considered. Marker effects were estimated in the reference population, and then their genotypic records were used to predict genomic breeding values in the validation population. The prediction accuracy was calculated as the correlation between estimated and true breeding values. The prediction bias was examined by computing the regression of true genomic breeding value on estimated genomic breeding value. The accuracy of the genomic evaluation was the highest in a scenario with no marker genotyping error and varied from 0.731 to 0.934. The accuracy of the genomic evaluation was the lowest in a scenario with marker genotyping error equal to 20% and changed from 0.517 to 0.762. The unbiased regression coefficients of true genomic breeding value on estimated genomic breeding value were obtained in the reference and validation populations when the rate of marker genotyping error was equal to zero. The results showed that marker genotyping error can reduce the accuracy of genomic evaluations. Moreover, marker genotyping error can provide biased estimates of genomic breeding values. Therefore, for obtaining accurate results it is recommended to minimize the marker genotyping errors to zero in genomic evaluation programs.






    BACKGROUND: Alpha-1-antitrypsin (A1AT) deficiency is a hereditary condition caused by mutations in the SERPINA1 gene and associated with lung emphysema and liver disease. Laboratory testing in suspected A1AT deficiency involves quantifying serum A1AT concentration and identification of specific alleles by genotyping and phenotyping. The aim of this report was to present a case of the null allele carrier with consequent genotype/phenotype/concentration discrepancies and potential misclassification of the Z variant in a 42-year-old white man presenting with symptoms of chronic obstructive pulmonary disease (COPD).
    METHODS: Serum A1AT concentration was measured using an immunoturbidimetric assay. A1AT phenotype was determined using isoelectric focusing followed with immunofixation (IEF-IF). Genotyping specifically for the S and Z allele was performed by melting curve analysis using real-time PCR and checked by an alternative PCR-RFLP method. Genotype/phenotype ambiguity and discrepancy were amended using gene sequencing.
    RESULTS: Laboratory testing revealed highly reduced A1AT concentration (less than 0.30 g/L), mild to moderate deficient genotype (Pi*Z allele: M/Z and Pi*S allele: M/M) and severe deficient Z homozygous phenotype (Pi ZZ). After repeated sampling, the same discordant results were verified by these tests. Further sequencing revealed two clinically relevant and defective variants: rs199422210 (a rare null allele) and rs28929474 (the Z allele).
    CONCLUSIONS: Due to inability of genotyping kit probes to detect null/Z allele combination (which mimics the Pi ZZ phenotype), our patient was misclassified as mild to moderate deficient Pi*MZ heterozygote. In all unclear cases, whole-gene sequencing is highly recommended in order to determine definitive cause of A1AT deficiency.





