Forensic DNA phenotyping

法医 DNA 表型鉴定
  • 文章类型: Journal Article
    在一些刑事案件中,嫌疑人身份不明,DNA数据库中没有匹配的DNA图谱.法医DNA表型分析可以为这些病例提供有用的调查信息。大多数法医研究侧重于可见特征而不是行为特征。然而,吸烟在中国人口中很普遍,DNA甲基化是最有希望的吸烟生物标志物。我们从中国人群中收集了204份全血样本,并使用甲基化敏感的单核苷酸引物延伸法(Ms-SnuPE)测量了9个吸烟相关CpG基因座的甲基化水平。但基因座cg12803068和cg21566642的单碱基延伸引物含有其他CpG位点,这可能会引入偏见,其他7个CpG位点仅纳入随后的统计分析。当前吸烟者组的芳烃受体阻遏物(AHRR)基因附近的基因座cg05575921的甲基化水平远低于从不吸烟者组。为了评估7个CpG基因座中的每一个预测吸烟状况的能力,分别建立逻辑回归(LR)模型,与其他6个基因座相比,cg05575921基因座对吸烟状况的预测能力最好。然后,联合(包括基因座cg19572487,cg05575921,cg23480021,cg23576855,cg21161138,cg01940273和cg09935388)和逐步(包括基因座cg05575921和cg01940273)多项逻辑回归(MLR)模型。组合和逐步MLR模型在预测吸烟状况方面均具有良好的效率,并优于上述7个LR模型。然而,准确性,测试数据集中逐步MLR模型的特异性和曲线下面积(AUC)略高于组合MLR模型,逐步MLR模型需要较少的基因座信息。因此,基于2个显著CpG位点的逐步MLR模型是更推荐的预测中国人群吸烟状况的模型,公式如下:P=1/(1+e-(10.621-10.005*cg05575921-8.770*cg01940273))。主要有2个CpG位点(cg05575921和cg01940273)在预测吸烟状况中起主要作用,其他5个CpG位点的贡献较少。此外,为了评估7个CpG基因座中每个基因座预测卷烟消费量的能力,分别建立了多项式回归公式。由于调整后的R2在0.00和0.20之间,这7个基因座的甲基化水平与香烟消耗不密切相关。我们的甲基化检测很简单,经济,并且可以在传统的法医实验室中使用,并可能有助于评估不明嫌犯的吸烟状况。
    In some criminal cases, the identity of suspect is unknown and there is no matching DNA profile in the DNA database. Forensic DNA Phenotyping can provide useful investigative information for these cases. Most forensic studies focus on visible characteristics rather than behavioral characteristics. However, smoking is prevalent in the Chinese population, and DNA methylation is the most promising biomarker for smoking. We collected 204 whole blood samples from the Chinese population and measured methylation levels of 9 smoking-related CpG loci using the methylation-sensitive single-nucleotide primer extension method (Ms-SnuPE). But the single-base extension primers of loci cg12803068 and cg21566642 contained other CpG sites, which may introduce bias, and only the other 7 CpG loci were included in subsequent statistical analysis. The methylation level of locus cg05575921 near the aromatic hydrocarbon receptor repressor (AHRR) gene was much lower in the current smoker group than in the never smoker group. To evaluate the ability of each of 7 CpG loci to predict smoking status, the logistic regression (LR) models were established separately, and locus cg05575921 had the best ability to predict smoking status compared with the other 6 loci. Then, combined (including loci cg19572487, cg05575921, cg23480021, cg23576855, cg21161138, cg01940273, and cg09935388) and stepwise (including loci cg05575921 and cg01940273) multinomial logistic regression (MLR) models were also established. Both combined and stepwise MLR models had good efficiencies in predicting smoking status, and outperformed the above 7 LR models. However, the accuracy, specificity and area under the curve (AUC) of stepwise MLR model in the testing dataset were slightly higher than those of combined MLR model, and the stepwise MLR model required less loci information. Therefore, the stepwise MLR model based on 2 significant CpG loci was more recommended model for predicting smoking status in the Chinese population, and the formula was as follow: P = 1/(1 +e-(10.621-10.005*cg05575921-8.770*cg01940273)). Mainly 2 CpG loci (cg05575921 and cg01940273) played a major role in the prediction of smoking status, and the other 5 CpG loci contributed less. Moreover, for evaluating the ability of each of 7 CpG loci to predict cigarette consumption, the polynomial regression formulas were established separately. As the adjusted R2 was between 0.00 and 0.20, the methylation levels of these 7 loci were not closely associated with the cigarette consumption. Our methylation assay is simple, economical, and available in conventional forensic laboratories, and may be useful in assessing the smoking status of unknown suspects.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    眼部表型在亚洲人中是可识别的,包括眼睑,裂隙倾角,和峡谷指数。在这里,我们筛选了27个与面部表型相关的SNP,并报道了广东省246名汉族中国人的初步研究。结果表明,rs17760296可以解释6.2%的眼睑变化,当基因型为TT时,双眼皮更容易出现。关于指甲指数,rs4791774和rs642961与之显著相关。然而,没有单个SNP与裂隙倾斜相关。我们进一步构建了两个模型来预测眼睑折叠和指甲指数,并使用受试者工作特征(ROC)曲线和支持向量机(SVM)回归对它们进行评估。分别。模型显示出中等到高的预测能力(AUC=0.75,灵敏度=76%,和特异性=72%),而眼睑指数表现温和(R2=0.1074,MSE=0.0005,P值=0.024)。总之,我们的研究表明rs17760296可以被选入中国南方汉族人群的面部表型预测系统。除rs4791774和rs642961外,还鼓励更多的SNP提高canthal指数的预测准确性。
    Ocular phenotype is recognizable among Asians, including eyelid fold, fissure inclination, and canthal index. Here we screened 27 facial phenotype-associated SNPs and reported a preliminary study in 246 Chinese individuals of Han origin in Guangdong province. Results showed that rs17760296 could explain 6.2% of the eyelid fold variation and double eyelids were more likely to appear when one\'s genotype was TT. With respect to the canthal index, rs4791774 and rs642961 were significantly associated with it. However, no individual SNP was associated with fissure inclination. We further constructed two models to predict eyelid fold and canthal index and evaluated them with receiver operating characteristic (ROC) curves and support vector machine (SVM) regression, respectively. The models showed a moderate-to-high predictive capacity (AUC = 0.75, sensitivity = 76%, and specificity = 72%) for the eyelid fold while a mild performance (R2 = 0.1074, MSE = 0.0005, P-value = 0.024) for the canthal index. In conclusion, our study indicates that rs17760296 could be selected into the facial phenotype prediction system for the Southern Han Chinese population. More SNPs are encouraged to improve the prediction accuracy of the canthal index besides rs4791774 and rs642961.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Predicting appearance phenotypes from genotypes is relevant for various areas of human genetic research and applications such as genetic epidemiology, human history, anthropology, and particularly in forensics. Many appearance phenotypes, and thus their underlying genotypes, are highly correlated, with pigmentation traits serving as primary examples. However, all available genetic prediction models, including those for pigmentation traits currently used in forensic DNA phenotyping, ignore phenotype correlations. Here, we investigated the impact of appearance phenotype correlations on genetic appearance prediction in the exemplary case of three pigmentation traits. We used data for categorical eye, hair and skin colour as well as 41 DNA markers utilized in the recently established HIrisPlex-S system from 762 individuals with complete phenotype and genotype information. Based on these data, we performed genetic prediction modelling of eye, hair and skin colour via three different strategies, namely the established approach of predicting phenotypes solely based on genotypes while not considering phenotype correlations, and two novel approaches that considered phenotype correlations, either incorporating truly observed correlated phenotypes or DNA-predicted correlated phenotypes in addition to the DNA predictors. We found that using truly observed correlated pigmentation phenotypes as additional predictors increased the DNA-based prediction accuracies for almost all eye, hair and skin colour categories, with the largest increase for intermediate eye colour, brown hair colour, dark to black skin colour, and particularly for dark skin colour. Outcomes of dedicated computer simulations suggest that this prediction accuracy increase is due to the additional genetic information that is implicitly provided by the truly observed correlated pigmentation phenotypes used, yet not covered by the DNA predictors applied. In contrast, considering DNA-predicted correlated pigmentation phenotypes as additional predictors did not improve the performance of the genetic prediction of eye, hair and skin colour, which was in line with the results from our computer simulations. Hence, in practical applications of DNA-based appearance prediction where no phenotype knowledge is available, such as in forensic DNA phenotyping, it is not advised to use DNA-predicted correlated phenotypes as predictors in addition to the DNA predictors. In the very least, this is not recommended for the pigmentation traits and the established pigmentation DNA predictors tested here.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Predicting adult height from DNA has important implications in forensic DNA phenotyping. In 2014, we introduced a prediction model consisting of 180 height-associated SNPs based on data from 10,361 Northwestern Europeans enriched with tall individuals (770 > 1.88 standard deviation), which yielded a mid-ranged accuracy (AUC = 0.75 for binary prediction of tall stature and R2 = 0.12 for quantitative prediction of adult height). Here, we provide an update on DNA-based height predictability considering an enlarged list of subsequently-published height-associated SNPs using data from the same set of 10,361 Europeans. A prediction model based on the full set of 689 SNPs showed an improved accuracy relative to previous models for both tall stature (AUC = 0.79) and quantitative height (R2 = 0.21). A feature selection analysis revealed a subset of 412 most informative SNPs while the corresponding prediction model retained most of the accuracy (AUC = 0.76 and R2 = 0.19) achieved with the full model. Over all, our study empirically exemplifies that the accuracy for predicting human appearance phenotypes with very complex underlying genetic architectures, such as adult height, can be improved by increasing the number of phenotype-associated DNA variants. Our work also demonstrates that a careful sub-selection allows for a considerable reduction of the number of DNA predictors that achieve similar prediction accuracy as provided by the full set. This is forensically relevant due to restrictions in the number of SNPs simultaneously analyzable with forensically suitable DNA technologies in the current days of targeted massively parallel sequencing in forensic genetics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Accurate genomic profiling for adult height is of high practical relevance in forensics genetics. Adult height is a classical reference trait in the field of human complex trait genetics characterized by highly polygenic nature and relatively high heritability. A meta-analysis of genome-wide association studies by the Genetic Investigation of Anthropocentric Traits (GIANT) consortium has identified 697 DNA variants associated with adult height in Europeans; however, whether these variants will still be informative in non-Europeans is still in question. The present study investigated the predictive power of these 697 height-associated SNPs in 687 Uyghurs of European-Asian admixed origin. Among all GIANT SNPs, 11% showed nominally significant association (6.78 × 10-4 < p < 0.05) with adult height in the Uyghur population and among the significant SNPs 77% of allele effects were in the same direction as those in Europeans reported in the GIANT study. Fitting linear and logistic models using a polygenic score consisting of all GIANT SNPs resulted in an 80-20 cross-validated mean R2 of 10.08% (95% CI 3.16-18.40%) for quantitative height prediction and a mean AUC value of 0.65 (95% CI 0.57-0.72%) for qualitative \"above average\" prediction. Fine-tuning the SNP set using their association p values considerably improved the prediction results (number of SNPs = 62, R2 = 15.59%, 95% CI 6.80-25.71%; AUC = 0.70, 95% CI 62-0.77) in the Uyghurs. Overall, our findings demonstrate substantial differences between the European and Asian populations in the genetics of adult height, emphasizing the importance of population heterogeneity underlying the genetic architecture of adult height.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Human head hair shape, commonly classified as straight, wavy, curly or frizzy, is an attractive target for Forensic DNA Phenotyping and other applications of human appearance prediction from DNA such as in paleogenetics. The genetic knowledge underlying head hair shape variation was recently improved by the outcome of a series of genome-wide association and replication studies in a total of 26,964 subjects, highlighting 12 loci of which 8 were novel and introducing a prediction model for Europeans based on 14 SNPs. In the present study, we evaluated the capacity of DNA-based head hair shape prediction by investigating an extended set of candidate SNP predictors and by using an independent set of samples for model validation. Prediction model building was carried out in 9674 subjects (6068 from Europe, 2899 from Asia and 707 of admixed European and Asian ancestries), used previously, by considering a novel list of 90 candidate SNPs. For model validation, genotype and phenotype data were newly collected in 2415 independent subjects (2138 Europeans and 277 non-Europeans) by applying two targeted massively parallel sequencing platforms, Ion Torrent PGM and MiSeq, or the MassARRAY platform. A binomial model was developed to predict straight vs. non-straight hair based on 32 SNPs from 26 genetic loci we identified as significantly contributing to the model. This model achieved prediction accuracies, expressed as AUC, of 0.664 in Europeans and 0.789 in non-Europeans; the statistically significant difference was explained mostly by the effect of one EDAR SNP in non-Europeans. Considering sex and age, in addition to the SNPs, slightly and insignificantly increased the prediction accuracies (AUC of 0.680 and 0.800, respectively). Based on the sample size and candidate DNA markers investigated, this study provides the most robust, validated, and accurate statistical prediction models and SNP predictor marker sets currently available for predicting head hair shape from DNA, providing the next step towards broadening Forensic DNA Phenotyping beyond pigmentation traits.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Forensic DNA Phenotyping (FDP), i.e. the prediction of human externally visible traits from DNA, has become a fast growing subfield within forensic genetics due to the intelligence information it can provide from DNA traces. FDP outcomes can help focus police investigations in search of unknown perpetrators, who are generally unidentifiable with standard DNA profiling. Therefore, we previously developed and forensically validated the IrisPlex DNA test system for eye colour prediction and the HIrisPlex system for combined eye and hair colour prediction from DNA traces. Here we introduce and forensically validate the HIrisPlex-S DNA test system (S for skin) for the simultaneous prediction of eye, hair, and skin colour from trace DNA. This FDP system consists of two SNaPshot-based multiplex assays targeting a total of 41 SNPs via a novel multiplex assay for 17 skin colour predictive SNPs and the previous HIrisPlex assay for 24 eye and hair colour predictive SNPs, 19 of which also contribute to skin colour prediction. The HIrisPlex-S system further comprises three statistical prediction models, the previously developed IrisPlex model for eye colour prediction based on 6 SNPs, the previous HIrisPlex model for hair colour prediction based on 22 SNPs, and the recently introduced HIrisPlex-S model for skin colour prediction based on 36 SNPs. In the forensic developmental validation testing, the novel 17-plex assay performed in full agreement with the Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines, as previously shown for the 24-plex assay. Sensitivity testing of the 17-plex assay revealed complete SNP profiles from as little as 63 pg of input DNA, equalling the previously demonstrated sensitivity threshold of the 24-plex HIrisPlex assay. Testing of simulated forensic casework samples such as blood, semen, saliva stains, of inhibited DNA samples, of low quantity touch (trace) DNA samples, and of artificially degraded DNA samples as well as concordance testing, demonstrated the robustness, efficiency, and forensic suitability of the new 17-plex assay, as previously shown for the 24-plex assay. Finally, we provide an update to the publically available HIrisPlex website https://hirisplex.erasmusmc.nl/, now allowing the estimation of individual probabilities for 3 eye, 4 hair, and 5 skin colour categories from HIrisPlex-S input genotypes. The HIrisPlex-S DNA test represents the first forensically validated tool for skin colour prediction, and reflects the first forensically validated tool for simultaneous eye, hair and skin colour prediction from DNA.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Estimating individual age from biomarkers may provide key information facilitating forensic investigations. Recent progress has shown DNA methylation at age-associated CpG sites as the most informative biomarkers for estimating the individual age of an unknown donor. Optimal feature selection plays a critical role in determining the performance of the final prediction model. In this study we investigate methylation levels at 153 age-associated CpG sites from 21 previously reported genomic regions using the EpiTYPER system for their predictive power on individual age in 390 Han Chinese males ranging from 15 to 75 years of age. We conducted a systematic feature selection using a stepwise backward multiple linear regression analysis as well as an exhaustive searching algorithm. Both approaches identified the same subset of 9 CpG sites, which in linear combination provided the optimal model fitting with mean absolute deviation (MAD) of 2.89 years of age and explainable variance (R2) of 0.92. The final model was validated in two independent Han Chinese male samples (validation set 1, N = 65, MAD = 2.49, R2 = 0.95, and validation set 2, N = 62, MAD = 3.36, R2 = 0.89). Other competing models such as support vector machine and artificial neural network did not outperform the linear model to any noticeable degree. The validation set 1 was additionally analyzed using Pyrosequencing technology for cross-platform validation and was termed as validation set 3. Directly applying our model, in which the methylation levels were detected by the EpiTYPER system, to the data from pyrosequencing technology showed, however, less accurate results in terms of MAD (validation set 3, N = 65 Han Chinese males, MAD = 4.20, R2 = 0.93), suggesting the presence of a batch effect between different data generation platforms. This batch effect could be partially overcome by a z-score transformation (MAD = 2.76, R2 = 0.93). Overall, our systematic feature selection identified 9 CpG sites as the optimal subset for forensic age estimation and the prediction model consisting of these 9 markers demonstrated high potential in forensic practice. An age estimator implementing our prediction model allowing missing markers is freely available at http://liufan.big.ac.cn/AgePrediction.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    在法医中已经被广泛接受,由于缺乏明智的前科,在痕量供体鉴定或亲属关系分析中,匹配DNA谱的证据价值以似然比(LR)的形式最合理地传达.这种限制并不能减轻后验赔率(PO)将是返回判决的首选依据的事实。法医DNA表型(FDP)的情况完全不同,其目的是从犯罪现场留下的DNA中预测微量供体的外部可见特征(EVC)。FDP旨在为警方调查提供线索,帮助他们找到无法通过DNA分析识别的未知痕量供体。作为FDP基础的统计模型通常产生具有特定EVC的个体的后验几率(PO)。这种明显的差异导致混淆,即LR或PO何时是法医DNA分析的适当结果,以传达给调查当局。因此,我们从统计学的角度出发,在法医DNA谱分析和FDP的背景下,阐明LR和PO之间的区别。这样做,我们还讨论了人口从属关系对LR和PO的影响。与众所周知的DNA分析中LR的人口依赖性相反,在FDP中获得的PO可能与人口无关。实际的独立程度,然而,问题是(i)相应EVC的因果关系的多少被用于FDP的遗传标记捕获,以及(ii)相同EVC的诸如环境因果因素的非遗传因素在整个群体中均匀分布的程度。ThefactthatanLRshouldbecommunicatedincasesofDNAprofilingwhilethePOaresuitableforFDPdoesnotconflictwiththeory,而是反映了DNA信息的这两种法医应用之间的内在差异。
    It has become widely accepted in forensics that, owing to a lack of sensible priors, the evidential value of matching DNA profiles in trace donor identification or kinship analysis is most sensibly communicated in the form of a likelihood ratio (LR). This restraint does not abate the fact that the posterior odds (PO) would be the preferred basis for returning a verdict. A completely different situation holds for Forensic DNA Phenotyping (FDP), which is aimed at predicting externally visible characteristics (EVCs) of a trace donor from DNA left behind at the crime scene. FDP is intended to provide leads to the police investigation helping them to find unknown trace donors that are unidentifiable by DNA profiling. The statistical models underlying FDP typically yield posterior odds (PO) for an individual possessing a certain EVC. This apparent discrepancy has led to confusion as to when LR or PO is the appropriate outcome of forensic DNA analysis to be communicated to the investigating authorities. We thus set out to clarify the distinction between LR and PO in the context of forensic DNA profiling and FDP from a statistical point of view. In so doing, we also addressed the influence of population affiliation on LR and PO. In contrast to the well-known population dependency of the LR in DNA profiling, the PO as obtained in FDP may be widely population-independent. The actual degree of independence, however, is a matter of (i) how much of the causality of the respective EVC is captured by the genetic markers used for FDP and (ii) by the extent to which non-genetic such as environmental causal factors of the same EVC are distributed equally throughout populations. The fact that an LR should be communicated in cases of DNA profiling whereas the PO are suitable for FDP does not conflict with theory, but rather reflects the immanent differences between these two forensic applications of DNA information.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号