Dirichlet

狄利克雷
  • 文章类型: Journal Article
    我们提出了一种新的无监督深度学习方法,称为BindVAE,基于Dirichlet变分自编码器,用于联合解码来自开放染色质区域的多个TF结合信号。BindVAE可以将输入DNA序列解开为不同的潜在因子,这些潜在因子编码单个TFs的细胞类型特异性体内结合信号,参与协同结合的TFs的复合模式,和围绕结合位点的基因组背景。在检索给定细胞类型中表达的TF的基序的任务中,BindVAE与现有的基序发现方法竞争。
    We present a novel unsupervised deep learning approach called BindVAE, based on Dirichlet variational autoencoders, for jointly decoding multiple TF binding signals from open chromatin regions. BindVAE can disentangle an input DNA sequence into distinct latent factors that encode cell-type specific in vivo binding signals for individual TFs, composite patterns for TFs involved in cooperative binding, and genomic context surrounding the binding sites. On the task of retrieving the motifs of expressed TFs in a given cell type, BindVAE is competitive with existing motif discovery approaches.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项工作的目的是解决涉及Neumann-Robin的案例研究奇异模型,Dirichlet,和Neumann边界条件使用基于人工神经网络(ANN)的新颖计算框架,全局搜索遗传算法和局部搜索顺序二次规划方法(SQPM),即,ANN-GA-SQPM。提出此数值框架的灵感来自于引入可靠的结构,该结构使用软计算的优化程序来处理此类刺激系统,从而关联可操作的ANN特征。基于涉及Neumann-Robin的奇异方程的四个不同问题,Dirichlet,诺依曼边界条件已经被占用来审查鲁棒性,稳定性,熟练掌握所设计的ANN-GA-SQPM。通过ANN-GA-SQPM提出的结果已与确切结果进行了比较,以通过50个独立试验的统计性能检查方案的效率。此外,还进行了基于3个和15个神经元的神经元分析的研究,以检查所提出的ANN-GA-SQPM的真实性。
    The aim of this work is to solve the case study singular model involving the Neumann-Robin, Dirichlet, and Neumann boundary conditions using a novel computing framework that is based on the artificial neural network (ANN), global search genetic algorithm (GA), and local search sequential quadratic programming method (SQPM), i.e., ANN-GA-SQPM. The inspiration to present this numerical framework comes through the objective of introducing a reliable structure that associates the operative ANNs features using the optimization procedures of soft computing to deal with such stimulating systems. Four different problems that are based on the singular equations involving Neumann-Robin, Dirichlet, and Neumann boundary conditions have been occupied to scrutinize the robustness, stability, and proficiency of the designed ANN-GA-SQPM. The proposed results through ANN-GA-SQPM have been compared with the exact results to check the efficiency of the scheme through the statistical performances for taking fifty independent trials. Moreover, the study of the neuron analysis based on three and 15 neurons is also performed to check the authenticity of the proposed ANN-GA-SQPM.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    精确估计AI系统预测中的不确定性是确保信任和安全的关键因素。用传统方法训练的深度神经网络容易产生过度自信的预测。与学习权重的近似分布来推断预测置信度的贝叶斯神经网络相反,我们提出了一种新的方法,信息感知Dirichlet网络,通过最小化预测误差的预期最大范数的界限和惩罚与不正确结果相关的信息来学习预测分布上的显式Dirichlet先验分布。推导了新成本函数的属性,以指示如何实现改进的不确定性估计。使用真实数据集的实验表明,我们的技术优于,在很大程度上,用于估计分布内和分布外不确定性的最先进的神经网络,并检测对抗性的例子。
    Precise estimation of uncertainty in predictions for AI systems is a critical factor in ensuring trust and safety. Deep neural networks trained with a conventional method are prone to over-confident predictions. In contrast to Bayesian neural networks that learn approximate distributions on weights to infer prediction confidence, we propose a novel method, Information Aware Dirichlet networks, that learn an explicit Dirichlet prior distribution on predictive distributions by minimizing a bound on the expected max norm of the prediction error and penalizing information associated with incorrect outcomes. Properties of the new cost function are derived to indicate how improved uncertainty estimation is achieved. Experiments using real datasets show that our technique outperforms, by a large margin, state-of-the-art neural networks for estimating within-distribution and out-of-distribution uncertainty, and detecting adversarial examples.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    在具有k个可能结果的多项设置中,我们在“中间审查”范式下发展估计,这是在Jammalamadaka和Mangalam(2003)中定义的。由于相互依赖的概率,这个问题有许多特殊的特征,我们在这里探索。
    In a multinomial set-up with k possible outcomes, we develop estimation under a \"middle censoring\" paradigm, which is as defined in Jammalamadaka and Mangalam (2003). This problem has many special features because of the inter-dependent probabilities, which we explore here.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    分子生态学通常需要分析反映组合物特征相对丰度的计数数据(例如,社区中的类群,组织中的基因转录物)。生成这些数据的采样过程可以使用多项分布来建模。复制多项样本可告知潜在Dirichlet分布中特征的相对丰度。这些分布共同形成了复制和采样组之间相对丰度的分层模型。这种类型的狄利克雷多项式建模(DMM)已经在前面描述过,但它的好处和局限性在很大程度上是未经测试的。有了模拟数据,我们量化了DMM检测治疗组和对照组之间比例差异的能力,并比较了三种计算方法实现DMM-哈密顿蒙特卡罗(HMC)的效果,变分推断(VI),和吉布斯马尔可夫链蒙特卡罗。我们报告说,与类似的分析工具相比,DMM能够更好地检测相对丰度的变化,同时识别可接受的低数量的假阳性。在实现DMM的方法中,HMC提供了最准确的相对丰度估计,VI是计算效率最高的。通过分析先前发表的描述肺微生物组的数据来举例说明DMM的敏感性。我们报告DMM发现了几种潜在的致病性,在吞咽过程中吸入异物的儿童的肺部中,细菌分类群更为丰富;这些差异用不同的统计方法未被发现。我们的结果表明,DMM作为一种指导分子生态学推断的统计方法具有很强的潜力。
    Molecular ecology regularly requires the analysis of count data that reflect the relative abundance of features of a composition (e.g., taxa in a community, gene transcripts in a tissue). The sampling process that generates these data can be modelled using the multinomial distribution. Replicate multinomial samples inform the relative abundances of features in an underlying Dirichlet distribution. These distributions together form a hierarchical model for relative abundances among replicates and sampling groups. This type of Dirichlet-multinomial modelling (DMM) has been described previously, but its benefits and limitations are largely untested. With simulated data, we quantified the ability of DMM to detect differences in proportions between treatment and control groups, and compared the efficacy of three computational methods to implement DMM-Hamiltonian Monte Carlo (HMC), variational inference (VI), and Gibbs Markov chain Monte Carlo. We report that DMM was better able to detect shifts in relative abundances than analogous analytical tools, while identifying an acceptably low number of false positives. Among methods for implementing DMM, HMC provided the most accurate estimates of relative abundances, and VI was the most computationally efficient. The sensitivity of DMM was exemplified through analysis of previously published data describing lung microbiomes. We report that DMM identified several potentially pathogenic, bacterial taxa as more abundant in the lungs of children who aspirated foreign material during swallowing; these differences went undetected with different statistical approaches. Our results suggest that DMM has strong potential as a statistical method to guide inference in molecular ecology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    通过公共卫生政策的设计和实施来预防和控制人畜共患病需要对传播途径有透彻的了解。联合对从病例中获得的微生物分离株的流行病学数据和遗传信息进行建模,为追溯感染源提供了一种方法。在本文中,每种来源的弯曲杆菌感染人类病例的归因概率,条件是每个案例在农村居住的程度与城市环境相比,估计。结合遗传数据和进化过程的模型与新开发的无遗传模型一起应用。我们表明,除了罕见的微生物基因型外,每个模型的推断都是可比的。Further,“乡村性”的影响可以在Logit尺度上线性建模,随着农村的增加,导致反刍动物来源的弯曲杆菌病的可能性增加。
    Preventing and controlling zoonoses through the design and implementation of public health policies requires a thorough understanding of transmission pathways. Modelling jointly the epidemiological data and genetic information of microbial isolates derived from cases provides a methodology for tracing back the source of infection. In this paper, the attribution probability for human cases of campylobacteriosis for each source, conditional on the extent to which each case resides in a rural compared to urban environment, is estimated. A model that incorporates genetic data and evolutionary processes is applied alongside a newly developed genetic-free model. We show that inference from each model is comparable except for rare microbial genotypes. Further, the effect of \'rurality\' may be modelled linearly on the logit scale, with increasing rurality leading to the increasing likelihood of ruminant-sourced campylobacteriosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    In population genetics, the Dirichlet (also called the Balding-Nichols) model has for 20 years been considered the key model to approximate the distribution of allele fractions within populations in a multi-allelic setting. It has often been noted that the Dirichlet assumption is approximate because positive correlations among alleles cannot be accommodated under the Dirichlet model. However, the validity of the Dirichlet distribution has never been systematically investigated in a general framework. This paper attempts to address this problem by providing a general overview of how allele fraction data under the most common multi-allelic mutational structures should be modeled. The Dirichlet and alternative models are investigated by simulating allele fractions from a diffusion approximation of the multi-allelic Wright-Fisher process with mutation, and applying a moment-based analysis method. The study shows that the optimal modeling strategy for the distribution of allele fractions depends on the specific mutation process. The Dirichlet model is only an exceptionally good approximation for the pure drift, Jukes-Cantor and parent-independent mutation processes with small mutation rates. Alternative models are required and proposed for the other mutation processes, such as a Beta-Dirichlet model for the infinite alleles mutation process, and a Hierarchical Beta model for the Kimura, Hasegawa-Kishino-Yano and Tamura-Nei processes. Finally, a novel Hierarchical Beta approximation is developed, a Pyramidal Hierarchical Beta model, for the generalized time-reversible and single-step mutation processes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    计划的干预措施和/或自然条件通常会影响顺序分类结果的变化(例如,症状严重程度)。在这种情况下,有时需要为观察到的状态变化分配先验分数,通常对幅度更大的变化给予更高的权重。我们根据c×c表的每行的多项模型为此类数据定义更改索引,其中行表示基线状态类别。我们将旨在评估每个基准类别中条件变化的索引与旨在捕获总体变化的其他两个索引区分开。这些总体指数之一衡量目标人群的预期变化。另一个被缩放以捕获数据指示的方向上的总可能变化的比例,因此,它的范围从-1(当所有科目都在最不有利的类别中完成时)到+1(当所有科目都在最有利的类别中完成时)。无论受试者如何被采样到基线类别中,变化的条件评估都可以是信息性的。相比之下,当受试者在基线时从目标关注人群中随机抽样时,总体指数变得相关,或者当研究者能够对该人群的基线状态分布做出某些假设时。我们使用Dirichlet-多项式模型来获得具有良好小样本频率特征的条件变化指数的贝叶斯可信区间。模拟研究说明了这些方法,我们将它们应用于涉及顺序反应变化的例子,用于研究睡眠剥夺和日常生活活动。
    Planned interventions and/or natural conditions often effect change on an ordinal categorical outcome (e.g., symptom severity). In such scenarios, it is sometimes desirable to assign a priori scores to observed changes in status, typically giving higher weight to changes of greater magnitude. We define change indices for such data based upon a multinomial model for each row of a c × c table, where the rows represent the baseline status categories. We distinguish an index designed to assess conditional changes within each baseline category from two others designed to capture overall change. One of these overall indices measures expected change across a target population. The other is scaled to capture the proportion of total possible change in the direction indicated by the data, so that it ranges from -1 (when all subjects finish in the least favorable category) to +1 (when all finish in the most favorable category). The conditional assessment of change can be informative regardless of how subjects are sampled into the baseline categories. In contrast, the overall indices become relevant when subjects are randomly sampled at baseline from the target population of interest, or when the investigator is able to make certain assumptions about the baseline status distribution in that population. We use a Dirichlet-multinomial model to obtain Bayesian credible intervals for the conditional change index that exhibit favorable small-sample frequentist properties. Simulation studies illustrate the methods, and we apply them to examples involving changes in ordinal responses for studies of sleep deprivation and activities of daily living.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号