Poisson Distribution

泊松分布
  • 文章类型: Journal Article
    阳泉市疾病预防控制中心,中国,针对卡拉-阿扎尔疫情日益增多的趋势,采取了一系列的防控措施。作为回应,我们提出了一个新的模型来更科学地评估这些干预措施的有效性.
    我们从阳泉市疾病预防控制中心(CDC)获得了2017年至2021年Kala-Azar的发病率数据。我们构建了泊松分段回归模型,谐波泊松分段回归模型,和改进的谐波泊松分段回归模型,并使用这三种模型来解释干预效果,分别。最后,通过比较三种模型的拟合效果,选择最优模型。
    初步分析显示干预前Kala-Azar有潜在的上升趋势[发生率比(IRR):1.045,95%置信区间(CI):1.027-1.063,p<0.001]。就长期影响而言,干预后Kala-Azar的上升显着放缓(IRR:0.960,95CI:0.927-0.995,p=0.026),干预后每增加一个月,Kala-Azar的风险增加0.3%(β1β3=0.003,IRR=1.003)。模型拟合效果的结果表明,改进的谐波泊松分段回归模型拟合效果最好,和MSE的值,MAE,RMSE最低,分别为0.017、0.101和0.130。
    从长远来看,阳泉市疾控中心采取的干预措施可以很好地遏制Kala-Azar的上升趋势。改进的谐波泊松分段回归模型具有更高的拟合性能,为季节性传染病干预效果的评价提供一定的科学参考。
    UNASSIGNED: The Centre for Disease Control and Prevention in Yangquan, China, has taken a series of preventive and control measures in response to the increasing trend of Kala-Azar. In response, we propose a new model to more scientifically evaluate the effectiveness of these interventions.
    UNASSIGNED: We obtained the incidence data of Kala-Azar from 2017 to 2021 from the Centre for Disease Control and Prevention (CDC) in Yangquan. We constructed Poisson segmented regression model, harmonic Poisson segmental regression model, and improved harmonic Poisson segmented regression model, and used the three models to explain the intervention effect, respectively. Finally, we selected the optimal model by comparing the fitting effects of the three models.
    UNASSIGNED: The primary analysis showed an underlying upward trend of Kala-Azar before intervention [incidence rate ratio (IRR): 1.045, 95% confidence interval (CI): 1.027-1.063, p < 0.001]. In terms of long-term effects, the rise of Kala-Azar slowed down significantly after the intervention (IRR:0.960, 95%CI:0.927-0.995, p = 0.026), and the risk of Kala-Azar increased by 0.3% for each additional month after intervention (β1  + β3  = 0.003, IRR = 1.003). The results of the model fitting effect showed that the improved harmonic Poisson segmental regression model had the best fitting effect, and the values of MSE, MAE, and RMSE were the lowest, which were 0.017, 0.101, and 0.130, respectively.
    UNASSIGNED: In the long term, the intervention measures taken by the Yangquan CDC can well curb the upward trend of Kala-Azar. The improved harmonic Poisson segmented regression model has higher fitting performance, which can provide a certain scientific reference for the evaluation of the intervention effect of seasonal infectious diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    单分子表面增强拉曼光谱(SM-SERS)具有巨大的潜力,可以彻底改变超色光谱的定量分析。然而,由于强烈的强度波动和闪烁特性,实现定量的SM-SERS具有挑战性。在这项研究中,我们揭示了SERS光谱中统计SERS概率P与微观平均分子数α之间的关系P=1-e-α,这为实现SM-SERS定量的统计途径奠定了物理基础。利用SERS概率校准,我们实现了具有批次间稳健性的定量SERS分析,浓度检测范围极广,涵盖9个数量级,和超低的检测极限远低于单分子水平。这些结果表明了通过统计途径进行稳健SERS定量的物理可行性,并且无疑为在各种应用场景中实现SERS作为实用分析工具开辟了新途径。
    Single-molecule surface-enhanced Raman spectroscopy (SM-SERS) holds great potential to revolutionize ultratrace quantitative analysis. However, achieving quantitative SM-SERS is challenging because of strong intensity fluctuation and blinking characteristics. In this study, we reveal the relation P = 1 - e-α between the statistical SERS probability P and the microscopic average molecule number α in SERS spectra, which lays the physical foundation for a statistical route to implement SM-SERS quantitation. Utilizing SERS probability calibration, we achieve quantitative SERS analysis with batch-to-batch robustness, extremely wide detection range of concentration covering 9 orders of magnitude, and ultralow detection limit far below the single-molecule level. These results indicate the physical feasibility of robust SERS quantitation through statistical route and certainly open a new avenue for implementing SERS as a practical analysis tool in various application scenarios.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    了解传染病的传播和传播,流行病学家转向瞬时繁殖数的估计。虽然存在许多估计方法,其效用可能有限。监测数据收集的挑战,仅靠数据无法验证的模型假设,和计算效率低下的框架是许多现有方法的关键限制。我们提出了一种基于离散样条的方法,该方法使用近端牛顿法解决了凸优化问题-泊松趋势滤波。它产生了一个局部自适应估计器,用于具有异质平滑度的瞬时再现数量估计。即使在某些过程错误规范下,我们的方法仍然准确,并且在计算上很有效,即使是大规模数据。该实现可以在轻量级R包rtestim中轻松访问。
    To understand the transmissibility and spread of infectious diseases, epidemiologists turn to estimates of the instantaneous reproduction number. While many estimation approaches exist, their utility may be limited. Challenges of surveillance data collection, model assumptions that are unverifiable with data alone, and computationally inefficient frameworks are critical limitations for many existing approaches. We propose a discrete spline-based approach that solves a convex optimization problem-Poisson trend filtering-using the proximal Newton method. It produces a locally adaptive estimator for instantaneous reproduction number estimation with heterogeneous smoothness. Our methodology remains accurate even under some process misspecifications and is computationally efficient, even for large-scale data. The implementation is easily accessible in a lightweight R package rtestim.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: English Abstract
    The aim of this study was to develop a methodology for estimating cancer incidence in Brazil and its regions. Using data from population-based cancer registries (RCBP, acronym in Portuguese) and the Brazilian Mortality Information System (SIM, acronym in Portuguese), annual incidence/mortality (I/M) ratios were calculated by type of cancer, age group and sex in each RCBP. Poisson longitudinal multilevel models were applied to estimate the I/M ratios by region in 2018. The estimate of new cancer cases in 2018 was calculated by applying the estimated I/M ratios to the number of SIM-corrected deaths that occurred that year. North and Northeast concentrated the lowest I/M ratios. Pancreatic, lung, liver and esophageal cancers had the lowest I/M ratios, whereas the highest were estimated for thyroid, testicular, prostate and female breast cancers. For 2018, 506,462 new cancer cases were estimated in Brazil. Female breast and prostate were the two main types of cancer in all regions. In the North and Northeast, cervical and stomach cancers stood out. Differences in the I/M ratios between regions were observed and may be related to socioeconomic development and access to health services.
    O objetivo deste estudo foi desenvolver metodologia para estimar a incidência de câncer no Brasil e regiões. A partir de dados dos registros de câncer de base populacional (RCBP) e do Sistema de Informações sobre Mortalidade (SIM) foram calculadas razões de incidência e mortalidade (I/M) anuais, tipo de câncer, faixa etária e sexo em cada RCBP. Para estimar as razões I/M por região em 2018, foram aplicados modelos multiníveis longitudinais de Poisson. A estimativa de casos novos de câncer, em 2018, foi calculada aplicando-se as razões I/M estimadas ao número de óbitos corrigidos do SIM ocorridos naquele ano. Norte e Nordeste concentraram as menores razões I/M. Os cânceres de pâncreas, pulmão, fígado e esôfago tiveram as menores razões I/M, enquanto as maiores razões I/M foram estimadas para câncer de tireoide, testículo, próstata e mama feminina. Para 2018, foram estimados 506.462 casos novos de câncer no Brasil. Mama feminina e próstata foram os dois principais tipos de câncer em todas as regiões. No Norte e no Nordeste, destacaram-se os cânceres do colo do útero e de estômago. Diferenças nas razões I/M entre as regiões foram observadas e podem estar relacionadas ao desenvolvimento socioeconômico e ao acesso a serviços de saúde.
    El objetivo de este estudio fue desarrollar una metodología para estimar la incidencia de cáncer en Brasil y sus regiones. A partir de datos de los registros de cáncer de base poblacional (RCBP) y el Sistema de Informaciones de Mortalidad (SIM), se calcularon las tasas anuales de incidencia y mortalidad (I/M), tipo de cáncer, grupo de edad y sexo en cada RCBP. Para estimar las tasas de I/M por región en 2018, se aplicaron modelos multinivel longitudinales de Poisson. Los nuevos casos de cáncer en 2018 se estimaron mediante la aplicación de las tasas I/M que se esperan para el número de muertes corregidas de SIM que habían ocurrido ese año. Las regiones Norte y Nordeste concentraron las más bajas tasas de I/M. Los cánceres de páncreas, pulmón, hígado y esófago tuvieron las más bajas tasas de I/M, mientras que las más altas tasas de I/M se estimaron para los cánceres de tiroides, testículos, próstata y mama femenina. Para 2018, se estimaron 506.462 nuevos casos de cáncer en Brasil. La mama femenina y la próstata representaron técnicas de estimación y configuraron ser los tipos principales de cáncer en todas las regiones. En el Norte y el Nordeste se destacaron los cánceres de cuello uterino y estómago. Se observaron diferencias en las tasas de I/M entre regiones, las cuales pueden estar relacionadas con el desarrollo socioeconómico y el acceso a los servicios de salud.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近几十年来,养老金改革已经实施,以解决社会保障体系的财务可持续性问题,导致退休年龄的增加。这种调整导致了关于退休与健康之间关系的持续辩论。这项研究调查了退休时间对意大利心血管疾病(CVD)风险的影响。它使用一个全面的数据集,包括社会经济,健康,和行为风险因素,这与行政住院和死亡率登记册有关。为了解决退休的潜在内生性,我们采用嵌入在泊松速率模型中的工具变量方法。结果表明,平均而言,退休多年对男性和女性心血管疾病的风险都有有益的影响.退休后每增加一年,男性此类疾病的发病率就会减少约17%,女性减少29%。分层分析和稳健性测试表明,退休的好处在男性和某些群体中似乎更加稳健和明显,特别是从事体力劳动或人体工程学工作条件差的男性。这些结果突出表明,延迟退休可能会导致老年人口心血管疾病负担增加。此外,在工效学条件较差的工人中,退休对心血管疾病发展的保护作用凸显了提高退休年龄对不同类别工人的不同影响,以及需要有针对性和区别对待的政策来避免打击更脆弱的人群.
    In recent decades, pension reforms have been implemented to address the financial sustainability of social security systems, resulting in an increase in the retirement age. This adjustment has led to ongoing debates about the relationship between retirement and health. This study investigates the impact of time spent in retirement on the risk of cardiovascular disease (CVD) in Italy. It uses a comprehensive dataset that includes socioeconomic, health, and behavioural risk factors, which is linked to administrative hospitalisation and mortality registers. To address the potential endogeneity of retirement, we employ an instrumental variables approach embedded in a Poisson rate model. The results show that, on average, years spent in retirement have a beneficial effect on the risk of CVD for both men and women. Each additional year spent in retirement reduces the incidence of such diseases by about 17% for men and 29% for women. Stratified analyses and robustness tests show that the benefits of retirement appear to be more robust and pronounced in men and in certain groups, particularly men in manual occupations or with poor ergonomic conditions at work. These results highlight that delaying access to retirement may lead to an increased burden of CVD in the older population. In addition, the protective effect of retirement on the development of CVD among workers with poorer ergonomic conditions underlines the different impact of increasing the retirement age on different categories of workers and the need for targeted and differentiated policies to avoid hitting the more vulnerable.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    神经活动的降维通过将内部神经模式再激活的测量与外部变量调节的测量分离,为无监督神经解码铺平了道路。仅假设潜在动力学和内部调谐曲线的光滑度,泊松高斯过程潜变量模型(P-GPLVM;Wu等人。,2017)是发现高维尖峰列车的低维潜在结构的强大工具。然而,当给出新的神经数据时,原始模型缺乏一种方法来推断他们在学习的潜在空间中的潜在轨迹,限制了其估计神经再激活的能力。这里,我们扩展了P-GPLVM,以实现受先前学习的平滑度和映射信息约束的新数据的潜在变量推断。我们还描述了一种用于时间压缩活动模式的约束潜在变量推断的原则性方法,例如在海马锐波波纹期间的人口爆发事件中发现的,以及评估神经模式再激活有效性和推断编码经验的指标。在主动迷宫探索过程中,将这些方法应用于海马合奏记录,我们复制了P-GPLVM学习编码动物位置的潜在空间的结果。我们进一步证明了这个潜在空间可以区分一个迷宫上下文。通过推断跑步过程中新神经数据的潜在变量,观察到某些神经模式会重新激活,根据训练数据流形中附近神经轨迹编码的经验的相似性。最后,神经模式的再激活也可以估计在人口爆发事件期间的神经活动,允许识别通用行为和更一般体验的重放事件。因此,我们对神经活动无监督分析的P-GPLVM框架的扩展可用于回答与科学发现相关的关键问题.
    Dimension reduction on neural activity paves a way for unsupervised neural decoding by dissociating the measurement of internal neural pattern reactivation from the measurement of external variable tuning. With assumptions only on the smoothness of latent dynamics and of internal tuning curves, the Poisson gaussian-process latent variable model (P-GPLVM; Wu et al., 2017) is a powerful tool to discover the low-dimensional latent structure for high-dimensional spike trains. However, when given novel neural data, the original model lacks a method to infer their latent trajectories in the learned latent space, limiting its ability for estimating the neural reactivation. Here, we extend the P-GPLVM to enable the latent variable inference of new data constrained by previously learned smoothness and mapping information. We also describe a principled approach for the constrained latent variable inference for temporally compressed patterns of activity, such as those found in population burst events during hippocampal sharp-wave ripples, as well as metrics for assessing the validity of neural pattern reactivation and inferring the encoded experience. Applying these approaches to hippocampal ensemble recordings during active maze exploration, we replicate the result that P-GPLVM learns a latent space encoding the animal\'s position. We further demonstrate that this latent space can differentiate one maze context from another. By inferring the latent variables of new neural data during running, certain neural patterns are observed to reactivate, in accordance with the similarity of experiences encoded by its nearby neural trajectories in the training data manifold. Finally, reactivation of neural patterns can be estimated for neural activity during population burst events as well, allowing the identification for replay events of versatile behaviors and more general experiences. Thus, our extension of the P-GPLVM framework for unsupervised analysis of neural activity can be used to answer critical questions related to scientific discovery.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:血吸虫病是一种被忽视的疾病,在世界热带和亚热带地区流行,尤其是在非洲。检测疾病的存在是基于对儿童和成人的粪便或尿液中的寄生虫的检测。在这样的研究中,通常,收集的血吸虫病感染数据包括许多负面个体的信息,导致高零通货膨胀。因此,在实践中,计数数据中过多的零是常见的。然而,此分析的目的是将统计模型应用于计数数据并评估其性能和结果。
    方法:这是对先前收集的数据的二次分析。作为建模过程的一部分,比较泊松回归,使用负二项回归及其相关的零膨胀和障碍模型来确定哪种模型最适合计数数据。
    结果:总体而言,在接受测试的1345人中,94.1%的研究参与者没有任何血吸虫病卵。导致零通胀。负二项回归模型的性能(跨栏负二项(HNB),零膨胀负二项(ZINB)和标准负二项)优于基于泊松的回归模型(泊松,零膨胀泊松,跨栏泊松)。最佳模型是ZINB和HNB,根据基于信息的标准测试值,它们的性能无法区分。
    结论:发现零膨胀负二项和障碍负二项模型是对过度分散的零膨胀计数数据进行建模的最令人满意的拟合,建议在未来的统计建模分析中使用。
    BACKGROUND: Schistosomiasis is a neglected disease prevalent in tropical and sub-tropical areas of the world, especially in Africa. Detecting the presence of the disease is based on the detection of the parasites in the stool or urine of children and adults. In such studies, typically, data collected on schistosomiasis infection includes information on many negative individuals leading to a high zero inflation. Thus, in practice, counts data with excessive zeros are common. However, the purpose of this analysis is to apply statistical models to the count data and evaluate their performance and results.
    METHODS: This is a secondary analysis of previously collected data. As part of a modelling process, a comparison of the Poisson regression, negative binomial regression and their associated zero inflated and hurdle models were used to determine which offered the best fit to the count data.
    RESULTS: Overall, 94.1% of the study participants did not have any schistosomiasis eggs out of 1345 people tested, resulting in a high zero inflation. The performance of the negative binomial regression models (hurdle negative binomial (HNB), zero inflated negative binomial (ZINB) and the standard negative binomial) were better than the Poisson-based regression models (Poisson, zero inflated Poisson, hurdle Poisson). The best models were the ZINB and HNB and their performances were indistinguishable according to information-based criteria test values.
    CONCLUSIONS: The zero-inflated negative binomial and hurdle negative binomial models were found to be the most satisfactory fit for modelling the over-dispersed zero inflated count data and are recommended for use in future statistical modelling analyses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究调查了导致自行车事故的因素,重点研究了四类自行车道和其他暴露和建成环境特征的普查街区。以首尔为例,收集了2018年至2020年三年的自行车事故现场数据,导致1,330个自行车事故现场和2,072起事故。地理加权泊松回归(GWPR)模型被用作方法论方法,以研究事故频率和整个空间的解释变量之间的空间变化关系,与泊松回归模型相反。结果表明,GWPR模型在捕获未观察到的空间异质性方面优于全局泊松回归模型。例如,确定模型拟合优度的偏差值对于Poisson回归模型为0.244,对于拟合更好的GWPR模型为0.500。进一步的发现表明,影响自行车事故的因素会根据事故的位置和分布而产生不同的影响。例如,尽管有自行车道,一些人口普查街区,特别是在城市的东北部,自行车事故仍然存在风险。这些发现可以为城市规划者和决策者制定自行车安全措施和法规提供有价值的见解。
    This study investigates the factors contributing to bicycle accidents, focusing on four types of bicycle lanes and other exposure and built environment characteristics of census blocks. Using Seoul as a case study, three years of bicycle accident spot data from 2018 to 2020 was collected, resulting in 1,330 bicycle accident spots and a total of 2,072 accidents. The geographically weighted Poisson regression (GWPR) model was used as a methodological approach to investigate the spatially varying relationships between the accident frequency and explanatory variables across the space, as opposed to the Poisson regression model. The results indicated that the GWPR model outperforms the global Poisson regression model in capturing unobserved spatial heterogeneity. For example, the value of deviance that determines the goodness of fit for a model was 0.244 for the Poisson regression model and 0.500 for the far better-fitting GWPR model. Further findings revealed that the factors affecting bicycle accidents have varying impacts depending on the location and distribution of accidents. For example, despite the presence of bicycle lanes, some census blocks, particularly in the northeast part of the city, still pose a risk for bicycle accidents. These findings can provide valuable insights for urban planners and policymakers in developing bicycle safety measures and regulations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    基于液滴的单细胞测序技术依赖于每个液滴封装单个细胞的基本假设,实现单个细胞组学分析。然而,多胞胎不可避免的问题,两个或多个细胞被包裹在一个液滴中,可能导致虚假的细胞类型注释和模糊的真实生物学发现。多重染色体的问题在单细胞多重组学设置中加剧,其中,集成用于聚类的跨模态信息可能会无意中促进多个聚类的聚合,并增加错误细胞类型注释的风险。这里,我们提出了一种基于复合泊松模型的单细胞多体组数据多重检测框架.利用实验细胞散列结果作为多重状态的真相,我们进行了三模态DOGMA-seq实验,并从两个组织中生成了17个基准数据集,共涉及280,123个液滴。我们证明了所提出的方法是集成跨模态多重信号的重要工具,有效消除单细胞多组学数据中的多重簇-基准单组学方法被证明是不充分的任务。
    Droplet-based single-cell sequencing techniques rely on the fundamental assumption that each droplet encapsulates a single cell, enabling individual cell omics profiling. However, the inevitable issue of multiplets, where two or more cells are encapsulated within a single droplet, can lead to spurious cell type annotations and obscure true biological findings. The issue of multiplets is exacerbated in single-cell multiomics settings, where integrating cross-modality information for clustering can inadvertently promote the aggregation of multiplet clusters and increase the risk of erroneous cell type annotations. Here, we propose a compound Poisson model-based framework for multiplet detection in single-cell multiomics data. Leveraging experimental cell hashing results as the ground truth for multiplet status, we conducted trimodal DOGMA-seq experiments and generated 17 benchmarking datasets from two tissues, involving a total of 280,123 droplets. We demonstrated that the proposed method is an essential tool for integrating cross-modality multiplet signals, effectively eliminating multiplet clusters in single-cell multiomics data-a task at which the benchmarked single-omics methods proved inadequate.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    染色体双中心和易位通常用作生物标志物来估计辐射剂量。本文的主要目标是对两种类型的像差的产量进行比较分析。目的是确定两种收益率之间是否有相关的区别,允许全面评估它们在辐射剂量估计中的适用性和准确性。
    分析涉及来自部分辐射模拟研究的数据,其中校准数据是通过两种评分方法获得的:常规和PAINT修改。随后,采用贝叶斯双变量零膨胀泊松模型比较双中心和易位平均值的后验边缘密度,并评估它们之间的差异.
    当采用常规的评分方法时,研究结果表明,观察到的易位和双中心的产量之间没有显着差异。然而,当使用PAINT修改方法时,观察到较高剂量的显着差异,指示两种类型的像差的平均数的相关差异。
    评分方法的选择显着影响辐射诱发像差的分析,特别是在区分复杂和简单的染色体形成时。需要进一步的研究和分析,以更深入地了解影响双中心和易位形成的因素和机制。
    UNASSIGNED: Chromosomal dicentrics and translocations are commonly employed as biomarkers to estimate radiation doses. The main goal of this article is to perform a comparative analysis of yields of both types of aberrations. The objective is to determine if there are relevant distinctions between both yields, allowing for a comprehensive assessment of their respective suitability and accuracy in the estimation of radiation doses.
    UNASSIGNED: The analysis involved data from a partial-radiation simulation study with the calibration data obtained through two scoring methods: conventional and PAINT modified. Subsequently, a Bayesian bivariate zero-inflated Poisson model was employed to compare the posterior marginal density of the mean of dicentrics and translocations and assess the differences between them.
    UNASSIGNED: When employing the conventional method of scoring, the findings indicate that there is no notable disparity between the yield of observed translocations and dicentrics. However, when utilizing the PAINT modified method, a notable discrepancy is observed for higher doses, indicating a relevant difference in the mean number of the two types of aberrations.
    UNASSIGNED: The choice of scoring method significantly influences the analysis of radiation-induced aberrations, especially when distinguishing between complex and simple chromosomal formations. Further research and analysis are necessary to gain a deeper understanding of the factors and mechanisms impacting the formation of dicentrics and translocations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号