Poisson Distribution

  • 文章类型: Journal Article
    BACKGROUND: In sub-Saharan African countries, preventable and manageable diseases such as diarrhea and acute respiratory infections still claim the lives of children. Hence, this study aims to estimate the rate of change in the log expected number of days a child suffers from Diarrhea (NOD) and flu/common cold (NOF) among children aged 6 to 11 months at the baseline of the study.
    METHODS: This study used secondary data which exhibit a longitudinal and multilevel structure. Based on the results of exploratory analysis, a multilevel zero-inflated Poisson regression model with a rate of change in the log expected NOD and NOF described by a quadratic trend was proposed to efficiently analyze both outcomes accounting for correlation between observations and individuals through random effects. Furthermore, residual plots were used to assess the goodness of fit of the model.
    RESULTS: Considering subject and cluster-specific random effects, the results revealed a quadratic trend in the rate of change of the log expected NOD. Initially, low dose iron Micronutrient Powder (MNP) users exhibited a higher rate of change compared to non-users, but this trend reversed over time. Similarly, the log expected NOF decreased for children who used MNP and exclusively breastfed for six months, in comparison to their counterparts. In addition, the odds of not having flu decreased with each two-week increment for MNP users, as compared to non-MNP users. Furthermore, an increase in NOD resulted in an increase in the log expected NOF. Region and exclusive breastfeeding also have a significant relationships with both NOD and NOF.
    CONCLUSIONS: The findings of this study underscore the importance of commencing analysis of data generated from a study with exploratory analysis. The study highlights the critical role of promoting EBF for the first six months and supporting children with additional food after six months to reduce the burden of infectious diseases.






  • 文章类型: Journal Article
    Recent research has established existence of a correlation between women\'s education and fertility, suggesting that they share similar risk factors. However, in many studies, the two variables were analysed separately, which could bias the conclusions by undermining the apparent correlations of such paired outcomes. In this article, the univariate and bivariate Poisson regression models were applied to nationally representative sample of 24,562 women from the 2015-16 Malawi demographic and health survey to examine the risk factors of women\'s education levels and fertility. The R software version 4.1.2 was used for the analyses. The results showed that estimates from the bivariate Poisson model were consistent with those obtained from the separate univariate Poisson models. The sizes of estimates of coefficients, their standard errors, p-values, and directions were comparable in both bivariate and univariate Poisson models. Using either the univariate or bivariate Poisson model, it was found that the age of a woman at first sexual experience, her current age, household wealth index, and contraceptive usage were significantly associated with both the woman\'s schooling and fertility. The study further revealed that ethnicity, religion, and region of residence impacted education level only and not fertility. Similarly, marital status and occupation impacted fertility only and not education. The study also found that higher education levels were linked to a lower number of children, with a strong negative correlation of -0.62 between the two variables. The study recommends using bivariate Poisson regression for analysing paired count response data, when there is an apparent covariance between the outcome variables. The results suggest that efforts by policymakers to achieve the desired women\'s sexual and reproductive health in sub-Saharan Africa should be intertwined with improving women\'s and girls\' education attainment in the region.






  • 文章类型: Journal Article
    UNASSIGNED: The Centre for Disease Control and Prevention in Yangquan, China, has taken a series of preventive and control measures in response to the increasing trend of Kala-Azar. In response, we propose a new model to more scientifically evaluate the effectiveness of these interventions.
    UNASSIGNED: We obtained the incidence data of Kala-Azar from 2017 to 2021 from the Centre for Disease Control and Prevention (CDC) in Yangquan. We constructed Poisson segmented regression model, harmonic Poisson segmental regression model, and improved harmonic Poisson segmented regression model, and used the three models to explain the intervention effect, respectively. Finally, we selected the optimal model by comparing the fitting effects of the three models.
    UNASSIGNED: The primary analysis showed an underlying upward trend of Kala-Azar before intervention [incidence rate ratio (IRR): 1.045, 95% confidence interval (CI): 1.027-1.063, p < 0.001]. In terms of long-term effects, the rise of Kala-Azar slowed down significantly after the intervention (IRR:0.960, 95%CI:0.927-0.995, p = 0.026), and the risk of Kala-Azar increased by 0.3% for each additional month after intervention (β1  + β3  = 0.003, IRR = 1.003). The results of the model fitting effect showed that the improved harmonic Poisson segmental regression model had the best fitting effect, and the values of MSE, MAE, and RMSE were the lowest, which were 0.017, 0.101, and 0.130, respectively.
    UNASSIGNED: In the long term, the intervention measures taken by the Yangquan CDC can well curb the upward trend of Kala-Azar. The improved harmonic Poisson segmented regression model has higher fitting performance, which can provide a certain scientific reference for the evaluation of the intervention effect of seasonal infectious diseases.






  • 文章类型: Journal Article
    To understand the transmissibility and spread of infectious diseases, epidemiologists turn to estimates of the instantaneous reproduction number. While many estimation approaches exist, their utility may be limited. Challenges of surveillance data collection, model assumptions that are unverifiable with data alone, and computationally inefficient frameworks are critical limitations for many existing approaches. We propose a discrete spline-based approach that solves a convex optimization problem-Poisson trend filtering-using the proximal Newton method. It produces a locally adaptive estimator for instantaneous reproduction number estimation with heterogeneous smoothness. Our methodology remains accurate even under some process misspecifications and is computationally efficient, even for large-scale data. The implementation is easily accessible in a lightweight R package rtestim.






  • 文章类型: English Abstract
    The aim of this study was to develop a methodology for estimating cancer incidence in Brazil and its regions. Using data from population-based cancer registries (RCBP, acronym in Portuguese) and the Brazilian Mortality Information System (SIM, acronym in Portuguese), annual incidence/mortality (I/M) ratios were calculated by type of cancer, age group and sex in each RCBP. Poisson longitudinal multilevel models were applied to estimate the I/M ratios by region in 2018. The estimate of new cancer cases in 2018 was calculated by applying the estimated I/M ratios to the number of SIM-corrected deaths that occurred that year. North and Northeast concentrated the lowest I/M ratios. Pancreatic, lung, liver and esophageal cancers had the lowest I/M ratios, whereas the highest were estimated for thyroid, testicular, prostate and female breast cancers. For 2018, 506,462 new cancer cases were estimated in Brazil. Female breast and prostate were the two main types of cancer in all regions. In the North and Northeast, cervical and stomach cancers stood out. Differences in the I/M ratios between regions were observed and may be related to socioeconomic development and access to health services.
    O objetivo deste estudo foi desenvolver metodologia para estimar a incidência de câncer no Brasil e regiões. A partir de dados dos registros de câncer de base populacional (RCBP) e do Sistema de Informações sobre Mortalidade (SIM) foram calculadas razões de incidência e mortalidade (I/M) anuais, tipo de câncer, faixa etária e sexo em cada RCBP. Para estimar as razões I/M por região em 2018, foram aplicados modelos multiníveis longitudinais de Poisson. A estimativa de casos novos de câncer, em 2018, foi calculada aplicando-se as razões I/M estimadas ao número de óbitos corrigidos do SIM ocorridos naquele ano. Norte e Nordeste concentraram as menores razões I/M. Os cânceres de pâncreas, pulmão, fígado e esôfago tiveram as menores razões I/M, enquanto as maiores razões I/M foram estimadas para câncer de tireoide, testículo, próstata e mama feminina. Para 2018, foram estimados 506.462 casos novos de câncer no Brasil. Mama feminina e próstata foram os dois principais tipos de câncer em todas as regiões. No Norte e no Nordeste, destacaram-se os cânceres do colo do útero e de estômago. Diferenças nas razões I/M entre as regiões foram observadas e podem estar relacionadas ao desenvolvimento socioeconômico e ao acesso a serviços de saúde.
    El objetivo de este estudio fue desarrollar una metodología para estimar la incidencia de cáncer en Brasil y sus regiones. A partir de datos de los registros de cáncer de base poblacional (RCBP) y el Sistema de Informaciones de Mortalidad (SIM), se calcularon las tasas anuales de incidencia y mortalidad (I/M), tipo de cáncer, grupo de edad y sexo en cada RCBP. Para estimar las tasas de I/M por región en 2018, se aplicaron modelos multinivel longitudinales de Poisson. Los nuevos casos de cáncer en 2018 se estimaron mediante la aplicación de las tasas I/M que se esperan para el número de muertes corregidas de SIM que habían ocurrido ese año. Las regiones Norte y Nordeste concentraron las más bajas tasas de I/M. Los cánceres de páncreas, pulmón, hígado y esófago tuvieron las más bajas tasas de I/M, mientras que las más altas tasas de I/M se estimaron para los cánceres de tiroides, testículos, próstata y mama femenina. Para 2018, se estimaron 506.462 nuevos casos de cáncer en Brasil. La mama femenina y la próstata representaron técnicas de estimación y configuraron ser los tipos principales de cáncer en todas las regiones. En el Norte y el Nordeste se destacaron los cánceres de cuello uterino y estómago. Se observaron diferencias en las tasas de I/M entre regiones, las cuales pueden estar relacionadas con el desarrollo socioeconómico y el acceso a los servicios de salud.






  • 文章类型: Journal Article
    In recent decades, pension reforms have been implemented to address the financial sustainability of social security systems, resulting in an increase in the retirement age. This adjustment has led to ongoing debates about the relationship between retirement and health. This study investigates the impact of time spent in retirement on the risk of cardiovascular disease (CVD) in Italy. It uses a comprehensive dataset that includes socioeconomic, health, and behavioural risk factors, which is linked to administrative hospitalisation and mortality registers. To address the potential endogeneity of retirement, we employ an instrumental variables approach embedded in a Poisson rate model. The results show that, on average, years spent in retirement have a beneficial effect on the risk of CVD for both men and women. Each additional year spent in retirement reduces the incidence of such diseases by about 17% for men and 29% for women. Stratified analyses and robustness tests show that the benefits of retirement appear to be more robust and pronounced in men and in certain groups, particularly men in manual occupations or with poor ergonomic conditions at work. These results highlight that delaying access to retirement may lead to an increased burden of CVD in the older population. In addition, the protective effect of retirement on the development of CVD among workers with poorer ergonomic conditions underlines the different impact of increasing the retirement age on different categories of workers and the need for targeted and differentiated policies to avoid hitting the more vulnerable.






  • 文章类型: Journal Article
    Measuring the abundance of microbes in a sample is a common procedure with a long history, but best practices are not well-conserved across microbiological fields. Serial dilution methods are commonly used to dilute bacterial cultures to produce countable numbers of colonies, and from these counts, to infer bacterial concentrations measured in colony-forming units (CFUs). The most common methods to generate data for CFU point estimates involve plating bacteria on (or in) a solid growth medium and counting their resulting colonies or counting the number of tubes at a given dilution that have growth. Traditionally, these types of data have been analyzed separately using different analytic methods. Here, we build a direct correspondence between these approaches, which allows one to extend the use of the most probable number method from the liquid tubes experiments, for which it was developed, to the growth plates by viewing colony-sized patches of a plate as equivalent to individual tubes. We also discuss how to combine measurements taken at different dilutions, and we review several ways of analyzing colony counts, including the Poisson and truncated Poisson methods. We test all point estimate methods computationally using simulated data. For all methods, we discuss their relevant error bounds, assumptions, strengths, and weaknesses. We provide an online calculator for these estimators.Estimation of the number of microbes in a sample is an important problem with a long history. Yet common practices, such as combining results from different measurements, remain sub-optimal. We provide a comparison of methods for estimating abundance of microbes and detail a mapping between different methods, which allows to extend their range of applicability. This mapping enables higher precision estimates of colony-forming units (CFUs) using the same data already collected for traditional CFU estimation methods. Furthermore, we provide recommendations for how to combine measurements of colony counts taken across dilutions, correcting several misconceptions in the literature.






  • 文章类型: Journal Article
    Ecological momentary assessment (EMA), a data collection method commonly employed in mHealth studies, allows for repeated real-time sampling of individuals\' psychological, behavioral, and contextual states. Due to the frequent measurements, data collected using EMA are useful for understanding both the temporal dynamics in individuals\' states and how these states relate to adverse health events. Motivated by data from a smoking cessation study, we propose a joint model for analyzing longitudinal EMA data to determine whether certain latent psychological states are associated with repeated cigarette use. Our method consists of a longitudinal submodel-a dynamic factor model-that models changes in the time-varying latent states and a cumulative risk submodel-a Poisson regression model-that connects the latent states with the total number of events. In the motivating data, both the predictors-the underlying psychological states-and the event outcome-the number of cigarettes smoked-are partially unobservable; we account for this incomplete information in our proposed model and estimation method. We take a two-stage approach to estimation that leverages existing software and uses importance sampling-based weights to reduce potential bias. We demonstrate that these weights are effective at reducing bias in the cumulative risk submodel parameters via simulation. We apply our method to a subset of data from a smoking cessation study to assess the association between psychological state and cigarette smoking. The analysis shows that above-average intensities of negative mood are associated with increased cigarette use.






  • 文章类型: Journal Article
    BACKGROUND: Schistosomiasis is a neglected disease prevalent in tropical and sub-tropical areas of the world, especially in Africa. Detecting the presence of the disease is based on the detection of the parasites in the stool or urine of children and adults. In such studies, typically, data collected on schistosomiasis infection includes information on many negative individuals leading to a high zero inflation. Thus, in practice, counts data with excessive zeros are common. However, the purpose of this analysis is to apply statistical models to the count data and evaluate their performance and results.
    METHODS: This is a secondary analysis of previously collected data. As part of a modelling process, a comparison of the Poisson regression, negative binomial regression and their associated zero inflated and hurdle models were used to determine which offered the best fit to the count data.
    RESULTS: Overall, 94.1% of the study participants did not have any schistosomiasis eggs out of 1345 people tested, resulting in a high zero inflation. The performance of the negative binomial regression models (hurdle negative binomial (HNB), zero inflated negative binomial (ZINB) and the standard negative binomial) were better than the Poisson-based regression models (Poisson, zero inflated Poisson, hurdle Poisson). The best models were the ZINB and HNB and their performances were indistinguishable according to information-based criteria test values.
    CONCLUSIONS: The zero-inflated negative binomial and hurdle negative binomial models were found to be the most satisfactory fit for modelling the over-dispersed zero inflated count data and are recommended for use in future statistical modelling analyses.






  • 文章类型: Journal Article
    Droplet-based single-cell sequencing techniques rely on the fundamental assumption that each droplet encapsulates a single cell, enabling individual cell omics profiling. However, the inevitable issue of multiplets, where two or more cells are encapsulated within a single droplet, can lead to spurious cell type annotations and obscure true biological findings. The issue of multiplets is exacerbated in single-cell multiomics settings, where integrating cross-modality information for clustering can inadvertently promote the aggregation of multiplet clusters and increase the risk of erroneous cell type annotations. Here, we propose a compound Poisson model-based framework for multiplet detection in single-cell multiomics data. Leveraging experimental cell hashing results as the ground truth for multiplet status, we conducted trimodal DOGMA-seq experiments and generated 17 benchmarking datasets from two tissues, involving a total of 280,123 droplets. We demonstrated that the proposed method is an essential tool for integrating cross-modality multiplet signals, effectively eliminating multiplet clusters in single-cell multiomics data-a task at which the benchmarked single-omics methods proved inadequate.





