Poisson Distribution

泊松分布
  • 文章类型: Journal Article
    背景:在撒哈拉以南非洲国家,腹泻和急性呼吸道感染等可预防和可控制的疾病仍然夺去儿童的生命。因此,本研究旨在估算6~11个月儿童在研究基线时发生腹泻(NOD)和流感/普通感冒(NOF)的对数预期天数的变化率.
    方法:本研究使用了具有纵向和多层次结构的次级数据。根据探索性分析的结果,提出了一种多级零膨胀Poisson回归模型,其对数预期NOD和NOF的变化率由二次趋势描述,以通过随机效应有效地分析两种结果,说明观察值与个体之间的相关性.此外,残差图用于评估模型的拟合优度。
    结果:考虑到主题和集群特定的随机效应,结果表明,对数预期NOD的变化率呈二次趋势。最初,低剂量铁微量营养素粉(MNP)使用者与非使用者相比表现出更高的变化率,但是随着时间的推移,这种趋势发生了逆转。同样,使用MNP和纯母乳喂养六个月的儿童的对数预期NOF下降,与他们的同行相比。此外,MNP用户每增加两周,没有流感的几率就会降低,与非MNP用户相比。此外,NOD的增加导致对数预期NOF的增加。区域和纯母乳喂养也与NOD和NOF有显著关系。
    结论:本研究的结果强调了用探索性分析开始分析研究产生的数据的重要性。该研究强调了在前六个月促进EBF并在六个月后为儿童提供额外食物以减轻传染病负担的关键作用。
    BACKGROUND: In sub-Saharan African countries, preventable and manageable diseases such as diarrhea and acute respiratory infections still claim the lives of children. Hence, this study aims to estimate the rate of change in the log expected number of days a child suffers from Diarrhea (NOD) and flu/common cold (NOF) among children aged 6 to 11 months at the baseline of the study.
    METHODS: This study used secondary data which exhibit a longitudinal and multilevel structure. Based on the results of exploratory analysis, a multilevel zero-inflated Poisson regression model with a rate of change in the log expected NOD and NOF described by a quadratic trend was proposed to efficiently analyze both outcomes accounting for correlation between observations and individuals through random effects. Furthermore, residual plots were used to assess the goodness of fit of the model.
    RESULTS: Considering subject and cluster-specific random effects, the results revealed a quadratic trend in the rate of change of the log expected NOD. Initially, low dose iron Micronutrient Powder (MNP) users exhibited a higher rate of change compared to non-users, but this trend reversed over time. Similarly, the log expected NOF decreased for children who used MNP and exclusively breastfed for six months, in comparison to their counterparts. In addition, the odds of not having flu decreased with each two-week increment for MNP users, as compared to non-MNP users. Furthermore, an increase in NOD resulted in an increase in the log expected NOF. Region and exclusive breastfeeding also have a significant relationships with both NOD and NOF.
    CONCLUSIONS: The findings of this study underscore the importance of commencing analysis of data generated from a study with exploratory analysis. The study highlights the critical role of promoting EBF for the first six months and supporting children with additional food after six months to reduce the burden of infectious diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    最近的研究表明,女性受教育程度和生育能力之间存在相关性,这表明他们有相似的风险因素。然而,在许多研究中,这两个变量分别进行了分析,这可能会破坏这种配对结果的明显相关性,从而使结论产生偏差。在这篇文章中,采用单变量和双变量Poisson回归模型,对2015-16年马拉维人口和健康调查的24,562名全国代表性女性样本进行研究,以检验女性受教育程度和生育率的危险因素.R软件版本4.1.2用于分析。结果表明,来自双变量泊松模型的估计与从单独的单变量泊松模型获得的估计一致。系数估计值的大小,他们的标准误差,p值,在双变量和单变量泊松模型中,方向具有可比性。使用单变量或双变量泊松模型,人们发现,女人第一次性经历时的年龄,她现在的年龄,家庭财富指数,避孕药具的使用与女性的受教育程度和生育能力显著相关。研究进一步揭示了种族,宗教,和居住地区只会影响教育水平,而不会影响生育率。同样,婚姻状况和职业只影响生育率,而不影响教育。该研究还发现,较高的教育水平与较低的儿童数量有关,两个变量之间存在-0.62的强负相关。该研究建议使用双变量泊松回归分析配对计数反应数据,当结果变量之间存在明显的协方差时。结果表明,政策制定者为实现撒哈拉以南非洲妇女的性健康和生殖健康所做的努力应与提高该地区妇女和女孩的教育水平交织在一起。
    Recent research has established existence of a correlation between women\'s education and fertility, suggesting that they share similar risk factors. However, in many studies, the two variables were analysed separately, which could bias the conclusions by undermining the apparent correlations of such paired outcomes. In this article, the univariate and bivariate Poisson regression models were applied to nationally representative sample of 24,562 women from the 2015-16 Malawi demographic and health survey to examine the risk factors of women\'s education levels and fertility. The R software version 4.1.2 was used for the analyses. The results showed that estimates from the bivariate Poisson model were consistent with those obtained from the separate univariate Poisson models. The sizes of estimates of coefficients, their standard errors, p-values, and directions were comparable in both bivariate and univariate Poisson models. Using either the univariate or bivariate Poisson model, it was found that the age of a woman at first sexual experience, her current age, household wealth index, and contraceptive usage were significantly associated with both the woman\'s schooling and fertility. The study further revealed that ethnicity, religion, and region of residence impacted education level only and not fertility. Similarly, marital status and occupation impacted fertility only and not education. The study also found that higher education levels were linked to a lower number of children, with a strong negative correlation of -0.62 between the two variables. The study recommends using bivariate Poisson regression for analysing paired count response data, when there is an apparent covariance between the outcome variables. The results suggest that efforts by policymakers to achieve the desired women\'s sexual and reproductive health in sub-Saharan Africa should be intertwined with improving women\'s and girls\' education attainment in the region.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阳泉市疾病预防控制中心,中国,针对卡拉-阿扎尔疫情日益增多的趋势,采取了一系列的防控措施。作为回应,我们提出了一个新的模型来更科学地评估这些干预措施的有效性.
    我们从阳泉市疾病预防控制中心(CDC)获得了2017年至2021年Kala-Azar的发病率数据。我们构建了泊松分段回归模型,谐波泊松分段回归模型,和改进的谐波泊松分段回归模型,并使用这三种模型来解释干预效果,分别。最后,通过比较三种模型的拟合效果,选择最优模型。
    初步分析显示干预前Kala-Azar有潜在的上升趋势[发生率比(IRR):1.045,95%置信区间(CI):1.027-1.063,p<0.001]。就长期影响而言,干预后Kala-Azar的上升显着放缓(IRR:0.960,95CI:0.927-0.995,p=0.026),干预后每增加一个月,Kala-Azar的风险增加0.3%(β1β3=0.003,IRR=1.003)。模型拟合效果的结果表明,改进的谐波泊松分段回归模型拟合效果最好,和MSE的值,MAE,RMSE最低,分别为0.017、0.101和0.130。
    从长远来看,阳泉市疾控中心采取的干预措施可以很好地遏制Kala-Azar的上升趋势。改进的谐波泊松分段回归模型具有更高的拟合性能,为季节性传染病干预效果的评价提供一定的科学参考。
    UNASSIGNED: The Centre for Disease Control and Prevention in Yangquan, China, has taken a series of preventive and control measures in response to the increasing trend of Kala-Azar. In response, we propose a new model to more scientifically evaluate the effectiveness of these interventions.
    UNASSIGNED: We obtained the incidence data of Kala-Azar from 2017 to 2021 from the Centre for Disease Control and Prevention (CDC) in Yangquan. We constructed Poisson segmented regression model, harmonic Poisson segmental regression model, and improved harmonic Poisson segmented regression model, and used the three models to explain the intervention effect, respectively. Finally, we selected the optimal model by comparing the fitting effects of the three models.
    UNASSIGNED: The primary analysis showed an underlying upward trend of Kala-Azar before intervention [incidence rate ratio (IRR): 1.045, 95% confidence interval (CI): 1.027-1.063, p < 0.001]. In terms of long-term effects, the rise of Kala-Azar slowed down significantly after the intervention (IRR:0.960, 95%CI:0.927-0.995, p = 0.026), and the risk of Kala-Azar increased by 0.3% for each additional month after intervention (β1  + β3  = 0.003, IRR = 1.003). The results of the model fitting effect showed that the improved harmonic Poisson segmental regression model had the best fitting effect, and the values of MSE, MAE, and RMSE were the lowest, which were 0.017, 0.101, and 0.130, respectively.
    UNASSIGNED: In the long term, the intervention measures taken by the Yangquan CDC can well curb the upward trend of Kala-Azar. The improved harmonic Poisson segmented regression model has higher fitting performance, which can provide a certain scientific reference for the evaluation of the intervention effect of seasonal infectious diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    了解传染病的传播和传播,流行病学家转向瞬时繁殖数的估计。虽然存在许多估计方法,其效用可能有限。监测数据收集的挑战,仅靠数据无法验证的模型假设,和计算效率低下的框架是许多现有方法的关键限制。我们提出了一种基于离散样条的方法,该方法使用近端牛顿法解决了凸优化问题-泊松趋势滤波。它产生了一个局部自适应估计器,用于具有异质平滑度的瞬时再现数量估计。即使在某些过程错误规范下,我们的方法仍然准确,并且在计算上很有效,即使是大规模数据。该实现可以在轻量级R包rtestim中轻松访问。
    To understand the transmissibility and spread of infectious diseases, epidemiologists turn to estimates of the instantaneous reproduction number. While many estimation approaches exist, their utility may be limited. Challenges of surveillance data collection, model assumptions that are unverifiable with data alone, and computationally inefficient frameworks are critical limitations for many existing approaches. We propose a discrete spline-based approach that solves a convex optimization problem-Poisson trend filtering-using the proximal Newton method. It produces a locally adaptive estimator for instantaneous reproduction number estimation with heterogeneous smoothness. Our methodology remains accurate even under some process misspecifications and is computationally efficient, even for large-scale data. The implementation is easily accessible in a lightweight R package rtestim.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: English Abstract
    The aim of this study was to develop a methodology for estimating cancer incidence in Brazil and its regions. Using data from population-based cancer registries (RCBP, acronym in Portuguese) and the Brazilian Mortality Information System (SIM, acronym in Portuguese), annual incidence/mortality (I/M) ratios were calculated by type of cancer, age group and sex in each RCBP. Poisson longitudinal multilevel models were applied to estimate the I/M ratios by region in 2018. The estimate of new cancer cases in 2018 was calculated by applying the estimated I/M ratios to the number of SIM-corrected deaths that occurred that year. North and Northeast concentrated the lowest I/M ratios. Pancreatic, lung, liver and esophageal cancers had the lowest I/M ratios, whereas the highest were estimated for thyroid, testicular, prostate and female breast cancers. For 2018, 506,462 new cancer cases were estimated in Brazil. Female breast and prostate were the two main types of cancer in all regions. In the North and Northeast, cervical and stomach cancers stood out. Differences in the I/M ratios between regions were observed and may be related to socioeconomic development and access to health services.
    O objetivo deste estudo foi desenvolver metodologia para estimar a incidência de câncer no Brasil e regiões. A partir de dados dos registros de câncer de base populacional (RCBP) e do Sistema de Informações sobre Mortalidade (SIM) foram calculadas razões de incidência e mortalidade (I/M) anuais, tipo de câncer, faixa etária e sexo em cada RCBP. Para estimar as razões I/M por região em 2018, foram aplicados modelos multiníveis longitudinais de Poisson. A estimativa de casos novos de câncer, em 2018, foi calculada aplicando-se as razões I/M estimadas ao número de óbitos corrigidos do SIM ocorridos naquele ano. Norte e Nordeste concentraram as menores razões I/M. Os cânceres de pâncreas, pulmão, fígado e esôfago tiveram as menores razões I/M, enquanto as maiores razões I/M foram estimadas para câncer de tireoide, testículo, próstata e mama feminina. Para 2018, foram estimados 506.462 casos novos de câncer no Brasil. Mama feminina e próstata foram os dois principais tipos de câncer em todas as regiões. No Norte e no Nordeste, destacaram-se os cânceres do colo do útero e de estômago. Diferenças nas razões I/M entre as regiões foram observadas e podem estar relacionadas ao desenvolvimento socioeconômico e ao acesso a serviços de saúde.
    El objetivo de este estudio fue desarrollar una metodología para estimar la incidencia de cáncer en Brasil y sus regiones. A partir de datos de los registros de cáncer de base poblacional (RCBP) y el Sistema de Informaciones de Mortalidad (SIM), se calcularon las tasas anuales de incidencia y mortalidad (I/M), tipo de cáncer, grupo de edad y sexo en cada RCBP. Para estimar las tasas de I/M por región en 2018, se aplicaron modelos multinivel longitudinales de Poisson. Los nuevos casos de cáncer en 2018 se estimaron mediante la aplicación de las tasas I/M que se esperan para el número de muertes corregidas de SIM que habían ocurrido ese año. Las regiones Norte y Nordeste concentraron las más bajas tasas de I/M. Los cánceres de páncreas, pulmón, hígado y esófago tuvieron las más bajas tasas de I/M, mientras que las más altas tasas de I/M se estimaron para los cánceres de tiroides, testículos, próstata y mama femenina. Para 2018, se estimaron 506.462 nuevos casos de cáncer en Brasil. La mama femenina y la próstata representaron técnicas de estimación y configuraron ser los tipos principales de cáncer en todas las regiones. En el Norte y el Nordeste se destacaron los cánceres de cuello uterino y estómago. Se observaron diferencias en las tasas de I/M entre regiones, las cuales pueden estar relacionadas con el desarrollo socioeconómico y el acceso a los servicios de salud.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近几十年来,养老金改革已经实施,以解决社会保障体系的财务可持续性问题,导致退休年龄的增加。这种调整导致了关于退休与健康之间关系的持续辩论。这项研究调查了退休时间对意大利心血管疾病(CVD)风险的影响。它使用一个全面的数据集,包括社会经济,健康,和行为风险因素,这与行政住院和死亡率登记册有关。为了解决退休的潜在内生性,我们采用嵌入在泊松速率模型中的工具变量方法。结果表明,平均而言,退休多年对男性和女性心血管疾病的风险都有有益的影响.退休后每增加一年,男性此类疾病的发病率就会减少约17%,女性减少29%。分层分析和稳健性测试表明,退休的好处在男性和某些群体中似乎更加稳健和明显,特别是从事体力劳动或人体工程学工作条件差的男性。这些结果突出表明,延迟退休可能会导致老年人口心血管疾病负担增加。此外,在工效学条件较差的工人中,退休对心血管疾病发展的保护作用凸显了提高退休年龄对不同类别工人的不同影响,以及需要有针对性和区别对待的政策来避免打击更脆弱的人群.
    In recent decades, pension reforms have been implemented to address the financial sustainability of social security systems, resulting in an increase in the retirement age. This adjustment has led to ongoing debates about the relationship between retirement and health. This study investigates the impact of time spent in retirement on the risk of cardiovascular disease (CVD) in Italy. It uses a comprehensive dataset that includes socioeconomic, health, and behavioural risk factors, which is linked to administrative hospitalisation and mortality registers. To address the potential endogeneity of retirement, we employ an instrumental variables approach embedded in a Poisson rate model. The results show that, on average, years spent in retirement have a beneficial effect on the risk of CVD for both men and women. Each additional year spent in retirement reduces the incidence of such diseases by about 17% for men and 29% for women. Stratified analyses and robustness tests show that the benefits of retirement appear to be more robust and pronounced in men and in certain groups, particularly men in manual occupations or with poor ergonomic conditions at work. These results highlight that delaying access to retirement may lead to an increased burden of CVD in the older population. In addition, the protective effect of retirement on the development of CVD among workers with poorer ergonomic conditions underlines the different impact of increasing the retirement age on different categories of workers and the need for targeted and differentiated policies to avoid hitting the more vulnerable.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    测量样品中微生物的丰度是一个历史悠久的常见程序,但在微生物fi领域,最佳实践并不保守。连续稀释方法通常用于稀释细菌培养物以产生可计数数量的菌落,从这些数字来看,推断以菌落形成单位(CFU)测量的细菌浓度。产生CFU点估计数据的最常见方法包括在固体生长培养基上(或在固体生长培养基中)铺板细菌并计数其产生的菌落或计数具有生长的给定稀释度的管的数量。传统上,这些类型的数据已经使用不同的分析方法进行了单独分析。这里,我们在这些方法之间建立了直接的对应关系,这允许人们从液体管实验中扩展使用最可能的数字方法,它是为之开发的,通过将平板的菌落大小的斑块视为等同于单个管,从而将其添加到生长平板。我们还讨论了如何结合在不同稀释度下进行的测量,我们回顾了几种分析菌落计数的方法,包括泊松和截断泊松方法。我们使用模拟数据对所有点估计方法进行计算测试。对于所有方法,我们讨论它们的相关误差范围,假设,优势,和弱点。我们为这些估计器提供了一个在线计算器。样品中微生物数量的估计是一个历史悠久的重要问题。然而,常见的做法,例如结合不同测量的结果,保持次优。我们提供了估算微生物丰度的方法比较,并详细说明了不同方法之间的映射,这允许扩展其适用范围。这种映射使得能够使用已经为传统CFU估计方法收集的相同数据来更高精度地估计菌落形成单位(CFU)。此外,我们提供了有关如何结合稀释度测量集落计数的建议,纠正了文献中的几个误解。
    Measuring the abundance of microbes in a sample is a common procedure with a long history, but best practices are not well-conserved across microbiological fields. Serial dilution methods are commonly used to dilute bacterial cultures to produce countable numbers of colonies, and from these counts, to infer bacterial concentrations measured in colony-forming units (CFUs). The most common methods to generate data for CFU point estimates involve plating bacteria on (or in) a solid growth medium and counting their resulting colonies or counting the number of tubes at a given dilution that have growth. Traditionally, these types of data have been analyzed separately using different analytic methods. Here, we build a direct correspondence between these approaches, which allows one to extend the use of the most probable number method from the liquid tubes experiments, for which it was developed, to the growth plates by viewing colony-sized patches of a plate as equivalent to individual tubes. We also discuss how to combine measurements taken at different dilutions, and we review several ways of analyzing colony counts, including the Poisson and truncated Poisson methods. We test all point estimate methods computationally using simulated data. For all methods, we discuss their relevant error bounds, assumptions, strengths, and weaknesses. We provide an online calculator for these estimators.Estimation of the number of microbes in a sample is an important problem with a long history. Yet common practices, such as combining results from different measurements, remain sub-optimal. We provide a comparison of methods for estimating abundance of microbes and detail a mapping between different methods, which allows to extend their range of applicability. This mapping enables higher precision estimates of colony-forming units (CFUs) using the same data already collected for traditional CFU estimation methods. Furthermore, we provide recommendations for how to combine measurements of colony counts taken across dilutions, correcting several misconceptions in the literature.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    生态瞬时评估(EMA),mHealth研究中常用的数据收集方法,允许对个人进行重复的实时采样,行为,和上下文状态。由于测量频繁,使用EMA收集的数据有助于了解个体状态的时间动态以及这些状态与不良健康事件的关系.根据戒烟研究的数据,我们提出了一个联合模型,用于分析纵向EMA数据,以确定某些潜在的心理状态是否与重复使用香烟有关。我们的方法包括纵向子模型-动态因子模型-对时变潜在状态的变化进行建模,以及累积风险子模型-泊松回归模型-将潜在状态与事件总数联系起来。在激励数据中,预测因素-潜在的心理状态-和事件结果-吸烟数量-都部分无法观察到;我们在提出的模型和估计方法中考虑了这种不完整的信息.我们采用两阶段方法进行估计,该方法利用现有软件并使用基于重要性采样的权重来减少潜在的偏差。我们通过仿真证明了这些权重在减少累积风险子模型参数中的偏差方面是有效的。我们将我们的方法应用于戒烟研究的数据子集,以评估心理状态与吸烟之间的关联。分析表明,高于平均水平的负面情绪强度与香烟使用量增加有关。
    Ecological momentary assessment (EMA), a data collection method commonly employed in mHealth studies, allows for repeated real-time sampling of individuals\' psychological, behavioral, and contextual states. Due to the frequent measurements, data collected using EMA are useful for understanding both the temporal dynamics in individuals\' states and how these states relate to adverse health events. Motivated by data from a smoking cessation study, we propose a joint model for analyzing longitudinal EMA data to determine whether certain latent psychological states are associated with repeated cigarette use. Our method consists of a longitudinal submodel-a dynamic factor model-that models changes in the time-varying latent states and a cumulative risk submodel-a Poisson regression model-that connects the latent states with the total number of events. In the motivating data, both the predictors-the underlying psychological states-and the event outcome-the number of cigarettes smoked-are partially unobservable; we account for this incomplete information in our proposed model and estimation method. We take a two-stage approach to estimation that leverages existing software and uses importance sampling-based weights to reduce potential bias. We demonstrate that these weights are effective at reducing bias in the cumulative risk submodel parameters via simulation. We apply our method to a subset of data from a smoking cessation study to assess the association between psychological state and cigarette smoking. The analysis shows that above-average intensities of negative mood are associated with increased cigarette use.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:血吸虫病是一种被忽视的疾病,在世界热带和亚热带地区流行,尤其是在非洲。检测疾病的存在是基于对儿童和成人的粪便或尿液中的寄生虫的检测。在这样的研究中,通常,收集的血吸虫病感染数据包括许多负面个体的信息,导致高零通货膨胀。因此,在实践中,计数数据中过多的零是常见的。然而,此分析的目的是将统计模型应用于计数数据并评估其性能和结果。
    方法:这是对先前收集的数据的二次分析。作为建模过程的一部分,比较泊松回归,使用负二项回归及其相关的零膨胀和障碍模型来确定哪种模型最适合计数数据。
    结果:总体而言,在接受测试的1345人中,94.1%的研究参与者没有任何血吸虫病卵。导致零通胀。负二项回归模型的性能(跨栏负二项(HNB),零膨胀负二项(ZINB)和标准负二项)优于基于泊松的回归模型(泊松,零膨胀泊松,跨栏泊松)。最佳模型是ZINB和HNB,根据基于信息的标准测试值,它们的性能无法区分。
    结论:发现零膨胀负二项和障碍负二项模型是对过度分散的零膨胀计数数据进行建模的最令人满意的拟合,建议在未来的统计建模分析中使用。
    BACKGROUND: Schistosomiasis is a neglected disease prevalent in tropical and sub-tropical areas of the world, especially in Africa. Detecting the presence of the disease is based on the detection of the parasites in the stool or urine of children and adults. In such studies, typically, data collected on schistosomiasis infection includes information on many negative individuals leading to a high zero inflation. Thus, in practice, counts data with excessive zeros are common. However, the purpose of this analysis is to apply statistical models to the count data and evaluate their performance and results.
    METHODS: This is a secondary analysis of previously collected data. As part of a modelling process, a comparison of the Poisson regression, negative binomial regression and their associated zero inflated and hurdle models were used to determine which offered the best fit to the count data.
    RESULTS: Overall, 94.1% of the study participants did not have any schistosomiasis eggs out of 1345 people tested, resulting in a high zero inflation. The performance of the negative binomial regression models (hurdle negative binomial (HNB), zero inflated negative binomial (ZINB) and the standard negative binomial) were better than the Poisson-based regression models (Poisson, zero inflated Poisson, hurdle Poisson). The best models were the ZINB and HNB and their performances were indistinguishable according to information-based criteria test values.
    CONCLUSIONS: The zero-inflated negative binomial and hurdle negative binomial models were found to be the most satisfactory fit for modelling the over-dispersed zero inflated count data and are recommended for use in future statistical modelling analyses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基于液滴的单细胞测序技术依赖于每个液滴封装单个细胞的基本假设,实现单个细胞组学分析。然而,多胞胎不可避免的问题,两个或多个细胞被包裹在一个液滴中,可能导致虚假的细胞类型注释和模糊的真实生物学发现。多重染色体的问题在单细胞多重组学设置中加剧,其中,集成用于聚类的跨模态信息可能会无意中促进多个聚类的聚合,并增加错误细胞类型注释的风险。这里,我们提出了一种基于复合泊松模型的单细胞多体组数据多重检测框架.利用实验细胞散列结果作为多重状态的真相,我们进行了三模态DOGMA-seq实验,并从两个组织中生成了17个基准数据集,共涉及280,123个液滴。我们证明了所提出的方法是集成跨模态多重信号的重要工具,有效消除单细胞多组学数据中的多重簇-基准单组学方法被证明是不充分的任务。
    Droplet-based single-cell sequencing techniques rely on the fundamental assumption that each droplet encapsulates a single cell, enabling individual cell omics profiling. However, the inevitable issue of multiplets, where two or more cells are encapsulated within a single droplet, can lead to spurious cell type annotations and obscure true biological findings. The issue of multiplets is exacerbated in single-cell multiomics settings, where integrating cross-modality information for clustering can inadvertently promote the aggregation of multiplet clusters and increase the risk of erroneous cell type annotations. Here, we propose a compound Poisson model-based framework for multiplet detection in single-cell multiomics data. Leveraging experimental cell hashing results as the ground truth for multiplet status, we conducted trimodal DOGMA-seq experiments and generated 17 benchmarking datasets from two tissues, involving a total of 280,123 droplets. We demonstrated that the proposed method is an essential tool for integrating cross-modality multiplet signals, effectively eliminating multiplet clusters in single-cell multiomics data-a task at which the benchmarked single-omics methods proved inadequate.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号