Sandwich variance estimator

三明治方差估计器
  • 文章类型: Journal Article
    线性混合模型通常用于分析阶梯楔形聚类随机试验。分析阶梯式楔形集群随机试验的一个关键考虑因素是考虑潜在的复杂相关结构,这可以通过指定随机效应来实现。最简单的随机效应结构是随机截距,但更复杂的结构,如随机分组,离散时间衰减,最近,随机干预结构,已被提议。在实践中指定适当的随机效应可能是具有挑战性的:假设更复杂的相关结构可能是合理的,但它们容易受到计算挑战。为了规避这些挑战,鲁棒方差估计器可以应用于线性混合模型,以在存在随机效应错误指定的情况下提供固定效应参数的标准误差的一致估计器。然而,对于阶梯式楔形聚类随机试验,目前还没有关于稳健方差估计的实证研究.在这篇文章中,我们回顾了可用于R中线性混合模型的六个稳健方差估计器(标准和小样本偏差校正稳健方差估计器),然后描述一个全面的模拟研究,以检查这些稳健方差估计器在不同数据生成器下具有连续结果的阶梯式楔形集群随机试验的性能。对于每个数据生成器,我们研究是否使用具有随机截距模型或随机聚类周期模型的鲁棒方差估计器足以为固定效应参数提供有效的统计推断,当这些工作模型受到随机效应错误的规范时。我们的结果表明,具有稳健方差估计的随机截距和随机逐期聚类模型表现充分。CR3鲁棒方差估计器(近似夹刀)估计器,再加上簇的数量减去两个自由度的校正,始终如一地给出了最好的报道结果,但是当集群数量低于16时,可能会稍微保守一些。我们总结了我们的结果对阶梯楔形聚类随机试验的线性混合模型分析的意义,并就分析模型的选择提供了一些实用建议。
    Linear mixed models are commonly used in analyzing stepped-wedge cluster randomized trials. A key consideration for analyzing a stepped-wedge cluster randomized trial is accounting for the potentially complex correlation structure, which can be achieved by specifying random-effects. The simplest random effects structure is random intercept but more complex structures such as random cluster-by-period, discrete-time decay, and more recently, the random intervention structure, have been proposed. Specifying appropriate random effects in practice can be challenging: assuming more complex correlation structures may be reasonable but they are vulnerable to computational challenges. To circumvent these challenges, robust variance estimators may be applied to linear mixed models to provide consistent estimators of standard errors of fixed effect parameters in the presence of random-effects misspecification. However, there has been no empirical investigation of robust variance estimators for stepped-wedge cluster randomized trials. In this article, we review six robust variance estimators (both standard and small-sample bias-corrected robust variance estimators) that are available for linear mixed models in R, and then describe a comprehensive simulation study to examine the performance of these robust variance estimators for stepped-wedge cluster randomized trials with a continuous outcome under different data generators. For each data generator, we investigate whether the use of a robust variance estimator with either the random intercept model or the random cluster-by-period model is sufficient to provide valid statistical inference for fixed effect parameters, when these working models are subject to random-effect misspecification. Our results indicate that the random intercept and random cluster-by-period models with robust variance estimators performed adequately. The CR3 robust variance estimator (approximate jackknife) estimator, coupled with the number of clusters minus two degrees of freedom correction, consistently gave the best coverage results, but could be slightly conservative when the number of clusters was below 16. We summarize the implications of our results for the linear mixed model analysis of stepped-wedge cluster randomized trials and offer some practical recommendations on the choice of the analytic model.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    单独随机分组治疗(IRGT)试验,其中结果的聚类是由基于群体的治疗交付引起的,在公共卫生研究中越来越受欢迎。IRGT试验经常结合纵向测量,适当的样本量计算应考虑反映治疗诱导的聚类和重复结果测量的相关结构。鉴于有关设计纵向IRGT试验的文献相对较少,我们基于广义估计方程方法提出了连续结果和二元结果的样本量程序,将具有不同相关参数的块可交换相关结构用于治疗臂和控制臂,并调查了五个具有不同时间效应假设的边际均值模型:无时间常数处理效应,线性时间常数治疗效果,分类时间常数治疗效果,通过治疗相互作用的线性时间,和治疗相互作用的分类时间。得出了连续结果的封闭形式的样本量公式,这取决于相关矩阵的特征值;针对二元结果提出了详细的数值样本大小程序。通过模拟,我们证明了经验功率与预测功率非常吻合,治疗组中只有八个小组,当使用相关参数的矩阵调整估计方程和偏差校正的三明治方差估计器来分析数据时。
    Individually randomized group treatment (IRGT) trials, in which the clustering of outcome is induced by group-based treatment delivery, are increasingly popular in public health research. IRGT trials frequently incorporate longitudinal measurements, of which the proper sample size calculations should account for correlation structures reflecting both the treatment-induced clustering and repeated outcome measurements. Given the relatively sparse literature on designing longitudinal IRGT trials, we propose sample size procedures for continuous and binary outcomes based on the generalized estimating equations approach, employing the block exchangeable correlation structures with different correlation parameters for the treatment arm and for the control arm, and surveying five marginal mean models with different assumptions of time effect: no-time constant treatment effect, linear-time constant treatment effect, categorical-time constant treatment effect, linear time by treatment interaction, and categorical time by treatment interaction. Closed-form sample size formulas are derived for continuous outcomes, which depends on the eigenvalues of the correlation matrices; detailed numerical sample size procedures are proposed for binary outcomes. Through simulations, we demonstrate that the empirical power agrees well with the predicted power, for as few as eight groups formed in the treatment arm, when data are analyzed using the matrix-adjusted estimating equations for the correlation parameters with a bias-corrected sandwich variance estimator.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    当存在多种类型的事件或研究受试者集群时,会出现多变量间隔删失数据。使得事件时间潜在地相关,并且当每个事件仅已知在特定时间间隔内发生时。我们通过边际比例风险模型制定了潜在时变协变量对多变量事件时间的影响,同时未指定相关事件时间的依赖结构。我们在所有事件时间都是独立的工作假设下构造了非参数伪似然,我们提供了一个简单而稳定的EM型算法。所得到的回归参数的非参数最大伪似然估计量显示为一致且渐近正态,具有极限协方差矩阵,该矩阵可以在相关事件时间的任意依赖结构下通过三明治估计器进行一致估计。我们通过广泛的模拟研究来评估所提出方法的性能,并将其应用于社区动脉粥样硬化风险研究的数据。
    Multivariate interval-censored data arise when there are multiple types of events or clusters of study subjects, such that the event times are potentially correlated and when each event is only known to occur over a particular time interval. We formulate the effects of potentially time-varying covariates on the multivariate event times through marginal proportional hazards models while leaving the dependence structures of the related event times unspecified. We construct the nonparametric pseudolikelihood under the working assumption that all event times are independent, and we provide a simple and stable EM-type algorithm. The resulting nonparametric maximum pseudolikelihood estimators for the regression parameters are shown to be consistent and asymptotically normal, with a limiting covariance matrix that can be consistently estimated by a sandwich estimator under arbitrary dependence structures for the related event times. We evaluate the performance of the proposed methods through extensive simulation studies and present an application to data from the Atherosclerosis Risk in Communities Study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:通常建议使用带有广义估计方程(GEE)的边际模型来分析相关的序数结果,这在纵向研究或集群随机试验(CRT)中很常见。集群内关联通常在纵向研究或CRT中感兴趣,并且可以用成对的估计方程来估计。然而,当聚类数量较小时,聚类内关联参数和方差的估计器可能会受到有限样本偏差的影响。本文的目的是介绍一种新开发的R包ORTH。使用具有有限样本偏差校正的GEE模型分析相关序数结果的Ord。
    方法:R包ORTH。Ord实现了基于正交残差(ORTH)的估计的交替逻辑回归的修改版本,在边际均值和关联模型中,使用配对估计方程联合估计参数。序数响应之间的聚类内关联通过全局成对比值比(POR)建模。R包还根据矩阵乘法调整的正交化残差(MMORTH)对POR参数估计进行了有限样本偏差校正,以校正估计方程,以及具有不同协方差估计选项的偏差校正三明治估计。
    结果:一项模拟研究表明,与未校正的ORTH相比,MMORTH提供的全局POR估计偏差较小,其95%置信区间的覆盖率更接近标称水平。对患者报告的正颌手术临床试验结果的分析说明了ORTH的特征。Ord.
    结论:本文概述了ORTH方法,对估计方程和三明治估计器进行了偏差校正,以分析相关的序数数据,描述了ORTH的功能。OrdR包,使用仿真研究评估软件包的性能,最后说明了其在临床试验分析中的应用。
    OBJECTIVE: Marginal models with generalized estimating equations (GEE) are usually recommended for analyzing correlated ordinal outcomes which are commonly seen in a longitudinal study or clustered randomized trial (CRT). Within-cluster association is often of interest in longitudinal studies or CRTs, and can be estimated with paired estimating equations. However, the estimators for within-cluster association parameters and variances may be subject to finite-sample biases when the number of clusters is small. The objective of this article is to introduce a newly developed R package ORTH.Ord for analyzing correlated ordinal outcomes using GEE models with finite-sample bias corrections.
    METHODS: The R package ORTH.Ord implements a modified version of alternating logistic regressions with estimation based on orthogonalized residuals (ORTH), which use paired estimating equations to jointly estimate parameters in marginal mean and association models. The within-cluster association between ordinal responses is modeled by global pairwise odds ratios (POR). The R package also provides a finite-sample bias correction to POR parameter estimates based on matrix multiplicative adjusted orthogonalized residuals (MMORTH) for correcting estimating equations, and bias-corrected sandwich estimators with different options for covariance estimation.
    RESULTS: A simulation study shows that MMORTH provides less biased global POR estimates and coverage of their 95% confidence intervals closer to the nominal level than uncorrected ORTH. An analysis of patient-reported outcomes from an orthognathic surgery clinical trial illustrates features of ORTH.Ord.
    CONCLUSIONS: This article provides an overview of the ORTH method with bias-correction on both estimating equations and sandwich estimators for analyzing correlated ordinal data, describes the features of the ORTH.Ord R package, evaluates the performance of the package using a simulation study, and finally illustrates its application in an analysis of a clinical trial.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    群集随机试验(CRT)涉及将整组参与者(称为群集)随机分配到治疗组,但通常由有限或固定数量的可用群集组成。虽然协变量调整可以解释治疗组之间的机会不平衡,并提高单独随机试验的统计效率,迄今为止,小CRT中个体水平协变量调整的分析方法很少受到关注。在本文中,我们系统地调查,通过广泛的模拟,将倾向评分加权和多变量回归作为两种个体水平协变量调整策略的操作特征,用于估计具有罕见二元结果的小型CRT中的参与者-平均因果效应,并确定每种调整策略相对于另一种调整策略具有相对效率优势的情景,以提出切实可行的建议.我们还检查了与倾向得分加权和多变量回归相关的偏差校正夹心方差估计器的有限样本性能,以量化估计参与者平均治疗效果的不确定性。为了说明个体水平协变量调整的方法,我们重新分析了最近一项在31个儿科重症监护病房进行的CRT镇静试验.
    Cluster-randomized trials (CRTs) involve randomizing entire groups of participants-called clusters-to treatment arms but are often comprised of a limited or fixed number of available clusters. While covariate adjustment can account for chance imbalances between treatment arms and increase statistical efficiency in individually randomized trials, analytical methods for individual-level covariate adjustment in small CRTs have received little attention to date. In this paper, we systematically investigate, through extensive simulations, the operating characteristics of propensity score weighting and multivariable regression as two individual-level covariate adjustment strategies for estimating the participant-average causal effect in small CRTs with a rare binary outcome and identify scenarios where each adjustment strategy has a relative efficiency advantage over the other to make practical recommendations. We also examine the finite-sample performance of the bias-corrected sandwich variance estimators associated with propensity score weighting and multivariable regression for quantifying the uncertainty in estimating the participant-average treatment effect. To illustrate the methods for individual-level covariate adjustment, we reanalyze a recent CRT testing a sedation protocol in 31 pediatric intensive care units.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    尽管逻辑回归是用二元响应建模回归关系的最流行方法,许多人发现相对风险(RR),或风险比率,更容易解释,更喜欢在回归分析中使用这种风险度量。的确,自从Zou发表了他改进的泊松回归方法来为横截面数据建模RR以来,他的论文被引用了7000多次,证明了在涉及二元反应的回归分析中这种替代风险度量的受欢迎程度。随着纵向研究在临床试验和观察性研究中越来越受欢迎,对纵向数据推广邹氏方法势在必行。纵向数据分析的两种最流行的方法是广义线性混合效应模型(GLMM)和广义估计方程(GEE)。然而,参数GLMM不能用于当前上下文中的扩展,因为Zou\的方法将二元响应视为泊松变量,这与二进制响应的伯努利分布不一致。另一方面,因为它没有对数据分布施加数学模型,半参数GEE与Zou的修正泊松回归相关。在本文中,我们开发了一个基于GEE的二元反应纵向模型,以提供关于RR的推断。
    Although logistic regression is the most popular for modelling regression relationships with binary responses, many find relative risk (RR), or risk ratio, easier to interpret and prefer to use this measure of risk in regression analysis. Indeed, since Zou published his modified Poisson regression approach for modelling RR for cross-sectional data, his paper has been cited over 7 000 times, demonstrating the popularity of this alternative measure of risk in regression analysis involving binary responses. As longitudinal studies have become increasingly popular in clinical trials and observational studies, it is imperative to extend Zou\'s approach for longitudinal data. The two most popular approaches for longitudinal data analysis are the generalised linear mixed-effects model (GLMM) and generalised estimating equations (GEE). However, the parametric GLMM cannot be used for the extension within the current context, because Zou\'s approach treats the binary response as a Poisson variable, which is at odds with the Bernoulli distribution for the binary response. On the other hand, as it imposes no mathematical model on data distributions, the semiparametric GEE is coherent with Zou\'s modified Poisson regression. In this paper, we develop a GEE-based longitudinal model for binary responses to provide inference about RR.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    虽然已经广泛研究和比较了分析具有连续和二元结果的整群随机试验的统计方法,在存在竞争风险的情况下,很少有比较证据用于分析具有生存结局的整群随机试验.受减少伤害和在老年人试验中培养信心的策略的激励,我们进行了一项模拟研究,以比较几种现有的人口平均生存模型的运行特征,包括边缘考克斯,边缘精细和灰色,和边际多状态模型。对于每个模型,我们发现,当集群数量很大时,通过三明治方差估计器调整类内相关性有效地保持了I型错误率。不超过30个集群,然而,三明治方差估计可以表现出显著的负偏差,和置换测试提供了对I型误差膨胀的更好控制。在替代方案下,每个模型的功率受到两种类型的组内相关性的不同影响-个体内和个体间的相关性。此外,边际精细和灰色模型偶尔会导致比边际Cox模型或边际多状态模型更高的功率,特别是当比赛事件率很高时。最后,我们使用所考虑的每种分析策略,对减少伤害和培养老年人信心试验的策略进行了说明性分析.
    While statistical methods for analyzing cluster randomized trials with continuous and binary outcomes have been extensively studied and compared, little comparative evidence has been provided for analyzing cluster randomized trials with survival outcomes in the presence of competing risks. Motivated by the Strategies to Reduce Injuries and Develop Confidence in Elders trial, we carried out a simulation study to compare the operating characteristics of several existing population-averaged survival models, including the marginal Cox, marginal Fine and Gray, and marginal multi-state models. For each model, we found that adjusting for the intraclass correlations through the sandwich variance estimator effectively maintained the type I error rate when the number of clusters is large. With no more than 30 clusters, however, the sandwich variance estimator can exhibit notable negative bias, and a permutation test provides better control of type I error inflation. Under the alternative, the power for each model is differentially affected by two types of intraclass correlations-the within-individual and between-individual correlations. Furthermore, the marginal Fine and Gray model occasionally leads to higher power than the marginal Cox model or the marginal multi-state model, especially when the competing event rate is high. Finally, we provide an illustrative analysis of Strategies to Reduce Injuries and Develop Confidence in Elders trial using each analytical strategy considered.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    边际Fine-Gray比例子分布风险模型是一种流行的方法,可以直接研究协变量与集群竞争风险数据的累积发生率函数之间的关联,这通常出现在多中心随机试验或多水平观察性研究中。要考虑故障时间之间的集群内相关性,回归参数估计器的不确定性由鲁棒三明治方差估计器量化,在有限数量的集群下,性能可能不令人满意。为了克服这个限制,我们提出了四个偏差校正方差估计器,以减少通常的三明治方差估计器的负偏差,将偏差校正技术从具有非删失指数族结果的广义估计方程扩展到集群竞争风险结果.我们通过仿真和两个真实数据示例进一步比较了它们的有限样本操作特性。特别是,我们发现Mancl和DeRouen(MD)型夹心方差估计器通常具有最小的偏差。此外,有少量的集群,具有MD三明治方差估计器的Waldt置信区间对集群级效应参数具有接近标称的覆盖率。基于三明治方差估计的t置信区间具有三种类型的乘法偏差校正中的任何一种或具有Morel的z置信区间,Bokossa和Neerchal(MBN)型三明治方差估计器对个体水平效应参数具有接近标称的覆盖率。最后,我们开发了一个用户友好的R包crrcbcv实现建议的三明治方差估计器,以协助实际应用。
    The marginal Fine-Gray proportional subdistribution hazards model is a popular approach to directly study the association between covariates and the cumulative incidence function with clustered competing risks data, which often arise in multicenter randomized trials or multilevel observational studies. To account for the within-cluster correlations between failure times, the uncertainty of the regression parameters estimators is quantified by the robust sandwich variance estimator, which may have unsatisfactory performance with a limited number of clusters. To overcome this limitation, we propose four bias-corrected variance estimators to reduce the negative bias of the usual sandwich variance estimator, extending the bias-correction techniques from generalized estimating equations with noncensored exponential family outcomes to clustered competing risks outcomes. We further compare their finite-sample operating characteristics through simulations and two real data examples. In particular, we found the Mancl and DeRouen (MD) type sandwich variance estimator generally has the smallest bias. Furthermore, with a small number of clusters, the Wald t -confidence interval with the MD sandwich variance estimator carries close to nominal coverage for the cluster-level effect parameter. The t -confidence intervals based on the sandwich variance estimator with any one of the three types of multiplicative bias correction or the z -confidence interval with the Morel, Bokossa and Neerchal (MBN) type sandwich variance estimator have close to nominal coverage for the individual-level effect parameter. Finally, we develop a user-friendly R package crrcbcv implementing the proposed sandwich variance estimators to assist practical applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    随着我们对微生物组的理解的扩大,人们也认识到它在人类健康和疾病中的关键作用,从而强调测试微生物是否与环境因素或临床结果相关的重要性。然而,与微生物组调查有关的许多基本挑战来自统计和实验设计问题,例如微生物组计数数据的稀疏和过度分散性质以及样本之间复杂的相关结构。例如,在人类微生物组项目(HMP)数据集中,跨时间点(级别1)的重复观察嵌套在身体部位(级别2)内,它们进一步嵌套在主题(3级)中。因此,非常需要发展专门和复杂的统计测试。在本文中,我们提出了多级零膨胀负二项模型,用于微生物组调查中的关联分析.我们开发了一种用于最大似然估计和推断的变分逼近方法。它使用优化,而不是抽样,近似对数似然并计算参数估计,提供了参数估计协方差的稳健估计,并构造了关联检验的Wald型检验统计量。我们使用广泛的模拟研究和HMP数据集的应用来评估和证明我们方法的性能。我们开发了一个R包MZINBVA来实现所提出的方法,,可从GitHub存储库https://github.com/liudoubletian/MZINBVA获得。
    As our understanding of the microbiome has expanded, so has the recognition of its critical role in human health and disease, thereby emphasizing the importance of testing whether microbes are associated with environmental factors or clinical outcomes. However, many of the fundamental challenges that concern microbiome surveys arise from statistical and experimental design issues, such as the sparse and overdispersed nature of microbiome count data and the complex correlation structure among samples. For example, in the human microbiome project (HMP) dataset, the repeated observations across time points (level 1) are nested within body sites (level 2), which are further nested within subjects (level 3). Therefore, there is a great need for the development of specialized and sophisticated statistical tests. In this paper, we propose multilevel zero-inflated negative-binomial models for association analysis in microbiome surveys. We develop a variational approximation method for maximum likelihood estimation and inference. It uses optimization, rather than sampling, to approximate the log-likelihood and compute parameter estimates, provides a robust estimate of the covariance of parameter estimates and constructs a Wald-type test statistic for association testing. We evaluate and demonstrate the performance of our method using extensive simulation studies and an application to the HMP dataset. We have developed an R package MZINBVA to implement the proposed method, which is available from the GitHub repository https://github.com/liudoubletian/MZINBVA.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    逆概率加权Cox模型可用于估计观察性研究中不同点处理下的边际风险比。要获得方差估计,通常建议使用稳健的夹心方差估计器来考虑加权观测值之间的相关性。然而,该估计器在估计权重时不包含不确定性,并且倾向于高估方差,导致推理效率低下。在这里,我们提出了一种新的方差估计器,它结合了使用堆叠估计方程的风险比和权重的估计程序,对Cox部分似然得分方程中不独立且相同分布的项的总和进行额外调整。我们通过分析证明了鲁棒的三明治方差估计器是保守的,并在所提出的方差估计器与Hajage等人通过线性化获得的方差估计器之间建立了渐近等价性。在2018年。此外,我们扩展了我们提出的方差估计器以适应聚类数据。我们通过仿真研究比较了所提出方法与替代方法的有限样本性能。我们在独立数据设置和聚类数据设置中说明了这些不同的方差方法,使用减肥手术数据集和多次再入院数据集,分别。为了便于实施所提出的方法,我们开发了一个R包ipwCoxCSV。
    Inverse probability weighted Cox models can be used to estimate marginal hazard ratios under different point treatments in observational studies. To obtain variance estimates, the robust sandwich variance estimator is often recommended to account for the induced correlation among weighted observations. However, this estimator does not incorporate the uncertainty in estimating the weights and tends to overestimate the variance, leading to inefficient inference. Here we propose a new variance estimator that combines the estimation procedures for the hazard ratio and weights using stacked estimating equations, with additional adjustments for the sum of terms that are not independently and identically distributed in a Cox partial likelihood score equation. We prove analytically that the robust sandwich variance estimator is conservative and establish the asymptotic equivalence between the proposed variance estimator and one obtained through linearization by Hajage et al. in 2018. In addition, we extend our proposed variance estimator to accommodate clustered data. We compare the finite sample performance of the proposed method with alternative methods through simulation studies. We illustrate these different variance methods in both independent and clustered data settings, using a bariatric surgery dataset and a multiple readmission dataset, respectively. To facilitate implementation of the proposed method, we have developed an R package ipwCoxCSV.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号