Interval censoring

  • 文章类型: Journal Article
    间隔审查失败时间数据经常出现在各种科学研究中,每个受试者都会定期对感兴趣的失败事件的发生进行检查,并且故障时间仅在特定的时间间隔内。此外,收集的数据可能包括多个具有一定程度相关性的观测变量,导致严重的多重共线性问题。这项工作提出了一种因子增强转换模型来分析间隔删失故障时间数据,同时降低了模型维数并避免了多个相关协变量引起的多重共线性。我们提供了一个联合建模框架,包括一个因素分析模型,将多个观察到的变量分组为几个潜在因素,以及一类带有增强因素的半参数转换模型,以检查它们和其他协变量对故障事件的影响。此外,我们提出了一种非参数最大似然估计方法,并开发了一种计算稳定且可靠的期望最大化算法。我们建立了所提出的估计器的渐近性质,并进行了仿真研究以评估所提出方法的经验性能。提供了阿尔茨海默病神经影像学计划(ADNI)研究的应用。一个R包ICTransCFA也可用于从业人员。本文制备中使用的数据来自ADNI数据库。
    Interval-censored failure time data frequently arise in various scientific studies where each subject experiences periodical examinations for the occurrence of the failure event of interest, and the failure time is only known to lie in a specific time interval. In addition, collected data may include multiple observed variables with a certain degree of correlation, leading to severe multicollinearity issues. This work proposes a factor-augmented transformation model to analyze interval-censored failure time data while reducing model dimensionality and avoiding multicollinearity elicited by multiple correlated covariates. We provide a joint modeling framework by comprising a factor analysis model to group multiple observed variables into a few latent factors and a class of semiparametric transformation models with the augmented factors to examine their and other covariate effects on the failure event. Furthermore, we propose a nonparametric maximum likelihood estimation approach and develop a computationally stable and reliable expectation-maximization algorithm for its implementation. We establish the asymptotic properties of the proposed estimators and conduct simulation studies to assess the empirical performance of the proposed method. An application to the Alzheimer\'s Disease Neuroimaging Initiative (ADNI) study is provided. An R package ICTransCFA is also available for practitioners. Data used in preparation of this article were obtained from the ADNI database.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:SARS-CoV-2孵化时间分布的估计受到有关感染的不完整数据的阻碍。我们讨论了对此类数据的错误处理可能导致的两种偏见。已通知的案例可以更精确地回忆最近的曝光(差异回忆)。如果分析仅限于具有明确定义的暴露的观察,这就会产生偏差,因为更长的孵化时间更有可能被排除在外。另一个偏差出现在基于武汉旅行者数据的初步估计中。仅包括在离开后出现症状的个体,导致孵化时间较短的病例代表性不足(左侧截短)。在文献中进行的分析中没有解决这个问题。
    方法:我们进行了模拟并提供了文献综述,以研究SARS-CoV-2孵化时间分布的估计百分位数的偏差量。
    结果:根据差异召回率,将分析限制在狭窄暴露窗口的子集导致中位数的低估,甚至在第95百分位数的低估。未能考虑左截断会导致中位数和第95百分位数高估多天。
    结论:我们检查了两个被忽视的关于接触信息的偏倚来源,从事孵化时间估计的研究人员需要注意。
    BACKGROUND: Estimation of the SARS-CoV-2 incubation time distribution is hampered by incomplete data about infection. We discuss two biases that may result from incorrect handling of such data. Notified cases may recall recent exposures more precisely (differential recall). This creates bias if the analysis is restricted to observations with well-defined exposures, as longer incubation times are more likely to be excluded. Another bias occurred in the initial estimates based on data concerning travellers from Wuhan. Only individuals who developed symptoms after their departure were included, leading to under-representation of cases with shorter incubation times (left truncation). This issue was not addressed in the analyses performed in the literature.
    METHODS: We performed simulations and provide a literature review to investigate the amount of bias in estimated percentiles of the SARS-CoV-2 incubation time distribution.
    RESULTS: Depending on the rate of differential recall, restricting the analysis to a subset of narrow exposure windows resulted in underestimation in the median and even more in the 95th percentile. Failing to account for left truncation led to an overestimation of multiple days in both the median and the 95th percentile.
    CONCLUSIONS: We examined two overlooked sources of bias concerning exposure information that the researcher engaged in incubation time estimation needs to be aware of.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    定量偏差分析(QBA)允许评估可用数据的各种缺陷对特定现实世界研究的结果和结论的预期影响。本文将QBA方法扩展到具有右删失端点的多变量时间到事件分析,可能包括时变暴露或协变量。所提出的方法采用数据驱动的模拟,保留手头数据的重要特征,同时提供控制可能影响结果的参数和假设的灵活性。首先,描述了执行数据驱动模拟所需的步骤,然后两个现实世界时间到事件分析的例子说明了它们的实现和它们可能提供的见解。第一个例子着重于在癌症死亡率的预后研究中遗漏了一个重要的时间不变的结果预测因子,并允许将混杂偏差的预期影响与非可折叠性分开。第二个示例评估了间隔审查事件的不精确时机-仅在很少的临床就诊时间确定-如何影响其与随时间变化的药物暴露的估计关联。仿真结果还为比较在此设置中估算未知事件时间的两种替代策略的性能提供了基础。提供了允许复制我们的示例的R脚本。
    Quantitative bias analysis (QBA) permits assessment of the expected impact of various imperfections of the available data on the results and conclusions of a particular real-world study. This article extends QBA methodology to multivariable time-to-event analyses with right-censored endpoints, possibly including time-varying exposures or covariates. The proposed approach employs data-driven simulations, which preserve important features of the data at hand while offering flexibility in controlling the parameters and assumptions that may affect the results. First, the steps required to perform data-driven simulations are described, and then two examples of real-world time-to-event analyses illustrate their implementation and the insights they may offer. The first example focuses on the omission of an important time-invariant predictor of the outcome in a prognostic study of cancer mortality, and permits separating the expected impact of confounding bias from non-collapsibility. The second example assesses how imprecise timing of an interval-censored event - ascertained only at sparse times of clinic visits - affects its estimated association with a time-varying drug exposure. The simulation results also provide a basis for comparing the performance of two alternative strategies for imputing the unknown event times in this setting. The R scripts that permit the reproduction of our examples are provided.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    许多纵向研究旨在监测参与者与疾病进展有关的重大事件。这种纵向研究产生的数据通常会受到间隔审查,因为已知事件仅在两次监测访问之间发生。在这项工作中,我们提出了一种在比例风险模型框架内处理间隔删失多状态数据的新方法,其中事件的危险率由时间的非参数函数建模,协变量按比例影响危险率。该方法的主要思想是通过近似和数据扩充技术的应用来简化离散时间多状态模型的似然函数,其中假定存在的删失信息有助于更简单的参数化。然后使用期望最大化算法对模型中的参数进行估计。通过数值研究评估了所提出方法的性能。最后,该方法用于分析追踪心脏移植后冠状动脉移植血管病变进展的数据集。
    Many longitudinal studies are designed to monitor participants for major events related to the progression of diseases. Data arising from such longitudinal studies are usually subject to interval censoring since the events are only known to occur between two monitoring visits. In this work, we propose a new method to handle interval-censored multistate data within a proportional hazards model framework where the hazard rate of events is modeled by a nonparametric function of time and the covariates affect the hazard rate proportionally. The main idea of this method is to simplify the likelihood functions of a discrete-time multistate model through an approximation and the application of data augmentation techniques, where the assumed presence of censored information facilitates a simpler parameterization. Then the expectation-maximization algorithm is used to estimate the parameters in the model. The performance of the proposed method is evaluated by numerical studies. Finally, the method is employed to analyze a dataset on tracking the advancement of coronary allograft vasculopathy following heart transplantation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:无进展生存期(PFS)用于评估癌症临床试验中的治疗效果。患者的疾病进展(DP)通常通过在几个预定的肿瘤评估时间点的放射学测试来确定。这产生了真实进展时间和观察到的进展时间之间的差异。当观察到的进展时间被认为是真实的进展时间时,一些患者获得了正偏倚的PFS,Kaplan-Meier方法得到的估计生存函数也存在偏差。
    方法:虽然中点插补方法可用,并用中点数据替换间隔删失数据,它不切实际地假设,当在相同的肿瘤评估间隔内观察到多个DP时,多个DP在相同的时间点发生。我们通过基于相同肿瘤评估间隔内观察到的间隔删失数据的数量,将间隔删失数据替换为相等间隔的时间点数据,从而增强了中点填补方法。
    结果:无论肿瘤评估频率如何,增强方法的中值的均方根误差几乎总是小于中点插补的均方根误差。在大多数情况下,增强方法的覆盖概率接近95%的标称置信水平。
    结论:我们认为增强的方法,它建立在中点插补方法的基础上,比中点插补方法本身更有效。
    BACKGROUND: Progression-free survival (PFS) is used to evaluate treatment effects in cancer clinical trials. Disease progression (DP) in patients is typically determined by radiological testing at several scheduled tumor-assessment time points. This produces a discrepancy between the true progression time and the observed progression time. When the observed progression time is considered as the true progression time, a positively biased PFS is obtained for some patients, and the estimated survival function derived by the Kaplan-Meier method is also biased.
    METHODS: While the midpoint imputation method is available and replaces interval-censored data with midpoint data, it unrealistically assumes that several DPs occur at the same time point when several DPs are observed within the same tumor-assessment interval. We enhanced the midpoint imputation method by replacing interval-censored data with equally spaced timepoint data based on the number of observed interval-censored data within the same tumor-assessment interval.
    RESULTS: The root mean square error of the median of the enhanced method is almost always smaller than that of the midpoint imputation regardless of the tumor-assessment frequency. The coverage probability of the enhanced method is close to the nominal confidence level of 95% in most scenarios.
    CONCLUSIONS: We believe that the enhanced method, which builds upon the midpoint imputation method, is more effective than the midpoint imputation method itself.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    要检查是否对水烟和香烟的危害性和成瘾性的感知影响水烟和香烟的起始年龄,分别,在美国青年。青年(12-17岁)用户和从未使用过水烟和香烟的用户在他们的第一波PATH参与由每个烟草产品(TP)独立分析。使用间隔审查Cox比例风险模型估计了在第一波PATH参与中(i)危害性和(ii)成瘾性对开始使用水烟的年龄的影响。
    通过将采样权重和100个平衡重复的重复权重乘以逆概率权重(IPW),来平衡用户和从未使用过水烟的用户。IPW是基于在他们的第一波PATH参与中成为用户的概率。Fay的系数为0.3,用于方差估计。报告了粗风险比(HR)和95%置信区间(CI)。对香烟重复类似的过程。
    与将每个TP视为“很多伤害”的年轻人相比,报告认为“有些伤害”的年轻人开始使用这些烟草产品的年龄较小,水烟HR:2.53(95%CI:2.87-4.34),香烟HR:2.35(95%CI:2.10-2.62)。同样,与认为每个TP“有很多伤害”的年轻人相比,认为每个TP“没有/小伤害”的年轻人开始这些TP的年龄较早,水烟的HR:2.23(95%CI:1.82,2.71),香烟的HR:1.85(95%CI:1.72,1.98)。与那些将每个TP报告为“有点/非常有可能”作为他们对成瘾的看法的年轻人相比,青年报告“既不可能也不可能”和“非常/有些不可能”,因为他们对水烟上瘾的看法有一个更大的开始年龄,HR:0.75(95%CI:0.67-0.83)和HR:0.55(95%CI:0.47,0.63)。
    对这些烟草制品(TP)的危害性和成瘾性的认识应在针对青年的教育运动中解决,以防止早期开始吸烟和水烟。
    To examine if perceptions of harmfulness and addictiveness of hookah and cigarettes impact the age of initiation of hookah and cigarettes, respectively, among US youth. Youth (12-17 years old) users and never users of hookah and cigarettes during their first wave of PATH participation were analyzed by each tobacco product (TP) independently. The effect of perceptions of (i) harmfulness and (ii) addictiveness at the first wave of PATH participation on the age of initiation of ever use of hookah was estimated using interval-censoring Cox proportional hazards models.
    Users and never users of hookah at their first wave of PATH participation were balanced by multiplying the sampling weight and the 100 balance repeated replicate weights with the inverse probability weight (IPW). The IPW was based on the probability of being a user in their first wave of PATH participation. A Fay\'s factor of 0.3 was included for variance estimation. Crude hazard ratios (HR) and 95% confidence intervals (CIs) are reported. A similar process was repeated for cigarettes.
    Compared to youth who perceived each TP as \"a lot of harm\", youth who reported perceived \"some harm\" had younger ages of initiation of these tobacco products, HR: 2.53 (95% CI: 2.87-4.34) for hookah and HR: 2.35 (95% CI: 2.10-2.62) for cigarettes. Similarly, youth who perceived each TP as \"no/little harm\" had an earlier age of initiation of these TPs compared to those who perceived them as \"a lot of harm\", with an HR: 2.23 (95% CI: 1.82, 2.71) for hookah and an HR: 1.85 (95% CI: 1.72, 1.98) for cigarettes. Compared to youth who reported each TP as \"somewhat/very likely\" as their perception of addictiveness, youth who reported \"neither likely nor unlikely\" and \"very/somewhat unlikely\" as their perception of addictiveness of hookah had an older age of initiation, with an HR: 0.75 (95% CI: 0.67-0.83) and an HR: 0.55 (95% CI: 0.47, 0.63) respectively.
    Perceptions of the harmfulness and addictiveness of these tobacco products (TPs) should be addressed in education campaigns for youth to prevent early ages of initiation of cigarettes and hookah.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    当存在多种类型的复发事件时,会出现多变量面板计数数据,每个研究对象的观察包括两次连续检查之间每种类型的复发事件的数量。我们通过比例率模型制定了潜在的时间依赖性协变量对多种类型的复发事件的影响,而相关复发事件的依赖结构完全未指明。我们在所有类型的事件都是独立的并且每种类型的事件都是非齐次泊松过程的工作假设下采用非参数最大伪似然估计,我们开发了一个简单而稳定的EM型算法。我们证明了回归参数的估计是一致的和渐近正态的,具有可以通过三明治估计器一致估计的协方差矩阵。此外,我们开发了一类图形和数值方法来检查拟合模型的充分性。最后,我们通过模拟研究和皮肤癌临床试验的分析来评估所提出方法的性能。
    Multivariate panel count data arise when there are multiple types of recurrent events, and the observation for each study subject consists of the number of recurrent events of each type between two successive examinations. We formulate the effects of potentially time-dependent covariates on multiple types of recurrent events through proportional rates models, while leaving the dependence structures of the related recurrent events completely unspecified. We employ nonparametric maximum pseudo-likelihood estimation under the working assumptions that all types of events are independent and each type of event is a nonhomogeneous Poisson process, and we develop a simple and stable EM-type algorithm. We show that the resulting estimators of the regression parameters are consistent and asymptotically normal, with a covariance matrix that can be estimated consistently by a sandwich estimator. In addition, we develop a class of graphical and numerical methods for checking the adequacy of the fitted model. Finally, we evaluate the performance of the proposed methods through simulation studies and analysis of a skin cancer clinical trial.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    面板计数数据和间隔删失数据是事件史研究中经常出现的两种不完整数据。几乎所有现有的统计方法都是为它们的单独分析而开发的。在本文中,我们调查了一种更一般的情况,即反复事件过程和间隔审查失败事件同时发生.为了直观清晰地解释当前循环过程与故障事件之间的关系,我们通过完全未指定的链接函数提出了一个故障时间相关的均值模型。为了克服非参数分量和参数回归系数混合带来的挑战,我们开发了一个两阶段的基于条件期望似然的估计程序。我们建立了一致性,所提出的两阶段估计器的收敛速度和渐近正态。此外,我们构造了一类双样本检验,用于比较不同组的均值函数。所提出的方法通过广泛的模拟研究进行了评估,并通过激发本研究的皮肤癌数据进行了说明。
    Panel count data and interval-censored data are two types of incomplete data that often occur in event history studies. Almost all existing statistical methods are developed for their separate analysis. In this paper, we investigate a more general situation where a recurrent event process and an interval-censored failure event occur together. To intuitively and clearly explain the relationship between the recurrent current process and failure event, we propose a failure time-dependent mean model through a completely unspecified link function. To overcome the challenges arising from the blending of nonparametric components and parametric regression coefficients, we develop a two-stage conditional expected likelihood-based estimation procedure. We establish the consistency, the convergence rate and the asymptotic normality of the proposed two-stage estimator. Furthermore, we construct a class of two-sample tests for comparison of mean functions from different groups. The proposed methods are evaluated by extensive simulation studies and are illustrated with the skin cancer data that motivated this study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在这篇文章中,考虑了竞争风险生存模型,其中初始风险数量,假设服从负二项分布,受到破坏性机制的影响。假设感兴趣的人群有治愈成分,数据的形式为间隔审查,并考虑到初始风险的数量和销毁后仍处于活动状态的风险都是缺失的数据,我们为该模型开发了两种不同的估计算法。利用缺失数据的条件分布,我们开发了一种期望最大化(EM)算法,其中条件期望的完整对数似然函数被分解为更简单的函数,然后独立地最大化。EM算法的一种变体,称为随机EM(SEM)算法,还开发了避免计算复杂的期望和提高性能的参数恢复的目标。进行了蒙特卡罗模拟研究,以通过计算出的偏差来评估两种估计方法的性能,均方根误差,和渐近置信区间的覆盖概率。我们通过仿真证明了所提出的SEM算法作为首选估计方法,并进一步说明了SEM算法的优势,以及使用破坏性模型,来自一项儿童死亡率研究的数据。
    In this article, a competitive risk survival model is considered in which the initial number of risks, assumed to follow a negative binomial distribution, is subject to a destructive mechanism. Assuming the population of interest to have a cure component, the form of the data as interval-censored, and considering both the number of initial risks and risks remaining active after destruction to be missing data, we develop two distinct estimation algorithms for this model. Making use of the conditional distributions of the missing data, we develop an expectation maximization (EM) algorithm, in which the conditional expected complete log-likelihood function is decomposed into simpler functions which are then maximized independently. A variation of the EM algorithm, called the stochastic EM (SEM) algorithm, is also developed with the goal of avoiding the calculation of complicated expectations and improving performance at parameter recovery. A Monte Carlo simulation study is carried out to evaluate the performance of both estimation methods through calculated bias, root mean square error, and coverage probability of the asymptotic confidence interval. We demonstrate the proposed SEM algorithm as the preferred estimation method through simulation and further illustrate the advantage of the SEM algorithm, as well as the use of a destructive model, with data from a children\'s mortality study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    近似的伯恩斯坦多项式模型,β分布的混合,用于获得回归系数的最大似然估计,基于包括当前状态数据的区间删失数据的加速失效时间模型中的基线密度和生存函数。在未经审查和/或间隔审查数据的某些条件下,回归系数和基础基线密度函数的估计器显示与几乎参数化的收敛速度一致。仿真结果表明,该方法优于竞争对手。通过使用加速故障时间模型拟合乳房化妆品和HIV感染时间数据来说明所提出的方法。
    The approximate Bernstein polynomial model, a mixture of beta distributions, is applied to obtain maximum likelihood estimates of the regression coefficients, the baseline density and the survival functions in an accelerated failure time model based on interval censored data including current status data. The estimators of the regression coefficients and the underlying baseline density function are shown to be consistent with almost parametric rates of convergence under some conditions for uncensored and/or interval censored data. Simulation shows that the proposed method is better than its competitors. The proposed method is illustrated by fitting the Breast Cosmetic and the HIV infection time data using the accelerated failure time model.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号