Interval censoring

  • 文章类型: Journal Article
    Interval-censored failure time data frequently arise in various scientific studies where each subject experiences periodical examinations for the occurrence of the failure event of interest, and the failure time is only known to lie in a specific time interval. In addition, collected data may include multiple observed variables with a certain degree of correlation, leading to severe multicollinearity issues. This work proposes a factor-augmented transformation model to analyze interval-censored failure time data while reducing model dimensionality and avoiding multicollinearity elicited by multiple correlated covariates. We provide a joint modeling framework by comprising a factor analysis model to group multiple observed variables into a few latent factors and a class of semiparametric transformation models with the augmented factors to examine their and other covariate effects on the failure event. Furthermore, we propose a nonparametric maximum likelihood estimation approach and develop a computationally stable and reliable expectation-maximization algorithm for its implementation. We establish the asymptotic properties of the proposed estimators and conduct simulation studies to assess the empirical performance of the proposed method. An application to the Alzheimer\'s Disease Neuroimaging Initiative (ADNI) study is provided. An R package ICTransCFA is also available for practitioners. Data used in preparation of this article were obtained from the ADNI database.






  • 文章类型: Journal Article
    BACKGROUND: Estimation of the SARS-CoV-2 incubation time distribution is hampered by incomplete data about infection. We discuss two biases that may result from incorrect handling of such data. Notified cases may recall recent exposures more precisely (differential recall). This creates bias if the analysis is restricted to observations with well-defined exposures, as longer incubation times are more likely to be excluded. Another bias occurred in the initial estimates based on data concerning travellers from Wuhan. Only individuals who developed symptoms after their departure were included, leading to under-representation of cases with shorter incubation times (left truncation). This issue was not addressed in the analyses performed in the literature.
    METHODS: We performed simulations and provide a literature review to investigate the amount of bias in estimated percentiles of the SARS-CoV-2 incubation time distribution.
    RESULTS: Depending on the rate of differential recall, restricting the analysis to a subset of narrow exposure windows resulted in underestimation in the median and even more in the 95th percentile. Failing to account for left truncation led to an overestimation of multiple days in both the median and the 95th percentile.
    CONCLUSIONS: We examined two overlooked sources of bias concerning exposure information that the researcher engaged in incubation time estimation needs to be aware of.






  • 文章类型: Journal Article
    Quantitative bias analysis (QBA) permits assessment of the expected impact of various imperfections of the available data on the results and conclusions of a particular real-world study. This article extends QBA methodology to multivariable time-to-event analyses with right-censored endpoints, possibly including time-varying exposures or covariates. The proposed approach employs data-driven simulations, which preserve important features of the data at hand while offering flexibility in controlling the parameters and assumptions that may affect the results. First, the steps required to perform data-driven simulations are described, and then two examples of real-world time-to-event analyses illustrate their implementation and the insights they may offer. The first example focuses on the omission of an important time-invariant predictor of the outcome in a prognostic study of cancer mortality, and permits separating the expected impact of confounding bias from non-collapsibility. The second example assesses how imprecise timing of an interval-censored event - ascertained only at sparse times of clinic visits - affects its estimated association with a time-varying drug exposure. The simulation results also provide a basis for comparing the performance of two alternative strategies for imputing the unknown event times in this setting. The R scripts that permit the reproduction of our examples are provided.






  • 文章类型: Journal Article
    Many longitudinal studies are designed to monitor participants for major events related to the progression of diseases. Data arising from such longitudinal studies are usually subject to interval censoring since the events are only known to occur between two monitoring visits. In this work, we propose a new method to handle interval-censored multistate data within a proportional hazards model framework where the hazard rate of events is modeled by a nonparametric function of time and the covariates affect the hazard rate proportionally. The main idea of this method is to simplify the likelihood functions of a discrete-time multistate model through an approximation and the application of data augmentation techniques, where the assumed presence of censored information facilitates a simpler parameterization. Then the expectation-maximization algorithm is used to estimate the parameters in the model. The performance of the proposed method is evaluated by numerical studies. Finally, the method is employed to analyze a dataset on tracking the advancement of coronary allograft vasculopathy following heart transplantation.






  • 文章类型: Journal Article
    BACKGROUND: Progression-free survival (PFS) is used to evaluate treatment effects in cancer clinical trials. Disease progression (DP) in patients is typically determined by radiological testing at several scheduled tumor-assessment time points. This produces a discrepancy between the true progression time and the observed progression time. When the observed progression time is considered as the true progression time, a positively biased PFS is obtained for some patients, and the estimated survival function derived by the Kaplan-Meier method is also biased.
    METHODS: While the midpoint imputation method is available and replaces interval-censored data with midpoint data, it unrealistically assumes that several DPs occur at the same time point when several DPs are observed within the same tumor-assessment interval. We enhanced the midpoint imputation method by replacing interval-censored data with equally spaced timepoint data based on the number of observed interval-censored data within the same tumor-assessment interval.
    RESULTS: The root mean square error of the median of the enhanced method is almost always smaller than that of the midpoint imputation regardless of the tumor-assessment frequency. The coverage probability of the enhanced method is close to the nominal confidence level of 95% in most scenarios.
    CONCLUSIONS: We believe that the enhanced method, which builds upon the midpoint imputation method, is more effective than the midpoint imputation method itself.






  • 文章类型: Journal Article
    To examine if perceptions of harmfulness and addictiveness of hookah and cigarettes impact the age of initiation of hookah and cigarettes, respectively, among US youth. Youth (12-17 years old) users and never users of hookah and cigarettes during their first wave of PATH participation were analyzed by each tobacco product (TP) independently. The effect of perceptions of (i) harmfulness and (ii) addictiveness at the first wave of PATH participation on the age of initiation of ever use of hookah was estimated using interval-censoring Cox proportional hazards models.
    Users and never users of hookah at their first wave of PATH participation were balanced by multiplying the sampling weight and the 100 balance repeated replicate weights with the inverse probability weight (IPW). The IPW was based on the probability of being a user in their first wave of PATH participation. A Fay\'s factor of 0.3 was included for variance estimation. Crude hazard ratios (HR) and 95% confidence intervals (CIs) are reported. A similar process was repeated for cigarettes.
    Compared to youth who perceived each TP as \"a lot of harm\", youth who reported perceived \"some harm\" had younger ages of initiation of these tobacco products, HR: 2.53 (95% CI: 2.87-4.34) for hookah and HR: 2.35 (95% CI: 2.10-2.62) for cigarettes. Similarly, youth who perceived each TP as \"no/little harm\" had an earlier age of initiation of these TPs compared to those who perceived them as \"a lot of harm\", with an HR: 2.23 (95% CI: 1.82, 2.71) for hookah and an HR: 1.85 (95% CI: 1.72, 1.98) for cigarettes. Compared to youth who reported each TP as \"somewhat/very likely\" as their perception of addictiveness, youth who reported \"neither likely nor unlikely\" and \"very/somewhat unlikely\" as their perception of addictiveness of hookah had an older age of initiation, with an HR: 0.75 (95% CI: 0.67-0.83) and an HR: 0.55 (95% CI: 0.47, 0.63) respectively.
    Perceptions of the harmfulness and addictiveness of these tobacco products (TPs) should be addressed in education campaigns for youth to prevent early ages of initiation of cigarettes and hookah.






  • 文章类型: Journal Article
    Multivariate panel count data arise when there are multiple types of recurrent events, and the observation for each study subject consists of the number of recurrent events of each type between two successive examinations. We formulate the effects of potentially time-dependent covariates on multiple types of recurrent events through proportional rates models, while leaving the dependence structures of the related recurrent events completely unspecified. We employ nonparametric maximum pseudo-likelihood estimation under the working assumptions that all types of events are independent and each type of event is a nonhomogeneous Poisson process, and we develop a simple and stable EM-type algorithm. We show that the resulting estimators of the regression parameters are consistent and asymptotically normal, with a covariance matrix that can be estimated consistently by a sandwich estimator. In addition, we develop a class of graphical and numerical methods for checking the adequacy of the fitted model. Finally, we evaluate the performance of the proposed methods through simulation studies and analysis of a skin cancer clinical trial.






  • 文章类型: Journal Article
    Panel count data and interval-censored data are two types of incomplete data that often occur in event history studies. Almost all existing statistical methods are developed for their separate analysis. In this paper, we investigate a more general situation where a recurrent event process and an interval-censored failure event occur together. To intuitively and clearly explain the relationship between the recurrent current process and failure event, we propose a failure time-dependent mean model through a completely unspecified link function. To overcome the challenges arising from the blending of nonparametric components and parametric regression coefficients, we develop a two-stage conditional expected likelihood-based estimation procedure. We establish the consistency, the convergence rate and the asymptotic normality of the proposed two-stage estimator. Furthermore, we construct a class of two-sample tests for comparison of mean functions from different groups. The proposed methods are evaluated by extensive simulation studies and are illustrated with the skin cancer data that motivated this study.






  • 文章类型: Journal Article
    In this article, a competitive risk survival model is considered in which the initial number of risks, assumed to follow a negative binomial distribution, is subject to a destructive mechanism. Assuming the population of interest to have a cure component, the form of the data as interval-censored, and considering both the number of initial risks and risks remaining active after destruction to be missing data, we develop two distinct estimation algorithms for this model. Making use of the conditional distributions of the missing data, we develop an expectation maximization (EM) algorithm, in which the conditional expected complete log-likelihood function is decomposed into simpler functions which are then maximized independently. A variation of the EM algorithm, called the stochastic EM (SEM) algorithm, is also developed with the goal of avoiding the calculation of complicated expectations and improving performance at parameter recovery. A Monte Carlo simulation study is carried out to evaluate the performance of both estimation methods through calculated bias, root mean square error, and coverage probability of the asymptotic confidence interval. We demonstrate the proposed SEM algorithm as the preferred estimation method through simulation and further illustrate the advantage of the SEM algorithm, as well as the use of a destructive model, with data from a children\'s mortality study.






  • 文章类型: Journal Article
    The approximate Bernstein polynomial model, a mixture of beta distributions, is applied to obtain maximum likelihood estimates of the regression coefficients, the baseline density and the survival functions in an accelerated failure time model based on interval censored data including current status data. The estimators of the regression coefficients and the underlying baseline density function are shown to be consistent with almost parametric rates of convergence under some conditions for uncensored and/or interval censored data. Simulation shows that the proposed method is better than its competitors. The proposed method is illustrated by fitting the Breast Cosmetic and the HIV infection time data using the accelerated failure time model.





