Interim monitoring

临时监测
  • 文章类型: Journal Article
    背景:临床试验通常涉及某种形式的临时监测,以在计划的试验完成之前确定无效。虽然存在许多临时监测选项(例如,阿尔法支出,条件功率),基于非参数的中期监测方法也需要考虑更复杂的试验设计和分析.上升是最近提出的一种非参数方法,可用于临时监测。
    方法:Upstrapping的动机是病例重采样自举,并且涉及重复采样并从临时数据中替换,以模拟数千个完全注册的试验。计算每个上行试验的p值,并将满足p值标准的上行试验的比例与预先指定的决策阈值进行比较。为了评估作为一种临时徒劳监测的潜在效用,我们进行了一项模拟研究,考虑了不同的样本量和几种不同的建议校准策略。我们首先比较了一系列阈值组合的试验拒绝率,以验证上绑方法。然后,我们将上绑方法应用于模拟临床试验数据,直接将他们的表现与更传统的阿尔法支出和有条件的权力临时监测方法进行比较,以防止徒劳。
    结果:方法验证表明,与各种模拟设置中的替代方法相比,在空场景中更有可能发现无用的证据。根据使用的停止规则,我们提出的三种向上校准方法具有不同的强度。与O'Brien-Fleming小组序贯方法相比,升级方法的I型错误率最多相差1.7%,在空场景中预期样本量低2-22%,而在替代方案中,功率在15.7%和0.2%之间波动,预期样本量降低0-15%。
    结论:在这个概念验证模拟研究中,我们评估了在临床试验中作为基于重采样的无益性监测方法的可能性.预期样本量的权衡,电源,和I型错误率控制表明,可以校准升频以实现具有不同程度的侵略性的徒劳监视,并且可以相对于考虑的alpha支出和条件性功率徒劳监视方法来识别性能相似性。
    BACKGROUND: Clinical trials often involve some form of interim monitoring to determine futility before planned trial completion. While many options for interim monitoring exist (e.g., alpha-spending, conditional power), nonparametric based interim monitoring methods are also needed to account for more complex trial designs and analyses. The upstrap is one recently proposed nonparametric method that may be applied for interim monitoring.
    METHODS: Upstrapping is motivated by the case resampling bootstrap and involves repeatedly sampling with replacement from the interim data to simulate thousands of fully enrolled trials. The p-value is calculated for each upstrapped trial and the proportion of upstrapped trials for which the p-value criteria are met is compared with a pre-specified decision threshold. To evaluate the potential utility for upstrapping as a form of interim futility monitoring, we conducted a simulation study considering different sample sizes with several different proposed calibration strategies for the upstrap. We first compared trial rejection rates across a selection of threshold combinations to validate the upstrapping method. Then, we applied upstrapping methods to simulated clinical trial data, directly comparing their performance with more traditional alpha-spending and conditional power interim monitoring methods for futility.
    RESULTS: The method validation demonstrated that upstrapping is much more likely to find evidence of futility in the null scenario than the alternative across a variety of simulations settings. Our three proposed approaches for calibration of the upstrap had different strengths depending on the stopping rules used. Compared to O\'Brien-Fleming group sequential methods, upstrapped approaches had type I error rates that differed by at most 1.7% and expected sample size was 2-22% lower in the null scenario, while in the alternative scenario power fluctuated between 15.7% lower and 0.2% higher and expected sample size was 0-15% lower.
    CONCLUSIONS: In this proof-of-concept simulation study, we evaluated the potential for upstrapping as a resampling-based method for futility monitoring in clinical trials. The trade-offs in expected sample size, power, and type I error rate control indicate that the upstrap can be calibrated to implement futility monitoring with varying degrees of aggressiveness and that performance similarities can be identified relative to considered alpha-spending and conditional power futility monitoring methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:适应性临床试验越来越受欢迎,因为它们更加灵活,比传统的固定设计更有效率和道德。然而,尽管它们在评估COVID-19治疗方面的使用越来越多,但它们在重症监护试验中的使用仍然有限。更好地理解各种自适应设计的相对优势可能会增加它们的使用和解释。
    方法:使用两项大型重症监护试验(ADRENAL。
    结果:政府编号,NCT01448109。更新12-12-2017;NICE糖。
    结果:政府编号,NCT00220987。更新于2009年1月29日),我们评估了三种频率论和两种贝叶斯自适应方法的性能。我们回顾性地重新分析了一项试验,两个,四,和九个等距间隔。使用最初的假设,我们进行了10,000次模拟以得出错误率,做出早期正确和错误决定的概率,零情景(无治疗效果)和替代情景(积极治疗效果)下的预期样本量和治疗效果估计。我们使用逻辑回归模型,以90天死亡率作为结果,以治疗组作为协变量。使用0.05的双侧显著性水平(α)检验零假设。
    结果:在所有方法中,增加间隔的数量导致预期样本量减少。在null场景下,组顺序方法可以很好地控制I型错误率;然而,I型错误率通货膨胀是贝叶斯方法的一个问题。贝叶斯预测概率和O'Brien-Fleming方法显示出正确停止试验的最高概率(约95%)。在另一种情况下,贝叶斯方法显示正确停止ADRENAL试验的总体可能性最高(约91%),而Haybittle-Peto方法在NICE-SUGAR试验中获得了最大的影响力。随着间期数量的增加,治疗效果估计越来越被低估。
    结论:这项研究证实了正确的适应性设计可以在样本量减少很多的情况下达到与固定设计相同的结论。与增加的interims数量相关的效率增益与具有大样本量和短随访时间的晚期重症监护试验高度相关。在试验设计阶段系统地探索自适应方法将有助于选择最合适的方法。
    Adaptive clinical trials are growing in popularity as they are more flexible, efficient and ethical than traditional fixed designs. However, notwithstanding their increased use in assessing treatments for COVID-19, their use in critical care trials remains limited. A better understanding of the relative benefits of various adaptive designs may increase their use and interpretation.
    Using two large critical care trials (ADRENAL.
    gov number, NCT01448109. Updated 12-12-2017; NICE-SUGAR.
    gov number, NCT00220987. Updated 01-29-2009), we assessed the performance of three frequentist and two bayesian adaptive approaches. We retrospectively re-analysed the trials with one, two, four, and nine equally spaced interims. Using the original hypotheses, we conducted 10,000 simulations to derive error rates, probabilities of making an early correct and incorrect decision, expected sample size and treatment effect estimates under the null scenario (no treatment effect) and alternative scenario (a positive treatment effect). We used a logistic regression model with 90-day mortality as the outcome and the treatment arm as the covariate. The null hypothesis was tested using a two-sided significance level (α) at 0.05.
    Across all approaches, increasing the number of interims led to a decreased expected sample size. Under the null scenario, group sequential approaches provided good control of the type-I error rate; however, the type I error rate inflation was an issue for the Bayesian approaches. The Bayesian Predictive Probability and O\'Brien-Fleming approaches showed the highest probability of correctly stopping the trials (around 95%). Under the alternative scenario, the Bayesian approaches showed the highest overall probability of correctly stopping the ADRENAL trial for efficacy (around 91%), whereas the Haybittle-Peto approach achieved the greatest power for the NICE-SUGAR trial. Treatment effect estimates became increasingly underestimated as the number of interims increased.
    This study confirms the right adaptive design can reach the same conclusion as a fixed design with a much-reduced sample size. The efficiency gain associated with an increased number of interims is highly relevant to late-phase critical care trials with large sample sizes and short follow-up times. Systematically exploring adaptive methods at the trial design stage will aid the choice of the most appropriate method.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    建立了行业-学术合作,以评估大规模行业实验中A/B测试的统计测试和研究设计的选择。具体来说,行业合作伙伴的标准方法是对所有结果应用t检验,连续和二进制,并应用未评估对功率和I型错误率等操作特性的潜在影响的初始临时监测策略。尽管许多论文已经总结了t检验的稳健性,它在大规模比例数据的A/B测试环境中的性能,有或没有中期分析,是需要的。调查中期分析对t检验稳健性的影响很重要,因为中期分析依赖于总样本量的一小部分,并且当t检验不只是在研究结束时实施时,应确保保持所需的特性,而是做出临时决定。通过模拟研究,t检验的性能,卡方检验,当应用于二元结果数据时,采用Yate校正的卡方检验进行评估。Further,中期监测通过一个幼稚的方法,没有校正多个测试与O'Brien-Fleming边界被认为是在设计,允许早期终止徒劳,差异,或者两者兼而有之。结果表明,对于工业A/B测试中使用的大样本量,有和没有临时监测的二元结果数据,t检验实现了类似的功率和I型错误率。和天真的临时监测没有纠正导致表现不佳的研究。
    An industry-academic collaboration was established to evaluate the choice of statistical test and study design for A/B testing in larger-scale industry experiments. Specifically, the standard approach at the industry partner was to apply a t-test for all outcomes, both continuous and binary, and to apply naïve interim monitoring strategies that had not evaluated the potential implications on operating characteristics such as power and type I error rates. Although many papers have summarized the robustness of the t-test, its performance for the A/B testing context of large-scale proportion data, with or without interim analyses, is needed. Investigating the effect of interim analyses on the robustness of the t-test is important, because interim analyses rely on a fraction of the total sample size and one should ensure that desired properties are maintained when a t-test is implemented not just at the end of the study, but for making interim decisions. Through simulation studies, the performance of the t-test, Chi-squared test, and Chi-squared test with Yate\'s correction when applied to binary outcomes data is evaluated. Further, interim monitoring through a naïve approach with no correction for multiple testing versus the O\'Brien-Fleming boundary are considered in designs that allow early termination for futility, difference, or both. Results indicate that the t-test achieves similar power and type I error rates for binary outcomes data with the large sample sizes used in industrial A/B tests with and without interim monitoring, and naïve interim monitoring without corrections leads to poorly performing studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    连续多分配随机试验(SMART)有助于同时比较多种适应性治疗策略(ATS)。先前的研究已经建立了一个框架,通过逆概率加权通过全局Wald检验来测试多个ATS的同质性。由于多阶段治疗随机化的顺序性质,SMART通常比经典临床试验更长。因此,如果观察到压倒性的疗效,则增加中期分析以允许早期停止将是有益的。我们将组序方法引入SMART,以促进基于多元卡方分布的临时监测。仿真研究表明,与经典的SMART相比,SMART(IM-SMART)中提出的临时监控以减少的预期样本量保持了所需的I型误差和功率。最后,我们通过重新分析SMART评估认知行为和物理治疗对膝骨关节炎合并亚综合征抑郁症状患者的影响来说明我们的方法.
    A sequential multiple assignment randomized trial (SMART) facilitates the comparison of multiple adaptive treatment strategies (ATSs) simultaneously. Previous studies have established a framework to test the homogeneity of multiple ATSs by a global Wald test through inverse probability weighting. SMARTs are generally lengthier than classical clinical trials due to the sequential nature of treatment randomization in multiple stages. Thus, it would be beneficial to add interim analyses allowing for an early stop if overwhelming efficacy is observed. We introduce group sequential methods to SMARTs to facilitate interim monitoring based on the multivariate chi-square distribution. Simulation studies demonstrate that the proposed interim monitoring in SMART (IM-SMART) maintains the desired type I error and power with reduced expected sample size compared to the classical SMART. Finally, we illustrate our method by reanalyzing a SMART assessing the effects of cognitive behavioral and physical therapies in patients with knee osteoarthritis and comorbid subsyndromal depressive symptoms.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Cluster-randomized trials (CRTs) of infectious disease preventions often yield correlated, interval-censored data: dependencies may exist between observations from the same cluster, and event occurrence may be assessed only at intermittent study visits. This data structure must be accounted for when conducting interim monitoring and futility assessment for CRTs. In this article, we propose a flexible framework for conditional power estimation when outcomes are correlated and interval-censored. Under the assumption that the survival times follow a shared frailty model, we first characterize the correspondence between the marginal and cluster-conditional survival functions, and then use this relationship to semiparametrically estimate the cluster-specific survival distributions from the available interim data. We incorporate assumptions about changes to the event process over the remainder of the trial-as well as estimates of the dependency among observations in the same cluster-to extend these survival curves through the end of the study. Based on these projected survival functions, we generate correlated interval-censored observations, and then calculate the conditional power as the proportion of times (across multiple full-data generation steps) that the null hypothesis of no treatment effect is rejected. We evaluate the performance of the proposed method through extensive simulation studies, and illustrate its use on a large cluster-randomized HIV prevention trial.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    When a clinical trial has a composite endpoint and a comparison of treatment strategies with multiple intervention components, interim data reviews by a data safety and monitoring board (DSMB) can be challenging as the data evolve on multiple fronts. We illustrate with a study in the treatment of Kaposi sarcoma (KS), an HIV-associated cancer with a multi-faceted disease presentation. The study, ACTG-A5264/AMC-067, was a 1:1 randomized trial to compare two strategies: immediate initiation of etoposide with antiretroviral therapy (ART), or ART with delayed etoposide upon disease progression. The outcome was a composite endpoint that included the following events, ordered from worst to best in the following three categories: (1) KS progression at 48 weeks, death, initiation of alternate KS treatment, loss to study follow-up; (2) stable KS; and (3) partial or complete KS response at 48 weeks. We present the interim results on the composite endpoint and the individual components, where components favored different study arms at an interim review. To facilitate interim data monitoring for complex trials, we recommend clear communications between the study team and the DSMB prior to the initiation of the trial on the need for a composite endpoint, the intentions behind the defined strategies, and relative importance of individual components of the composite endpoint. We also recommend flexibility in the timing of data reviews by the DSMB to interpret emerging data in multiple dimensions. Clinicaltrials.govNCT01352117.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    在一项正在进行的临床试验中,总会有意想不到的重大安全问题的风险,如严重不良事件的过度发生。当出现这样的问题时,试验管理者必须立即进行评估,以确定是否应终止试验以保护患者.这个决定很复杂,但可能会得到统计停止规则的帮助。顺序停止规则适用于即时决策,但是频率论方法可能没有用,因为试验的未知截断结束使得不可能定义I型错误。因此,提出了一种基于具有信息先验分布的后验分布的贝叶斯停止规则,并给出了构建该停止规则的指导原则。对一些运行特性进行了评估,并与改进的序贯概率比检验(SPRT)进行了比较,最大化的SPRT,和Pocock的测试。所提出的方法具有构造的灵活性,并且可以提供比其他比较方法更理想的性能。
    In an ongoing clinical trial, there will always be a risk for unanticipated critical safety problems, such as excessive occurrence of serious adverse events. When such a problem arises, the trial administrators must conduct an immediate evaluation to determine whether the trial should be terminated to protect patients. This decision is complicated but may be aided by statistical stopping rules. Sequential stopping rules are appropriate for immediate decisions, but frequentist approaches may not be useful because the unknown truncated end of the trial makes it impossible to define type I errors. Thus, a Bayesian stopping rule is proposed that is based on the posterior distribution with an informative prior distribution, and a guideline to construct this stopping rule is presented. Some operating characteristics are evaluated and compared with those of the modified sequential probability ratio test (SPRT), the maximized SPRT, and Pocock\'s test. The proposed method has flexibility for construction and could provide a more desirable performance than the other compared methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Recent guidelines have de-emphasized the role of routine surveillance computed tomography (CT) scans for diffuse large B-cell lymphoma (DLBCL) patients who achieve a complete response to front-line therapy. This shift in practice recommendations was prompted by retrospective studies that failed to demonstrate clear clinical utility for surveillance CT in unselected DLBCL patients. Controversy remains, however, over the role of routine surveillance CT in the highest risk patients for treatment failure who would remain candidates for aggressive salvage therapies. Novel high-throughput sequencing methods can non-invasively monitor tumor-specific DNA in the blood and offers clear advantages designed to overcome fundamental limitations of CT scans. This review will discuss the current controversies surrounding monitoring clinical outcomes in aggressive B-cell lymphomas, with a specific emphasis on DLBCL. Fundamental limitations of imaging scans will be addressed and the potential of monitoring circulating tumor DNA as an adjunct or replacement for CT scans will be discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    In clinical research and development, interim monitoring is critical for better decision-making and minimizing the risk of exposing patients to possible ineffective therapies. For interim futility or efficacy monitoring, predictive probability methods are widely adopted in practice. Those methods have been well studied for univariate variables. However, for longitudinal studies, predictive probability methods using univariate information from only completers may not be most efficient, and data from on-going subjects can be utilized to improve efficiency. On the other hand, leveraging information from on-going subjects could allow an interim analysis to be potentially conducted once a sufficient number of subjects reach an earlier time point. For longitudinal outcomes, we derive closed-form formulas for predictive probabilities, including Bayesian predictive probability, predictive power, and conditional power and also give closed-form solutions for predictive probability of success in a future trial and the predictive probability of success of the best dose. When predictive probabilities are used for interim monitoring, we study their distributions and discuss their analytical cutoff values or stopping boundaries that have desired operating characteristics. We show that predictive probabilities utilizing all longitudinal information are more efficient for interim monitoring than that using information from completers only. To illustrate their practical application for longitudinal data, we analyze 2 real data examples from clinical trials.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Comparative Study
    从伦理和效率的角度来看,中期监测是随机临床试验设计的关键组成部分。在具有时间至事件终点的研究中,中期分析的时机通常基于观察最终分析所需事件总数的预定比例。虽然大多数随机临床试验在确定分析时间时设计了实验组和对照组的集合事件,一些设计仅使用控制臂事件来安排临时外观。
    为了评估合并和基于控制臂的临时监测方法的性能,并提出新的程序,最早的信息时间程序,结合了两种方法的好处。
    介绍了该程序的分析和后勤考虑因素。该方法来自三个已发表的随机临床试验的数据。在模拟研究中比较了这些程序。
    对照臂方法导致单侧随机临床试验设计中研究I型错误的轻微膨胀。当新疗法不优于对照疗法时,集合手臂方法的结果是,平均而言,比控制臂方法更早的停止时间。当新疗法效果异常好的时候,控制臂方法下的平均停止时间早于合并方法下的平均停止时间。所提出的最早的信息时间过程显示出在整个替代方案范围内与两种方法中的最佳(最早)相对应的停止时间。
    最早的信息时间程序可能会导致I型错误的轻微膨胀(尤其是在小型试验中);当需要对I型错误进行精确控制时,有必要使用基于模拟的方法来校正通货膨胀。
    在时间到事件设置中,最早的信息时间程序是汇集和控制臂方法的有吸引力的替代方法。改善中期分析的时机有助于最大程度地减少患者对劣质治疗的暴露,并加速研究结果的传播。
    Interim monitoring is a key component of randomized clinical trial design from both ethical and efficiency perspectives. In studies with time-to-event endpoints, timing of interim analyses is typically based on observing a pre-specified proportion of the total number of events required for the final analysis. While most randomized clinical trial designs pool events over the experimental and control arms in determining the analysis times, some designs use only the control-arm events for scheduling interim looks.
    To evaluate the performance of the pooled and control-arm-based interim monitoring approaches and to propose a new procedure, the earliest information time procedure, that combines the benefits of the two approaches.
    The analytical and logistical considerations for the procedures are presented. The methodology is illustrated on data from three published randomized clinical trials. The procedures are compared in a simulation study.
    The control-arm approach results in a slight inflation of the study type I error in one-sided randomized clinical trial designs. When the new treatment is no better than the control treatment, the pooled-arm approach results in, on average, earlier stopping times than the control-arm approach. When the new treatment works exceptionally well, the average stopping times under the control-arm approach are earlier than those under the pooled approach. The proposed earliest information time procedure is shown to result in stopping times corresponding to the best (earliest) of the two approaches over the entire range of alternatives.
    The earliest information time procedure may result in a slight inflation of the type I error (especially in small trials); when exact control of the type I error is required, it is necessary to use a simulation-based method to correct the inflation.
    In time-to-event settings, the earliest information time procedure is an attractive alternative to the pooled and control-arm approaches. Improving the timing of interim analyses helps to minimize patient exposure to inferior treatments and to accelerate dissemination of the study results.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号