Futility monitoring

徒劳监测
  • 文章类型: Journal Article
    背景:临床试验通常涉及某种形式的临时监测,以在计划的试验完成之前确定无效。虽然存在许多临时监测选项(例如,阿尔法支出,条件功率),基于非参数的中期监测方法也需要考虑更复杂的试验设计和分析.上升是最近提出的一种非参数方法,可用于临时监测。
    方法:Upstrapping的动机是病例重采样自举,并且涉及重复采样并从临时数据中替换,以模拟数千个完全注册的试验。计算每个上行试验的p值,并将满足p值标准的上行试验的比例与预先指定的决策阈值进行比较。为了评估作为一种临时徒劳监测的潜在效用,我们进行了一项模拟研究,考虑了不同的样本量和几种不同的建议校准策略。我们首先比较了一系列阈值组合的试验拒绝率,以验证上绑方法。然后,我们将上绑方法应用于模拟临床试验数据,直接将他们的表现与更传统的阿尔法支出和有条件的权力临时监测方法进行比较,以防止徒劳。
    结果:方法验证表明,与各种模拟设置中的替代方法相比,在空场景中更有可能发现无用的证据。根据使用的停止规则,我们提出的三种向上校准方法具有不同的强度。与O'Brien-Fleming小组序贯方法相比,升级方法的I型错误率最多相差1.7%,在空场景中预期样本量低2-22%,而在替代方案中,功率在15.7%和0.2%之间波动,预期样本量降低0-15%。
    结论:在这个概念验证模拟研究中,我们评估了在临床试验中作为基于重采样的无益性监测方法的可能性.预期样本量的权衡,电源,和I型错误率控制表明,可以校准升频以实现具有不同程度的侵略性的徒劳监视,并且可以相对于考虑的alpha支出和条件性功率徒劳监视方法来识别性能相似性。
    BACKGROUND: Clinical trials often involve some form of interim monitoring to determine futility before planned trial completion. While many options for interim monitoring exist (e.g., alpha-spending, conditional power), nonparametric based interim monitoring methods are also needed to account for more complex trial designs and analyses. The upstrap is one recently proposed nonparametric method that may be applied for interim monitoring.
    METHODS: Upstrapping is motivated by the case resampling bootstrap and involves repeatedly sampling with replacement from the interim data to simulate thousands of fully enrolled trials. The p-value is calculated for each upstrapped trial and the proportion of upstrapped trials for which the p-value criteria are met is compared with a pre-specified decision threshold. To evaluate the potential utility for upstrapping as a form of interim futility monitoring, we conducted a simulation study considering different sample sizes with several different proposed calibration strategies for the upstrap. We first compared trial rejection rates across a selection of threshold combinations to validate the upstrapping method. Then, we applied upstrapping methods to simulated clinical trial data, directly comparing their performance with more traditional alpha-spending and conditional power interim monitoring methods for futility.
    RESULTS: The method validation demonstrated that upstrapping is much more likely to find evidence of futility in the null scenario than the alternative across a variety of simulations settings. Our three proposed approaches for calibration of the upstrap had different strengths depending on the stopping rules used. Compared to O\'Brien-Fleming group sequential methods, upstrapped approaches had type I error rates that differed by at most 1.7% and expected sample size was 2-22% lower in the null scenario, while in the alternative scenario power fluctuated between 15.7% lower and 0.2% higher and expected sample size was 0-15% lower.
    CONCLUSIONS: In this proof-of-concept simulation study, we evaluated the potential for upstrapping as a resampling-based method for futility monitoring in clinical trials. The trade-offs in expected sample size, power, and type I error rate control indicate that the upstrap can be calibrated to implement futility monitoring with varying degrees of aggressiveness and that performance similarities can be identified relative to considered alpha-spending and conditional power futility monitoring methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:神经退行性疾病的晚期临床试验成功的可能性很低。在这里,我们介绍了一种算法,该算法可优化肌萎缩侧索硬化症(ALS)临床试验的中期分析计划,以更好地利用可用的时间和资源,尽量减少患者接触无效或有害药物。
    方法:开发了一种基于模拟的算法,通过将有关ALS临床试验成功率的先验知识与早期研究中获得的药物特异性信息相结合,来确定最佳的中期分析方案。通过改变中期分析的数量和时间来优化中期分析方案,以及他们关于何时停止审判的决定规则。该算法回顾性地应用于三项临床试验,这些临床试验研究了膈肌起搏或头孢曲松对ALS患者生存率的影响。结果还与传统的临时设计进行了比较。
    结果:我们评估了每个试验的183至1351个独特的中期分析方案。应用优化设计正确确立缺乏功效,所有研究都会在1.2~19.4个月前结束(试验持续时间减少了4.6%~57.7%),并且随机分组的患者数量可以减少1.7%~58.1%.通过模拟,我们说明了其他治疗方案的效率。在大多数情况下,优化的中期分析方案优于常规的中期设计。
    结论:我们的算法使用先验知识来确定ALS临床试验中预期治疗效果的不确定性,并优化中期分析的计划。改善ALS中的无效性监测可以最大程度地减少患者对无效或有害治疗的暴露,并带来显著的道德和效率收益。
    Late-phase clinical trials for neurodegenerative diseases have a low probability of success. In this study, we introduce an algorithm that optimizes the planning of interim analyses for clinical trials in amyotrophic lateral sclerosis (ALS) to better use the time and resources available and minimize the exposure of patients to ineffective or harmful drugs.
    A simulation-based algorithm was developed to determine the optimal interim analysis scheme by integrating prior knowledge about the success rate of ALS clinical trials with drug-specific information obtained in early-phase studies. Interim analysis schemes were optimized by varying the number and timing of interim analyses, together with their decision rules about when to stop a trial. The algorithm was applied retrospectively to 3 clinical trials that investigated the efficacy of diaphragm pacing or ceftriaxone on survival in patients with ALS. Outcomes were additionally compared with conventional interim designs.
    We evaluated 183-1,351 unique interim analysis schemes for each trial. Application of the optimal designs correctly established lack of efficacy, would have concluded all studies 1.2-19.4 months earlier (reduction of 4.6%-57.7% in trial duration), and could have reduced the number of randomized patients by 1.7%-58.1%. By means of simulation, we illustrate the efficiency for other treatment scenarios. The optimized interim analysis schemes outperformed conventional interim designs in most scenarios.
    Our algorithm uses prior knowledge to determine the uncertainty of the expected treatment effect in ALS clinical trials and optimizes the planning of interim analyses. Improving futility monitoring in ALS could minimize the exposure of patients to ineffective or harmful treatments and result in significant ethical and efficiency gains.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    UNASSIGNED:近年来,开发生物标志物靶向抗癌疗法的努力进展迅速。随着努力加快对有希望的疗法的监管审查,基于在单臂II期临床试验中获得的证据,几种靶向癌症疗法获得了加速批准.然而,在没有随机化的情况下,在提交加速批准申请之前,对于新出现的生物标志物亚群,可能尚未根据护理化疗标准对无进展生存期和总生存期的患者预后进行研究.用于设计和评估新兴靶向疗法的历史控制率通常以人口平均数的形式出现,缺乏对目标遗传或免疫表型的特异性。因此,历史试验结果在推断新型靶向治疗的潜在"比较疗效"方面存在固有的局限性.因此,在这种情况下,随机化可能是不可避免的。需要在设计方法上进行创新,然而,能够有效实施针对生物标志物亚群的药物的随机试验。
    UNASSIGNED:本文为早期阶段生物标志物指导的肿瘤学临床试验提出了三种随机设计。每个设计利用最佳效率预测概率方法来监测多个生物标志物亚群的无效性。当从由后验和预测阈值的不同组合形成的候选设计中选择最佳效率设计时,仅考虑具有在0.05和0.1之间的I型误差和至少0.8的幂的设计。一项模拟研究是由最近一项研究阿特珠单抗治疗局部晚期或转移性尿路上皮癌患者的临床试验报告的结果所激发的,用于评估各种设计的操作特征。
    未经评估:在最多300名患者中,我们发现,富集设计有一个平均总样本量下的零101.0和总平均样本量下的替代218.0,相比于144.8和213.8下的零和替代,分别,用于分层控制臂设计。合并的控制臂设计总共招募了113.2名无效患者和159.6名替代患者。最多200。这些平均样本量在替代方案下较小23-48%,在零下较小47-64%,与阿特珠单抗II期研究中310例患者的实际样本量相比.
    UNASSIGNED:我们的研究结果表明,在III期研究之前,可以使用随机化和无效停止来设计可能较小的II期试验,以有效地获得关于治疗组和对照组的更多信息。
    UNASSIGNED: Efforts to develop biomarker-targeted anti-cancer therapies have progressed rapidly in recent years. With efforts to expedite regulatory reviews of promising therapies, several targeted cancer therapies have been granted accelerated approval on the basis of evidence acquired in single-arm phase II clinical trials. And yet, in the absence of randomization, patient prognosis for progression-free survival and overall survival may not have been studied under standard of care chemotherapies for emerging biomarker subpopulations prior to the submission of an accelerated approval application. Historical control rates used to design and evaluate emerging targeted therapies often arise as population averages, lacking specificity to the targeted genetic or immunophenotypic profile. Thus, historical trial results are inherently limited for inferring the potential \"comparative efficacy\" of novel targeted therapies. Consequently, randomization may be unavoidable in this setting. Innovations in design methodology are needed, however, to enable efficient implementation of randomized trials for agents that target biomarker subpopulations.
    UNASSIGNED: This article proposes three randomized designs for early phase biomarker-guided oncology clinical trials. Each design utilizes the optimal efficiency predictive probability method to monitor multiple biomarker subpopulations for futility. Only designs with type I error between 0.05 and 0.1 and power of at least 0.8 were considered when selecting an optimal efficiency design from among the candidate designs formed by different combinations of posterior and predictive threshold. A simulation study motivated by the results reported in a recent clinical trial studying atezolizumab treatment in patients with locally advanced or metastatic urothelial carcinoma is used to evaluate the operating characteristics of the various designs.
    UNASSIGNED: Out of a maximum of 300 total patients, we find that the enrichment design has an average total sample size under the null of 101.0 and a total average sample size under the alternative of 218.0, as compared to 144.8 and 213.8 under the null and alternative, respectively, for the stratified control arm design. The pooled control arm design enrolled a total of 113.2 patients under the null and 159.6 under the alternative, out of a maximum of 200. These average sample sizes that are 23-48% smaller under the alternative and 47-64% smaller under the null, as compared to the realized sample size of 310 patients in the phase II study of atezolizumab.
    UNASSIGNED: Our findings suggest that potentially smaller phase II trials to those used in practice can be designed using randomization and futility stopping to efficiently obtain more information about both the treatment and control groups prior to phase III study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Clinical Trial, Phase II
    针对单臂II期试验提出了一种稳健的贝叶斯设计,该设计具有早期停止规则以监视到事件终点的时间。假定的模型是分段指数分布,在固定随访间隔的子间隔中,危险参数具有非信息性伽马先验。作为一个额外的比较器,我们还根据假定的Weibull分布定义和评估设计的版本。除了假设的模型,基于分段指数和Weibull模型的设计与已建立的设计相同,该设计假设在平均事件时间上具有逆伽马先验的指数事件时间分布。在具有不同形状参数的对数逻辑和威布尔分布下,通过仿真比较了三种设计,和不同的监测时间表。模拟显示,与基于指数逆伽马模型的设计相比,分段指数设计具有更好的性能,早期正确停止试验的可能性要高得多,和更短和更少的可变试验持续时间,当假定的中位事件时间低得不可接受时。与基于威布尔模型的设计相比,分段指数设计做了一个更好的工作,保持小的不正确的停止概率在情况下,真正的中位生存时间是理想的大。
    A robust Bayesian design is presented for a single-arm phase II trial with an early stopping rule to monitor a time to event endpoint. The assumed model is a piecewise exponential distribution with non-informative gamma priors on the hazard parameters in subintervals of a fixed follow up interval. As an additional comparator, we also define and evaluate a version of the design based on an assumed Weibull distribution. Except for the assumed models, the piecewise exponential and Weibull model based designs are identical to an established design that assumes an exponential event time distribution with an inverse gamma prior on the mean event time. The three designs are compared by simulation under several log-logistic and Weibull distributions having different shape parameters, and for different monitoring schedules. The simulations show that, compared to the exponential inverse gamma model based design, the piecewise exponential design has substantially better performance, with much higher probabilities of correctly stopping the trial early, and shorter and less variable trial duration, when the assumed median event time is unacceptably low. Compared to the Weibull model based design, the piecewise exponential design does a much better job of maintaining small incorrect stopping probabilities in cases where the true median survival time is desirably large.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Comparative Study
    When designing phase II clinical trials, it is important to construct interim monitoring rules that achieve a balance between reliable early stopping for futility or safety and maintaining a high true positive probability (TPP), which is the probability of not stopping if the new treatment is truly safe and effective. We define and compare several methods for specifying early stopping boundaries as functions of interim sample size, rather than as fixed cut-offs, using Bayesian posterior probabilities as decision criteria. We consider boundaries with constant, linear, or exponential shapes. For design optimization criteria, we use the TPP and mean number of patients enrolled in the trial. Simulations to evaluate and compare the designs\' operating characteristics under a range of scenarios show that, while there is no uniformly optimal boundary, an appropriately calibrated exponential shape maintains high TPP while limiting the number of patients assigned to a treatment with an inferior response rate or an excessive toxicity rate.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Comparative Study
    背景:实用(无效)中期监测是进行III期临床试验的重要组成部分,尤其是在危及生命的疾病中。如果新疗法有害,或者如果试验继续进行最终分析,则理想的无效监测指南允许及时停止。有许多分析方法用于构建徒劳的监测边界。最常见的方法是基于条件幂,替代假设的顺序检验,或顺序置信区间。相对于建议停止研究所需的证据水平,所得的徒劳界限差异很大。
    目的:我们使用放射治疗肿瘤学小组已完成的III期临床试验的事件史来评估常用方法的性能,癌症和白血病B组,和北中央癌症治疗小组。
    方法:我们考虑了已发表的在1990年后开始的生存终点的优势III期试验。有52项研究可用于来自不同疾病部位的这种分析。使用方案指定的效应大小计算每个研究的总样本量和最大事件数(统计信息),I型和II型错误率。除了常见的徒劳方法之外,我们考虑了最近提出的线性无效性边界方法,该方法具有早期损害外观,随后进行了几项无效性分析.对于每一种徒劳的方法,中期测试统计数据是为三个不同分析频率的时间表生成的,如果临时结果越过徒劳的停止边界,则建议尽早停止。对于没有表现出优越性的试验,每个规则的影响被总结为对样本量的节省,研究持续时间,和信息时间尺度。
    结果:对于阴性研究,我们的研究结果表明,基于替代假设和重复置信区间规则的无效方法产生的储蓄较少(与其他两个规则相比)。这些界限太保守了,特别是在研究的前半部分(<50%的信息)。在研究的后半部分(>50%的信息),条件权力规则过于激进,即使有临床意义的治疗效果,也可能会停止试验。具有三个或更多个中期分析的线性无效边界提供了最佳结果。对于积极的研究,我们证明,没有一个无效的规则会停止试验。
    结论:线性无效边界方法从统计学上具有吸引力,临床,以及评估新抗癌药的临床试验中的后勤观点。
    BACKGROUND: Futility (inefficacy) interim monitoring is an important component in the conduct of phase III clinical trials, especially in life-threatening diseases. Desirable futility monitoring guidelines allow timely stopping if the new therapy is harmful or if it is unlikely to demonstrate to be sufficiently effective if the trial were to continue to its final analysis. There are a number of analytical approaches that are used to construct futility monitoring boundaries. The most common approaches are based on conditional power, sequential testing of the alternative hypothesis, or sequential confidence intervals. The resulting futility boundaries vary considerably with respect to the level of evidence required for recommending stopping the study.
    OBJECTIVE: We evaluate the performance of commonly used methods using event histories from completed phase III clinical trials of the Radiation Therapy Oncology Group, Cancer and Leukemia Group B, and North Central Cancer Treatment Group.
    METHODS: We considered published superiority phase III trials with survival endpoints initiated after 1990. There are 52 studies available for this analysis from different disease sites. Total sample size and maximum number of events (statistical information) for each study were calculated using protocol-specified effect size, type I and type II error rates. In addition to the common futility approaches, we considered a recently proposed linear inefficacy boundary approach with an early harm look followed by several lack-of-efficacy analyses. For each futility approach, interim test statistics were generated for three schedules with different analysis frequency, and early stopping was recommended if the interim result crossed a futility stopping boundary. For trials not demonstrating superiority, the impact of each rule is summarized as savings on sample size, study duration, and information time scales.
    RESULTS: For negative studies, our results show that the futility approaches based on testing the alternative hypothesis and repeated confidence interval rules yielded less savings (compared to the other two rules). These boundaries are too conservative, especially during the first half of the study (<50% of information). The conditional power rules are too aggressive during the second half of the study (>50% of information) and may stop a trial even when there is a clinically meaningful treatment effect. The linear inefficacy boundary with three or more interim analyses provided the best results. For positive studies, we demonstrated that none of the futility rules would have stopped the trials.
    CONCLUSIONS: The linear inefficacy boundary futility approach is attractive from statistical, clinical, and logistical standpoints in clinical trials evaluating new anti-cancer agents.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号