randomization test

随机化试验
  • 文章类型: Journal Article
    单例实验在心理和教育研究中越来越受欢迎。然而,单例数据的分析往往因频繁出现数据缺失或不完整而变得复杂。如果不能避免错误或不完整,知道哪些策略是最优的变得很重要,因为缺失数据的存在或不充分的数据处理策略可能导致实验不再“符合标准”,例如,什么工作信息交换所。对于处理缺失数据的策略的检查和比较,我们模拟了ABAB相位设计的完整数据集,随机区组设计,和多基线设计。我们通过随机删除10%,在模拟数据集中引入了不同程度的错误,30%,和50%的数据。我们评估了随机检验的I型错误率和统计能力,即在这些不同程度的错误下没有治疗效果的零假设,使用不同的策略来处理缺失数据:(1)随机化缺失数据标记并仅为可用数据点计算所有参考统计数据,(2)利用时间序列模型的状态空间表示通过单次插补估计缺失数据点,(3)基于对前后数据点的可用数据点进行回归的多重填补。对于模拟的条件,结果是决定性的:在随机测试中,随机标记方法在统计功效方面优于其他两种方法,同时控制I型错误率。
    Single-case experiments have become increasingly popular in psychological and educational research. However, the analysis of single-case data is often complicated by the frequent occurrence of missing or incomplete data. If missingness or incompleteness cannot be avoided, it becomes important to know which strategies are optimal, because the presence of missing data or inadequate data handling strategies may lead to experiments no longer \"meeting standards\" set by, for example, the What Works Clearinghouse. For the examination and comparison of strategies to handle missing data, we simulated complete datasets for ABAB phase designs, randomized block designs, and multiple-baseline designs. We introduced different levels of missingness in the simulated datasets by randomly deleting 10%, 30%, and 50% of the data. We evaluated the type I error rate and statistical power of a randomization test for the null hypothesis that there was no treatment effect under these different levels of missingness, using different strategies for handling missing data: (1) randomizing a missing-data marker and calculating all reference statistics only for the available data points, (2) estimating the missing data points by single imputation using the state space representation of a time series model, and (3) multiple imputation based on regressing the available data points on preceding and succeeding data points. The results are conclusive for the conditions simulated: The randomized-marker method outperforms the other two methods in terms of statistical power in a randomization test, while keeping the type I error rate under control.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    在单案例研究中已经提出了多层次模型(MLM),在多基线设计(MBD)中综合一组病例的数据。这种方法的局限性在于,MLM需要几个统计假设,这些假设在单案例研究中经常被违反。在本文中,我们通过为MLM提供随机化测试(RT)包装器,提供了一种非参数方式来评估治疗效果,从而提出了这种限制的解决方案。而不做分布假设或随机抽样假设。我们提出了所提出的技术的基本原理,并与MLM中的参数统计推断相比,验证了其性能(相对于I型错误率和功率)。在评估MBD各病例的平均治疗效果的背景下。我们进行了一项模拟研究,该研究操纵了数据集中的案例数量和每个案例的观察结果,案例之间的数据可变性,数据的分布特征,自相关水平,以及数据中治疗效果的大小。结果表明,对于少于5例的MBD,RT包装器的功率优于基于F分布的参数检验的功率,并且针对双峰数据控制RT包装器的类型I错误率,而传统的MLM并非如此。
    Multilevel models (MLMs) have been proposed in single-case research, to synthesize data from a group of cases in a multiple-baseline design (MBD). A limitation of this approach is that MLMs require several statistical assumptions that are often violated in single-case research. In this article we propose a solution to this limitation by presenting a randomization test (RT) wrapper for MLMs that offers a nonparametric way to evaluate treatment effects, without making distributional assumptions or an assumption of random sampling. We present the rationale underlying the proposed technique and validate its performance (with respect to Type I error rate and power) as compared to parametric statistical inference in MLMs, in the context of evaluating the average treatment effect across cases in an MBD. We performed a simulation study that manipulated the numbers of cases and of observations per case in a dataset, the data variability between cases, the distributional characteristics of the data, the level of autocorrelation, and the size of the treatment effect in the data. The results showed that the power of the RT wrapper is superior to the power of parametric tests based on F distributions for MBDs with fewer than five cases, and that the Type I error rate of the RT wrapper is controlled for bimodal data, whereas this is not the case for traditional MLMs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    The conditional power (CP) of the randomization test (RT) was investigated in a simulation study in which three different single-case effect size (ES) measures were used as the test statistics: the mean difference (MD), the percentage of nonoverlapping data (PND), and the nonoverlap of all pairs (NAP). Furthermore, we studied the effect of the experimental design on the RT\'s CP for three different single-case designs with rapid treatment alternation: the completely randomized design (CRD), the randomized block design (RBD), and the restricted randomized alternation design (RRAD). As a third goal, we evaluated the CP of the RT for three types of simulated data: data generated from a standard normal distribution, data generated from a uniform distribution, and data generated from a first-order autoregressive Gaussian process. The results showed that the MD and NAP perform very similarly in terms of CP, whereas the PND performs substantially worse. Furthermore, the RRAD yielded marginally higher power in the RT, followed by the CRD and then the RBD. Finally, the power of the RT was almost unaffected by the type of the simulated data. On the basis of the results of the simulation study, we recommend at least 20 measurement occasions for single-case designs with a randomized treatment order that are to be evaluated with an RT using a 5% significance level. Furthermore, we do not recommend use of the PND, because of its low power in the RT.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号