complex survey sampling

  • 文章类型: Journal Article
    以人口为基础的调查是可能的来源,从中得出有代表性的控制数据,用于病例对照研究。然而,这些调查涉及复杂的抽样,如果在分析中没有正确说明,可能会导致对关联度量的估计有偏差。尚未研究将复杂采样控制纳入密度采样病例控制设计的方法。
    我们使用模拟研究来评估从病例对照研究中估计发病率密度比(IDR)的不同方法的性能,并使用风险集抽样从复杂的调查数据中提取对照。在模拟人口数据中,我们采用了四种调查抽样方法,随着测量大小的变化,并评估了纳入基于调查的控制的四种分析方法的性能。
    对于进行风险集抽样的方法,IDR的估计是无偏的,选择概率与调查权重成正比。当没有纳入抽样权重时,IDR的估计是有偏差的,或仅包含在回归建模中。无偏分析方法进行比较,并产生方差与有偏方法相当的估计。随着调查规模的减小,方差增加,置信区间覆盖率降低。
    在风险集抽样病例对照研究中,使用从复杂调查数据中提取的对照,当权重适当合并时,可以获得无偏估计。
    Population-based surveys are possible sources from which to draw representative control data for case-control studies. However, these surveys involve complex sampling that could lead to biased estimates of measures of association if not properly accounted for in analyses. Approaches to incorporating complex-sampled controls in density-sampled case-control designs have not been examined.
    We used a simulation study to evaluate the performance of different approaches to estimating incidence density ratios (IDR) from case-control studies with controls drawn from complex survey data using risk-set sampling. In simulated population data, we applied four survey sampling approaches, with varying survey sizes, and assessed the performance of four analysis methods for incorporating survey-based controls.
    Estimates of the IDR were unbiased for methods that conducted risk-set sampling with probability of selection proportional to survey weights. Estimates of the IDR were biased when sampling weights were not incorporated, or only included in regression modeling. The unbiased analysis methods performed comparably and produced estimates with variance comparable to biased methods. Variance increased and confidence interval coverage decreased as survey size decreased.
    Unbiased estimates are obtainable in risk-set sampled case-control studies using controls drawn from complex survey data when weights are properly incorporated.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    多级验证性因子分析(MCFA)模型中级别之间和级别内结构的平等问题对于获得无偏参数估计和统计推断具有影响。一个常见的条件是在相等的水平变化结构下,因子载荷的不相等。通过数学研究和蒙特卡罗模拟,这项研究比较了五个统计模型的稳健性,包括两个基于模型的(一个真实模型和一个错误指定的模型),一个基于设计的,和两个最大模型(两个模型,其中方差-协方差矩阵的满秩估计在水平之间和水平内,分别)在分析具有水平变化因子载荷的复杂调查测量数据时。使用MCFA对120名三年级学生(来自40个教室)感知的Harter能力量表的经验数据进行建模,并将参数估计用作真实参数来进行蒙特卡洛模拟研究。结果表明,最大模型对不等因子载荷具有鲁棒性,而基于设计的方法和基于错误指定的基于模型的方法产生了混淆的结果和虚假的统计推断。如果研究人员对因子载荷模式和测量结构的信息有限,我们建议使用最大模型。测量模型是结构方程建模(SEM)的关键组成部分;因此,研究结果可以推广到多级SEM和CFA模型。为最大模型和其他分析模型提供了Mplus代码。
    The issue of equality in the between-and within-level structures in Multilevel Confirmatory Factor Analysis (MCFA) models has been influential for obtaining unbiased parameter estimates and statistical inferences. A commonly seen condition is the inequality of factor loadings under equal level-varying structures. With mathematical investigation and Monte Carlo simulation, this study compared the robustness of five statistical models including two model-based (a true and a mis-specified models), one design-based, and two maximum models (two models where the full rank of variance-covariance matrix is estimated in between level and within level, respectively) in analyzing complex survey measurement data with level-varying factor loadings. The empirical data of 120 3rd graders\' (from 40 classrooms) perceived Harter competence scale were modeled using MCFA and the parameter estimates were used as true parameters to perform the Monte Carlo simulation study. Results showed maximum models was robust to unequal factor loadings while the design-based and the miss-specified model-based approaches produced conflated results and spurious statistical inferences. We recommend the use of maximum models if researchers have limited information about the pattern of factor loadings and measurement structures. Measurement models are key components of Structural Equation Modeling (SEM); therefore, the findings can be generalized to multilevel SEM and CFA models. Mplus codes are provided for maximum models and other analytical models.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号