item response model

项目响应模型
  • 文章类型: Journal Article
    目的:这项研究的目的是通过定量感觉测试(QST)以及评估其他心理测量特性,验证手术后患者的神经性疼痛(NeuPPS)量表与临床验证的神经性疼痛(NP)。NeuPPS是经过验证的5项量表,旨在评估手术人群中的NP。
    方法:使用了537名年龄>18岁的原发性乳腺癌手术患者的数据,该研究纳入了先前的一项研究,用于评估乳腺癌治疗后持续性疼痛的危险因素。排除标准是任何其他乳房手术或相关合并症。6个月时共有448份合格问卷,12个月时共有455份。12个月时,290例患者完成了临床检查和QST。针对有和没有临床证实的NP的患者分析了NeuPPS和PainDETECT。使用包括临床评估的标准化QST方案评估NP。此外,NeuPPS和PainDETECT分数用项目反应理论方法进行了心理测试,Rasch分析,评估结构效度。主要结果是NeuPPS的诊断准确性指标,次要措施是在6个月和12个月后对NeuPPS进行心理测量分析。还将PainDETECT与临床验证的NP以及NeuPPS进行比较,比较估计的稳定性。
    结果:使用受试者工作特征曲线将NeuPPS评分与已验证的NP进行比较,NeuPPS的曲线下面积为0.80。使用1的截止值,NeuPPS的灵敏度为88%,特异性为59%,使用3的临界值,该值分别为35%和96%,分别。对PainDETECT的分析表明,在手术人群中使用的截止值可能不合适。
    结论:本研究支持NeuPPS在手术人群中作为NP筛选工具的有效性。
    OBJECTIVE: The aim of this study was to validate the Neuropathic Pain for Post-Surgical Patients (NeuPPS) scale against clinically verified neuropathic pain (NP) by quantitative sensory testing (QST) as well as evaluation of other psychometric properties. The NeuPPS is a validated 5-item scale designed to evaluate NP in surgical populations.
    METHODS: Data from 537 women aged >18 years scheduled for primary breast cancer surgery enrolled in a previous study for assessing risk factors for persistent pain after breast cancer treatment were used. Exclusion criteria were any other breast surgery or relevant comorbidity. A total of 448 eligible questionnaires were available at 6 months and 455 at 12 months. At 12 months, 290 patients completed a clinical examination and QST. NeuPPS and PainDETECT were analyzed against patients with and without clinically verified NP. NP was assessed using a standardized QST protocol including a clinical assessment. Furthermore, the NeuPPS and PainDETECT scores were psychometrically tested with an item response theory method, the Rasch analysis, to assess construct validity. Primary outcomes were the diagnostic accuracy measures for the NeuPPS, and secondary measures were psychometric analyses of the NeuPPS after 6 and 12 months. PainDETECT was also compared to clinically verified NP as well as NeuPPS comparing the stability of the estimates.
    RESULTS: Comparing the NeuPPS scores with verified NP using a receiver operating characteristic curve, the NeuPPS had an area under the curve of 0.80. Using a cutoff of 1, the NeuPPS had a sensitivity of 88% and a specificity of 59%, and using a cutoff of 3, the values were 35 and 96%, respectively. Analysis of the PainDETECT indicated that the used cutoffs may be inappropriate in a surgical population.
    CONCLUSIONS: The present study supports the validity of the NeuPPS as a screening tool for NP in a surgical population.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    社交网络如何影响人类行为一直是应用研究中一个有趣的话题。现有方法通常利用规模级行为数据(例如,积极响应的总数)来估计社交网络对人类行为的影响。这项研究提出了一种新颖的方法来研究利用项目级别的行为措施的社会影响。在潜伏空间建模框架下,我们将受访者的两个潜在空间“社交网络数据和项目级行为测量”整合到一个我们称之为“交互地图”的空间中。交互图可视化了受访者之间潜在的同质性与其项目级行为之间的关联,揭示不同项目层面行为的不同社会影响效应。我们还通过评估互动地图的影响来衡量整体社会影响力。我们通过广泛的模拟研究来评估所提出的方法的属性,并在研究学生的友谊网络如何影响他们参与学校活动的背景下,用真实数据演示所提出的方法。
    How social networks influence human behavior has been an interesting topic in applied research. Existing methods often utilized scale-level behavioral data (e.g., total number of positive responses) to estimate the influence of a social network on human behavior. This study proposes a novel approach to studying social influence that utilizes item-level behavioral measures. Under the latent space modeling framework, we integrate the two latent spaces for respondents\' social network data and item-level behavior measures into a single space we call \'interaction map\'. The interaction map visualizes the association between the latent homophily among respondents and their item-level behaviors, revealing differential social influence effects across item-level behaviors. We also measure overall social influence by assessing the impact of the interaction map. We evaluate the properties of the proposed approach via extensive simulation studies and demonstrate the proposed approach with a real data in the context of studying how students\' friendship network influences their participation in school activities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    从线性因子分析文献中知道Heywood案例是社区大于1.00的变量,在当今的因子模型中,这个问题也表现为负残差方差。对于二进制数据,序数数据的因子模型可以应用于delta参数化或theta参数化。前者比后者更常见,并且在使用有限的信息估计时可以产生Heywood情况。同样的问题在theta参数化因子模型中表现为非收敛情况,在项目响应理论(IRT)模型中表现为极大的判别。在这项研究中,我们解释了为什么同一问题根据分析方法以不同的形式出现。我们首先用方程讨论这个问题,然后用一个小型的模拟研究来说明我们的结论,所有这三种方法,delta和theta参数化序数因子模型(基于多脉波相关性和阈值进行估计)和IRT模型(具有完全信息估计),用于分析相同的数据集。结果在整个WLS中推广,WLSMV,和顺序数据因子模型的ULS估计器。最后,我们用相同的三种方法分析真实数据。仿真研究结果和实际数据的分析证实了理论结论。
    Heywood cases are known from linear factor analysis literature as variables with communalities larger than 1.00, and in present day factor models, the problem also shows in negative residual variances. For binary data, factor models for ordinal data can be applied with either delta parameterization or theta parametrization. The former is more common than the latter and can yield Heywood cases when limited information estimation is used. The same problem shows up as non convergence cases in theta parameterized factor models and as extremely large discriminations in item response theory (IRT) models. In this study, we explain why the same problem appears in different forms depending on the method of analysis. We first discuss this issue using equations and then illustrate our conclusions using a small simulation study, where all three methods, delta and theta parameterized ordinal factor models (with estimation based on polychoric correlations and thresholds) and an IRT model (with full information estimation), are used to analyze the same datasets. The results generalize across WLS, WLSMV, and ULS estimators for the factor models for ordinal data. Finally, we analyze real data with the same three approaches. The results of the simulation study and the analysis of real data confirm the theoretical conclusions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    音乐情感歧视任务(MEDT)是一个简短的,对音乐中情感辨别能力的非适应性测试。考生听到相同旋律的两个表演,两者都由同一个表演者演奏,但每个人都试图传达不同的基本情感,并被要求确定哪个“更快乐”,例如。当前研究的目标是使用一组更大的更短的,更多样化的音乐剪辑和自适应框架,以扩大测试可以提供测量的能力范围。第一项研究分析了大量参与者(N=624)的反应,以确定音乐特征如何影响项目难度,这导致了植根于项目反应理论(IRT)的音乐情感辨别能力的定量模型。该模型为自适应MEDT的构建提供了信息。第二项研究为适应性MEDT的有效性和可靠性提供了初步证据,并证明了新版本的测试适用于更广泛的能力。因此,本文提出了第一个自适应音乐情感辨别测试,一种研究情绪处理的新资源,可免费供研究使用。
    ABSTRACTThe Musical Emotion Discrimination Task (MEDT) is a short, non-adaptive test of the ability to discriminate emotions in music. Test-takers hear two performances of the same melody, both played by the same performer but each trying to communicate a different basic emotion, and are asked to determine which one is \"happier\", for example. The goal of the current study was to construct a new version of the MEDT using a larger set of shorter, more diverse music clips and an adaptive framework to expand the ability range for which the test can deliver measurements. The first study analysed responses from a large sample of participants (N = 624) to determine how musical features contributed to item difficulty, which resulted in a quantitative model of musical emotion discrimination ability rooted in Item Response Theory (IRT). This model informed the construction of the adaptive MEDT. A second study contributed preliminary evidence for the validity and reliability of the adaptive MEDT, and demonstrated that the new version of the test is suitable for a wider range of abilities. This paper therefore presents the first adaptive musical emotion discrimination test, a new resource for investigating emotion processing which is freely available for research use.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在PISA等教育大规模评估(LSA)研究中,项目反应理论(IRT)缩放模型总结了各国学生在认知测试项目上的表现。本文研究了模型规范中不同因素对PISA2018数学研究的影响。在社会科学中,模型规范的多种选择也在标签多重回归分析或规范曲线分析下坚定。在这篇文章中,我们研究了PISA缩放模型中模型规范的以下五个因素,以获得两个国家分布参数;国家均值和国家标准差:(1)IRT模型函数形式的选择,(2)处理国家一级的差异项目功能,(3)缺失项目响应的处理,(4)PISA测试中项目选择的影响,(5)测试位置效应的影响。在我们的多元宇宙分析中,事实证明,模型不确定性对国家手段的可变性的影响与由于学生抽样而导致的抽样误差几乎相同。模型不确定性对国家标准偏差的影响比标准误差更大。总的来说,多元回归分析中的5个规范因子中的每一个对国家均值或标准差至少有中等影响.在讨论部分,我们批判性地评估LSA研究中模型规格决策的当前实践.有人认为,我们要么更愿意报告模型不确定性的可变性,要么选择可能提供最有效策略的特定模型规范。需要强调的是,模型拟合不应在为LSA应用程序选择缩放策略方面发挥作用。
    In educational large-scale assessment (LSA) studies such as PISA, item response theory (IRT) scaling models summarize students\' performance on cognitive test items across countries. This article investigates the impact of different factors in model specifications for the PISA 2018 mathematics study. The diverse options of the model specification also firm under the labels multiverse analysis or specification curve analysis in the social sciences. In this article, we investigate the following five factors of model specification in the PISA scaling model for obtaining the two country distribution parameters; country means and country standard deviations: (1) the choice of the functional form of the IRT model, (2) the treatment of differential item functioning at the country level, (3) the treatment of missing item responses, (4) the impact of item selection in the PISA test, and (5) the impact of test position effects. In our multiverse analysis, it turned out that model uncertainty had almost the same impact on variability in the country means as sampling errors due to the sampling of students. Model uncertainty had an even larger impact than standard errors for country standard deviations. Overall, each of the five specification factors in the multiverse analysis had at least a moderate effect on either country means or standard deviations. In the discussion section, we critically evaluate the current practice of model specification decisions in LSA studies. It is argued that we would either prefer reporting the variability in model uncertainty or choosing a particular model specification that might provide the strategy that is most valid. It is emphasized that model fit should not play a role in selecting a scaling strategy for LSA applications.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在诸如PISA之类的教育大规模评估研究中,项目反应理论(IRT)模型用于总结各国学生在认知测试项目上的表现。在这篇文章中,IRT模型的选择对国家分布参数的影响(即,意思是,标准偏差,百分位数)进行调查。使用信息标准比较了11种不同的IRT模型。此外,通过估计模型误差来量化模型不确定性,可以与与学生抽样相关的抽样误差进行比较。认知领域数学的PISA2009数据集,阅读,科学作为IRT模型选择的一个例子。结果表明,具有残差异质性的三参数logisticIRT模型和具有能力θ二次效应的三参数IRT模型提供了最佳的模型拟合。此外,在大多数情况下,与国家均值的抽样误差相比,模型不确定性相对较小,但对于国家标准偏差和百分位数而言,模型不确定性是相当大的。因此,可以说,模型误差应该包括在教育大规模评估研究的统计推断中。
    In educational large-scale assessment studies such as PISA, item response theory (IRT) models are used to summarize students\' performance on cognitive test items across countries. In this article, the impact of the choice of the IRT model on the distribution parameters of countries (i.e., mean, standard deviation, percentiles) is investigated. Eleven different IRT models are compared using information criteria. Moreover, model uncertainty is quantified by estimating model error, which can be compared with the sampling error associated with the sampling of students. The PISA 2009 dataset for the cognitive domains mathematics, reading, and science is used as an example of the choice of the IRT model. It turned out that the three-parameter logistic IRT model with residual heterogeneity and a three-parameter IRT model with a quadratic effect of the ability θ provided the best model fit. Furthermore, model uncertainty was relatively small compared to sampling error regarding country means in most cases but was substantial for country standard deviations and percentiles. Consequently, it can be argued that model error should be included in the statistical inference of educational large-scale assessment studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    综合数据分析(IDA)涉及获取多个数据集,将数据缩放为通用度量,并共同分析数据。IDA的第一步是将多样本项目级数据扩展到一个通用指标,这通常是用多组项目反应模型(MGM)完成的。在测试和施加不变性约束的情况下,来自MGM的估计潜在变量得分在后续分析中用作观察变量.这种方法与经验的多组数据一起使用,并且从不同的研究中获得了具有相同反应模式的个体的不同潜在变量估计。然后进行了蒙特卡罗模拟研究,以比较来自MGM的潜在变量估计的准确性,单组项目响应模型,以及忽略群体差异的MGM。结果表明,这些替代方法导致一致且同样准确的潜在变量估计。讨论了对国际开发协会的影响。
    Integrative data analysis (IDA) involves obtaining multiple datasets, scaling the data to a common metric, and jointly analyzing the data. The first step in IDA is to scale the multisample item-level data to a common metric, which is often done with multiple group item response models (MGM). With invariance constraints tested and imposed, the estimated latent variable scores from the MGM serve as an observed variable in subsequent analyses. This approach was used with empirical multiple group data and different latent variable estimates were obtained for individuals with the same response pattern from different studies. A Monte Carlo simulation study was then conducted to compare the accuracy of latent variable estimates from the MGM, a single-group item response model, and an MGM where group differences are ignored. Results suggest that these alternative approaches led to consistent and equally accurate latent variable estimates. Implications for IDA are discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    在教育大规模评估研究中,例如国际学生评估计划(PISA)中,缺失的项目响应很普遍。当前的操作实践将缺失项目响应评分为错误,但是一些心理测量学家主张基于潜在可忽略性假设的基于模型的治疗。在这种方法中,项目响应和响应指标以潜在能力和潜在响应倾向变量为条件进行联合建模。或者,可以使用基于归因的方法。在Mislevy-Wu模型中,潜在的可忽略性假设被削弱,该模型表征了不可忽略的错误机制,并允许项目的错误取决于项目本身。将缺失项目响应评分为错误和潜在可忽略模型是Mislevy-Wu模型的子模型。在一个说明性的模拟研究中,结果表明,Mislevy-Wu模型提供了无偏模型参数。此外,该模拟重复了文献中各种模拟研究的发现,即如果潜在可忽略性假设在数据生成模型中成立,则将缺失项目响应评分为错误提供了有偏差的估计.然而,如果生成了缺失的项目响应,使得它们只能从不正确的项目响应中生成,应用依赖于潜在可忽略性的项目响应模型会导致有偏差的估计。如果更一般的Mislevy-Wu模型在数据生成模型中成立,则Mislevy-Wu模型可以保证无偏的参数估计。此外,本文使用PISA2018数学数据集作为案例研究,研究不同缺失数据处理对国家均值和国家标准差的影响。对于不同的缩放模型,获得的国家平均值和国家标准偏差可能会大不相同。与文献中先前的陈述相反,对于大多数国家,缺失项目响应评分为不正确提供了比潜在可忽略模型更好的模型拟合.此外,在对潜在反应倾向进行条件调节后,对项目本身的错误依赖性对于构造反应项目比多项选择项目更为明显。因此,应该从两个角度拒绝假定潜在可忽略性的缩放模型。首先,由于模型拟合的原因,Mislevy-Wu模型优于潜在可忽略模型。第二,在讨论部分,我们认为,在大规模评估研究中,模型拟合在选择心理测量模型时只应扮演次要角色,因为有效性方面是最相关的。缺少各国可以简单操纵的数据处理(以及,因此,他们的学生)导致不公平的国家比较。
    Missing item responses are prevalent in educational large-scale assessment studies such as the programme for international student assessment (PISA). The current operational practice scores missing item responses as wrong, but several psychometricians have advocated for a model-based treatment based on latent ignorability assumption. In this approach, item responses and response indicators are jointly modeled conditional on a latent ability and a latent response propensity variable. Alternatively, imputation-based approaches can be used. The latent ignorability assumption is weakened in the Mislevy-Wu model that characterizes a nonignorable missingness mechanism and allows the missingness of an item to depend on the item itself. The scoring of missing item responses as wrong and the latent ignorable model are submodels of the Mislevy-Wu model. In an illustrative simulation study, it is shown that the Mislevy-Wu model provides unbiased model parameters. Moreover, the simulation replicates the finding from various simulation studies from the literature that scoring missing item responses as wrong provides biased estimates if the latent ignorability assumption holds in the data-generating model. However, if missing item responses are generated such that they can only be generated from incorrect item responses, applying an item response model that relies on latent ignorability results in biased estimates. The Mislevy-Wu model guarantees unbiased parameter estimates if the more general Mislevy-Wu model holds in the data-generating model. In addition, this article uses the PISA 2018 mathematics dataset as a case study to investigate the consequences of different missing data treatments on country means and country standard deviations. Obtained country means and country standard deviations can substantially differ for the different scaling models. In contrast to previous statements in the literature, the scoring of missing item responses as incorrect provided a better model fit than a latent ignorable model for most countries. Furthermore, the dependence of the missingness of an item from the item itself after conditioning on the latent response propensity was much more pronounced for constructed-response items than for multiple-choice items. As a consequence, scaling models that presuppose latent ignorability should be refused from two perspectives. First, the Mislevy-Wu model is preferred over the latent ignorable model for reasons of model fit. Second, in the discussion section, we argue that model fit should only play a minor role in choosing psychometric models in large-scale assessment studies because validity aspects are most relevant. Missing data treatments that countries can simply manipulate (and, hence, their students) result in unfair country comparisons.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Compositional items - a form of forced-choice items - require respondents to allocate a fixed total number of points to a set of statements. To describe the responses to these items, the Thurstonian item response theory (IRT) model was developed. Despite its prominence, the model requires that items composed of parts of statements result in a factor loading matrix with full rank. Without this requirement, the model cannot be identified, and the latent trait estimates would be seriously biased. Besides, the estimation of the Thurstonian IRT model often results in convergence problems. To address these issues, this study developed a new version of the Thurstonian IRT model for analyzing compositional items - the lognormal ipsative model (LIM) - that would be sufficient for tests using items with all statements positively phrased and with equal factor loadings. We developed an online value test following Schwartz\'s values theory using compositional items and collected response data from a sample size of N = 512 participants with ages from 13 to 51 years. The results showed that our LIM had an acceptable fit to the data, and that the reliabilities exceeded 0.85. A simulation study resulted in good parameter recovery, high convergence rate, and the sufficient precision of estimation in the various conditions of covariance matrices between traits, test lengths and sample sizes. Overall, our results indicate that the proposed model can overcome the problems of the Thurstonian IRT model when all statements are positively phrased and factor loadings are similar.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    COVID-19大流行已在世界各地广泛传播。已经提出了许多数学模型来研究COVID-19的拐点(IP)和传播模式。然而,没有研究人员应用社交网络分析(SNA)来聚类他们的特征。我们的目的是说明使用SNA来识别COVID-19的传播簇。国家/地区的累计感染病例数(CNIC)从GitHub下载。CNIC模式是根据国家/地区之间的CNIC从SNA中提取的。应用项目响应模型(IRT)为每个国家/地区创建通用预测模型。从IRT模型获得IP天数。在大陆的位置参数,中国,和美国进行了比较。结果表明:(1)利用SNA分离出东亚、欧洲到美洲三个簇(255,n=51、130和74),(2)与其他同行相比,中国的平均IP较短,平均位置参数较小,(3)使用在线仪表板显示每个国家/地区的群集以及IP日。时空传播模式可以使用SNA和相关系数(CC)进行聚类。建议流行病学家和研究人员使用带有传播簇和IP日的仪表板,而不仅限于COVID-19大流行。
    The COVID-19 pandemic has spread widely around the world. Many mathematical models have been proposed to investigate the inflection point (IP) and the spread pattern of COVID-19. However, no researchers have applied social network analysis (SNA) to cluster their characteristics. We aimed to illustrate the use of SNA to identify the spread clusters of COVID-19. Cumulative numbers of infected cases (CNICs) in countries/regions were downloaded from GitHub. The CNIC patterns were extracted from SNA based on CNICs between countries/regions. The item response model (IRT) was applied to create a general predictive model for each country/region. The IP days were obtained from the IRT model. The location parameters in continents, China, and the United States were compared. The results showed that (1) three clusters (255, n = 51, 130, and 74 in patterns from Eastern Asia and Europe to America) were separated using SNA, (2) China had a shorter mean IP and smaller mean location parameter than other counterparts, and (3) an online dashboard was used to display the clusters along with IP days for each country/region. Spatiotemporal spread patterns can be clustered using SNA and correlation coefficients (CCs). A dashboard with spread clusters and IP days is recommended to epidemiologists and researchers and is not limited to the COVID-19 pandemic.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号