Validity

有效性
  • 文章类型: Journal Article
    目的:评估当前人口调查年度社会和经济补编(CPS)中月度健康保险覆盖率自我报告的准确性。
    方法:CHIME(比较健康保险测量误差)研究使用了中西部大型地区保险公司的健康保险注册记录作为2015年春季主要数据收集的样本。
    方法:在一系列公共和私人保险类型(包括医疗补助和市场)中注册的个人样本被管理CPS健康保险模块,其中包括关于月级覆盖率的问题,按类型,在17-18个月的时间跨度内。然后将调查数据与涵盖相同时间范围的注册记录进行匹配,并评估记录和自我报告之间的一致性。
    方法:样本由保险公司的信息学专家和人口普查局的访谈人员进行了调查。收集数据后,将更新的纳入记录与调查数据进行匹配,以产生按月级别分类的个人级别覆盖文件.
    结果:对于总体样本的91%,在至少75%的观察月份内,我们准确报告了覆盖状况和类型.结果因覆盖率的稳定性而有些变化。在整个17-18个月的观察期(占总体样本的64%)中,在94%的样本中观察到了这种报告准确性水平;对于那些有审查法术的人(占总体样本的34%),该数字为87%;根据记录(占总体样本的2%),对于82%的患者,至少75%的月报告准确.
    结论:研究结果表明,CPS中月级覆盖率的报告准确性很高,并且该调查可能成为研究覆盖率动态的有价值的新数据源,包括医疗补助计划。
    OBJECTIVE: To evaluate the veracity of self-reports of month-level health insurance coverage in the Current Population Survey Annual Social and Economic Supplement (CPS).
    METHODS: The CHIME (Comparing Health Insurance Measurement Error) study used health insurance enrollment records from a large regional Midwest insurer as sample for primary data collection in spring 2015.
    METHODS: A sample of individuals enrolled in a range of public and private coverage types (including Medicaid and marketplace) was administered the CPS health insurance module, which included questions about month-level coverage, by type, over a 17-18-month time span. Survey data was then matched to enrollment records covering that same time frame, and concordance between the records and self-reports was assessed.
    METHODS: Sample was drawn by the insurer\'s informatics specialists and Census Bureau interviewers conducted the survey. Following data collection, updated enrollment records were matched to the survey data to produce a person-level file of coverage by type at the month-level.
    RESULTS: For 91% of the overall sample, coverage status and type were reported accurately for at least 75% of observed months. Results varied somewhat by stability of coverage. Among those who were continuously covered throughout the 17-18 month observation period (which comprised 64% of the overall sample), that level of reporting accuracy was observed for 94% of the sample; for those who had censored spells (34% of the overall sample), the figure was 87%; and among those with gaps and/or changes according to the records (2% of the overall sample), for 82% of the group at least 75% of months were reported accurately.
    CONCLUSIONS: Findings suggest that reporting accuracy of month-level coverage in the CPS is high and that the survey could become a valuable new data source for studying the dynamics of coverage, including the Medicaid unwinding.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:心理尸检方法通常包括冲动性和攻击性的测量。目的是评估其在西班牙样本中的可靠性和有效性。
    方法:由184个先证者和代理对完成了基于网络的横截面调查。收集了社会人口统计学特征的数据,通过Barratt冲动量表(BIS-11)的冲动,通过巴斯-佩里侵略问卷(BPAQ)进行侵略,和自杀观念的历史。代理人填写了BIS-11,BPAQ和自杀意念,并给出了他们对先证者的期望。使用先证者和代理人之间的组内相关系数(ICC)评估可靠性。采用Logistic回归分析评估代理报告预测自杀意念的有效性。
    结果:双变量分析显示BPAQ存在差异(中位数68与62;p=0.001),但在BIS-11中没有(p>.050)。BIS-11显示良好的一致性(ICC=0.754;CI95%0.671-0.816)和BPAQ可接受(ICC=0.592;CI95%0.442-0.699)。在先证者回归模型中,BPAQ预测自杀意念(OR1.038;CI95%1.016-1.061),但不预测BIS-11(OR0.991;CI95%0.958-1.025)。在代理报告模型中,BPAQ还预测了先证者的自杀意念(OR1.036CI95%1.014-1.058),但不预测BIS-11(OR0.973;CI95%0.942-1.004)。
    结论:用作代理报告的评估工具,BIS-11显示出比BPAQ更好的可靠性。然而,两者在西班牙人群中均显示有效性,可纳入心理尸检方案.
    BACKGROUND: Psychological autopsy methods often include measures of impulsivity and aggression. The aim is to assess their reliability and validity in a Spanish sample.
    METHODS: Cross-sectional web-based survey was fulfilled by 184 proband and proxy pairs. Data was collected on sociodemographic characteristics, impulsivity through Barratt Impulsiveness Scale (BIS-11), aggression through Buss-Perry Aggression Questionnaire (BPAQ), and history of suicide ideation. Proxies filled out BIS-11, BPAQ and suicide ideation with the responses they would expect from the probands. Reliability was assessed using intraclass correlation coefficients (ICC) between proband and proxies. Logistic regression analysis was performed to assess the predictive validity of proxy reports in predicting probands\' suicide ideation.
    RESULTS: Bivariate analysis showed differences in BPAQ (Median 68 vs. 62; p=0.001), but not in BIS-11 (p>.050). BIS-11 showed good concordance (ICC=0.754; CI 95% 0.671-0.816) and BPAQ acceptable (ICC=0.592; CI 95% 0.442-0.699). In the probands regression model BPAQ predicted suicide ideation (OR 1.038; CI 95% 1.016-1.061) but not BIS-11 (OR 0.991; CI 95% 0.958-1.025). In the proxy-report model BPAQ also predicted probands\' suicide ideation (OR 1.036; CI 95% 1.014-1.058) but not BIS-11 (OR 0.973; CI 95% 0.942-1.004).
    CONCLUSIONS: Used as proxy-reported assessment tools, BIS-11 showed better reliability than the BPAQ. However, both showed validity in Spanish population and could be included in psychological autopsy protocols.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:评估子宫内膜癌病例发现算法的阳性预测值(PPV),该算法使用来自美国保险索赔的国际疾病分类第10次修订临床修改(ICD-10-CM)诊断代码,以在计划的上市后安全性研究中实施。评估了两种算法变体。
    方法:从2016年到2020年,在≥50岁的女性中发现了临时发生的子宫内膜癌病例。一种算法变体使用了子宫部位恶性肿瘤的诊断代码(C54。x),排除C54.2(子宫肌层恶性肿瘤);另一个仅使用C54.1(子宫内膜恶性肿瘤)。要求对近期事件临时病例(2018-2020年)的病历进行随机抽样裁定。确诊病例显示子宫内膜癌的活检证据,癌症分期的文献,或诊断后子宫切除术。我们用95%置信区间(CI)估计了变异体的PPV,不包括信息不足的病例。
    结果:在裁定的294起临时案件中,85%来自门诊(n=249)。诊断时的平均年龄为69.3岁。在294起判决的案件中(用更广泛的算法变体确定),通过两种算法变体确认了同样的223例子宫内膜癌病例.更广泛的算法变体的PPV(95%CI)为84.2%(79.2%和88.3%),对于仅使用C54.1的变体,为85.8%(80.9%和89.8%)。
    结论:我们开发并验证了一种使用ICD-10-CM诊断代码的算法,以识别具有足够高PPV的健康保险索赔中的子宫内膜癌病例,用于计划的上市后安全性研究。
    To evaluate the positive predictive value (PPV) of an endometrial cancer case finding algorithm using International Classification of Disease 10th revision Clinical Modification (ICD-10-CM) diagnosis codes from US insurance claims for implementation in a planned post-marketing safety study. Two algorithm variants were evaluated.
    Provisional incident endometrial cancer cases were identified from 2016 through 2020 among women aged ≥50 years. One algorithm variant used diagnosis codes for malignant neoplasms of uterine sites (C54.x), excluding C54.2 (malignant neoplasm of myometrium); the other used only C54.1 (malignant neoplasm of endometrium). A random sample of medical records of recent incident provisional cases (2018-2020) was requested for adjudication. Confirmed cases showed biopsy evidence of endometrial cancer, documentation of cancer staging, or hysterectomy following diagnosis. We estimated the PPV of the variants with 95% confidence intervals (CI) excluding cases that had insufficient information.
    Of 294 provisional cases adjudicated, 85% were from outpatient settings (n = 249). Mean age at diagnosis was 69.3 years. Among the 294 adjudicated cases (identified with the broader algorithm variant), the same 223 were confirmed endometrial cancer cases by both algorithm variants. The PPV (95% CI) for the broader algorithm variant was 84.2% (79.2% and 88.3%), and for the variant using only C54.1 was 85.8% (80.9% and 89.8%).
    We developed and validated an algorithm using ICD-10-CM diagnosis codes to identify endometrial cancer cases in health insurance claims with a sufficiently high PPV to use in a planned post-marketing safety study.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目标:本研究旨在评估知识的验证和可靠性,病例混合和诊断相关组(DRG)系统问卷的态度和实践(KAP)。方法:在2012年9月1日至11月30日的横断面研究中,从土耳其三家公立医院中方便地选择了238名医疗保健提供者。平均年龄为38.63岁(标准差[SD]10.52),年龄从21岁到60岁不等。一半以上是男性(52.1%),近五分之二是医生(39.9%),三分之一是护士(33.2%),六分之一是辅助人员(16.4%),其余是编码人员(10.5%)。只有三分之一(33.6%)的受访者参加了案例组合或DRG系统中的研讨会或培训计划。在检查内容有效性之后,进行了因子分析,问卷的内部一致性是通过Cronbach的α估计来评估的,并对重测可靠性进行了评估。结果:Kaiser-Meyer-Olkin检验(0.915)和Bartlett检验(1052)证实了提取因子的样品充分性。因子分析显示了三个因素,包括态度(36.43%),实践(23.39%)和知识(17%),总方差为76.82%。问卷各部分的信度如下:知识(0.963),态度(0.964)和实践(0.973)。Cronbach的α总量为0.941,表现出优异的内部一致性。结论:本研究表明,设计的问卷具有较高的结构效度和信度,并可用于衡量土耳其病例组合和DRG系统卫生保健人员的KAP。
    Objectives: This study was aimed to assess validation and reliability of knowledge of, attitude toward and practice (KAP) of a Case-mix and Diagnosis Related Group (DRG) system questionnaire. Methods: A sample of 238 health care providers selected conveniently from three public hospitals in Turkey was enrolled in a cross-sectional study from September 1 until November 30, 2012. The mean age was 38.63 years (standard deviation [SD] 10.52), ranging from age 21 to 60 years. More than one-half were males (52.1%), nearly two-fifths were medical doctors (39.9%), one-third were nurses (33.2%), one-sixth were auxiliary staff (16.4%) and the remaining were coders (10.5%). Only one-third (33.6%) of respondents attended a workshop or training program in the Case-mix or DRG system. After examining content validity, factor analysis was conducted, internal consistency of the questionnaire was assessed by Cronbach\'s alpha estimate, and test-retest reliability was evaluated. Results: The sample adequacy for extraction of the factors was confirmed by the Kaiser-Meyer-Olkin test (0.915) and the Bartlett test (1052). Factor analysis showed three factors, including attitude (36.43%), practice (23.39%) and knowledge (17%), with a total variance of 76.82%. The reliability of each section of the questionnaire was as follows: knowledge (0.963), attitude (0.964) and practice (0.973). Cronbach\'s alpha total was 0.941, which showed excellent internal consistency. Conclusions: This study demonstrated that the designed questionnaire provided high construct validity and reliability, and could be adequately used to measure KAP among health care staff of the Case-mix and DRG system in Turkey.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:我们旨在检验成都市儿科急诊分诊标准的信度和效度,以期为其他医院儿科急诊分诊的发展提供参考。
    方法:我们根据病情/症状制定了成都儿科急诊分诊标准,生命体征,和我院2020年采用德尔菲法的儿科预警评分系统。2021年1月至3月在我院进行的模拟情景分诊和现实生活分诊,以及2022年2月从我院健康信息系统中提取的分诊记录的回顾性研究,用于衡量分诊护士之间的分诊决策一致性。在分诊护士和专家团队之间。
    结果:对于20个模拟案例,分诊护士之间的分诊决策Kappa值为0.6(95%CI0.352-0.849),分诊护士和专家团队之间的分诊决策Kappa值为0.73(95%CI0.540-0.911).对于现实生活中的252个案例,分诊护士和专家团队之间的分诊决策Kappa值为0.824(95%CI0.680-0.962).对于选择进行分诊记录回顾性研究的20540例,分诊护士之间的分诊决策Kappa值为0.702(95%CI0.691~0.713);分诊护士1与专家团队之间的Kappa值为0.634(95%CI0.523~0.647);分诊护士2与专家团队之间的Kappa值为0.725(95%CI0.713~0.736).在模拟情景分诊中,分诊护士与专家团队之间的分诊决策总体一致率为80%;在现实生活中,分诊护士与专家团队之间的总体一致率为97.6%;在回顾性研究中,分诊护士之间的总体一致率为91.9%。在回顾性研究中,分诊护士1和专家团队在分诊决定中的协议率,在分诊护士2和专家团队之间,分别为88.0%和92.3%,分别。
    结论:我院制定的成都儿科急诊分诊标准可靠有效,并能促进分诊护士快速有效的分诊。
    We aimed to examine the reliability and validity of Chengdu pediatric emergency triage criteria in order to provide a reference for the development of pediatric emergency triage within other hospitals.
    We developed Chengdu pediatric emergency triage criteria based on the conditions/symptom, vital signs, and the Pediatric Early Warning Score system within our hospital using the Delphi method in 2020. The simulation scenario triage and real-life triage which were conducted in our hospital during January - March 2021, and the retrospective study of triage records extracted from our hospital\'s health information system in February 2022, were used to measure the agreement in triage decisions between the triage nurses, and between the triage nurses and the expert team.
    For the 20 simulation cases, the Kappa value of triage decisions between the triage nurses was 0.6 (95% CI 0.352-0.849), and the Kappa value of triage decisions between the triage nurses and the expert team was 0.73 (95% CI 0.540-0.911). For the 252 cases in the real-life triage, the Kappa value of triage decisions between the triage nurses and the expert team was 0.824 (95% CI 0.680-0.962). For the 20,540 cases selected for the retrospective study of triage records, the Kappa value of triage decisions between the triage nurses was 0.702 (95% CI 0.691-0.713); that between Triage Nurse 1 and the expert team was 0.634 (95% CI 0.623-0.647); and that between Triage Nurse 2 and the expert team was 0.725 (95% CI 0.713-0.736). The overall agreement rate in triage decisions between the triage nurses and the expert team in the simulation scenario triage was 80%; that between the triage nurses and the expert team in the real-life triage was 97.6%; and that between the triage nurses in the retrospective study was 91.9%. In the retrospective study, the agreement rates in triage decisions between Triage Nurse 1 and the expert team, and between Triage Nurse 2 and the expert team, were 88.0% and 92.3%, respectively.
    Chengdu pediatric emergency triage criteria that developed within our hospital is reliable and valid, and can promote rapid and effective triage by triage nurses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本文介绍了一个研究项目的关键发现,该项目评估了在基于现场的条件下进行盒装比较的有效性和证明价值。美国228名训练有素的枪支检查员提供的决定表明,法医弹壳比较的错误率低。然而,不确定的决定占所有决定的五分之一以上,对技术做出明确正确决策的能力进行复杂的评估。具体来说,将评估限制为仅确定和消除的决定性决定,产生的真阳性和真阴性率超过99%,但纳入不确定因素导致这些值降至93.4%和63.5%,分别。对这两个比率的不对称影响发生了,因为对于不同来源的比较,不确定的决定的频率是相同来源的比较的六倍。考虑到证明价值,这是决定对确定比较的真实状态的有用性,决定性的决定以近乎完美的方式预测了他们相应的地面真相状态。似然比(LR)进一步表明,结论性决策大大增加了比较的真实状态与决策所断言的真实状态相匹配的几率。不确定的决定也具有证明价值,预测不同来源状态,并有一个LR表明它们增加了不同来源状态的几率。该研究还通过使用两个产生不同弹壳标记的枪支模型来操纵比较难度。为更困难而选择的模型在同源比较中获得了更多不确定的决定,与难度较小的模型相比,真阳性率较低。相关地,难度较小的模型的不确定决策表现出更多的证明价值,更强烈地预测不同来源的状态。
    This article presents key findings from a research project that evaluated the validity and probative value of cartridge-case comparisons under field-based conditions. Decisions provided by 228 trained firearm examiners across the US showed that forensic cartridge-case comparison is characterized by low error rates. However, inconclusive decisions constituted over one-fifth of all decisions rendered, complicating evaluation of the technique\'s ability to yield unambiguously correct decisions. Specifically, restricting evaluation to only the conclusive decisions of identification and elimination yielded true-positive and true-negative rates exceeding 99%, but incorporating inconclusives caused these values to drop to 93.4% and 63.5%, respectively. The asymmetric effect on the two rates occurred because inconclusive decisions were rendered six times more frequently for different-source than same-source comparisons. Considering probative value, which is a decision\'s usefulness for determining a comparison\'s ground-truth state, conclusive decisions predicted their corresponding ground-truth states with near perfection. Likelihood ratios (LRs) further showed that conclusive decisions greatly increase the odds of a comparison\'s ground-truth state matching the ground-truth state asserted by the decision. Inconclusive decisions also possessed probative value, predicting different-source status and having a LR indicating that they increase the odds of different-source status. The study also manipulated comparison difficulty by using two firearm models that produce dissimilar cartridge-case markings. The model chosen for being more difficult received more inconclusive decisions for same-source comparisons, resulting in a lower true-positive rate compared to the less difficult model. Relatedly, inconclusive decisions for the less difficult model exhibited more probative value, being more strongly predictive of different-source status.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:许多作者使用报告清单作为评估工具来分析不同类型证据的报告质量。我们旨在分析研究人员在随机对照试验中评估报告证据质量的方法学方法,系统评价,和观察性研究。
    方法:我们分析了报告证据质量评估的文章以及系统评价和荟萃分析(PRISMA)的首选报告项目,合并报告试验标准(CONSORT),或截至2021年7月18日发布的加强流行病学观察研究报告(STROBE)清单。我们分析了用于评估报告质量的方法。
    结果:在356篇分析文章中,293(88%)调查了特定的主题领域。最常用的是CONSORT检查表(N=225;67%),在它的原始,已修改,局部形式,或其延伸。对252篇文章中的检查表项目的遵守情况进行了数字评分(75%),其中36篇(11%)使用了各种报告质量门槛。在158(47%)篇文章中,我们分析了遵守报告清单的预测因子.与遵守报告清单相关的研究最多的因素是文章发表年份(N=82;52%)。
    结论:用于评估证据报告质量的方法差异很大。研究界需要就评估报告质量的一致方法达成共识。
    Many authors used reporting checklists as an assessment tool to analyze the reporting quality of diverse types of evidence. We aimed to analyze methodological approaches used by researchers assessing reporting quality of evidence in randomized controlled trials, systematic reviews, and observational studies.
    We analyzed articles reporting quality assessment of evidence with Preferred Reporting Items of Systematic Reviews and Meta-Analyses (PRISMA), CONsolidated Standards of Reporting Trials (CONSORT), or the Strengthening the Reporting of Observational studies in Epidemiology (STROBE) checklists published up to 18 July 2021. We analyzed methods used for assessing reporting quality.
    Among 356 analyzed articles, 293 (88%) investigated a specific thematic field. The CONSORT checklist (N = 225; 67%) was most often used, in its original, modified, partial form, or its extension. Numerical scores were given for adherence to checklist items in 252 articles (75%), of which 36 articles (11%) used various reporting quality thresholds. In 158 (47%) articles, predictors of adherence to reporting checklist were analyzed. The most studied factor associated with adherence to reporting checklist was the year of article publication (N = 82; 52%).
    The methodology used for assessing reporting quality of evidence varied considerably. The research community needs a consensus on a consistent methodology for assessing the quality of reporting.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    儿童虐待与双相情感障碍的病因和临床过程有关。大多数研究采用回顾性虐待自我报告,容易受到偏见,对其有效性和可靠性提出质疑。这项研究检查了10年内的重测可靠性,双极性样本中儿童虐待的回顾性报告的收敛有效性和当前情绪的影响。85名躁郁症I型患者在基线时完成了儿童创伤问卷[CTQ]和父母联系工具[PBI]。贝克抑郁量表和自我报告躁狂量表评估抑郁和躁狂症状,分别。53名参与者在基线和10年随访时完成了CTQ。在CTQ和PBI之间观察到良好的收敛有效性水平。相关性范围从rs=-0.35(CTQ情感虐待和PBI父亲护理)到rs=-0.65(CTQ情感忽视和PBI母亲护理)。基线时的CTQ报告与10年随访之间存在良好的一致性(范围:κ=0.41对于身体忽视,而κ=0.83对于性虐待)。与没有此类报告的参与者相比,报告虐待(但没有忽视)的参与者的抑郁和躁狂症得分更高。这些发现支持在研究和临床实践中使用这种方法,尽管当前的情绪应该被考虑在内。
    Childhood maltreatment is associated with the etiology and clinical course of bipolar disorder. Most studies employ retrospective maltreatment self-reports which are vulnerable to bias, raising questions about their validity and reliability. This study examined the test-retest reliability over 10 years, the convergent validity and the impact of current mood on retrospective reports of childhood maltreatment in a bipolar sample. 85 participants with bipolar I disorder completed the Childhood Trauma Questionnaire [CTQ] and the Parental Bonding Instrument [PBI] at baseline. Beck Depression Inventory and Self Report Mania Inventory assessed depressive and manic symptoms, respectively. 53 participants completed the CTQ at baseline and 10-year follow-up. Good levels of convergent validity were observed between the CTQ and PBI. Correlations ranged from rs= -0.35 (CTQ emotional abuse and PBI paternal care) to rs= -0.65 (CTQ emotional neglect and PBI maternal care). Good agreement between CTQ reports at baseline and 10-year follow-up were found (range: κ=0.41 for physical neglect to κ=0.83 for sexual abuse). Higher depression and mania scores were recorded among participants who reported abuse (but not neglect) compared to those without such reports. These findings support using this method in research and clinical practice, though current mood should be taken into account.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:这项研究的目的是检查临床环境中认知功能受损的简要评估(BASIC)病例发现工具的心理测量特性,重点是(i)重测可靠性,(ii)BASIC及其成分用于识别阿尔茨海默病(AD)痴呆和非AD痴呆的判别效度,和(iii)认知状态的专家临床评级与BASIC表现的关联。
    方法:测试-重测信度分析是基于一般实践患者(n=59)的样本,平均间隔为19天。基于记忆诊所中BASIC的主要验证研究的数据,进行了判别效度分析和认知状态与BASIC表现的关联分析。
    结果:BASIC的重测可靠性高(r=0.861)。识别AD痴呆(敏感性=0.99,特异性=0.98)和非AD痴呆(敏感性=0.90,特异性=0.98)的判别有效性没有显着差异。BASIC的所有组成部分都有助于AD和非AD痴呆的高区分效度。BASIC表现与患者认知状态的专家临床评分显着相关。与基于迷你精神状态检查(MMSE)分数范围的模型(58%的准确性)相比,使用BASIC分数间隔的认知状态粗略分期模型具有更高的分类准确性(70%)。
    结论:BASIC是临床上用于AD痴呆和非AD痴呆的可靠且有效的病例发现工具。基本表现与认知障碍程度显著相关,BASIC在损害分期方面似乎优于MMSE。
    The aims of this study were to examine the psychometric properties of the Brief Assessment of Impaired Cognition (BASIC) case-finding instrument in clinical settings focusing on (i) test-retest reliability, (ii) the discriminative validity of BASIC and its components for identification of Alzheimer disease (AD) dementia and non-AD dementia, and (iii) the association of expert clinical rating of cognitive status with BASIC performance.
    The test-retest reliability analysis was based on a sample of general practice patients (n = 59) retested with a mean interval of 19 days. Discriminative validity analyses and analysis of the association of cognitive status with BASIC performance were based on data from the primary validation study of BASIC in memory clinics.
    The test-retest reliability of BASIC was high (r = 0.861). No significant difference in discriminative validity was found for identification of AD dementia (sensitivity = 0.99, specificity = 0.98) and non-AD dementia (sensitivity = 0.90, specificity = 0.98). All components of BASIC contributed to the high discriminative validity of both AD and non-AD dementia. BASIC performance was significantly correlated with expert clinical rating of the cognitive status of patients. A crude staging model for cognitive status using BASIC score intervals had superior classification accuracy (70%) compared to a Mini-Mental State Examination (MMSE) score range-based model (58% accuracy).
    BASIC is a reliable and valid case-finding instrument for AD dementia and non-AD dementia in clinical settings. BASIC performance is significantly associated with the degree of cognitive impairment, and BASIC seems to be superior to MMSE for staging of impairment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:故事回忆是一种简单而敏感的认知测试,通常用于测量早期阿尔茨海默病(AD)中情景记忆功能的变化。数字技术和自然语言处理方法的最新进展使该测试成为自动管理和评分的候选。更高频率的疾病监测需要多个并行测试刺激。
    目的:本研究旨在开发和验证远程和全自动故事回忆任务,适合纵向评估,在有或没有轻度认知障碍(MCI)或轻度AD的老年人群体中。
    方法:“早期阿尔茨海默病的淀粉样蛋白预测”(AMYPRED)研究招募了英国(AMYPRED-UK:NCT04828122)和美国(AMYPRED-US:NCT04928976)的参与者。参与者被要求在7到8天内在他们的智能设备上远程完成可选的每日自我管理评估。评估包括立即和延迟召回自动故事召回任务(ASRT)中的3个故事,具有多个平行刺激(18个短篇故事和18个长篇故事)的测试,平衡了关键的语言和话语指标。口头回答被记录并从参与者的个人设备安全地传输,并使用源文本和复述之间的文本相似性度量自动转录和评分,以得出广义匹配分数。使用逻辑和线性混合模型检查了依从性和任务绩效的组差异,分别。相关分析检查了ASRT的并行形式可靠性和认知测试的收敛有效性(逻辑记忆测试和具有语义处理的临床前阿尔茨海默认知组合)。使用远程管理的问卷获得可接受性和可用性数据。
    结果:在AMYPRED研究中招募的200名参与者中,151(75.5%)-78认知未受损(CU)和73MCI或轻度AD-从事可选的远程评估。对每日评估的坚持是中等的,并没有随着时间的推移而下降,但在CU参与者中更高(每天73/106,68.9%的MCI或轻度AD参与者和78/94,83%的CU参与者完成ASRT)。参与者报告了有利的任务可用性:不常见的技术问题,易于使用的应用程序,以及对任务的广泛兴趣。任务绩效在一周内略有改善,并且更适合立即召回。MCI或轻度AD参与者的广义匹配得分较低(Cohend=1.54)。对于立即召回(平均rho0.73,范围0.56-0.88)和延迟召回(平均rho=0.73,范围=0.54-0.86),ASRT故事的并行形式可靠性中等到强。在已建立的认知测试中,ASRT表现出中等的收敛效度。
    结论:无监督,自我管理的ASRT任务对MCI和轻度AD的认知障碍敏感。该任务显示出良好的可用性,高并行形式可靠性,和具有既定认知测验的高收敛效度。远程,低成本,低负担,自动评分语音评估可以支持诊断筛查,卫生保健,和治疗监测。
    BACKGROUND: Story recall is a simple and sensitive cognitive test that is commonly used to measure changes in episodic memory function in early Alzheimer disease (AD). Recent advances in digital technology and natural language processing methods make this test a candidate for automated administration and scoring. Multiple parallel test stimuli are required for higher-frequency disease monitoring.
    OBJECTIVE: This study aims to develop and validate a remote and fully automated story recall task, suitable for longitudinal assessment, in a population of older adults with and without mild cognitive impairment (MCI) or mild AD.
    METHODS: The \"Amyloid Prediction in Early Stage Alzheimer\'s disease\" (AMYPRED) studies recruited participants in the United Kingdom (AMYPRED-UK: NCT04828122) and the United States (AMYPRED-US: NCT04928976). Participants were asked to complete optional daily self-administered assessments remotely on their smart devices over 7 to 8 days. Assessments included immediate and delayed recall of 3 stories from the Automatic Story Recall Task (ASRT), a test with multiple parallel stimuli (18 short stories and 18 long stories) balanced for key linguistic and discourse metrics. Verbal responses were recorded and securely transferred from participants\' personal devices and automatically transcribed and scored using text similarity metrics between the source text and retelling to derive a generalized match score. Group differences in adherence and task performance were examined using logistic and linear mixed models, respectively. Correlational analysis examined parallel-forms reliability of ASRTs and convergent validity with cognitive tests (Logical Memory Test and Preclinical Alzheimer\'s Cognitive Composite with semantic processing). Acceptability and usability data were obtained using a remotely administered questionnaire.
    RESULTS: Of the 200 participants recruited in the AMYPRED studies, 151 (75.5%)-78 cognitively unimpaired (CU) and 73 MCI or mild AD-engaged in optional remote assessments. Adherence to daily assessment was moderate and did not decline over time but was higher in CU participants (ASRTs were completed each day by 73/106, 68.9% participants with MCI or mild AD and 78/94, 83% CU participants). Participants reported favorable task usability: infrequent technical problems, easy use of the app, and a broad interest in the tasks. Task performance improved modestly across the week and was better for immediate recall. The generalized match scores were lower in participants with MCI or mild AD (Cohen d=1.54). Parallel-forms reliability of ASRT stories was moderate to strong for immediate recall (mean rho 0.73, range 0.56-0.88) and delayed recall (mean rho=0.73, range=0.54-0.86). The ASRTs showed moderate convergent validity with established cognitive tests.
    CONCLUSIONS: The unsupervised, self-administered ASRT task is sensitive to cognitive impairments in MCI and mild AD. The task showed good usability, high parallel-forms reliability, and high convergent validity with established cognitive tests. Remote, low-cost, low-burden, and automatically scored speech assessments could support diagnostic screening, health care, and treatment monitoring.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号