Effect size

效果大小
  • 文章类型: Journal Article
    缺乏统计显著性(即,p>0.05)在比较两个样本的频率检验结果中,通常用作没有差异的证据,或者没有治疗效果,在测量变量上。这样的结论通常是错误的,因为缺乏意义可能仅仅是由于样本量太小而无法揭示效果。得出治疗/病症没有有意义的效果的结论,有必要使用适当的统计方法。对于频率统计,实现这一目标的一个简单工具是两个单边t检验,\'一种等效测试形式,依赖于被认为相关的最小差异的先验定义。换句话说,利益的最小影响大小应事先确定。我们介绍了此测试的原理,并给出了示例,其中可以正确解释经典t检验的结果,假设没有差异。等效测试在探测某些重要结果是否也具有生物学意义方面也非常有用,因为当比较大样本时,可以在等效性检验和双样本t检验中找到显著结果,假设没有差异作为零假设。
    Absence of statistical significance (i.e., p > 0.05) in the results of a frequentist test comparing two samples is often used as evidence of absence of difference, or absence of effect of a treatment, on the measured variable. Such conclusions are often wrong because absence of significance may merely result from a sample size that is too small to reveal an effect. To conclude that there is no meaningful effect of a treatment/condition, it is necessary to use an appropriate statistical approach. For frequentist statistics, a simple tool for this goal is the \'two one-sided t-test,\' a form of equivalence test that relies on the a priori definition of a minimal difference considered to be relevant. In other words, the smallest effect size of interest should be established in advance. We present the principles of this test and give examples where it allows correct interpretation of the results of a classical t-test assuming absence of difference. Equivalence tests are also very useful in probing whether certain significant results are also biologically meaningful, because when comparing large samples it is possible to find significant results in both an equivalence test and in a two-sample t-test, assuming no difference as the null hypothesis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Systematic Review
    人们已经认识到,与艾滋病毒相关的污名阻碍了检测工作,治疗,和预防。在这次系统审查中,我们的目的是总结艾滋病毒相关的污名和年龄之间的关联的现有发现,社会支持,教育状况,抑郁症,就业状况,财富指数,性别,residence,关于艾滋病毒的知识,婚姻状况,自诊断以来的持续时间,和披露状况使用了大量的研究。
    电子数据库,包括Scopus,Medline/PubMed,WebofSciences(WOS),科克伦图书馆,谷歌学者,和开放研究数据集挑战进行了系统搜索,直到2023年4月15日。我们包括了各种艾滋病毒污名研究,不管语言,发布日期,或地理位置。40项研究符合纳入标准,共有171627名患者。使用混合效应模型来汇集估计值和评估发表偏差,以及进行敏感性分析。
    年龄等因素,社会支持,高等教育,更高的社会经济地位,良好的艾滋病毒知识,和更长的艾滋病毒感染年限显著降低了与艾滋病毒相关的耻辱的可能性。相反,抑郁等因素,居住在农村地区,女性受访者,和未披露HIV状况与HIV相关污名的高风险显著相关.
    为了对抗与艾滋病毒相关的系统性耻辱,通过提高社区一级的艾滋病毒认识,发展健康和全面的社会方法至关重要。除了激进主义,地方经济发展对于建立具有强大社会结构的繁荣社区也至关重要。
    UNASSIGNED: It has been recognized that HIV-related stigma hinders efforts in testing, treatment, and prevention. In this systematic review, we aimed to summarize available findings on the association between HIV-related stigma and age, social support, educational status, depression, employment status, wealth index, gender, residence, knowledge about HIV, marital status, duration since diagnosis, and disclosure status using a large number of studies.
    UNASSIGNED: Electronic databases including Scopus, Medline/PubMed, Web of Sciences (WOS), Cochrane Library, Google Scholar, and Open Research Dataset Challenge were systematically searched until 15 April 2023. We included all kinds of HIV-stigma studies, regardless of language, publishing date, or geographic location. The inclusion criteria were met by 40 studies, with a total of 171,627 patients. A mixed-effect model was used to pool estimates and evaluate publication bias, as well as to conduct sensitivity analysis.
    UNASSIGNED: Factors such as older age, social support, greater education, higher socioeconomic status, good knowledge of HIV, and longer years of living with HIV significantly lowered the likelihood of HIV-related stigma. Contrarily, factors such as depression, residing in rural areas, female respondents, and non-disclosure of HIV status were significantly associated with a high risk of HIV-related stigma.
    UNASSIGNED: To combat systemic HIV-associated stigma, it is crucial to develop wholesome and comprehensive social methods by raising community-level HIV awareness. In addition to activism, local economic development is also crucial for creating thriving communities with a strong social fabric.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    DeLong等人于1988年提出的一种非参数方法,用于比较相关接收器工作特性曲线下的区域,在实践中得到了广泛使用。然而,在流行软件中实现的DeLong方法会悄悄地删除具有任何缺失值的个人,产生潜在无效和/或低效的结果。我们使用等级简化了DeLong算法,并通过使用多元数据的混合模型方法对其进行扩展以适应缺失的数据。仿真结果证明了我们方法对随机数据缺失的有效性和有效性。我们在SAS中说明了我们提出的程序,Stata,和R使用原始的DeLong数据。
    A nonparametric method proposed by DeLong et al in 1988 for comparing areas under correlated receiver operating characteristic curves is used widely in practice. However, the DeLong method as implemented in popular software quietly deletes individuals with any missing values, yielding potentially invalid and/or inefficient results. We simplify the DeLong algorithm using ranks and extend it to accommodate missing data by using a mixed model approach for multivariate data. Simulation results demonstrate the validity and efficiency of our procedure for data missing at random. We illustrate our proposed procedure in SAS, Stata, and R using the original DeLong data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:对研究结果的正确解释既需要对良好的方法实践的深刻理解,又需要对先前结果的深入了解,由效果大小的可用性辅助。
    方法:这篇综述采用了一篇说明性文章的形式,探讨了统计意义之间的复杂而细微的关系,临床重要性,和效果大小。
    结果:仔细注意研究设计和方法将增加获得统计学意义的可能性,并可能增强研究人员/读者准确解释结果的能力。效应大小的度量表明研究中使用的变量如何很好地解释/解释数据中的变异性。报告强效应的研究可能比报告弱效应的研究具有更大的实用价值/效用。效应大小需要在上下文中解释。效果大小的口头摘要表征(例如,\"弱\",\“strong\”)从根本上是有缺陷的,可能导致对结果的不恰当表征。通用语言效果大小(CLES)指标是一种相对较新的效果大小方法,可以提供更易于理解的结果解释,可以使提供者受益。病人,和广大公众。
    结论:以研究界和公众都清楚的方式传达研究结果非常重要。至少,这需要在研究报告中纳入标准效应大小数据。正确选择措施和仔细设计研究是解释研究结果的基础。当研究人员提高其工作的方法学质量时,从研究中得出有用结论的能力就会增强。
    BACKGROUND: The proper interpretation of a study\'s results requires both excellent understanding of good methodological practices and deep knowledge of prior results, aided by the availability of effect sizes.
    METHODS: This review takes the form of an expository essay exploring the complex and nuanced relationships among statistical significance, clinical importance, and effect sizes.
    RESULTS: Careful attention to study design and methodology will increase the likelihood of obtaining statistical significance and may enhance the ability of investigators/readers to accurately interpret results. Measures of effect size show how well the variables used in a study account for/explain the variability in the data. Studies reporting strong effects may have greater practical value/utility than studies reporting weak effects. Effect sizes need to be interpreted in context. Verbal summary characterizations of effect sizes (e.g., \"weak\", \"strong\") are fundamentally flawed and can lead to inappropriate characterization of results. Common language effect size (CLES) indicators are a relatively new approach to effect sizes that may offer a more accessible interpretation of results that can benefit providers, patients, and the public at large.
    CONCLUSIONS: It is important to convey research findings in ways that are clear to both the research community and to the public. At a minimum, this requires inclusion of standard effect size data in research reports. Proper selection of measures and careful design of studies are foundational to the interpretation of a study\'s results. The ability to draw useful conclusions from a study is increased when investigators enhance the methodological quality of their work.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:颞下颌关节紊乱病(TMD)是用于描述咀嚼肌和颞下颌关节(TMJ)的病理(功能障碍和疼痛)的术语。牙科研究的出版有明显的上升趋势,需要不断提高研究质量。因此,本研究旨在分析TMD随机对照试验中样本量和效应量计算的使用.
    方法:期限限制为整整5年,即,2019年、2020年、2021年、2022年和2023年发表的论文。使用过滤器文章类型-“随机对照试验”。这些研究以两级量表进行分级:0-1。在1的情况下,计算样本量(SS)和效应量(ES)。
    结果:在整个研究样本中,58%的研究中使用了SS,而15%的研究使用ES。
    结论:质量应该随着研究的增加而提高。影响质量的一个因素是统计水平。SS和ES计算为理解作者获得的结果提供了基础。访问公式,在线计算器和软件促进了这些分析。高质量的试验为医学进步提供了坚实的基础,促进个性化疗法的发展,提供更精确和有效的治疗,增加患者康复的机会。提高TMD研究的质量,和一般的医学研究,有助于增加公众对医疗进步的信心,并提高病人护理的标准。
    OBJECTIVE: Temporomandibular disorder (TMD) is the term used to describe a pathology (dysfunction and pain) in the masticatory muscles and temporomandibular joint (TMJ). There is an apparent upward trend in the publication of dental research and a need to continually improve the quality of research. Therefore, this study was conducted to analyse the use of sample size and effect size calculations in a TMD randomised controlled trial.
    METHODS: The period was restricted to the full 5 years, i.e., papers published in 2019, 2020, 2021, 2022, and 2023. The filter article type-\"Randomized Controlled Trial\" was used. The studies were graded on a two-level scale: 0-1. In the case of 1, sample size (SS) and effect size (ES) were calculated.
    RESULTS: In the entire study sample, SS was used in 58% of studies, while ES was used in 15% of studies.
    CONCLUSIONS: Quality should improve as research increases. One factor that influences quality is the level of statistics. SS and ES calculations provide a basis for understanding the results obtained by the authors. Access to formulas, online calculators and software facilitates these analyses. High-quality trials provide a solid foundation for medical progress, fostering the development of personalized therapies that provide more precise and effective treatment and increase patients\' chances of recovery. Improving the quality of TMD research, and medical research in general, helps to increase public confidence in medical advances and raises the standard of patient care.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:关于骨科择期手术的比较患者报告结果(PROMs)和效应大小(ESs)的知识有限。方法:收集2020年1月至2022年12月的所有患者数据,和治疗结局评估为基线和1年随访之间的PROM差异。该队列分为亚组(手,弯头,肩膀,脊柱,臀部,膝盖,和脚/脚踝)。分别计算每位患者的PROMESs,ES>0.5的患者被认为是应答者。结果:总的来说,7695例患者进行了手术。所有患者组的平均ES为1.81(SD1.41),在肩部患者中观察到最大的ES,在手患者中观察到最小的ES。总的来说,肩膀,臀部,与手相比,膝盖患者的ES更大,脊柱,和足踝患者(p<0.0001)。膝关节中积极反应者的比例在91-94%之间,肩膀,和臀部,和69-70%的手,脊柱,和足/踝亚组。结论:在整个择期骨科手术中,ESs通常很高。然而,根据我们的机构观察,肩膀,臀部,与手相比,膝盖患者的治疗效果更大,脊柱,和足踝患者,其中也有更多的非回应者。在考虑择期手术时,应将预期的治疗结果明确告知患者。由于研究的局限性,应谨慎对待结果。
    Background: There is limited knowledge regarding the comparative patient-reported outcomes (PROMs) and effect sizes (ESs) across orthopedic elective surgery. Methods: All patient data between January 2020 and December 2022 were collected, and treatment outcomes assessed as a PROM difference between baseline and one-year follow-up. The cohort was divided into subgroups (hand, elbow, shoulder, spine, hip, knee, and foot/ankle). The PROM ESs were calculated for each patient separately, and patients with ES > 0.5 were considered responders. Results: In total, 7695 patients were operated on. The mean ES across all patient groups was 1.81 (SD 1.41), and the largest ES was observed in shoulder patients and the smallest in hand patients. Overall, shoulder, hip, and knee patients had a larger ES compared to hand, spine, and foot/ankle patients (p < 0.0001). The proportion of positive responders ranged between 91-94% in the knee, shoulder, and hip, and 69-70% in the hand, spine, and foot/ankle subgroups. Conclusions: The ESs are generally high throughout elective orthopedic surgery. However, based on our institutional observations, shoulder, hip, and knee patients experience larger treatment effects compared to hand, spine, and foot/ankle patients, among whom there are also more non-responders. The expected treatment outcomes should be clearly communicated to patients when considering elective surgery. Because of the study limitations, the results should be approached with some caution.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    限制的平均生存时间是预期的生存持续时间,直到选定的限制时间τ。对于比较研究,两组之间的有限平均生存时间的差异提供了治疗效果的汇总度量,没有关于两条生存曲线的相对形状的假设。比如比例风险。然而,由于在时间τ处的观察结果被截断,因此很难从受限均值的比较中判断效果的大小。
    在本文中,我们描述了基于限制手段的其他表达治疗效果的方法,这些方法在这方面可能是有帮助的。这些包括限制手段的比率,损失的寿命年(或时间)之比,和存活曲线之间的平均积分差异,等于限制均值之差除以τ。这些替代度量易于计算,并提供了一种用于缩放效果大小的方法,以帮助解释。来自两个随机的例子,前列腺癌的多中心临床试验,NRG/RTOG0521和NRG/RTOG0534,主要终点为总生存期和生化/放射学无进展生存期,分别,是为了说明这些想法而提出的。
    四种效应测量(受限平均生存时间差,受限平均生存时间比率,时间损失率,和平均生存率差异)为0.45年,RTOG0521和1.36年的1.05、0.81和0.038,τ=12年和11年的RTOG0534为1.17、0.56和0.12,分别。因此,例如,第一次试验的0.45年差异转化为19%的时间损失和3.8%的平均绝对差异之间的存活曲线在12年的范围内,适度的效应大小,而第二项试验的1.36年差异相当于减少44%的时间损失和12%的绝对生存差异,相当大的影响。
    除了限制平均生存时间的差异外,这些替代措施有助于确定治疗效果的大小是否具有临床意义.
    UNASSIGNED: Restricted mean survival time is the expected duration of survival up to a chosen time of restriction τ. For comparison studies, the difference in restricted mean survival times between two groups provides a summary measure of the treatment effect that is free of assumptions regarding the relative shape of the two survival curves, such as proportional hazards. However, it can be difficult to judge the magnitude of the effect from a comparison of restricted means due to the truncation of observation at time τ.
    UNASSIGNED: In this article, we describe additional ways of expressing the treatment effect based on restricted means that can be helpful in this regard. These include the ratio of restricted means, the ratio of life-years (or time) lost, and the average integrated difference between the survival curves, equal to the difference in restricted means divided by τ. These alternative metrics are straightforward to calculate and provide a means for scaling the effect size as an aid to interpretation. Examples from two randomized, multicenter clinical trials in prostate cancer, NRG/RTOG 0521 and NRG/RTOG 0534, with primary endpoints of overall survival and biochemical/radiological progression-free survival, respectively, are presented to illustrate the ideas.
    UNASSIGNED: The four effect measures (restricted mean survival time difference, restricted mean survival time ratio, time lost ratio, and average survival rate difference) were 0.45 years, 1.05, 0.81, and 0.038 for RTOG 0521 and 1.36 years, 1.17, 0.56, and 0.12 for RTOG 0534 with τ = 12 and 11 years, respectively. Thus, for example, the 0.45-year difference in the first trial translates into a 19% reduction in time lost and a 3.8% average absolute difference between the survival curves over the 12-year horizon, a modest effect size, whereas the 1.36-year difference in the second trial corresponds to a 44% reduction in time lost and a 12% absolute survival difference, a rather large effect.
    UNASSIGNED: In addition to the difference in restricted mean survival times, these alternative measures can be helpful in determining whether the magnitude of the treatment effect is clinically meaningful.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    临床试验分析结果的统计显著性通过数学计算和基于零假设显著性检验的概率来确定。然而,统计学意义并不总是与有意义的临床效果一致;因此,将临床相关性分配给统计学意义是不合理的.结合有临床意义的差异的统计结果是呈现统计显著性的更好方法。因此,最小临床重要差异(MCID),这需要从研究设计的早期阶段整合最小的临床相关变化,已经介绍了。作为上一轮关于P值的统计回合文章的后续,置信区间,和效果大小,在这篇文章中,我们介绍了MCID和各种效应大小的实例,并讨论了术语统计意义和临床相关性,包括有关其使用的注意事项。
    The statistical significance of a clinical trial analysis result is determined by a mathematical calculation and probability based on null hypothesis significance testing. However, statistical significance does not always align with meaningful clinical effects; thus, assigning clinical relevance to statistical significance is unreasonable. A statistical result incorporating a clinically meaningful difference is a better approach to present statistical significance. Thus, the minimal clinically important difference (MCID), which requires integrating minimum clinically relevant changes from the early stages of research design, has been introduced. As a follow-up to the previous statistical round article on P values, confidence intervals, and effect sizes, in this article, we present hands-on examples of MCID and various effect sizes and discuss the terms statistical significance and clinical relevance, including cautions regarding their use.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景小小组讨论(SGD)在医学教育中至关重要,促进批判性思维的发展,沟通技巧,和团队合作。然而,传统的SGD面临着可扩展性和保持学生参与度等挑战。这项研究旨在评估“分布,讨论,并开发“(3D)方法,以提高医学教育中的学习成果。方法对125名一年级医学学士和外科学士学生进行单盲介入研究,通过随机分配将其分为干预组和对照组。干预组在两个主题单元中采用3D方法:血液学和肌肉神经生理学。这项研究使用前测和后测评估学习成果,类平均归一化增益(“g”),和反馈问卷来捕捉学生对互动的看法,通信增强,和会议总结。结果干预组在两个专题单元的学习成果均有显著改善,与对照组相比,具有更大的效应大小(血液学:1.55;肌肉神经生理学:1.4)。归一化增益“g”表示干预组在两个主题中的中等有效性水平,建议加强学习。反馈问卷显示,干预组内关于互动的满意度较高,沟通技巧,和会议总结。结论3D方法解决了传统SGD面临的挑战,为医学教育提供可扩展且引人入胜的方法。通过培养更有效的以学生为中心的学习,该方法增强了对复杂生理概念的理解,并提高了沟通技巧。3D方法显著提高了学习成果,互动,以及医学教育中的沟通技巧。这种针对SGD的创新方法为增强医学院的教育经验提供了有希望的策略,支持更清晰和专业能力的医学毕业生的发展。
    Background Small-group discussions (SGDs) are pivotal in medical education, facilitating the development of critical thinking, communication skills, and teamwork. However, traditional SGDs face challenges such as scalability and maintaining student engagement. This study aims to evaluate the \"Distribute, Discuss, and Develop\" (3D) method for enhancing learning outcomes in medical education. Methods A single-blinded interventional study was conducted with 125 first-year Bachelor of Medicine and Bachelor of Surgery students, who were divided into intervention and control groups through random assignment. The intervention group employed the 3D method across two thematic units: hematology and muscle nerve physiology. The study assessed learning outcomes using pre- and posttests, class-average normalized gain (\"g\"), and feedback questionnaires to capture student perceptions of interaction, communication enhancement, and session summarization. Results The intervention group showed significantly improved learning outcomes in both thematic units, with larger effect sizes (hematology: 1.55; muscle nerve physiology: 1.4) compared to the control group. The normalized gain \"g\" indicated a medium effectiveness level for the intervention group in both themes, suggesting enhanced learning. Feedback questionnaires revealed higher satisfaction levels within the intervention group regarding interaction, communication skills, and session summarization. Conclusions The 3D method addresses the challenges faced by traditional SGDs, providing a scalable and engaging approach to medical education. By fostering more effective student-centered learning, the method enhances the comprehension of complex physiological concepts and improves communication skills. The 3D method significantly improves learning outcomes, interaction, and communication skills in medical education. This innovative approach to SGDs offers a promising strategy for enhancing the educational experience in medical schools, supporting the development of more articulate and professionally competent medical graduates.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Clinical Trial, Phase III
    背景:为了说明(标准化)效果大小(ES)如何根据计算方法而变化,并提供改进报告的注意事项。
    方法:分析了tanezumab在骨关节炎受试者中的三项试验的数据。对于WOMAC疼痛(结果),tanezumab与比较物的ES定义为均值(用于重复测量分析的混合模型)之间的最小二乘差异除以结果评分的合并标准偏差(SD)。评估了计算SD的三种方法:基线(基线时WOMAC疼痛值的汇总SD[跨治疗汇总]);终点(评估主要终点时这些值的汇总SD);和中位数(基于跨可用时间点的汇总SD的这些值的汇总SD中位数)。Bootstrap分析用于计算95%置信区间(CI)。
    结果:基于基线的2.5mgtanezumab的ES(95%CI),端点,一项研究中的SDs中位数为-0.416(-0.796,-0.060),-0.195(-0.371,-0.028),和-0.196(-0.373,-0.028),分别为负值表示疼痛改善。这种ES差异模式(基线SD最大,最小的端点SD,与终点SD相似的中值SD)在所有研究和tanezumab的剂量中是一致的。
    结论:ES的差异影响治疗效果的解释。因此,我们主张清楚地报告ES的各个要素,以及它的整体计算。当ES估计值用于确定临床试验的样本量时,这一点尤为重要。因为更大的ES将导致更小的样本量和潜在的不足的研究。
    背景:Clinicaltrials.govNCT02697773、NCT02709486和NCT02528188。
    To illustrate how (standardised) effect sizes (ES) vary based on calculation method and to provide considerations for improved reporting.
    Data from three trials of tanezumab in subjects with osteoarthritis were analyzed. ES of tanezumab versus comparator for WOMAC Pain (outcome) was defined as least squares difference between means (mixed model for repeated measures analysis) divided by a pooled standard deviation (SD) of outcome scores. Three approaches to computing the SD were evaluated: Baseline (the pooled SD of WOMAC Pain values at baseline [pooled across treatments]); Endpoint (the pooled SD of these values at the time primary endpoints were assessed); and Median (the median pooled SD of these values based on the pooled SDs across available timepoints). Bootstrap analyses were used to compute 95% confidence intervals (CI).
    ES (95% CI) of tanezumab 2.5 mg based on Baseline, Endpoint, and Median SDs in one study were - 0.416 (- 0.796, - 0.060), - 0.195 (- 0.371, - 0.028), and - 0.196 (- 0.373, - 0.028), respectively; negative values indicate pain improvement. This pattern of ES differences (largest with Baseline SD, smallest with Endpoint SD, Median SD similar to Endpoint SD) was consistent across all studies and doses of tanezumab.
    Differences in ES affect interpretation of treatment effect. Therefore, we advocate clearly reporting individual elements of ES in addition to its overall calculation. This is particularly important when ES estimates are used to determine sample sizes for clinical trials, as larger ES will lead to smaller sample sizes and potentially underpowered studies.
    Clinicaltrials.gov NCT02697773, NCT02709486, and NCT02528188.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号