statistical significance

统计意义
  • 文章类型: Journal Article
    背景:对研究结果的正确解释既需要对良好的方法实践的深刻理解,又需要对先前结果的深入了解,由效果大小的可用性辅助。
    方法:这篇综述采用了一篇说明性文章的形式,探讨了统计意义之间的复杂而细微的关系,临床重要性,和效果大小。
    结果:仔细注意研究设计和方法将增加获得统计学意义的可能性,并可能增强研究人员/读者准确解释结果的能力。效应大小的度量表明研究中使用的变量如何很好地解释/解释数据中的变异性。报告强效应的研究可能比报告弱效应的研究具有更大的实用价值/效用。效应大小需要在上下文中解释。效果大小的口头摘要表征(例如,\"弱\",\“strong\”)从根本上是有缺陷的,可能导致对结果的不恰当表征。通用语言效果大小(CLES)指标是一种相对较新的效果大小方法,可以提供更易于理解的结果解释,可以使提供者受益。病人,和广大公众。
    结论:以研究界和公众都清楚的方式传达研究结果非常重要。至少,这需要在研究报告中纳入标准效应大小数据。正确选择措施和仔细设计研究是解释研究结果的基础。当研究人员提高其工作的方法学质量时,从研究中得出有用结论的能力就会增强。
    BACKGROUND: The proper interpretation of a study\'s results requires both excellent understanding of good methodological practices and deep knowledge of prior results, aided by the availability of effect sizes.
    METHODS: This review takes the form of an expository essay exploring the complex and nuanced relationships among statistical significance, clinical importance, and effect sizes.
    RESULTS: Careful attention to study design and methodology will increase the likelihood of obtaining statistical significance and may enhance the ability of investigators/readers to accurately interpret results. Measures of effect size show how well the variables used in a study account for/explain the variability in the data. Studies reporting strong effects may have greater practical value/utility than studies reporting weak effects. Effect sizes need to be interpreted in context. Verbal summary characterizations of effect sizes (e.g., \"weak\", \"strong\") are fundamentally flawed and can lead to inappropriate characterization of results. Common language effect size (CLES) indicators are a relatively new approach to effect sizes that may offer a more accessible interpretation of results that can benefit providers, patients, and the public at large.
    CONCLUSIONS: It is important to convey research findings in ways that are clear to both the research community and to the public. At a minimum, this requires inclusion of standard effect size data in research reports. Proper selection of measures and careful design of studies are foundational to the interpretation of a study\'s results. The ability to draw useful conclusions from a study is increased when investigators enhance the methodological quality of their work.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    基因组评估过程依赖于基因组水平的密集单核苷酸多态性(SNP)标记与数量性状基因座(QTL)之间的连锁不平衡假设。本研究的目的是评估四种频率方法,包括岭回归,最小绝对收缩和选择算子(LASSO),ElasticNet,基因组最佳线性无偏预测(GBLUP)和包括贝叶斯岭回归(BRR)在内的五种贝叶斯方法,贝叶斯A,贝叶斯LASSO,贝叶斯C,和贝叶斯B,在使用模拟数据的基因组选择中。基于统计显著性(p值)成对评估预测准确性之间的差异(即,t检验和Mann-WhitneyU检验)和实际意义(科恩的d效应大小)为此,数据是基于两种不同标记密度(整个基因组中的4000和8000)的情景进行模拟的。模拟数据包括一个有四个染色体的基因组,每个1摩根,其中100个随机分布的QTL和两个不同密度的均匀分布的SNP(1000和2000),在0.4的遗传力水平,被认为。对于除GBLUP外的频率论方法,正则化参数λ是使用五折交叉验证方法计算的。对于这两种情况,在频率论方法中,通过岭回归和GBLUP观察到最高的预测准确性。岭回归和GBLUP显示了最低和最高的偏差,分别。此外,在贝叶斯方法中,BayesB和BRR显示出最高和最低的预测精度,分别。贝叶斯LASSO记录了两种情况下的最低偏差,第一种和第二种情况下的最高偏差由BRR和贝叶斯B显示,分别。在这两种情况下的所有研究方法中,BayesB、LASSO和ElasticNet显示了最高和最低的精度,分别。不出所料,在GBLUP和BRR之间观察到最大的性能相似性(d=0.007,在第一种情况下,d=0.003,在第二种情况下)。从参数t和非参数Mann-WhitneyU检验获得的结果相似。在第一种和第二种情况下,在每个场景中所研究方法的性能之间进行36t检验,14(P<。001)和2(P<。05)比较显著,分别,这表明随着预测因子数量的增加,不同方法的性能差异减小。这是根据科恩的d效应大小证明的,因此,随着模型复杂性的增加,效应大小并没有被视为非常大。在将这些方法用于基因组评估之前,应通过交叉验证方法优化频率方法中的正则化参数。
    The genomic evaluation process relies on the assumption of linkage disequilibrium between dense single-nucleotide polymorphism (SNP) markers at the genome level and quantitative trait loci (QTL). The present study was conducted with the aim of evaluating four frequentist methods including Ridge Regression, Least Absolute Shrinkage and Selection Operator (LASSO), Elastic Net, and Genomic Best Linear Unbiased Prediction (GBLUP) and five Bayesian methods including Bayes Ridge Regression (BRR), Bayes A, Bayesian LASSO, Bayes C, and Bayes B, in genomic selection using simulation data. The difference between prediction accuracy was assessed in pairs based on statistical significance (p-value) (i.e., t test and Mann-Whitney U test) and practical significance (Cohen\'s d effect size) For this purpose, the data were simulated based on two scenarios in different marker densities (4000 and 8000, in the whole genome). The simulated data included a genome with four chromosomes, 1 Morgan each, on which 100 randomly distributed QTL and two different densities of evenly distributed SNPs (1000 and 2000), at the heritability level of 0.4, was considered. For the frequentist methods except for GBLUP, the regularization parameter λ was calculated using a five-fold cross-validation approach. For both scenarios, among the frequentist methods, the highest prediction accuracy was observed by Ridge Regression and GBLUP. The lowest and the highest bias were shown by Ridge Regression and GBLUP, respectively. Also, among the Bayesian methods, Bayes B and BRR showed the highest and lowest prediction accuracy, respectively. The lowest bias in both scenarios was registered by Bayesian LASSO and the highest bias in the first and the second scenario were shown by BRR and Bayes B, respectively. Across all the studied methods in both scenarios, the highest and the lowest accuracy were shown by Bayes B and LASSO and Elastic Net, respectively. As expected, the greatest similarity in performance was observed between GBLUP and BRR ( d = 0.007 , in the first scenario and d = 0.003 , in the second scenario). The results obtained from parametric t and non-parametric Mann-Whitney U tests were similar. In the first and second scenario, out of 36 t test between the performance of the studied methods in each scenario, 14 ( P < . 001 ) and 2 ( P < . 05 ) comparisons were significant, respectively, which indicates that with the increase in the number of predictors, the difference in the performance of different methods decreases. This was proven based on the Cohen\'s d effect size, so that with the increase in the complexity of the model, the effect size was not seen as very large. The regularization parameters in frequentist methods should be optimized by cross-validation approach before using these methods in genomic evaluation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在这项工作中,我们提出了一种新的方法来构建全脑时空多层功能连接网络(FCN)和四个创新的丰富俱乐部指标。
    时空多层FCN通过将滑动时间窗方法与图论和超图理论相结合,实现了脑网络时空动态特性的高阶表示。提出的四个丰富俱乐部尺度是基于丰富俱乐部节点身份的动态变化,提供了从时间和空间角度对脑网络的拓扑动态特性的参数化描述。在三个独立差异分析实验中验证了所提出的方法:男女性别差异分析,自闭症谱系障碍(ASD)患者的异常分析,和个体差异分析。
    所提出的方法产生的结果与先前的相关研究一致,并揭示了一些创新的发现。例如,特定白质区域的动态拓扑特征有效地反映了个体差异。基底神经节内部功能连接异常的增加可能是ASD患者重复或限制性行为发生的原因。
    所提出的方法为构建全脑时空多层FCN并对其动态拓扑结构进行分析提供了有效的方法。时空多层FCN的动态拓扑特征可能为神经科学中的生理变异和病理异常提供新的见解。
    UNASSIGNED: In this work, we propose a novel method for constructing whole-brain spatio-temporal multilayer functional connectivity networks (FCNs) and four innovative rich-club metrics.
    UNASSIGNED: Spatio-temporal multilayer FCNs achieve a high-order representation of the spatio-temporal dynamic characteristics of brain networks by combining the sliding time window method with graph theory and hypergraph theory. The four proposed rich-club scales are based on the dynamic changes in rich-club node identity, providing a parameterized description of the topological dynamic characteristics of brain networks from both temporal and spatial perspectives. The proposed method was validated in three independent differential analysis experiments: male-female gender difference analysis, analysis of abnormality in patients with autism spectrum disorders (ASD), and individual difference analysis.
    UNASSIGNED: The proposed method yielded results consistent with previous relevant studies and revealed some innovative findings. For instance, the dynamic topological characteristics of specific white matter regions effectively reflected individual differences. The increased abnormality in internal functional connectivity within the basal ganglia may be a contributing factor to the occurrence of repetitive or restrictive behaviors in ASD patients.
    UNASSIGNED: The proposed methodology provides an efficacious approach for constructing whole-brain spatio-temporal multilayer FCNs and conducting analysis of their dynamic topological structures. The dynamic topological characteristics of spatio-temporal multilayer FCNs may offer new insights into physiological variations and pathological abnormalities in neuroscience.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本文追溯了创意护理从1981年作为初级护理时事通讯到目前作为季刊国际的发展历程,跨学科,同行评审,索引,继续培养新手作者的主题期刊,欢迎国际提交,回顾其他期刊不会考虑的文章,并解决许多期刊避免的主题。未来的方向包括多种语言的内容,邀请提交研究方法论文的新作者指南,超越基于P值阈值的统计显著性,要求作者在他们的论文中明确知识翻译的含义,创造性地思考如何利用人工智能进行研究,教育,和实践。
    This article traces the development of Creative Nursing from its origin in 1981 as a newsletter about Primary Nursing to its current position as a quarterly international, interdisciplinary, peer-reviewed, indexed, themed journal that continues to nurture novice authors, welcome international submissions, review articles that other journals won\'t consider, and address subjects that many journals avoid. Future directions include content in multiple languages, new author guidelines that invite submissions of research methods papers, moving beyond statistical significance based on p-value thresholds, asking authors to make explicit the implications for knowledge translation in their papers, and thinking creatively about how artificial intelligence can be leveraged for research, education, and practice.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这项研究分析了21辆中国VI重型柴油卡车(HDDT)的实际NOx和颗粒数(PN)排放。首先使用便携式排放测量系统(PEMS)评估道路排放符合性。只有76.19%,71.43%和61.90%的车辆通过NOx测试,PN测试和两个测试,分别。包括废气再循环(EGR)设备在内的车辆功能的影响,然后评估里程和牵引吨位。结果表明,EGR有助于降低NOx排放因子(EF),同时增加PNEF。较大的里程和牵引吨位对应较高的NOx和PNEF,分别。通过数值比较和统计测试,对操作条件对排放的影响进行了深入分析。结果证明,HDDT在低速或大车辆比功率(VSP)下产生较高的NOxEF,和更高的PNEF在高速或小VSP一般。此外,不合格车辆产生的NOxEF明显高于高速公路上或车速≥40km/h的合格车辆,虽然郊区道路上产生了显著较高的PNEF,高速公路或不合格车辆在具有正VSP的运行模式下。最后研究了车载诊断(OBD)NOx数据的可靠性和准确性。结果显示,43%的测试车辆没有报告可靠的OBD数据。OBDNOx和PEMS测量之间的相关性分析进一步证明瞬时浓度的一致性通常较低。然而,滑动窗口平均浓度显示出更好的相关性,例如,对于大多数车辆,20s窗口平均浓度的Pearson相关系数超过0.85。研究结果为排放管制提供了有价值的见解,例如,更加注重中高速运行,以识别不合格车辆,设定更高的标准以提高OBD数据的质量,并采用窗口平均OBDNOx浓度评价车辆排放性能。
    This research analyzed the real-world NOx and particle number (PN) emissions of 21 China VI heavy-duty diesel trucks (HDDTs). On-road emission conformity was first evaluated with portable emission measurement system (PEMS). Only 76.19 %, 71.43 % and 61.90 % of the vehicles passed the NOx test, PN test and both tests, respectively. The impacts of vehicle features including exhaust gas recirculation (EGR) equipment, mileage and tractive tonnage were then assessed. Results demonstrated that EGR helped reducing NOx emission factors (EFs) while increased PN EFs. Larger mileages and tractive tonnages corresponded to higher NOx and PN EFs, respectively. In-depth analyses regarding the influences of operating conditions on emissions were conducted with both numerical comparisons and statistical tests. Results proved that HDDTs generated higher NOx EFs under low speeds or large vehicle specific powers (VSPs), and higher PN EFs under high speeds or small VSPs in general. In addition, unqualified vehicles generated significantly higher NOx EFs than qualified vehicles on freeways or under speed≥40 km/h, while significant higher PN EFs were generated on suburban roads, freeways or under operating modes with positive VSPs by unqualified vehicles. The reliability and accuracy of on-board diagnostic (OBD) NOx data were finally investigated. Results revealed that 43 % of the test vehicles did not report reliable OBD data. Correlation analyses between OBD NOx and PEMS measurements further demonstrated that the consistency of instantaneous concentrations were generally low. However, sliding window averaged concentrations show better correlations, e.g., the Pearson correlation coefficients on 20s-window averaged concentrations exceeded 0.85 for most vehicles. The research results provide valuable insights into emission regulation, e.g., focusing more on medium- to high-speed operations to identify unqualified vehicles, setting higher standards to improve the quality of OBD data, and adopting window averaged OBD NOx concentrations in evaluating vehicle emission performance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    深度学习(DL)已经证明了其从复杂和多维数据中独立学习分层特征的固有能力。一个共同的理解是,它的性能随着训练数据量的增加而扩展。然而,数据还必须表现出多样性,以提高学习能力。在医学成像数据中,语义冗余,即存在类似或重复的信息,可能由于存在多个图像而发生,这些图像对于感兴趣的疾病具有高度相似的呈现。此外,当不加区别地应用于此类数据时,通常使用增强方法来生成DL训练中的多样性可能会限制性能。因此,我们假设语义冗余会降低性能,并限制对看不见的数据的可泛化性,并质疑其对分类器性能的影响,即使是大数据。我们提出了一种基于熵的样本评分方法来识别和去除语义冗余的训练数据,并使用公开的NIH胸部X射线数据集证明,在训练数据的结果信息子集上训练的模型明显优于在完整训练集上训练的模型,在内部(召回:0.7164vs0.6597,p<0.05)和外部测试(召回:0.3185vs0.2589,p<0.05)。我们的发现强调了以信息为导向的训练样本选择的重要性,而不是使用所有可用训练数据的常规做法。
    Deep learning (DL) has demonstrated its innate capacity to independently learn hierarchical features from complex and multi-dimensional data. A common understanding is that its performance scales up with the amount of training data. However, the data must also exhibit variety to enable improved learning. In medical imaging data, semantic redundancy, which is the presence of similar or repetitive information, can occur due to the presence of multiple images that have highly similar presentations for the disease of interest. Also, the common use of augmentation methods to generate variety in DL training could limit performance when indiscriminately applied to such data. We hypothesize that semantic redundancy would therefore tend to lower performance and limit generalizability to unseen data and question its impact on classifier performance even with large data. We propose an entropy-based sample scoring approach to identify and remove semantically redundant training data and demonstrate using the publicly available NIH chest X-ray dataset that the model trained on the resulting informative subset of training data significantly outperforms the model trained on the full training set, during both internal (recall: 0.7164 vs 0.6597, p<0.05) and external testing (recall: 0.3185 vs 0.2589, p<0.05). Our findings emphasize the importance of information-oriented training sample selection as opposed to the conventional practice of using all available training data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    空假设显著性检验(NHST)是老年和康复领域的主要统计方法。然而,NHST经常被误解或误用。在这种情况下,临床试验的结果将被视为没有效果的证据,事实上,临床相关问题可能具有“非显著”p值。相反,当观察到组间存在显著差异时,研究结果被认为具有临床相关性.假设p值不是关联或效果存在的唯一指标,应鼓励研究人员报告其他统计分析方法,如贝叶斯分析和补充统计工具以及p值(例如,效果大小,置信区间,最小的临床重要差异,和基于幅度的推断),通过提供更有效,更全面的分析来改善对临床试验结果的解释。然而,对贝叶斯分析和二级统计分析的关注并不意味着NHST不那么重要.只有这个,为了观察真正的干预效果,研究人员应结合NHST或贝叶斯统计分析使用二级统计分析的组合,以揭示老年和康复研究中无法显示的p值(例如,与对照组相比,干预组长寿老年人的握力增加1kg的临床重要性)。本文通过利用贝叶斯和二级统计分析来更好地审查临床试验的结果,为改善康复和老年领域科学数据的解释提供了潜在的见解,其中p值可能不适合单独确定干预措施的疗效。
    Null hypothesis significant testing (NHST) is the dominant statistical approach in the geriatric and rehabilitation fields. However, NHST is routinely misunderstood or misused. In this case, the findings from clinical trials would be taken as evidence of no effect, when in fact, a clinically relevant question may have a \"non-significant\" p-value. Conversely, findings are considered clinically relevant when significant differences are observed between groups. To assume that p-value is not an exclusive indicator of an association or the existence of an effect, researchers should be encouraged to report other statistical analysis approaches as Bayesian analysis and complementary statistical tools alongside the p-value (eg, effect size, confidence intervals, minimal clinically important difference, and magnitude-based inference) to improve interpretation of the findings of clinical trials by presenting a more efficient and comprehensive analysis. However, the focus on Bayesian analysis and secondary statistical analyses does not mean that NHST is less important. Only that, to observe a real intervention effect, researchers should use a combination of secondary statistical analyses in conjunction with NHST or Bayesian statistical analysis to reveal what p-values cannot show in the geriatric and rehabilitation studies (eg, the clinical importance of 1kg increase in handgrip strength in the intervention group of long-lived older adults compared to a control group). This paper provides potential insights for improving the interpretation of scientific data in rehabilitation and geriatric fields by utilizing Bayesian and secondary statistical analyses to better scrutinize the results of clinical trials where a p-value alone may not be appropriate to determine the efficacy of an intervention.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:脆弱性分析是一种根据统计结果的稳定性进一步表征结果的方法。这项研究评估了最近的随机对照试验(RCT)的统计脆弱性,该试验评估了机器人辅助与常规全膝关节置换术(RA-TKA与C-TKA)。
    方法:我们向PubMed查询了比较对齐的RCT,函数,RA-TKA和C-TKA之间的结果。脆弱性指数(FI)和反向脆弱性指数(RFI)(统称,计算“FI”)作为改变统计显著性所需的结果逆转次数。通过将FI除以该结果事件的样本大小来计算脆性商(FQ)。计算所有结果以及每个单独结果的平均FI和FQ。根据结局事件类型和统计学意义进行分分析以评估FI和FQ,以及随访和发表年份的研究损失。
    结果:总体中位数FI为3.0(四分位距,[IQR]1.0至6.3),中位数RFI为3.0(IQR2.0至4.0)。总体中位数FQ为0.027(IQR0.012至0.050)。在评估的38项结果中,有23项随访损失大于FI。
    结论:少量的替代结果通常足以逆转RA-TKA与C-TKA中评估二分结果的RCT结果的统计学意义。我们建议报告FI和FQ以及P值,以提高RCT结果的可解释性。
    BACKGROUND: Fragility analysis is a method of further characterizing outcomes in terms of the stability of statistical findings. This study assesses the statistical fragility of recent randomized controlled trials (RCTs) evaluating robotic-assisted versus conventional total knee arthroplasty (RA-TKA versus C-TKA).
    METHODS: We queried PubMed for RCTs comparing alignment, function, and outcomes between RA-TKA and C-TKA. Fragility index (FI) and reverse fragility index (RFI) (collectively, \"FI\") were calculated for dichotomous outcomes as the number of outcome reversals needed to change statistical significance. Fragility quotient (FQ) was calculated by dividing the FI by the sample size for that outcome event. Median FI and FQ were calculated for all outcomes collectively as well as for each individual outcome. Subanalyses were performed to assess FI and FQ based on outcome event type and statistical significance, as well as study loss to follow-up and year of publication.
    RESULTS: The overall median FI was 3.0 (interquartile range, [IQR] 1.0 to 6.3) and the median reverse fragility index was 3.0 (IQR 2.0 to 4.0). The overall median FQ was 0.027 (IQR 0.012 to 0.050). Loss to follow-up was greater than FI for 23 of the 38 outcomes assessed.
    CONCLUSIONS: A small number of alternative outcomes is often enough to reverse the statistical significance of findings in RCTs evaluating dichotomous outcomes in RA-TKA versus C-TKA. We recommend reporting FI and FQ alongside P values to improve the interpretability of RCT results.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    循证护理依赖于强有力的研究。脆弱性指数(FI)用于评估随机对照试验(RCT)中具有统计学意义的结果的稳健性。虽然传统的FI仅限于二分结果,一个新颖的工具,连续脆弱性指数(CFI),允许评估连续结果的稳健性。
    计算RCT中具有统计学意义的连续结果的CFI,以评估治疗肩关节前不稳定(ASI)的干预措施。
    荟萃分析;证据水平,2.
    在MEDLINE进行了搜索,Embase,和CENTRAL数据库,用于评估ASI从成立到2022年10月6日的管理策略。包括报告研究组之间在≥1个连续结局方面有统计学意义差异的研究。计算CFI并将其应用于所有可用的报告ASI干预措施的RCT。在CFI和各种研究特征之间进行多变量线性回归作为预测因子。
    有27个随机对照试验,总共有1846个肩膀,包括。样本量中位数为61肩(IQR,43).27个随机对照试验的CFI中位数为8.2(IQR,17.2;95%CI,3.6-15.4)。CFI中位数为7.9(IQR,21;95%CI,1-22),用于11项比较手术方法的研究,22.6(IQR,16;95%CI,8.2-30.4),用于6项比较非手术复位干预措施的研究,2.8为3项比较固定方法的研究,和2.4的3项研究比较了手术干预和非手术干预。重要的是,57项纳入研究的结果中有22项(38.6%)来自完成随访数据的研究,其随访损失超过其CFI。多因素回归分析显示,一项试验的样本量与其结果的CFI之间存在统计学显著正相关(r=0.23[95%CI,0.13-0.33];P<.001)。
    在ASI试验中,超过三分之一的连续结局的CFI低于报告的随访损失。这具有逆转试验结果的重大风险,在评估可用的RCT数据时应予以考虑。我们建议包括FI,CFI,以及未来随机对照试验摘要中的后续损失。
    UNASSIGNED: Evidence-based care relies on robust research. The fragility index (FI) is used to assess the robustness of statistically significant findings in randomized controlled trials (RCTs). While the traditional FI is limited to dichotomous outcomes, a novel tool, the continuous fragility index (CFI), allows for the assessment of the robustness of continuous outcomes.
    UNASSIGNED: To calculate the CFI of statistically significant continuous outcomes in RCTs evaluating interventions for managing anterior shoulder instability (ASI).
    UNASSIGNED: Meta-analysis; Level of evidence, 2.
    UNASSIGNED: A search was conducted across the MEDLINE, Embase, and CENTRAL databases for RCTs assessing management strategies for ASI from inception to October 6, 2022. Studies that reported a statistically significant difference between study groups in ≥1 continuous outcome were included. The CFI was calculated and applied to all available RCTs reporting interventions for ASI. Multivariable linear regression was performed between the CFI and various study characteristics as predictors.
    UNASSIGNED: There were 27 RCTs, with a total of 1846 shoulders, included. The median sample size was 61 shoulders (IQR, 43). The median CFI across 27 RCTs was 8.2 (IQR, 17.2; 95% CI, 3.6-15.4). The median CFI was 7.9 (IQR, 21; 95% CI, 1-22) for 11 studies comparing surgical methods, 22.6 (IQR, 16; 95% CI, 8.2-30.4) for 6 studies comparing nonsurgical reduction interventions, 2.8 for 3 studies comparing immobilization methods, and 2.4 for 3 studies comparing surgical versus nonsurgical interventions. Significantly, 22 of 57 included outcomes (38.6%) from studies with completed follow-up data had a loss to follow-up exceeding their CFI. Multivariable regression demonstrated that there was a statistically significant positive correlation between a trial\'s sample size and the CFI of its outcomes (r = 0.23 [95% CI, 0.13-0.33]; P < .001).
    UNASSIGNED: More than a third of continuous outcomes in ASI trials had a CFI less than the reported loss to follow-up. This carries the significant risk of reversing trial findings and should be considered when evaluating available RCT data. We recommend including the FI, CFI, and loss to follow-up in the abstracts of future RCTs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号