null hypothesis

零假设
  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    正如文献和美国统计协会等国际机构广泛指出的那样,对P值的严重误解,置信区间,和统计意义在公共卫生中很常见。这种情况会带来有关最终决定的严重风险,例如批准或拒绝治疗。对统计数据的认知扭曲可能源于学校和大学的糟糕教学,过于简化的解释,正如我们所建议的那样,不计后果地使用具有预定义标准化程序的计算软件。鉴于此,我们提出了一个框架来重新校准频繁推理统计在临床和流行病学研究中的作用。特别是,我们强调,统计数据只是一组规则和数字,只有在事先适当地置于明确定义的科学背景下才有意义。出于教育目的讨论了实际例子。除此之外,我们提出了一些工具来更好地评估统计结果,例如多个兼容性或令人惊讶的间隔或各种点假设的元组。最后,我们强调,每个结论都必须由不同类型的科学证据(例如,生物化学,临床,统计,等。),并且必须基于对成本的仔细检查,风险,和好处。
    As widely noted in the literature and by international bodies such as the American Statistical Association, severe misinterpretations of P-values, confidence intervals, and statistical significance are sadly common in public health. This scenario poses serious risks concerning terminal decisions such as the approval or rejection of therapies. Cognitive distortions about statistics likely stem from poor teaching in schools and universities, overly simplified interpretations, and - as we suggest - the reckless use of calculation software with predefined standardized procedures. In light of this, we present a framework to recalibrate the role of frequentist-inferential statistics within clinical and epidemiological research. In particular, we stress that statistics is only a set of rules and numbers that make sense only when properly placed within a well-defined scientific context beforehand. Practical examples are discussed for educational purposes. Alongside this, we propose some tools to better evaluate statistical outcomes, such as multiple compatibility or surprisal intervals or tuples of various point hypotheses. Lastly, we emphasize that every conclusion must be informed by different kinds of scientific evidence (e.g., biochemical, clinical, statistical, etc.) and must be based on a careful examination of costs, risks, and benefits.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在几个大规模的复制项目中,在原始和复制研究中,统计学上无显著性的结果被解释为复制成功。\'这里,我们讨论了这种方法的逻辑问题:两项研究的非显著性并不能确保研究提供了没有效应的证据,如果样本量足够小,“复制成功”几乎总是可以实现。此外,相关错误率无法控制。我们展示了方法,如等价测试和贝叶斯因子,可用于充分量化没有效果的证据以及如何在复制设置中应用它们。使用来自重复性项目的数据:癌症生物学,实验哲学可复制性项目,和可重复性项目:心理学我们说明了许多具有“空结果”的原始和复制研究实际上没有定论。我们得出的结论是,同样重要的是要复制具有统计学意义上无显著结果的研究,但是它们应该被设计出来,分析,并适当解释。
    In several large-scale replication projects, statistically non-significant results in both the original and the replication study have been interpreted as a \'replication success.\' Here, we discuss the logical problems with this approach: Non-significance in both studies does not ensure that the studies provide evidence for the absence of an effect and \'replication success\' can virtually always be achieved if the sample sizes are small enough. In addition, the relevant error rates are not controlled. We show how methods, such as equivalence testing and Bayes factors, can be used to adequately quantify the evidence for the absence of an effect and how they can be applied in the replication setting. Using data from the Reproducibility Project: Cancer Biology, the Experimental Philosophy Replicability Project, and the Reproducibility Project: Psychology we illustrate that many original and replication studies with \'null results\' are in fact inconclusive. We conclude that it is important to also replicate studies with statistically non-significant results, but that they should be designed, analyzed, and interpreted appropriately.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    从零假设显著性检验的范式转变似乎正在进行中。根据模拟,我们说明了一些潜在的动机。首先,P值在不同的研究中差异很大,因此,使用显著性阈值的二分推理通常是不合理的。第二,“统计显著”结果高估了效应大小,随着统计能力的增加,偏差下降。第三,“统计上不显著的”结果低估了效应大小,这种偏差随着更高的统计能力而变得更强。第四,经检验的统计假设通常缺乏生物学依据,并且通常没有信息。尽管存在这些问题,来自《进化生物学杂志》2020卷的48篇论文的筛选表明,显著性测试在进化生物学中仍然几乎被普遍使用。所有筛选的研究都测试了零效应的默认零假设,默认显著性阈值为p=0.05,没有一个提供了预先指定的替代假设,研究前的功率计算和“假阴性”的概率(β错误率)。论文的结果部分平均提供了49个显著性检验(中位数23,范围0-390)。在41项包含对“统计上不显著的”结果的口头描述的研究中,26人(63%)错误地声称没有效果。我们得出的结论是,生态学和进化生物学的研究大多是探索性和描述性的。因此,我们应该从声称到“测试”特定的假设,从统计上转向描述和讨论许多与我们的数据最兼容的假设(可能的真实效应大小)。考虑到我们的统计模型。我们已经有了这样做的手段,因为我们通常会提出涵盖这些假设的兼容性(\'置信度\')区间。
    A paradigm shift away from null hypothesis significance testing seems in progress. Based on simulations, we illustrate some of the underlying motivations. First, p-values vary strongly from study to study, hence dichotomous inference using significance thresholds is usually unjustified. Second, \'statistically significant\' results have overestimated effect sizes, a bias declining with increasing statistical power. Third, \'statistically non-significant\' results have underestimated effect sizes, and this bias gets stronger with higher statistical power. Fourth, the tested statistical hypotheses usually lack biological justification and are often uninformative. Despite these problems, a screen of 48 papers from the 2020 volume of the Journal of Evolutionary Biology exemplifies that significance testing is still used almost universally in evolutionary biology. All screened studies tested default null hypotheses of zero effect with the default significance threshold of p = 0.05, none presented a pre-specified alternative hypothesis, pre-study power calculation and the probability of \'false negatives\' (beta error rate). The results sections of the papers presented 49 significance tests on average (median 23, range 0-390). Of 41 studies that contained verbal descriptions of a \'statistically non-significant\' result, 26 (63%) falsely claimed the absence of an effect. We conclude that studies in ecology and evolutionary biology are mostly exploratory and descriptive. We should thus shift from claiming to \'test\' specific hypotheses statistically to describing and discussing many hypotheses (possible true effect sizes) that are most compatible with our data, given our statistical model. We already have the means for doing so, because we routinely present compatibility (\'confidence\') intervals covering these hypotheses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    In the United States, the majority of physicians have been sued and those who have not, will be. Defendants share the notion that the lawsuit is totally fallacious. To be fallacious, the outcome of a medical intervention must be an unpreventable random maloccurrence. This is the only alternative to a medical error. The conflict over outcomes that are random and outcomes that are medical errors results in 46,000 malpractice suits every year in the USA. The burden of proof is a preponderance of evidence, but this is insufficient to do more than just infer, not prove, a relationship between the medical intervention and the outcome. Plaintiffs, generally, prove a malpractice case using inductive reasoning. Inductive reasoning leaves much to intuition. They use inductive reasoning because, by definition, preponderance of evidence, also, leaves much to intuition. Deductive reasoning is objective and there is no place for intuition. With deductive reasoning, the burden of proof is now sufficient to distinguish whether or not the cause relates to the effect with 95% confidence. A model for deductive reasoning in malpractice which is completely consistent with the scientific method is presented. This should and would derail frivolous lawsuits.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Null hypothesis significance testing (NHST) and p-values are widespread in the cardiac surgical literature but are frequently misunderstood and misused. The purpose of the review is to discuss major disadvantages of p-values and suggest alternatives. We describe diagnostic tests, the prosecutor\'s fallacy in the courtroom, and NHST, which involve inter-related conditional probabilities, to help clarify the meaning of p-values, and discuss the enormous sampling variability, or unreliability, of p-values. Finally, we use a cardiac surgical database and simulations to explore further issues involving p-values. In clinical studies, p-values provide a poor summary of the observed treatment effect, whereas the three-number summary provided by effect estimates and confidence intervals is more informative and minimizes over-interpretation of a \"significant\" result. p-values are an unreliable measure of the strength of evidence; if used at all they give only, at best, a very rough guide to decision making. Researchers should adopt Open Science practices to improve the trustworthiness of research and, where possible, use estimation (three-number summaries) or other better techniques.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Editorial
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Over the last decade, gene set analysis has become the first choice for gaining insights into underlying complex biology of diseases through gene expression and gene association studies. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Although gene set analysis approaches are extensively used in gene expression and genome wide association data analysis, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. In this article, we provide a comprehensive overview, statistical structure and steps of gene set analysis approaches used for microarrays, RNA-sequencing and genome wide association data analysis. Further, we also classify the gene set analysis approaches and tools by the type of genomic study, null hypothesis, sampling model and nature of the test statistic, etc. Rather than reviewing the gene set analysis approaches individually, we provide the generation-wise evolution of such approaches for microarrays, RNA-sequencing and genome wide association studies and discuss their relative merits and limitations. Here, we identify the key biological and statistical challenges in current gene set analysis, which will be addressed by statisticians and biologists collectively in order to develop the next generation of gene set analysis approaches. Further, this study will serve as a catalog and provide guidelines to genome researchers and experimental biologists for choosing the proper gene set analysis approach based on several factors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    We review evidence for Macphail\'s (1982, 1985, 1987) Null Hypothesis, that nonhumans animals do not differ either qualitatively or quantitatively in their cognitive capacities. Our review supports the Null Hypothesis in so much as there are no qualitative differences among nonhuman vertebrate animals, and any observed differences along the qualitative dimension can be attributed to failures to account for contextual variables. We argue species do differ quantitatively, however, and that the main difference in \"intelligence\" among animals lies in the degree to which one must account for contextual variables.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Macphail famously criticized two foundational assumptions that underlie the evolutionary approach to comparative psychology: that there are differences in intelligence across species, and that intelligent behavior in animals is based on more than associative learning. Here, we provide evidence from recent work in avian cognition that supports both these assumptions: intelligence across species varies, and animals can perform intelligent behaviors that are not guided solely by associative learning mechanisms. Finally, we reflect on the limitations of comparative psychology that led to Macphail\'s claims and suggest strategies researchers can use to make more advances in the field.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号