关键词: Directed acyclic graph False discovery rate False exceedance rate Familywise error rate Multiple testing Nested hypothesis Partially ordered hypothesis

来  源:   DOI:10.1093/biomet/asab041   PDF(Pubmed)

Abstract:
We consider the problem of multiple hypothesis testing when there is a logical nested structure to the hypotheses. When one hypothesis is nested inside another, the outer hypothesis must be false if the inner hypothesis is false. We model the nested structure as a directed acyclic graph, including chain and tree graphs as special cases. Each node in the graph is a hypothesis and rejecting a node requires also rejecting all of its ancestors. We propose a general framework for adjusting node-level test statistics using the known logical constraints. Within this framework, we study a smoothing procedure that combines each node with all of its descendants to form a more powerful statistic. We prove a broad class of smoothing strategies can be used with existing selection procedures to control the familywise error rate, false discovery exceedance rate, or false discovery rate, so long as the original test statistics are independent under the null. When the null statistics are not independent but are derived from positively-correlated normal observations, we prove control for all three error rates when the smoothing method is arithmetic averaging of the observations. Simulations and an application to a real biology dataset demonstrate that smoothing leads to substantial power gains.
摘要:
当假设存在逻辑嵌套结构时,我们考虑多个假设检验的问题。当一个假设嵌套在另一个假设中时,如果内部假设是错误的,则外部假设必须是错误的。我们将嵌套结构建模为有向无环图,包括链图和树图作为特殊情况。图中的每个节点都是一个假设,拒绝一个节点也需要拒绝它的所有祖先。我们提出了一个通用框架,用于使用已知的逻辑约束来调整节点级测试统计信息。在这个框架内,我们研究了一个平滑过程,该过程将每个节点与其所有后代结合起来,以形成一个更强大的统计量。我们证明了一类广泛的平滑策略可以与现有的选择程序一起使用来控制家庭错误率,错误发现超标率,或者错误的发现率,只要原始测试统计信息在null下是独立的。当零统计量不是独立的,而是来自正相关的正态观察时,当平滑方法是对观测值进行算术平均时,我们证明了对所有三个错误率的控制。模拟和对真实生物学数据集的应用表明,平滑会导致大量的功率增益。
公众号