sampling model

抽样模型
  • 文章类型: Journal Article
    Over the last decade, gene set analysis has become the first choice for gaining insights into underlying complex biology of diseases through gene expression and gene association studies. It also reduces the complexity of statistical analysis and enhances the explanatory power of the obtained results. Although gene set analysis approaches are extensively used in gene expression and genome wide association data analysis, the statistical structure and steps common to these approaches have not yet been comprehensively discussed, which limits their utility. In this article, we provide a comprehensive overview, statistical structure and steps of gene set analysis approaches used for microarrays, RNA-sequencing and genome wide association data analysis. Further, we also classify the gene set analysis approaches and tools by the type of genomic study, null hypothesis, sampling model and nature of the test statistic, etc. Rather than reviewing the gene set analysis approaches individually, we provide the generation-wise evolution of such approaches for microarrays, RNA-sequencing and genome wide association studies and discuss their relative merits and limitations. Here, we identify the key biological and statistical challenges in current gene set analysis, which will be addressed by statisticians and biologists collectively in order to develop the next generation of gene set analysis approaches. Further, this study will serve as a catalog and provide guidelines to genome researchers and experimental biologists for choosing the proper gene set analysis approach based on several factors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    The choice of the sampling locations in a spatial network is often guided by practical demands. In particular, many locations are preferentially chosen to capture high values of a response, for example, air pollution levels in environmental monitoring. Then, model estimation and prediction of the exposure surface become biased due to the selective sampling. Since prediction is often the main utility of the modeling, we suggest that the effect of preferential sampling lies more importantly in the resulting predictive surface than in parameter estimation. Our contribution is to offer a direct simulation-based approach to assessing the effects of preferential sampling. We compare two predictive surfaces over the study region, one originating from the notion of an \'operating\' intensity driving the selection of monitoring sites, the other under complete spatial randomness. We can consider a range of response models. They may reflect the operating intensity, introduce alternative informative covariates, or just propose a flexible spatial model. Then, we can generate data under the given model. Upon fitting the model and interpolating (kriging), we will obtain two predictive surfaces to compare. It is important to note that we need suitable metrics to compare the surfaces and that the predictive surfaces are random, so we need to make expected comparisons.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号