Tidy data

  • 文章类型: Journal Article
    背景:微生物生长的表征具有基础和应用兴趣。现代平台可以自动收集高通量微生物生长曲线,需要开发计算工具来处理和分析这些数据以产生见解。
    结果:为了满足这一需求,这里我介绍一个新开发的R包:gcplyr。gcplyr可以以常见的表格格式灵活导入增长曲线数据,并在一个灵活和可扩展的整洁框架下重塑它,使用户能够使用流行的可视化软件包设计自定义分析或绘制数据。gcplyr还可以合并元数据,并生成或导入实验设计以与数据合并。最后,gcplyr进行无模型(非参数)分析。这些分析不需要关于微生物生长动力学的数学假设,gcplyr能够提取广泛的重要性状,包括增长率,倍增时间,滞后时间,最大密度和承载能力,Diauxie,曲线下的面积,灭绝时间,还有更多.
    结论:gcplyr对R中的生长曲线数据进行了脚本分析,简化常见的数据整理和分析步骤,并轻松集成常见的可视化和统计分析。
    BACKGROUND: Characterization of microbial growth is of both fundamental and applied interest. Modern platforms can automate collection of high-throughput microbial growth curves, necessitating the development of computational tools to handle and analyze these data to produce insights.
    RESULTS: To address this need, here I present a newly-developed R package: gcplyr. gcplyr can flexibly import growth curve data in common tabular formats, and reshapes it under a tidy framework that is flexible and extendable, enabling users to design custom analyses or plot data with popular visualization packages. gcplyr can also incorporate metadata and generate or import experimental designs to merge with data. Finally, gcplyr carries out model-free (non-parametric) analyses. These analyses do not require mathematical assumptions about microbial growth dynamics, and gcplyr is able to extract a broad range of important traits, including growth rate, doubling time, lag time, maximum density and carrying capacity, diauxie, area under the curve, extinction time, and more.
    CONCLUSIONS: gcplyr makes scripted analyses of growth curve data in R straightforward, streamlines common data wrangling and analysis steps, and easily integrates with common visualization and statistical analyses.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在统计界,一些共享数据的指导原则已经出现;然而,生成数据的合作者并不总是清楚这些原则。为了弥合这种鸿沟,我们已经建立了一套共享数据的准则。在这些中,我们强调需要向统计学家提供原始数据,一致格式的重要性,以及向统计学家提供所有必要的实验信息和预处理步骤的必要性。通过这些指南,我们希望避免数据分析中的错误和延迟。
    Within the statistics community, a number of guiding principles for sharing data have emerged; however, these principles are not always made clear to collaborators generating the data. To bridge this divide, we have established a set of guidelines for sharing data. In these, we highlight the need to provide raw data to the statistician, the importance of consistent formatting, and the necessity of including all essential experimental information and pre-processing steps carried out to the statistician. With these guidelines we hope to avoid errors and delays in data analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号