关键词: Bioconductor QFeatures bottom-up proteomics data processing differential expression limma mass spectrometry proteomics quality control shotgun proteomics

Mesh : Humans Proteomics Workflow HEK293 Cells Proteins / analysis Mass Spectrometry

来  源:   DOI:10.12688/f1000research.139116.1   PDF(Pubmed)

Abstract:
Background: Expression proteomics involves the global evaluation of protein abundances within a system. In turn, differential expression analysis can be used to investigate changes in protein abundance upon perturbation to such a system. Methods: Here, we provide a workflow for the processing, analysis and interpretation of quantitative mass spectrometry-based expression proteomics data. This workflow utilizes open-source R software packages from the Bioconductor project and guides users end-to-end and step-by-step through every stage of the analyses. As a use-case we generated expression proteomics data from HEK293 cells with and without a treatment. Of note, the experiment included cellular proteins labelled using tandem mass tag (TMT) technology and secreted proteins quantified using label-free quantitation (LFQ). Results: The workflow explains the software infrastructure before focusing on data import, pre-processing and quality control. This is done individually for TMT and LFQ datasets. The application of statistical differential expression analysis is demonstrated, followed by interpretation via gene ontology enrichment analysis. Conclusions: A comprehensive workflow for the processing, analysis and interpretation of expression proteomics is presented. The workflow is a valuable resource for the proteomics community and specifically beginners who are at least familiar with R who wish to understand and make data-driven decisions with regards to their analyses.
摘要:
背景:表达蛋白质组学涉及系统内蛋白质丰度的整体评估。反过来,差异表达分析可用于研究扰动此类系统后蛋白质丰度的变化。方法:这里,我们提供了处理的工作流程,基于质谱的定量表达蛋白质组学数据的分析和解释。该工作流程利用Bioconductor项目的开源R软件包,并指导用户端到端和逐步完成分析的每个阶段。作为用例,我们从有和没有处理的HEK293细胞产生表达蛋白质组学数据。值得注意的是,实验包括使用串联质量标签(TMT)技术标记的细胞蛋白和使用无标记定量(LFQ)定量的分泌蛋白。结果:工作流程在专注于数据导入之前解释了软件基础架构,预处理和质量控制。这对于TMT和LFQ数据集单独完成。证明了统计差异表达分析的应用,然后通过基因本体论富集分析进行解释。结论:处理的全面工作流程,表达蛋白质组学的分析和解释。该工作流是蛋白质组学社区的宝贵资源,特别是至少熟悉R的初学者,他们希望了解并做出有关其分析的数据驱动决策。
公众号