关键词: Leishmania donovani experimental proteome mass spectrometry post-translational modifications (PTMs) proteogenomics

Mesh : Leishmania donovani / genetics metabolism Proteogenomics / methods Protozoan Proteins / genetics metabolism Genome, Protozoan Protein Processing, Post-Translational / genetics Proteomics / methods Proteome / genetics Molecular Sequence Annotation

来  源:   DOI:10.3390/genes15060775   PDF(Pubmed)

Abstract:
The high-throughput proteomics data generated by increasingly more sensible mass spectrometers greatly contribute to our better understanding of molecular and cellular mechanisms operating in live beings. Nevertheless, proteomics analyses are based on accurate genomic and protein annotations, and some information may be lost if these resources are incomplete. Here, we show that most proteomics data may be recovered by interconnecting genomics and proteomics approaches (i.e., following a proteogenomic strategy), resulting, in turn, in an improvement of gene/protein models. In this study, we generated proteomics data from Leishmania donovani (HU3 strain) promastigotes that allowed us to detect 1908 proteins in this developmental stage on the basis of the currently annotated proteins available in public databases. However, when the proteomics data were searched against all possible open reading frames existing in the L. donovani genome, twenty new protein-coding genes could be annotated. Additionally, 43 previously annotated proteins were extended at their N-terminal ends to accommodate peptides detected in the proteomics data. Also, different post-translational modifications (phosphorylation, acetylation, methylation, among others) were found to occur in a large number of Leishmania proteins. Finally, a detailed comparative analysis of the L. donovani and Leishmania major experimental proteomes served to illustrate how inaccurate conclusions can be raised if proteomes are compared solely on the basis of the listed proteins identified in each proteome. Finally, we have created data entries (based on freely available repositories) to provide and maintain updated gene/protein models. Raw data are available via ProteomeXchange with the identifier PXD051920.
摘要:
由越来越敏感的质谱仪产生的高通量蛋白质组学数据极大地有助于我们更好地理解在生物中操作的分子和细胞机制。然而,蛋白质组学分析是基于准确的基因组和蛋白质注释,如果这些资源不完整,一些信息可能会丢失。这里,我们表明,大多数蛋白质组学数据可以通过相互关联的基因组学和蛋白质组学方法来恢复(即,遵循蛋白质基因组策略),产生的,反过来,基因/蛋白质模型的改进。在这项研究中,我们从多诺瓦尼利什曼原虫(HU3株)前鞭毛虫中产生了蛋白质组学数据,这些数据使我们能够根据公共数据库中当前注释的蛋白质在这个发育阶段中检测到1908种蛋白质.然而,当蛋白质组学数据针对多诺瓦尼乳杆菌基因组中存在的所有可能的开放阅读框进行搜索时,可以注释二十个新的蛋白质编码基因。此外,43个先前注释的蛋白质在其N末端延伸以适应蛋白质组学数据中检测到的肽。此外,不同的翻译后修饰(磷酸化,乙酰化,甲基化,其中)发现存在于大量利什曼原虫蛋白中。最后,对多诺瓦尼乳杆菌和利什曼原虫主要实验蛋白质组进行了详细的比较分析,以说明如果仅根据每个蛋白质组中列出的蛋白质对蛋白质组进行比较,则会得出不准确的结论。最后,我们已经创建了数据条目(基于免费提供的存储库),以提供和维护更新的基因/蛋白质模型。原始数据可通过具有标识符PXD051920的ProteomeXchange获得。
公众号