关键词: Charged Clusters Functional study Green algae Land plants Mitochondrial proteins Python detection algorithm

来  源:   DOI:10.1016/j.mito.2024.101938

Abstract:
Protein function is dependent on charge interactions and charge biased regions, which are involved in a wide range of cellular and biochemical processes. We report the development of a new algorithm implemented in Python and its use to identify charge clusters CC (NegativeCC: NCC, PositiveCC: PCC and MixedCC: MCC) and compare their presence in mitochondrial proteins of plant groups. To characterize the resulting CC, statistical, structural and functional analyses were conducted. The screening of 105 399 protein sequences showed that 2.6 %, 0.48 % and 0.03 % of the proteins contain NCC, PCC and MCC, respectively. Mitochondrial proteins encoded by the nuclear genome of green algae have the biggest proportion of both PCC (1.6 %) and MCC (0.4 %) and mitochondrial proteins coded by the nuclear genome of other plants group have the highest portion of NCC (7.5 %). The mapping of the identified CC showed that they are mainly located in the terminal regions of the protein. Annotation showed that proteins with CC are classified as binding proteins, are included in the transmembrane transport processes, and are mainly located in the membrane. The CC scanning revealed the presence of 2373 and 784 sites and 192 and 149 motif profiles within NCC and PCC, respectively. The investigation of CC within pentatricopeptide repeat-containing proteins revealed that they are involved in correct and specific RNA editing. CC were proven to play a key role in providing insightful structural and functional information of complex protein assemblies which could be useful in biotechnological applications.
摘要:
蛋白质功能依赖于电荷相互作用和电荷偏向区域,涉及广泛的细胞和生化过程。我们报告了用Python实现的新算法的开发及其用于识别电荷簇CC(NegativeCC:NCC,正CC:PCC和MixedCC:MCC),并比较它们在植物组线粒体蛋白中的存在。要表征生成的CC,统计,进行了结构和功能分析。对105,399个蛋白质序列的筛选显示,2.6%,0.48%和0.03%的蛋白质含有NCC,PCC和MCC,分别。绿藻核基因组编码的线粒体蛋白在PCC(1.6%)和MCC(0.4%)中所占比例最大,而其他植物核基因组编码的线粒体蛋白在NCC中所占比例最高(7.5%)。鉴定的CC的作图显示它们主要位于蛋白质的末端区域。注释表明,带有CC的蛋白质被归类为结合蛋白,包括在跨膜运输过程中,主要位于膜中。CC扫描显示NCC和PCC中存在2373和784个站点以及192和149个基序配置文件,分别。对含有五肽重复序列的蛋白质中CC的研究表明,它们参与正确和特异性的RNA编辑。CC被证明在提供复杂蛋白质组件的有见地的结构和功能信息中起着关键作用,这些信息可能在生物技术应用中有用。
公众号