肌球蛋白结合蛋白-C(MyBP-C)是一种肌节蛋白,可调节横纹肌的收缩力。MYBPC基因家族的突变,包括慢速骨骼(MYBPC1),快速骨骼(MYBPC2)和心脏(MYBPC3),可导致心脏和骨骼肌病。尽管如此,他们的进化模式,致病性和对MyBP-C蛋白结构的影响仍有待阐明。因此,本研究旨在系统评估MYBPC家族突变的进化保守和表观遗传模式。利用机器学习(ML)方法,基因组聚集数据库(gnomAD)提供了MYBPC1,MYBPC2和MYBPC3基因的变体。随后是Ensembl的变异效应预测因子(VEP)分析,分别在MYBPC1、MYBPC2和MYBPC3中鉴定出8,618、3,871和3,071变体。错义变体占总变体的61%-66%,其中密码子中的第三个核苷酸位置被高度改变。精氨酸是突变最多的氨基酸,重要,因为MyBP-C蛋白中的大多数致病突变都是精氨酸来源。发现MyBP-C的域C5和C6是MyBP-C蛋白家族中大多数突变的热点。cMyBP-C中的高百分比的截短突变导致心肌病。精氨酸和谷氨酸在fMyBP-C和cMyBP-C中排名第一,分别,和色氨酸和酪氨酸是最常见的三个旁系同源改变为过早终止密码子,并导致蛋白质截短的羧基末端。在三个MYBP-C旁系同源物中鉴定出异质性表观遗传模式。总的来说,研究表明,使用计算方法的数据库可以促进诊断和药物发现,以治疗由MYBPC突变引起的肌肉疾病。
Myosin binding protein-C (MyBP-C) is a sarcomeric protein which regulates the force of contraction in striated muscles. Mutations in the MYBPC family of genes, including slow skeletal (
MYBPC1), fast skeletal (MYBPC2) and cardiac (MYBPC3), can result in cardiac and skeletal myopathies. Nonetheless, their evolutionary pattern, pathogenicity and impact on MyBP-C protein structure remain to be elucidated. Therefore, the present study aimed to systematically assess the evolutionarily conserved and epigenetic patterns of MYBPC family mutations. Leveraging a machine learning (ML) approach, the Genome Aggregation Database (gnomAD) provided variants in
MYBPC1, MYBPC2, and MYBPC3 genes. This was followed by an analysis with Ensembl\'s variant effect predictor (VEP), resulting in the identification of 8,618, 3,871, and 3,071 variants in
MYBPC1, MYBPC2, and MYBPC3, respectively. Missense variants comprised 61%-66% of total variants in which the third nucleotide positions in the codons were highly altered. Arginine was the most mutated amino acid, important because most disease-causing mutations in MyBP-C proteins are arginine in origin. Domains C5 and C6 of MyBP-C were found to be hotspots for most mutations in the MyBP-C family of proteins. A high percentage of truncated mutations in cMyBP-C cause cardiomyopathies. Arginine and glutamate were the top hits in fMyBP-C and cMyBP-C, respectively, and tryptophan and tyrosine were the most common among the three paralogs changing to premature stop codons and causing protein truncations at the carboxyl terminus. A heterogeneous epigenetic pattern was identified among the three MYBP-C paralogs. Overall, it was shown that databases using computational approaches can facilitate diagnosis and drug discovery to treat muscle disorders caused by MYBPC mutations.