composition bias

  • 文章类型: Journal Article
    宫颈癌与人乳头瘤病毒(HPV)的特定菌株密切相关,特别是HPV-33和HPV-58,在中国女性中表现出显著的患病率。然而,HPV-33和HPV-58中的密码子使用偏差没有得到很好的理解。这项研究的目的是分析密码子使用模式HPV-33和HPV-58,找出影响密码子偏好的主要因素。两种HPV基因型对密码子使用的总体偏好并不显著。两种HPV基因型都表现出对A/U结尾的密码子的偏好。HPV-33的GC3含量为25.43%±0.35%,HPV-58为29.44%±0.57%。在HPV-33和HPV-58的26个有利密码子中(相对同义密码子使用(RSCU)>1),25以A/U结束主成分分析(PCA)显示了HPV-33和HPV-58的整个基因组序列的紧密聚类,表明它们的RSCU偏好相似。此外,对二核苷酸丰度的检查表明,翻译选择影响了HPV-33和HPV-58中独特的二核苷酸使用模式的发展。此外,涉及有效密码子数量图的组合分析,平价规则2和中立分析表明,对于HPV-33和HPV-58,影响密码子使用偏好的主要决定因素是自然选择.HPV-33和HPV-58表现出与人类共有的一组有限的有利密码子,有可能减轻翻译资源的竞争。我们的发现可以为HPV-33和HPV-58病毒的进化模式和密码子使用偏好提供有价值的观点。促进HPV亚型相关疫苗的开发和应用。
    Cervical cancer is closely linked to specific strains of human papillomavirus (HPV), notably HPV-33 and HPV-58, which exhibit a significant prevalence among women in China. Nevertheless, the codon usage bias in HPV-33 and HPV-58 is not well comprehended. The objective of this research is to analyze the codon usage patterns HPV-33 and HPV-58, pinpoint the primary factors that influence codon preference. The overall preference for codon usage in two HPV genotypes is not significant. Both HPV genotypes exhibit a preference for codons that end with A/U. The GC3 content for HPV-33 is 25.43% ± 0.35%, and for HPV-58, it is 29.44% ± 0.57%. Out of the 26 favored codons in HPV-33 and HPV-58 (relative synonymous codon usage (RSCU) > 1), 25 conclude with A/U. Principal component analysis (PCA) shows a tight clustering of the entire genome sequences of HPV-33 and HPV-58, suggesting a similarity in their RSCU preferences. Moreover, an examination of dinucleotide abundance indicated that translation selection influenced the development of a distinctive dinucleotide usage pattern in HPV-33 and HPV-58. Additionally, a combined analysis involving an effective number of codons plot, parity rule 2, and neutrality analysis demonstrated that, for HPV-33 and HPV-58, the primary determinant influencing codon usage preference is natural selection. HPV-33 and HPV-58 exhibit a restricted set of favored codons in common with humans, potentially mitigating competition for translation resources. Our discoveries could provide valuable perspectives on the evolutionary patterns and codon usage preferences of HPV-33 and HPV-58 viruses, contributing to the development and application of relevant HPV subtype vaccines.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    萝卜花叶病毒(TuMV),一种重要的病原体,在全世界的蔬菜作物中引起花叶病,属于Potyviridae科的Potyvirus属。以前,遗传变异的领域,人口结构,时间尺度,TuMV的迁移已经得到了很好的研究。然而,TuMV的密码子使用模式和宿主适应性分析尚不清楚。这里,使用184个非重组序列进行TuMV的组成偏好和密码子使用。我们发现基因组组成中存在相对稳定的变化,TuMV蛋白编码序列中显示的密码子使用选择略低。统计分析表明,TuMV蛋白编码序列的密码子使用模式主要受自然选择和突变压力的影响。自然选择是关键影响因素。密码子适应指数(CAI)和相对密码子去优化指数(RCDI)表明,从目前的数据来看,TuMV基因强烈适应甘蓝。相似性指数(SiD)分析还表明,甘蓝芽孢杆菌可能是TuMV的首选宿主。我们的研究为基于完整基因组评估TuMV的密码子使用偏好提供了第一个见解,并将为TuMV起源和进化模式的未来研究提供更好的建议。
    Turnip mosaic virus (TuMV), an important pathogen that causes mosaic diseases in vegetable crops worldwide, belongs to the genus Potyvirus of the family Potyviridae. Previously, the areas of genetic variation, population structure, timescale, and migration of TuMV have been well studied. However, the codon usage pattern and host adaptation analysis of TuMV is unclear. Here, compositional bias and codon usage of TuMV were performed using 184 non-recombinant sequences. We found a relatively stable change existed in genomic composition and a slightly lower codon usage choice displayed in TuMV protein-coding sequences. Statistical analysis presented that the codon usage patterns of TuMV protein-coding sequences were mainly affected by natural selection and mutation pressure, and natural selection was the key influencing factor. The codon adaptation index (CAI) and relative codon deoptimization index (RCDI) revealed that TuMV genes were strongly adapted to Brassica oleracea from the present data. Similarity index (SiD) analysis also indicated that B. oleracea is potentially the preferred host of TuMV. Our study provides the first insights for assessing the codon usage bias of TuMV based on complete genomes and will provide better advice for future research on TuMV origins and evolution patterns.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Schistosoma mansoni is a trematode flatworm that parasitizes humans and produces a disease called bilharzia. At the genomic level, it is characterized by a low genomic GC content and an \"isochore-like\" structure, where GC-richest regions, mainly placed at the extremes of the chromosomes, are interspersed with low GC-regions. Furthermore, the GC-richest regions are at the same time the gene-richest, and where the most heavily expressed genes are placed. Taking these features into account, we decided to reanalyze the codon usage of this flatworm. Our results show that a) when all genes are considered together, the strong mutational bias towards A + T leads to a predominance of A/T-ending codons, b) a multivariate analysis discriminates between highly and lowly expressed genes, c) the sequences expressed at highest levels display a significant increase in G/C-ending codons, d) when comparing the molecular distances with a closely related species the synonymous distance in highly expressed genes is significantly lower than in lowly expressed sequences. Therefore, we conclude that despite previous results, which were performed with a small sample of genes, codon usage in S. mansoni is the result of two forces that operate in opposite directions: while mutational bias leads to a predominance of A/T codons, translational selection, working at the level of speed, increment G/C ending triplets.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Horseshoe crabs (Xiphosura) are traditionally regarded as sister group to the clade of terrestrial chelicerates (Arachnida). This hypothesis has been challenged by recent phylogenomic analyses, but the non-monophyly of Arachnida has consistently been disregarded as artifactual. We re-evaluated the placement of Xiphosura among chelicerates using the most complete phylogenetic data set to date, expanding outgroup sampling, and including data from whole genome sequencing projects. In spite of uncertainty in the placement of some arachnid clades, all analyses show Xiphosura consistently nested within Arachnida as the sister group to Ricinulei (hooded tick spiders). It is apparent that the radiation of arachnids is an old one and occurred over a brief period of time, resulting in several consecutive short internodes, and thus is a potential case for the confounding effects of incomplete lineage sorting (ILS). We simulated coalescent gene trees to explore the effects of increasing levels of ILS on the placement of horseshoe crabs. In addition, common sources of systematic error were evaluated, as well as the effects of fast-evolving partitions and the dynamics of problematic long branch orders. Our results indicated that the placement of horseshoe crabs cannot be explained by missing data, compositional biases, saturation, or ILS. Interrogation of the phylogenetic signal showed that the majority of loci favor the derived placement of Xiphosura over a monophyletic Arachnida. Our analyses support the inference that horseshoe crabs represent a group of aquatic arachnids, comparable to aquatic mites, breaking a long-standing paradigm in chelicerate evolution and altering previous interpretations of the ancestral transition to the terrestrial habitat. Future studies testing chelicerate relationships should approach the task with a sampling strategy where the monophyly of Arachnida is not held as the premise.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    There are multiple definitions for low complexity regions (LCRs) in protein sequences, with all of them broadly considering LCRs as regions with fewer amino acid types compared to an average composition. Following this view, LCRs can also be defined as regions showing composition bias. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, and more generally the overlaps between different properties related to LCRs, using examples. We argue that statistical measures alone cannot capture all structural aspects of LCRs and recommend the combined usage of a variety of predictive tools and measurements. While the methodologies available to study LCRs are already very advanced, we foresee that a more comprehensive annotation of sequences in the databases will enable the improvement of predictions and a better understanding of the evolution and the connection between structure and function of LCRs. This will require the use of standards for the generation and exchange of data describing all aspects of LCRs.
    There are multiple definitions for low complexity regions (LCRs) in protein sequences. In this critical review, we focus on the definition of sequence complexity of LCRs and their connection with structure. We present statistics and methodological approaches that measure low complexity (LC) and related sequence properties. Composition bias is often associated with LC and disorder, but repeats, while compositionally biased, might also induce ordered structures. We illustrate this dichotomy, plus overlaps between different properties related to LCRs, using examples.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号