关键词: Drosophila melanogaster de novo transcripts transcription factor motifs transposable elements

Mesh : Animals DNA Transposable Elements Drosophila melanogaster / genetics Transcription Factors / genetics metabolism Transcriptome Transcription, Genetic Nucleotide Motifs Binding Sites Open Reading Frames Genome, Insect Evolution, Molecular

来  源:   DOI:10.1093/gbe/evae134   PDF(Pubmed)

Abstract:
De novo genes emerge from noncoding regions of genomes via succession of mutations. Among others, such mutations activate transcription and create a new open reading frame (ORF). Although the mechanisms underlying ORF emergence are well documented, relatively little is known about the mechanisms enabling new transcription events. Yet, in many species a continuum between absent and very prominent transcription has been reported for essentially all regions of the genome. In this study, we searched for de novo transcripts by using newly assembled genomes and transcriptomes of seven inbred lines of Drosophila melanogaster, originating from six European and one African population. This setup allowed us to detect sample specific de novo transcripts, and compare them to their homologous nontranscribed regions in other samples, as well as genic and intergenic control sequences. We studied the association with transposable elements (TEs) and the enrichment of transcription factor motifs upstream of de novo emerged transcripts and compared them with regulatory elements. We found that de novo transcripts overlap with TEs more often than expected by chance. The emergence of new transcripts correlates with regions of high guanine-cytosine content and TE expression. Moreover, upstream regions of de novo transcripts are highly enriched with regulatory motifs. Such motifs are more enriched in new transcripts overlapping with TEs, particularly DNA TEs, and are more conserved upstream de novo transcripts than upstream their \'nontranscribed homologs\'. Overall, our study demonstrates that TE insertion is important for transcript emergence, partly by introducing new regulatory motifs from DNA TE families.
摘要:
从头基因通过连续突变从基因组的非编码区出现。其中,这种突变激活转录并创建新的开放阅读框(ORF)。尽管ORF出现的潜在机制是有据可查的,对促成新转录事件的机制知之甚少。然而,在许多物种中,已经报道了几乎所有基因组区域的转录缺失和非常突出之间的连续体。在这项研究中,我们通过使用新组装的基因组和七个黑腹果蝇近交系的转录组来搜索从头转录本,起源于六个欧洲人和一个非洲人。这种设置使我们能够检测样本特异性从头转录本,并将它们与其他样本中的同源非转录区域进行比较,以及基因和基因间控制序列。我们研究了与转座因子的关联以及从头出现的转录本上游转录因子基序的富集,并将其与调控元件进行了比较。我们发现从头转录本与TE重叠的频率比偶然预期的要高。新转录物的出现与高GC含量和TE表达的区域相关。此外,从头转录本的上游区域高度富含调控基序。这样的主题在与TE重叠的新转录物中更加丰富,特别是DNATEs,与上游的“非转录同源物”相比,上游的从头转录本更保守。总的来说,我们的研究表明,TE插入对转录本的出现很重要,部分是通过从DNATE家族引入新的调控基序。
公众号