背景:鹦鹉属于一组行为先进的脊椎动物,并且相对于其他学习发声的鸟类具有先进的发声学习能力。它们可以模仿人类的语言,使他们的身体动作同步到有节奏的节拍,并理解声音的参照意义的复杂概念。然而,对这些特征的遗传学知之甚少。阐明遗传碱基需要全基因组测序和鹦鹉基因组的强大组装。
结果:我们提供了虎鲸的基因组资源,澳大利亚鹦鹉(Meopsittacusundulatus)-神经科学和行为中研究最广泛的鹦鹉物种。我们提供了基因组序列数据,其中包括来自多种测序技术的超过300倍原始阅读覆盖率和来自单个雄性动物的染色体光学图。读数和光学图用于创建三个混合组件,这些组件代表了迄今为止鸟类的一些最大的基因组支架;其中两个基于与非冗余人类参考集的相似性进行了注释,斑马雀和鸡蛋白质,和鹦鹉转录组序列组装。该项目的序列读数部分生成并用于组装2竞争和利用PacBio单分子测序的千兆规模脊椎动物基因组的第一次从头组装。
结论:在几个质量指标中,这些鹦鹉集合与传统的Sanger测序读数构建的鸡和斑马雀基因组集合相当或更好,足以分析难以测序和组装的区域,包括那些尚未在先前的鸟类基因组中组装的,和基因的启动子区域在声乐学习脑区差异调节。这项工作为基因组技术开发和研究复杂行为特征的基因组学提供了有价值的数据和材料。
BACKGROUND: Parrots belong to a group of behaviorally advanced vertebrates and have an advanced ability of vocal learning relative to other vocal-learning birds. They can imitate human speech, synchronize their body movements to a rhythmic beat, and understand complex concepts of referential meaning to sounds. However, little is known about the genetics of these traits. Elucidating the genetic bases would require whole genome sequencing and a robust assembly of a parrot genome.
RESULTS: We present a genomic resource for the budgerigar, an Australian Parakeet (Melopsittacus undulatus) -- the most widely studied parrot species in neuroscience and behavior. We present genomic sequence data that includes over 300× raw read coverage from multiple sequencing technologies and chromosome optical maps from a single male animal. The reads and optical maps were used to create three hybrid assemblies representing some of the largest genomic scaffolds to date for a bird; two of which were annotated based on similarities to reference sets of non-redundant human, zebra finch and chicken proteins, and budgerigar transcriptome sequence assemblies. The sequence reads for this project were in part generated and used for both the Assemblathon 2 competition and the first de novo assembly of a giga-scale vertebrate genome utilizing PacBio single-molecule sequencing.
CONCLUSIONS: Across several quality metrics, these budgerigar assemblies are comparable to or better than the chicken and zebra finch genome assemblies built from traditional Sanger sequencing reads, and are sufficient to analyze regions that are difficult to sequence and assemble, including those not yet assembled in prior bird genomes, and promoter regions of genes differentially regulated in vocal learning brain regions. This work provides valuable data and material for genome technology development and for investigating the genomics of complex behavioral traits.