关键词: Apis mellifera full-length transcriptome nanopore sequencing reference genome third-generation sequencing

Mesh : Bees / genetics Animals Molecular Sequence Annotation Transcriptome / genetics Genome, Insect Nosema / genetics Nanopore Sequencing / methods Gene Expression Profiling / methods

来  源:   DOI:10.3390/genes15060728   PDF(Pubmed)

Abstract:
Honeybees are an indispensable pollinator in nature with pivotal ecological, economic, and scientific value. However, a full-length transcriptome for Apis mellifera, assembled with the advanced third-generation nanopore sequencing technology, has yet to be reported. Here, nanopore sequencing of the midgut tissues of uninoculated and Nosema ceranae-inoculated A. mellifera workers was conducted, and the full-length transcriptome was then constructed and annotated based on high-quality long reads. Next followed improvement of sequences and annotations of the current reference genome of A. mellifera. A total of 5,942,745 and 6,664,923 raw reads were produced from midguts of workers at 7 days post-inoculation (dpi) with N. ceranae and 10 dpi, while 7,100,161 and 6,506,665 raw reads were generated from the midguts of corresponding uninoculated workers. After strict quality control, 6,928,170, 6,353,066, 5,745,048, and 6,416,987 clean reads were obtained, with a length distribution ranging from 1 kb to 10 kb. Additionally, 16,824, 17,708, 15,744, and 18,246 full-length transcripts were respectively detected, including 28,019 nonredundant ones. Among these, 43,666, 30,945, 41,771, 26,442, and 24,532 full-length transcripts could be annotated to the Nr, KOG, eggNOG, GO, and KEGG databases, respectively. Additionally, 501 novel genes (20,326 novel transcripts) were identified for the first time, among which 401 (20,255), 193 (13,365), 414 (19,186), 228 (12,093), and 202 (11,703) were respectively annotated to each of the aforementioned five databases. The expression and sequences of three randomly selected novel transcripts were confirmed by RT-PCR and Sanger sequencing. The 5\' UTR of 2082 genes, the 3\' UTR of 2029 genes, and both the 5\' and 3\' UTRs of 730 genes were extended. Moreover, 17,345 SSRs, 14,789 complete ORFs, 1224 long non-coding RNAs (lncRNAs), and 650 transcription factors (TFs) from 37 families were detected. Findings from this work not only refine the annotation of the A. mellifera reference genome, but also provide a valuable resource and basis for relevant molecular and -omics studies.
摘要:
蜜蜂是自然界中不可或缺的传粉媒介,具有举足轻重的生态,经济,和科学价值。然而,Apismellifera的全长转录组,采用先进的第三代纳米孔测序技术,尚未报告。这里,对未接种和Nosemaceranae接种的A.mellifera工人的中肠组织进行了纳米孔测序,然后基于高质量的长读数构建和注释全长转录组。接下来是A.mellifera的当前参考基因组的序列和注释的改进。在接种N.ceranae和10dpi后7天,从工人的腹部产生了总共5,942,745和6,664,923个原始读数,而7,100,161和6,506,665个原始读数是从相应的未接种工人的肠道生成的。经过严格的质量控制,获得了6,928,170、6,353,066、5,745,048和6,416,987个清洁读数,长度分布范围从1kb到10kb。此外,分别检测到16,824,17,708,15,744和18,246个全长转录本,包括28,019个非冗余的。其中,43,666、30,945、41,771、26,442和24,532个全长转录本可以注释到Nr,KOG,eggNOG,GO,和KEGG数据库,分别。此外,首次鉴定出501个新基因(20,326个新转录本),其中401(20,255),193(13,365),414(19,186),228(12,093),和202(11,703)分别注释到上述五个数据库中的每一个。通过RT-PCR和Sanger测序证实了三种随机选择的新转录物的表达和序列。2082个基因的5个UTR,2029个基因的3个UTR,730个基因的5'和3'UTR均被扩展。此外,17,345SSR,14,789个完整的ORF,1224长非编码RNA(lncRNAs),检测到37个家族的650个转录因子(TFs)。这项工作的发现不仅完善了A.mellifera参考基因组的注释,而且为相关的分子和组学研究提供了宝贵的资源和基础。
公众号