de novo assembly

从头组装
  • 文章类型: Journal Article
    虽然空气微生物组及其多样性对人类健康和生态系统复原力至关重要,全面的空气微生物多样性监测仍然很少,所以对空气微生物组的组成知之甚少,分布,或功能。在这里,我们表明基于纳米孔测序的宏基因组学可以通过液体冲击和量身定制的计算分析与主动空气采样相结合来稳健地评估空气微生物组。我们为空气微生物组分析提供快速和便携式的实验室和计算方法,我们利用它来稳健地评估受控温室环境和自然户外环境的核心空气微生物组的分类组成。我们表明,长读数测序可以通过从头宏基因组组装解决物种水平的注释和特定的生态系统功能,尽管少量的片段化DNA用作纳米孔测序的输入。然后,我们应用我们的管道来评估城市空气微生物组的多样性和变异性,利用巴塞罗那,西班牙,作为一个例子;这个随机实验给出了第一个见解高度稳定的位置特定的空气微生物在城市的边界内的存在,并展示了可以通过自动化实现的强大的微生物评估,快,和便携式纳米孔测序技术。
    While the air microbiome and its diversity are essential for human health and ecosystem resilience, comprehensive air microbial diversity monitoring has remained rare, so that little is known about the air microbiome\'s composition, distribution, or functionality. Here we show that nanopore sequencing-based metagenomics can robustly assess the air microbiome in combination with active air sampling through liquid impingement and tailored computational analysis. We provide fast and portable laboratory and computational approaches for air microbiome profiling, which we leverage to robustly assess the taxonomic composition of the core air microbiome of a controlled greenhouse environment and of a natural outdoor environment. We show that long-read sequencing can resolve species-level annotations and specific ecosystem functions through de novo metagenomic assemblies despite the low amount of fragmented DNA used as an input for nanopore sequencing. We then apply our pipeline to assess the diversity and variability of an urban air microbiome, using Barcelona, Spain, as an example; this randomized experiment gives first insights into the presence of highly stable location-specific air microbiomes within the city\'s boundaries, and showcases the robust microbial assessments that can be achieved through automatable, fast, and portable nanopore sequencing technology.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    稻粒的形状不仅决定了千粒重,而且与谷物品质密切相关。在这里,我们确定了千粒重超过60g的超大型谷物(ULG)。QTL的综合分析,BSA,从头组装的基因组,转录测序,并进行基因编辑以剖析ULG形成的分子基础。ULG从至少四个已知的谷物成型基因中聚合了有利的等位基因,OsLG3、OsMADS1、GS3、GL3.1和一个新基因座,qULG2-b,编码富含亮氨酸的重复受体样激酶。在转基因植物和近等基因品系中证实了OsLG3,OsMADS1,GS3和GL3.1对晶粒尺寸的集体影响。转录组分析确定了由这四个基因协同调节的112个基因,这些基因主要参与光合作用和碳代谢。通过利用这些基因的多效性,我们提高了粮食产量,外观,和水稻品种的抗逆性。SN265.除了展示可以产生ULG的多个粒度调节基因的金字塔,我们的研究为利用粒级调控基因的多效性改良水稻品种提供了理论框架和有价值的基因组资源。
    The shape of rice grains not only determines the thousand-grain weight but also correlates closely with the grain quality. Here we identified an ultra-large grain accession (ULG) with a thousand-grain weight exceeding 60 g. The integrated analysis of QTL, BSA, de novo genome assembled, transcription sequencing, and gene editing was conducted to dissect the molecular basis of the ULG formation. The ULG pyramided advantageous alleles from at least four known grain-shaping genes, OsLG3, OsMADS1, GS3, GL3.1, and one novel locus, qULG2-b, which encoded a leucine-rich repeat receptor-like kinase. The collective impacts of OsLG3, OsMADS1, GS3, and GL3.1 on grain size were confirmed in transgenic plants and near-isogenic lines. The transcriptome analysis identified 112 genes cooperatively regulated by these four genes that were prominently involved in photosynthesis and carbon metabolism. By leveraging the pleiotropy of these genes, we enhanced the grain yield, appearance, and stress tolerance of rice var. SN265. Beyond showcasing the pyramiding of multiple grain size regulation genes that can produce ULG, our study provides a theoretical framework and valuable genomic resources for improving rice variety by leveraging the pleiotropy of grain size regulated genes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    木耳角膜的开创性粉红色品种,2022年首次商业化种植,缺乏基因组数据,阻碍遗传育种研究,基因发现,和产品开发。这里,我们报道了粉红A.角膜Fen-A1基因组的从头组装,并提供了详细的功能注释。基因组大小为73.17Mb,包含86个支架(N50~5.49Mb),GC含量为59.09%,编码19,120个预测基因,BUSCO完整性为92.60%。比较基因组分析揭示了Fen-A1的系统发育相关性和显着的基因家族动力学。推定的基因被发现与3种抗生素相关,36种光依赖性和25种萜代谢物。此外,对789个CAZymes基因进行了分类,揭示了采后冷藏导致质量损失的动态。总的来说,我们的工作是关于粉红色A.角膜基因组的第一份报告,并提供了对其复杂功能的全面了解。
    A pioneering pink cultivar of Auricularia cornea, first commercially cultivated in 2022, lacks genomic data, hindering research in genetic breeding, gene discovery, and product development. Here, we report the de novo assembly of the pink A. cornea Fen-A1 genome and provide a detailed functional annotation. The genome is 73.17 Mb in size, contains 86 scaffolds (N50 ∼ 5.49 Mb), 59.09% GC content and encodes 19,120 predicted genes with a BUSCO completeness of 92.60%. Comparative genomic analysis reveals the phylogenetic relatedness of Fen-A1 and remarkable gene family dynamics. Putative genes were found mapped to 3 antibiotic-related, 36 light-dependent and 25 terpene metabolites. In addition, 789 CAZymes genes were classified, revealing the dynamics of quality loss due to postharvest refrigeration. Overall, our work is the first report on a pink A. cornea genome and provides a comprehensive insight into its complex functions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    生物学中的许多问题都受益于各种模型系统的使用。高通量测序方法在不同模型系统的民主化中取得了胜利。它们允许对感兴趣的整个基因组或转录组进行经济的测序,技术变化甚至可以提供对基因组组织和基因表达和调控的洞察。对此类大型数据集的分析和生物学解释可能会带来重大挑战,这取决于模型系统的“科学状态”。虽然高质量的基因组和转录组参考文献很容易用于建立良好的模型系统,为新兴的模型系统建立这种参考通常需要大量资源,例如财务,专业知识和计算能力。转录组的从头组装代表了新兴模型系统中遗传和分子研究的极好切入点,因为它可以有效地评估基因含量,同时也可以作为差异基因表达研究的参考。然而,从头转录组组装的过程是不平凡的,并且通常必须对每个数据集进行经验优化。对于使用新兴模型系统的研究人员来说,几乎没有从Illumina平台组装和量化短读数据的经验,这些过程可能令人望而生畏。在本指南中,我们概述了从头建立参考转录组时面临的主要挑战,并就如何进行这种努力提供建议。我们描述了主要的实验和生物信息学步骤,为新来者从头转录组组装和差异基因表达分析提供一些广泛的建议和注意事项。此外,我们提供了初步选择的工具,可以帮助从原始的短读数据到组装的转录组和差异表达基因列表.
    Many questions in biology benefit greatly from the use of a variety of model systems. High-throughput sequencing methods have been a triumph in the democratization of diverse model systems. They allow for the economical sequencing of an entire genome or transcriptome of interest, and with technical variations can even provide insight into genome organization and the expression and regulation of genes. The analysis and biological interpretation of such large datasets can present significant challenges that depend on the \'scientific status\' of the model system. While high-quality genome and transcriptome references are readily available for well-established model systems, the establishment of such references for an emerging model system often requires extensive resources such as finances, expertise and computation capabilities. The de novo assembly of a transcriptome represents an excellent entry point for genetic and molecular studies in emerging model systems as it can efficiently assess gene content while also serving as a reference for differential gene expression studies. However, the process of de novo transcriptome assembly is non-trivial, and as a rule must be empirically optimized for every dataset. For the researcher working with an emerging model system, and with little to no experience with assembling and quantifying short-read data from the Illumina platform, these processes can be daunting. In this guide we outline the major challenges faced when establishing a reference transcriptome de novo and we provide advice on how to approach such an endeavor. We describe the major experimental and bioinformatic steps, provide some broad recommendations and cautions for the newcomer to de novo transcriptome assembly and differential gene expression analyses. Moreover, we provide an initial selection of tools that can assist in the journey from raw short-read data to assembled transcriptome and lists of differentially expressed genes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    到目前为止,Hoplonemertea进化枝的带状蠕虫还没有全基因组信息,Nemertea门中物种最丰富的类别。虽然毛节虫内的物种,Hoplonemertea的姐妹进化枝,具有纤毛幼虫阶段,并且在其长鼻上缺乏针状体,Hoplonemertea物种具有浮游植物状的幼虫,并配备了用于将毒素注入猎物的探针。为了进一步比较这些发育,生理,和从基因组角度的行为差异,Hoplonemertea物种的参考基因组的可用性至关重要。这些数据将对未来的研究非常有用,以便更好地理解分子生态学,毒液进化,不仅在内梅尔茶,但也在其他海洋无脊椎动物门。为此,我们在此介绍了带注释的染色体水平基因组组装,一个容易收集的nemertean非常适合实验室实验。基因组具有157.9Mbp的组装大小。Hi-C支架产生了染色体水平的支架,支架N50为10.0Mbp,完整的BUSCO基因以单拷贝形式发现得分为95.1%。注释预测了20,684个蛋白质编码基因。高质量的参考基因组达到地球生物基因组标准水平7。C.Q50.
    Genome-wide information has so far been unavailable for ribbon worms of the clade Hoplonemertea, the most species-rich class within the phylum Nemertea. While species within Pilidiophora, the sister clade of Hoplonemertea, possess a pilidium larval stage and lack stylets on their proboscis, Hoplonemertea species have a planuliform larva and are armed with stylets employed for the injection of toxins into their prey. To further compare these developmental, physiological, and behavioral differences from a genomic perspective, the availability of a reference genome for a Hoplonemertea species is crucial. Such data will be highly useful for future investigations toward a better understanding of molecular ecology, venom evolution, and regeneration not only in Nemertea but also in other marine invertebrate phyla. To this end, we herein present the annotated chromosome-level genome assembly for Emplectonema gracile (Nemertea; Hoplonemertea; Monostilifera; Emplectonematidae), an easily collected nemertean well suited for laboratory experimentation. The genome has an assembly size of 157.9 Mb. Hi-C scaffolding yielded chromosome-level scaffolds, with a scaffold N50 of 10.0 Mb and a score of 95.1% for complete BUSCO genes found as a single copy. Annotation predicted 20,684 protein-coding genes. The high-quality reference genome reaches an Earth BioGenome standard level of 7.C.Q50.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    竹节参迈耶,双子叶植物科天道菜科的多年生草本植物,是一种罕见的民间中药,在中国被称为“草药之王”。为了了解干旱和盐胁迫下参与二级途径的基因,日本血吸虫的转录组学分析至关重要。地下根茎的转录组,茎,并且使用IlluminaHiSeq平台进行了在干旱和盐胁迫下的叶片。抄本从头组装后,进行表达谱分析并鉴定差异表达基因(DEGs)。此外,使用基因本体论术语和京都基因和基因组百科全书(KEGG)途径富集分析,探索了已鉴定的DEGs与人参皂苷相关的推定功能。从日本假单胞菌的转录组中获得总共221,804个单基因。进一步的分析显示,10,839个单基因被定位到91个KEGG途径。此外,筛选了与三萜皂苷合成相关的2条响应干旱和盐胁迫的竹节草代谢途径。对倍半萜和三萜的代谢途径进行了注释,最后对这些基因的表达进行相关性分析,并对这些基因的表达进行了鉴定,β-淀粉酶合成酶,异戊二烯合成酶,角鲨烯环氧酶,和1-脱氧-D-酮糖-5-磷酸合酶,分别。我们的研究结果为筛选高表达基因和挖掘与三萜皂苷合成相关的基因铺平了道路。该研究也为刺参人参皂苷生物合成和信号通路相关基因的研究提供了有价值的参考。
    Panax japonicus Meyer, a perennial herb of the dicotyledonaceae family Araliaceae, is a rare folk traditional Chinese medicine, known as \"the king of herbal medicine\" in China. To understand the genes involved in secondary pathways under drought and salt stress, the transcriptomic analysis of P. japonicus is of vital importance. The transcriptome of underground rhizomes, stems, and leaves under drought and salt stress in P. japonicus were performed using the Illumina HiSeq platform. After de novo assembly of transcripts, expression profiling and identified differentially expressed genes (DEGs) were performed. Furthermore, putative functions of identified DEGs correlated with ginsenoside in P. japonicus were explored using Gene Ontology terms and Kyoto Encyclopedia of Genes and Genome (KEGG) pathway enrichment analysis. A total of 221,804 unigenes were obtained from the transcriptome of P. japonicus. The further analysis revealed that 10,839 unigenes were mapped to 91 KEGG pathways. Furthermore, a total of two metabolic pathways of P. japonicus in response to drought and salt stress related to triterpene saponin synthesis were screened. The sesquiterpene and triterpene metabolic pathways were annotated and finally putatively involved in ginsenoside content and correlation analysis of the expression of these genes were analyzed to identify four genes, β-amyrin synthase, isoprene synthase, squalene epoxidase, and 1-deoxy-D-ketose-5-phosphate synthase, respectively. Our results paves the way for screening highly expressed genes and mining genes related to triterpenoid saponin synthesis. It also provides valuable references for the study of genes involved in ginsenoside biosynthesis and signal pathway of P. japonicus.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    人类主要组织相容性复合体(MHC)内的极高水平的遗传多态性限制了基于参考的比对方法用于序列组装的有用性。我们将简短的从头阅读组装算法纳入工作流程,以新颖地应用于MHC。MHConstructor是为高吞吐量而设计的集装箱化管道,单倍型知情,全基因组测序和靶捕获短读数据的可重复组装,人口队列。到目前为止,不存在其他自包含工具来从短读取数据生成从头MHC组件。MHConstructor促进了对高质量的广泛访问,无比对MHC序列分析。
    The extremely high levels of genetic polymorphism within the human major histocompatibility complex (MHC) limit the usefulness of reference-based alignment methods for sequence assembly. We incorporate a short read de novo assembly algorithm into a workflow for novel application to the MHC. MHConstructor is a containerized pipeline designed for high-throughput, haplotype-informed, reproducible assembly of both whole genome sequencing and target-capture short read data in large, population cohorts. To-date, no other self-contained tool exists for the generation of de novo MHC assemblies from short read data. MHConstructor facilitates wide-spread access to high quality, alignment-free MHC sequence analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目标:OtteliaPers。属于水生科。属中的物种是水生的,中国是他们在亚洲的起源中心。Otteliaalismoides(L.)Pers。,分布在世界各地,是中国的特色元素,而该属的其他物种是中国特有的。然而,由于某些亚洲国家的栖息地丧失和污染,o.alismoides也被认为濒临灭绝。Otteliaalismoides是唯一一种包含三种二氧化碳浓缩机制的浸没式大型植物,即碳酸氢盐(HCO3-)使用,十字草酸代谢和C4途径。在这项研究中,我们提出了它的第一个基因组组装,以帮助说明各种碳代谢机制,并在未来实现遗传保护。
    方法:使用从一片O.alismoides叶中提取的DNA和RNA,这项工作产生了73.4GbHiFi读取,~126.4Gb全基因组测序短读数和~21.9GbRNA-seq读数。从头基因组组装长度为6,455,939,835bp,具有11,923个支架/重叠群,N50为790,733bp。用基准标记通用单拷贝直系同源物进行的基因组组装完整性评估显示得分为94.4%。装配中的重复序列为4,875,817,144bp(75.5%)。总共预测了116,176个基因。蛋白质序列针对多个数据库进行了功能注释,促进比较基因组分析。
    OBJECTIVE: Ottelia Pers. is in the Hydrocharitaceae family. Species in the genus are aquatic, and China is their centre of origin in Asia. Ottelia alismoides (L.) Pers., which is distributed worldwide, is a distinguishing element in China, while other species of this genus are endemic to China. However, O. alismoides is also considered endangered due to habitat loss and pollution in some Asian countries. Ottelia alismoides is the only submerged macrophyte that contains three carbon dioxide-concentrating mechanisms, i.e. bicarbonate (HCO3-) use, crassulacean acid metabolism and the C4 pathway. In this study, we present its first genome assembly to help illustrate the various carbon metabolism mechanisms and to enable genetic conservation in the future.
    METHODS: Using DNA and RNA extracted from one O. alismoides leaf, this work produced ∼ 73.4 Gb HiFi reads, ∼ 126.4 Gb whole genome sequencing short reads and ∼ 21.9 Gb RNA-seq reads. The de novo genome assembly was 6,455,939,835 bp in length, with 11,923 scaffolds/contigs and an N50 of 790,733 bp. Genome assembly completeness assessment with Benchmarking Universal Single-Copy Orthologs revealed a score of 94.4%. The repetitive sequence in the assembly was 4,875,817,144 bp (75.5%). A total of 116,176 genes were predicted. The protein sequences were functionally annotated against multiple databases, facilitating comparative genomic analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在食品营养安全领域,富含矿物质的谷物的发展在对抗营养不良中起着关键作用。在目前的调查范围内,我们努力辨别对印度谷子中谷物铁积累的改善负责的成绩单。这种追求需要在穗发育-穗出现和挤奶阶段的两个不同阶段对基因型BAR-1433(高Fe含量)和BAR-1423(低Fe含量)进行转录组测序。在尖峰出现的背景下,我们确定了一组895个上调转录本和126个下调转录本,这些转录本描述了高和低籽粒铁基因型之间的差异.相比之下,在挤奶阶段,上调的转录本总数达到436份,而下调的转录本总数为285份.在两个发育阶段始终上升的转录本经过了功能注释,将它们的作用与核仁蛋白对齐,金属-烟胺转运蛋白,核糖核蛋白复合物,长春碱合酶,纤维素合酶,生长素响应因子,胚胎发生丰富的蛋白质,细胞色素c氧化酶,和含锌指BED结构域的蛋白质。同时,转录物的异质谱在整个不同阶段表现出差异表达和上调。这些转录本涵盖了各个方面,如ABC转运蛋白家族蛋白,钙依赖性激酶家族,铁蛋白,金属离子结合,铁硫簇结合,细胞色素家族,锌指转录因子家族,铁氧还蛋白-NADP还原酶1型家族,推定漆酶,Multicopper氧化酶家族,和萜烯合酶家族。为了验证这些转录本的可靠性,六个重叠群代表可能的函数,包括金属运输机,铁硫配位,金属离子结合,生长素响应性GH3样蛋白2和细胞色素P45071B16用于引物设计。随后,这些引物通过qRT-PCR在验证过程中使用,结果与转录组结果一致。这项研究记录了一系列与谷子中铁含量升高有关的基因,展示了在标记辅助选择中利用转录组见解用铁强化谷子的概念证明。这标志着首次全面的转录组分析,描绘了在谷子范式中的穗部发育阶段与不同水平的谷物铁含量相关的转录本。
    In the realm of food nutritional security, the development of mineral-rich grains assumes a pivotal role in combating malnutrition. Within the scope of the current investigation, we endeavoured to discern the transcripts accountable for the improved accumulation of grain-Fe within Indian barnyard millet. This pursuit entailed transcriptome sequencing of genotypes BAR-1433 (with high Fe content) and BAR-1423 (with low Fe content) during two distinct stages of spike development-spike emergence and milking stage. In the context of spike emergence, we identified a cohort of 895 up-regulated transcripts and 126 down-regulated transcripts that delineated the difference between the high and low grain-Fe genotypes. In contrast, during the milking stage, the tally of up-regulated transcripts reached 436, while down-regulated transcripts numbered 285. The transcripts that consistently ascended in both developmental stages underwent functional annotation, aligning their roles with nucleolar proteins, metal-nicotianamine transporters, ribonucleoprotein complexes, vinorine synthases, cellulose synthases, auxin response factors, embryogenesis abundant proteins, cytochrome c oxidases, and zinc finger BED domain-containing proteins. Meanwhile, a heterogeneous spectrum of transcripts exhibited differential expression and upregulation throughout the distinct stages. These transcripts encompassed various facets, such as ABC Transporter family proteins, Calcium-dependent kinase family, Ferritin, Metal ion binding, Iron-sulfur cluster binding, Cytochrome family, Zinc finger transcription factor family, Ferredoxin-NADP reductase type 1 family, Putative laccase, Multicopper oxidase family, and Terpene synthase family. To authenticate the reliability of these transcripts, six contigs representing probable functions, including metal transporters, iron sulfur coordination, metal ion binding, auxin-responsive GH3-like protein 2, and cytochrome P450 71B16, were harnessed for primer design. Subsequently, these primers were utilized in the validation process through qRT-PCR, with the outcomes aligning harmoniously with the transcriptome results. This study chronicles a constellation of genes linked to elevated iron content within barnyard millet, showcasing a proof of concept for leveraging transcriptome insights in marker-assisted selection to fortify barnyard millet with iron. This marks the inaugural comprehensive transcriptome analysis delineating transcripts associated with varying levels of grain-iron content during the panicle developmental stages within the barnyard millet paradigm.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • DOI:
    文章类型: Journal Article
    鲁佩尔的狐狸(Vulpesrueppellii)栖息在北非的沙漠地区,阿拉伯半岛和亚洲西南部。它与其他狐狸物种的系统发育关系,尤其是在其姐妹物种的系统地理环境中,V.vulpes,仍然不清楚。我们在这里报告了第一个带注释的有丝分裂基因组的测序和从头组装。用其他狐狸的数据分析(Vulpini部落,犬科亚科)。我们使用了四种生物信息学方法来重建V.rueppellii有丝分裂基因组,获得相同的序列,除了D环内不完全组装的串联重复区。有丝分裂基因组显示出相同的组织,基因的数量和长度为V.vulpes。我们发现高度支持V.rueppellii的两个已知子分支在V.vulpes的古分支中聚集,使后者物种具有亲缘性,与以前对较短mtDNA片段的分析一致。需要更多的工作来全面了解狐狸杂交的进化驱动因素和后果。
    The Rüppell\'s fox (Vulpes rueppellii) inhabits desert regions across North Africa, the Arabian Peninsula and southwestern Asia. Its phylogenetic relationship with other fox species, especially within the phylogeographic context of its sister species, V. vulpes, remain unclear. We here report the sequencing and de-novo assembly of the first annotated mitogenome of V. rueppellii, analysed with data from other foxes (tribe Vulpini, subfamily Caninae). We used four bioinformatic approaches to reconstruct the V. rueppellii mitogenome, obtaining identical sequences except for the incompletely assembled tandem-repeat region within the D-loop. The mitogenome displayed an identical organization, number and length of genes as V. vulpes. We found high support for clustering of both known subclades of V. rueppellii within the Palearctic clade of V. vulpes, rendering the latter species paraphyletic, consistent with previous analyses of shorter mtDNA fragments. More work is needed for a full understanding of the evolutionary drivers and consequences of hybridization in foxes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号