关键词: Aethionema arabicum Dimorphic seeds RNA-seq Reference and reference-free Transcriptome

Mesh : Brassicaceae / genetics growth & development Gene Expression Profiling Gene Expression Regulation, Plant Gene Ontology Genome, Plant Germination High-Throughput Nucleotide Sequencing Molecular Sequence Annotation Plant Proteins / genetics Seeds / genetics growth & development Transcriptome

来  源:   DOI:10.1186/s12864-019-5452-4   PDF(Sci-hub)   PDF(Pubmed)

Abstract:
BACKGROUND: RNA-sequencing analysis is increasingly utilized to study gene expression in non-model organisms without sequenced genomes. Aethionema arabicum (Brassicaceae) exhibits seed dimorphism as a bet-hedging strategy - producing both a less dormant mucilaginous (M+) seed morph and a more dormant non-mucilaginous (NM) seed morph. Here, we compared de novo and reference-genome based transcriptome assemblies to investigate Ae. arabicum seed dimorphism and to evaluate the reference-free versus -dependent approach for identifying differentially expressed genes (DEGs).
RESULTS: A de novo transcriptome assembly was generated using sequences from M+ and NM Ae. arabicum dry seed morphs. The transcripts of the de novo assembly contained 63.1% complete Benchmarking Universal Single-Copy Orthologs (BUSCO) compared to 90.9% for the transcripts of the reference genome. DEG detection used the strict consensus of three methods (DESeq2, edgeR and NOISeq). Only 37% of 1533 differentially expressed de novo assembled transcripts paired with 1876 genome-derived DEGs. Gene Ontology (GO) terms distinguished the seed morphs: the terms translation and nucleosome assembly were overrepresented in DEGs higher in abundance in M+ dry seeds, whereas terms related to mRNA processing and transcription were overrepresented in DEGs higher in abundance in NM dry seeds. DEGs amongst these GO terms included ribosomal proteins and histones (higher in M+), RNA polymerase II subunits and related transcription and elongation factors (higher in NM). Expression of the inferred DEGs and other genes associated with seed maturation (e.g. those encoding late embryogenesis abundant proteins and transcription factors regulating seed development and maturation such as ABI3, FUS3, LEC1 and WRI1 homologs) were put in context with Arabidopsis thaliana seed maturation and indicated that M+ seeds may desiccate and mature faster than NM. The 1901 transcriptomic DEG set GO-terms had almost 90% overlap with the 2191 genome-derived DEG GO-terms.
CONCLUSIONS: Whilst there was only modest overlap of DEGs identified in reference-free versus -dependent approaches, the resulting GO analysis was concordant in both approaches. The identified differences in dry seed transcriptomes suggest mechanisms underpinning previously identified contrasts between morphology and germination behaviour of M+ and NM seeds.
摘要:
背景:RNA测序分析越来越多地用于研究没有测序基因组的非模型生物中的基因表达。Aethionemaarabicum(十字花科)表现出种子二态性作为一种下注策略-产生休眠较少的粘质(M)种子形态和休眠较多的非粘质(NM)种子形态。这里,我们比较了从头和基于参考基因组的转录组组装来研究Ae。阿拉伯种子二态性,并评估用于鉴定差异表达基因(DEGs)的无参考与依赖方法。
结果:使用来自M+和NMAe的序列产生从头转录组组装。阿拉伯干种子变形。从头组装的转录本含有63.1%的完全基准通用单拷贝直系同源物(BUSCO),而参考基因组的转录本含有90.9%。DEG检测使用三种方法(DESeq2、edgeR和NOISeq)的严格一致性。1533个差异表达的从头组装转录物中只有37%与1876个基因组衍生的DEGs配对。基因本体论(GO)术语区分了种子形态:术语翻译和核小体组装在M干种子中的DEGs丰度更高,而与mRNA加工和转录相关的术语在NM干种子中的丰度较高的DEG中过多。这些GO术语中的DEG包括核糖体蛋白和组蛋白(M+较高),RNA聚合酶II亚基和相关的转录和延伸因子(在NM中较高)。将推断的DEGs和与种子成熟相关的其他基因(例如编码晚期胚胎发生丰富蛋白和调节种子发育和成熟的转录因子的基因,例如ABI3,FUS3,LEC1和WRI1同源物)的表达置于拟南芥种子成熟的背景下,表明M种子可能比NM更快地干燥和成熟。1901转录组DEG集合GO项与2191个基因组衍生的DEGGO项具有几乎90%的重叠。
结论:虽然在无参考方法和依赖方法中确定的DEG只有适度的重叠,两种方法的GO分析结果一致.干种子转录组的差异表明了先前确定的M和NM种子的形态与发芽行为之间形成对比的机制。
公众号