repeat analysis

重复分析
  • 文章类型: Journal Article
    Bromus是一种具有较高适应性和生态经济价值的牧草。这里,我们对Bromusciliatus进行了测序,Bromusbenekenii,Bromusriparius,和Bromusrubens叶绿体基因组,并将其与先前描述的四个物种进行比较。Bromus物种的基因组大小范围从136,934bp(普通Bromusvulgaris)到137,189bp(Bromus纤毛虫,Bromusbiebersteinii),具有典型的四方结构。研究的物种有129个基因,由83个蛋白质编码组成,38tRNA编码,和8个rRNA编码基因。GC含量最高的是反向重复(IR)区(43.85-44.15%),其次是大型单拷贝(LSC)区域(36.25-36.65%)和小型单拷贝(SSC)区域(32.21-32.46%)。有33个高频密码子,以A/U结尾的占90.91%。总共鉴定出350个简单序列重复(SSR),单核苷酸重复是最常见的(61.43%)。总共鉴定了228个正向和141个回文重复。没有检测到反向或互补重复。所有序列的序列同一性都非常相似,特别是关于蛋白质编码和反向重复区。检测到七个高度可变区,可用于分子标记开发。构建的系统发育树表明Bromus是与小麦密切相关的单系分类群。这种对Bromus叶绿体基因组的比较分析为物种鉴定和系统发育研究提供了科学依据。
    Bromus (Poaceae Bromeae) is a forage grass with high adaptability and ecological and economic value. Here, we sequenced Bromus ciliatus, Bromus benekenii, Bromus riparius, and Bromus rubens chloroplast genomes and compared them with four previously described species. The genome sizes of Bromus species ranged from 136,934 bp (Bromus vulgaris) to 137,189 bp (Bromus ciliates, Bromus biebersteinii), with a typical quadripartite structure. The studied species had 129 genes, consisting of 83 protein-coding, 38 tRNA-coding, and 8 rRNA-coding genes. The highest GC content was found in the inverted repeat (IR) region (43.85-44.15%), followed by the large single-copy (LSC) region (36.25-36.65%) and the small single-copy (SSC) region (32.21-32.46%). There were 33 high-frequency codons, with those ending in A/U accounting for 90.91%. A total of 350 simple sequence repeats (SSRs) were identified, with single-nucleotide repeats being the most common (61.43%). A total of 228 forward and 141 palindromic repeats were identified. No reverse or complementary repeats were detected. The sequence identities of all sequences were very similar, especially with respect to the protein-coding and inverted repeat regions. Seven highly variable regions were detected, which could be used for molecular marker development. The constructed phylogenetic tree indicates that Bromus is a monophyletic taxon closely related to Triticum. This comparative analysis of the chloroplast genome of Bromus provides a scientific basis for species identification and phylogenetic studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    RFC1中的双等位基因致病性重复扩增最近被确定为小脑共济失调的分子起源,神经病,前庭反射综合征(CANVAS)以及成人共济失调的最常见原因之一。同时,表型谱已经大规模扩展,现在包括多系统萎缩或帕金森病的模拟。在确定了临床诊断为肌萎缩侧索硬化症(ALS)的患者为RFC1双等位基因致病性重复扩增的携带者后,我们研究了另外106名具有运动神经元疾病(MND)临床主要表型的患者,以分析这种重复扩增在MND患者中是否更常见。的确,另外两名MND患者(一名还患有ALS,一名患有原发性侧索硬化症/PLS)已被确定为RFC1中双等位基因致病性重复扩增的携带者,而没有另一种遗传改变解释了该表型,提示运动神经元疾病是RFC1谱系障碍的另一种极端表型。因此,MND可能属于致病性RFC1重复扩增的扩展表型谱,特别是在那些具有其他特征如感觉和/或自主神经病变的MND患者中,前庭缺陷,或者小脑体征.通过使用牛津纳米孔技术长读测序系统分析RFC1重复阵列,我们的研究强调了该基因座的高度等位基因内和等位基因间异质性,并允许鉴定新的重复基序\'ACAAG\'.
    Biallelic pathogenic repeat expansions in RFC1 were recently identified as molecular origin of cerebellar ataxia, neuropathy, vestibular areflexia syndrome (CANVAS) as well as of one of the most common causes of adult-onset ataxia. In the meantime, the phenotypic spectrum has expanded massively and now includes mimics of multiple system atrophy or parkinsonism. After identifying a patient with a clinical diagnosis of amyotrophic lateral sclerosis (ALS) as a carrier of biallelic pathogenic repeat expansions in RFC1, we studied a cohort of 106 additional patients with a clinical main phenotype of motor neuron disease (MND) to analyze whether such repeat expansions are more common in MND patients. Indeed, two additional MND patients (one also with ALS and one with primary lateral sclerosis/PLS) have been identified as carrier of biallelic pathogenic repeat expansions in RFC1 in the absence of another genetic alteration explaining the phenotype, suggesting motor neuron disease as another extreme phenotype of RFC1 spectrum disorder. Therefore, MND might belong to the expanding phenotypic spectrum of pathogenic RFC1 repeat expansions, particularly in those MND patients with additional features such as sensory and/or autonomic neuropathy, vestibular deficits, or cerebellar signs. By systematically analyzing the RFC1 repeat array using Oxford nanopore technology long-read sequencing, our study highlights the high intra- and interallelic heterogeneity of this locus and allows the identification of the novel repeat motif \'ACAAG\'.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Braseniaschreberi,传统上用于中药和烹饪的植物物种,代表开花植物(被子植物)的早期进化阶段。虽然该物种的质体基因组已经发表,它的线粒体基因组(有丝分裂基因组)尚未被广泛探索,明显缺乏对其细胞器基因组的全面比较分析。在我们的研究中,我们利用来自Illumina平台和OxfordNanopore的测序数据组装了B.schreberi的整个有丝分裂基因组。Schreberi有丝分裂基因组主要以六个环状DNA分子存在,最大的是628,257个碱基对(bp),最小的是110,220bp,总计1.49兆字节(Mb)。然后我们注释了B.schreberi的有丝分裂基因组。有丝分裂基因组包含总共71个基因:其中40个是编码蛋白质基因(PCGs),28个是用于转移RNA(tRNA)的基因,其余3个是核糖体RNA(rRNA)的基因。在密码子使用分析中,我们注意到每个氨基酸特有的独特密码子偏好。最常用的密码子表现出1.36的平均RSCU,表明密码子选择存在明显的偏倚。在重复序列分析中,共鉴定出553个简单序列重复序列(SSR),1,822个分散重复(包括1,015个正向重复和807个回文重复),和608个长末端重复(LTR)。此外,在分析细胞器基因组之间的同源序列时,我们检测到38个来自质体基因组的同源序列,每个超过500bp,在B.schreberi线粒体基因组中。值得注意的是,十个tRNA基因(trnC-GCA,trnM-CAU,trnI-CAU,trnQ-UUG,trnN-GUU,trnT-GGU,trnW-CCA,trnA-UGC,trnI-GAU,和trnV-GAC)似乎已从叶绿体完全转移到有丝分裂基因组。利用Deepred-mt预测有丝分裂基因组中的RNA编辑位点,我们在40个线粒体PCGs中鉴定出675个高质量RNA编辑位点.在我们研究的最后阶段,我们进行了共线性分析,并推断了Schreberi与其他被子植物的系统发育关系,利用线粒体PCGs作为基础。结果表明,蛇形芽孢杆菌有丝分裂基因组的非编码区具有丰富的重复序列和外源序列,施雷柏氏芽孢杆菌与Euryaleferox的关系更密切。
    Brasenia schreberi, a plant species traditionally utilized in Chinese medicine and cuisine, represents an early evolutionary stage among flowering plants (angiosperms). While the plastid genome of this species has been published, its mitochondrial genome (mitogenome) has not been extensively explored, with a notable absence of thorough comparative analyses of its organellar genomes. In our study, we had assembled the entire mitogenome of B. schreberi utilizing the sequencing data derived from both Illumina platform and Oxford Nanopore. The B. schreberi mitogenome mostly exists as six circular DNA molecules, with the largest being 628,257 base pairs (bp) and the smallest 110,220 bp, amounting to 1.49 megabases (Mb). Then we annotated the mitogenome of B. schreberi. The mitogenome encompasses a total of 71 genes: 40 of these are coding proteins genes (PCGs), 28 are genes for transfer RNA (tRNA), and the remaining 3 are genes for ribosomal RNA (rRNA). In the analysis of codon usage, we noted a unique codon preference specific to each amino acid. The most commonly used codons exhibited an average RSCU of 1.36, indicating a noticeable bias in codon selection. In the repeat sequence analysis, a total of 553 simple sequence repeats (SSRs) were identified, 1,822 dispersed repeats (comprising 1,015 forward and 807 palindromic repeats), and 608 long terminal repeats (LTRs). Additionally, in the analysis of homologous sequences between organelle genomes, we detected 38 homologous sequences derived from the plastid genome, each exceeding 500 bp, within the B. schreberi mitochondrial genome. Notably, ten tRNA genes (trnC-GCA, trnM-CAU, trnI-CAU, trnQ-UUG, trnN-GUU, trnT-GGU, trnW-CCA, trnA-UGC, trnI-GAU, and trnV-GAC) appear to have been completely transferred from the chloroplast to the mitogenome. Utilizing the Deepred-mt to predict the RNA editing sites in the mitogenome, we have identified 675 high-quality RNA editing sites in the 40 mitochondrial PCGs. In the final stage of our study, we performed an analysis of colinearity and inferred the phylogenetic relationship of B. schreberi with other angiosperms, utilizing the mitochondrial PCGs as a basis. The results showed that the non-coding regions of the B. schreberi mitogenome are characterized by an abundance of repetitive sequences and exogenous sequences, and B. schreberi is more closely related with Euryale ferox.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    金莲花是我国特有牧草,具有很高的生态价值。在这项研究中,使用Illumina配对末端测序,对fruticosum的完整叶绿体基因组进行测序.苹果叶绿体基因组为123,100bp,包含105个基因,包括74个蛋白质编码基因,4个rRNA编码基因,和27个编码tRNA的基因。基因组的GC含量为34.53%,具有50个重复序列和63个不包含反向重复的简单重复重复序列。简单重复包括45个单核苷酸重复,占比例最高,主要由A/T重复组成。对fruticosum的比较分析,C.多树胶,四个Hedysarum物种揭示了六个基因组是高度保守的,差异主要位于保守的非编码区。此外,编码区的accD和clpP基因表现出高度的核苷酸变异性。因此,这些基因可以作为分子标记,用于金雀花物种的分类和系统发育分析。系统发育分析进一步表明,与四种Hedysarum物种相比,fruticosum和C.multijurgum出现在不同的进化枝中。新测序的叶绿体基因组提供了对fruticosum的系统发育位置的进一步见解。这对于金雀花的分类和鉴定是有用的。
    Corethrodendron fruticosum is an endemic forage grasses in China with high ecological value. In this study, the complete chloroplast genome of C. fruticosum was sequenced using Illumina paired-end sequencing. The C. fruticosum chloroplast genome was 123,100 bp and comprised 105 genes, including 74 protein-coding genes, 4 rRNA-coding genes, and 27 tRNA-coding genes. The genome had a GC content of 34.53%, with 50 repetitive sequences and 63 simple repeat repetitive sequences that did not contain reverse repeats. The simple repeats included 45 single-nucleotide repeats, which accounted for the highest proportion and primarily comprised A/T repeats. A comparative analysis of C. fruticosum, C. multijugum, and four Hedysarum species revealed that the six genomes were highly conserved, with differentials primarily located in the conserved non-coding regions. Moreover, the accD and clpP genes in the coding regions exhibited high nucleotide variability. Accordingly, these genes may serve as molecular markers for the classification and phylogenetic analysis of Corethrodendron species. Phylogenetic analysis further revealed that C. fruticosum and C. multijugum appeared in different clades than the four Hedysarum species. The newly sequenced chloroplast genome provides further insights into the phylogenetic position of C. fruticosum, which is useful for the classification and identification of Corethrodendron.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    凤凰丹聪茶,各种各样的乌龙茶,产于潮州,广东省,中国,其特征在于大量杂交和多倍化。为了评估凤凰丹聪茶与其他乌龙茶的遗传多样性和系统发育关系,构建了潮州30种凤凰丹聪茶的完整环状叶绿体基因组。凤凰丹聪茶的基因组为157,041-157,137bp的环状分子,具有一对反向重复序列(每个26,072-26,610bp),由大的单拷贝(86,615-86,658bp)和小的单拷贝(18,264-18,284bp)分开。总共编码了135个独特的基因,包括90个蛋白质编码基因,37个tRNAs和8个rRNAs。与迄今为止已测序的乌龙茶家族中其他七个物种的比较分析显示,结构组织相似,基因的含量和排列。重复序列分析确定了17-23个串联重复序列,20-24个正向重复和25-27个回文重复。此外,总共检测到65-70个简单序列重复,单核苷酸重复是最常见的。系统发育分析表明,凤凰丹聪茶和福建乌龙茶与山茶科山茶属中的其他栽培山茶成簇,虽然两种乌龙茶种相对独立地交叉嵌入该属中,山茶花.观察到凤凰丹聪茶与其他乌龙茶之间的亲缘关系密切,乌龙茶总体叶绿体基因组呈现低变异和保守进化的模式。凤凰丹聪茶叶绿体基因组的可用性不仅阐明了广东和福建不同产地乌龙茶之间的关系,而且还提供了宝贵的遗传资源,以辅助进一步的分子研究山茶属的分类学和系统基因组解析。
    Phoenix Dancong tea, a variety of oolong tea, is produced in Chaozhou, Guangdong Province, China, and is characterized by numerous hybridizations and polyploidization. To assess the genetic diversity and phylogenetic relationships among Phoenix Dancong tea and other oolong teas, an integrated circular chloroplast genome was constructed for thirty species of Phoenix Dancong tea from Chaozhou. The genome of Phoenix dancong tea is a circular molecule of 157,041-157,137 bp, with a pair of inverted repeats (26,072-26,610 bp each) separated by a large single copy (86,615-86,658 bp) and small single copy (18,264-18,284 bp). A total of 135 unique genes were encoded, including 90 protein coding genes, 37 tRNAs and 8 rRNAs. A comparative analysis with the other seven species in the oolong tea family that have been sequenced to date revealed similarities in structural organization, gene content and arrangement. Repeated sequence analysis identified 17-23 tandem repeats, 20-24 forward repeats and 25-27 palindromic repeats. Additionally, a total of 65-70 simple sequence repeats were detected, with mononucleotide repeats being the most common. Phylogenetic analyses showed that Phoenix Dancong tea and Fujian oolong tea were clustered with other cultivated Camellia sinensis in the genus Camellia of the family Theaceae, while the two oolong tea species were relatively independently cross-embedded in the genus, Camellia. Close genetic relationships were observed between Phoenix Dancong tea and other oolong teas, and the overall chloroplast genomes of oolong tea showed patterns with low variations and conserved evolution. The availability of Phoenix Dancong tea chloroplast genomes not only elucidated the relationship among oolong teas from different origins in Guangdong and Fujian but also provided valuable genetic resources to assist further molecular studies on the taxonomic and phylogenomic resolution of the genus Camellia.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    意大利酸浆变种。villosa,富含牛油苷,多年来一直被用作中药。迄今为止,很少对这种植物进行广泛的分子研究。在本研究中,角状疟原虫的质体。villosa被测序,表征并与其他酸浆物种进行比较,并对茄科进行了系统发育分析。角状疟原虫的质体。villosa长度为156,898bp,GC含量为37.52%,并表现出典型的陆地植物的四方结构,由一个大型单副本(LSC,87,108bp)区域,一个小的单一副本(SSC,18,462bp)区域和一对反向重复序列(IR:IRA和IRB,每个25,664个基点)。质体含有131个基因,其中114个是独特的,17个在红外区域重复。基因组由85个蛋白质编码基因组成,8个rRNA基因和38个tRNA基因。共38个长,在质体中鉴定了三种类型的重复序列,其中正向重复的频率最高。简单序列重复(SSRs)分析显示共有57个SSRs,其中T单核苷酸占多数,大多数SSR位于基因间间隔区。9种Physalis物种之间的比较基因组分析显示,单拷贝区域的保守性低于反向重复序列对,大多数变异都在基因间间隔区而不是编码区。系统发育分析表明酸浆与Withania之间有密切关系。此外,Iochroma,Dunalia,Saracha和Eriolarynx是共生的,聚集在系统发育树中。我们的研究发表了第一个序列和组装。villosa,报告了其进化研究的基本资源,并为评估茄科内部的系统发育关系提供了重要工具。
    Physalis angulata var. villosa, rich in withanolides, has been used as a traditional Chinese medicine for many years. To date, few extensive molecular studies of this plant have been conducted. In the present study, the plastome of P. angulata var. villosa was sequenced, characterized and compared with that of other Physalis species, and a phylogenetic analysis was conducted in the family Solanaceae. The plastome of P. angulata var. villosa was 156,898 bp in length with a GC content of 37.52%, and exhibited a quadripartite structure typical of land plants, consisting of a large single-copy (LSC, 87,108 bp) region, a small single-copy (SSC, 18,462 bp) region and a pair of inverted repeats (IR: IRA and IRB, 25,664 bp each). The plastome contained 131 genes, of which 114 were unique and 17 were duplicated in IR regions. The genome consisted of 85 protein-coding genes, eight rRNA genes and 38 tRNA genes. A total of 38 long, repeat sequences of three types were identified in the plastome, of which forward repeats had the highest frequency. Simple sequence repeats (SSRs) analysis revealed a total of 57 SSRs, of which the T mononucleotide constituted the majority, with most of SSRs being located in the intergenic spacer regions. Comparative genomic analysis among nine Physalis species revealed that the single-copy regions were less conserved than the pair of inverted repeats, with most of the variation being found in the intergenic spacer regions rather than in the coding regions. Phylogenetic analysis indicated a close relationship between Physalis and Withania. In addition, Iochroma, Dunalia, Saracha and Eriolarynx were paraphyletic, and clustered together in the phylogenetic tree. Our study published the first sequence and assembly of the plastome of P. angulata var. villosa, reported its basic resources for evolutionary studies and provided an important tool for evaluating the phylogenetic relationship within the family Solanaceae.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    为了系统地确定它们的系统发育关系并开发用于SalviaBowleyana物种区分的分子标记,S、辉煌,还有巴草,我们使用IlluminaHiseq2500平台对其叶绿体基因组进行了测序。S.Bowleyana的叶绿体基因组长度,S、辉煌,厚朴为151,387bp,150,604bp,和151,163个基点,分别。六个基因ndhB,rpl2、rpl23、rps7、rps12和ycf2存在于IR区域中。S.Bowleyana的叶绿体基因组,S、辉煌,和巴草包含29个串联重复;35、29、24个简单序列重复,和47、49、40个散布的重复序列,分别。rps16-trnQ-UG的三个特异性基因间序列(IGS),trnL-UAA-trnF-GAA,发现trnM-CAU-atpE可以区分23种丹参。通过遗传距离分析鉴定了总共91个基因间间隔序列。在三个研究的丹参物种中,两个特定的IGS区域(trnG-GCC-trnM-CAU和ycf3-trnS-GGA)具有最高的K2p值。此外,系统发育树表明,这23种鼠尾草形成了单系群。发现了两对属特异性DNA条形码引物。研究结果将为理解3种鼠尾草的系统发育分类奠定坚实的基础。此外,特定的基因间区域可以提供在表型和基因片段的区别之间区分丹参物种的可能性。
    To systematically determine their phylogenetic relationships and develop molecular markers for species discrimination of Salvia bowleyana, S. splendens, and S. officinalis, we sequenced their chloroplast genomes using the Illumina Hiseq 2500 platform. The chloroplast genomes length of S. bowleyana, S. splendens, and S. officinalis were 151,387 bp, 150,604 bp, and 151,163 bp, respectively. The six genes ndhB, rpl2, rpl23, rps7, rps12, and ycf2 were present in the IR regions. The chloroplast genomes of S. bowleyana, S. splendens, and S. officinalis contain 29 tandem repeats; 35, 29, 24 simple-sequence repeats, and 47, 49, 40 interspersed repeats, respectively. The three specific intergenic sequences (IGS) of rps16-trnQ-UUG, trnL-UAA-trnF-GAA, and trnM-CAU-atpE were found to discriminate the 23 Salvia species. A total of 91 intergenic spacer sequences were identified through genetic distance analysis. The two specific IGS regions (trnG-GCC-trnM-CAU and ycf3-trnS-GGA) have the highest K2p value identified in the three studied Salvia species. Furthermore, the phylogenetic tree showed that the 23 Salvia species formed a monophyletic group. Two pairs of genus-specific DNA barcode primers were found. The results will provide a solid foundation to understand the phylogenetic classification of the three Salvia species. Moreover, the specific intergenic regions can provide the probability to discriminate the Salvia species between the phenotype and the distinction of gene fragments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    20多年来,简单DNA重复序列的不稳定性一直是遗传性共济失调的常见原因。这些表型相似疾病的常规遗传诊断仍然依赖于迭代工作流程,用于通过基于PCR的有限精度方法定量重复单位。我们建立并验证了临床纳米孔Cas9靶向测序(Clin-CATS),一种无扩增方法,用于同时分析与临床重叠遗传性共济失调相关的10个重复位点。该方法结合了CRISPR/Cas9,牛津纳米孔长读数测序的靶标富集,以及利用工具STRIque和Megalodon进行长度并行检测的生物信息学管道,序列,甲基化,和重复位点的组成。Clin-CATS允许对与成年共济失调相关的10个重复基因座进行精确和平行的分析,并同时揭示了其他参数,例如FMR1启动子甲基化和诊断所需的重复序列。使用Clin-CATS,我们分析了100例未确诊的共济失调患者的临床样本,并确定了28例患者的致病性重复扩展。平行重复分析使共济失调的分子诊断独立于基于临床表现的先入之见。RFC1内的双等位基因扩增被确定为共济失调的最常见原因。我们表征了所有患者的RFC1重复组成,并确定了一个新的重复基序,AGGGG.我们的结果强调了Clin-CATS作为一种易于扩展的工作流程的功能,用于深入分析和诊断表型重叠的重复扩增障碍。
    Instability of simple DNA repeats has been known as a common cause of hereditary ataxias for over 20 years. Routine genetic diagnostics of these phenotypically similar diseases still rely on an iterative workflow for quantification of repeat units by PCR-based methods of limited precision. We established and validated clinical nanopore Cas9-targeted sequencing, an amplification-free method for simultaneous analysis of 10 repeat loci associated with clinically overlapping hereditary ataxias. The method combines target enrichment by CRISPR-Cas9, Oxford Nanopore long-read sequencing and a bioinformatics pipeline using the tools STRique and Megalodon for parallel detection of length, sequence, methylation and composition of the repeat loci. Clinical nanopore Cas9-targeted sequencing allowed for the precise and parallel analysis of 10 repeat loci associated with adult-onset ataxia and revealed additional parameter such as FMR1 promotor methylation and repeat sequence required for diagnosis at the same time. Using clinical nanopore Cas9-targeted sequencing we analysed 100 clinical samples of undiagnosed ataxia patients and identified causative repeat expansions in 28 patients. Parallel repeat analysis enabled a molecular diagnosis of ataxias independent of preconceptions on the basis of clinical presentation. Biallelic expansions within RFC1 were identified as the most frequent cause of ataxia. We characterized the RFC1 repeat composition of all patients and identified a novel repeat motif, AGGGG. Our results highlight the power of clinical nanopore Cas9-targeted sequencing as a readily expandable workflow for the in-depth analysis and diagnosis of phenotypically overlapping repeat expansion disorders.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    Chloroplast genome sequencing is an essential tool to understand genome evolution and phylogenetic relationship. The available methods for constructing chloroplast genome include chloroplast enrichment followed by long overlapping PCR or extraction and assembly of chloroplast-specific reads from whole-genome datasets. In the present study, we propose an alternate strategy of extraction and assembly of chloroplast-specific reads from leaf transcriptome data of Pterocarpus santalinus using bowtie2 aligner program. The assembled genome was compared with the published chloroplast genome of P. santalinus for genome size, number of predicted genes, microsatellite repeat motifs, and nucleotide repeats. A near-complete chloroplast genome was assembled from the transcriptome reads. The proposed method requires less computational time and know-how, limited virtual memory, and is cost-effective when compared to whole-genome sequencing. Assembly of Cp genome from transcriptome data will enhance the resolution of phylogenetic studies through comparative plastome analysis, facilitate accurate species/genotype discrimination and accelerate the development of transplastomic plants with enhanced biotic and abiotic tolerance.
    UNASSIGNED: The online version contains supplementary material available at 10.1007/s13205-021-02943-0.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号