Pacific biosciences

太平洋生物科学
  • 文章类型: Journal Article
    我们的目标是评估下一代测序(NGS)与Sanger相比的准确性。我们对来自两个HIV+个体的提取组织DNA进行了HIV-1gp160的单基因组扩增(SGA)。用Sanger对扩增子(n=30)进行测序,或用条形码引物重新扩增并在使用牛津纳米孔技术[ONT]和太平洋生物科学[PB]进行测序之前合并。对于每个扩增子,NGS读段的共有序列是通过(1)将读段映射到Sanger序列(“基于参考的”)或(2)将读段映射到“伪参考”序列获得的,即,NGS读段子集的共有序列(“无参考”)。基于遗传相似性对PB读数进行聚类。获得23/30扩增子的Sanger共有序列,与Sanger相比,所有NGS共有序列相同[n=9]或几乎相同[n=14]。对于Sanger/NGS之间的九个不匹配,NGS序列中的核苷酸与该患者的所有其他序列相匹配。在没有Sanger序列的7/30扩增子中,NGS序列在五个扩增子中有35个模糊的调用,两个扩增子中的模糊度和0。电泳图分析显示后两个扩增子的单个测序引物失败(与单个模板一致),和其他五个的重叠峰(与多个模板一致)。聚类结果紧随Sanger/NGS共识结果,其中来自单个模板的扩增子也具有单个簇,反之亦然(除了一个例外,这可能是条形码错误识别的结果)。来自簇的代表性序列与Sanger/NGS相比包含2-13个差异。总之,我们表明,与Sanger相比,ONT和PB都可以产生具有相似或更高准确性的扩增子共有序列,而且重要的是,不需要已知的参考序列。在某些情况下,聚类可能有助于预测或确认多个起始模板的存在。
    Our goal was to assess the accuracy of next generation sequencing (NGS) compared with Sanger. We performed single genome amplification (SGA) of HIV-1 gp160 on extracted tissue DNA from two HIV+ individuals. Amplicons (n = 30) were sequenced with Sanger or reamplified with barcoded primers and pooled before sequencing using Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PB). For each amplicon, a consensus sequence for NGS reads was obtained by (1) mapping reads to the Sanger sequence when available (\"reference-based\") or (2) mapping reads to a \"pseudo-reference\" sequence, i.e., a consensus sequence of a subset of NGS reads (\"reference-free\"). PB reads were clustered based on genetic similarity. A Sanger consensus sequence was obtained for 23/30 amplicons, for which all NGS consensus sequences were identical (n = 9) or nearly identical (n = 14) compared with Sanger. For the nine mismatches between Sanger/NGS, the nucleotide in the NGS sequence matched all other sequences from that patient. Of the 7/30 amplicons without a Sanger sequence, NGS sequences had ≥35 ambiguous calls in five amplicons and 0 ambiguities in two amplicons. Analysis of the electropherograms showed failure of a single sequencing primer for the latter two amplicons (consistent with a single template) and overlapping peaks for the other five (consistent with multiple templates). Clustering results closely followed the Sanger/NGS consensus results, where amplicons derived from a single template also had a single cluster and vice versa (with one exception, which could be the result of barcode misidentification). Representative sequences from the clusters contained 2-13 differences compared with Sanger/NGS. In summary, we show that both ONT and PB can produce amplicon consensus sequences with similar or higher accuracy compared with Sanger and, importantly, without the need for a known reference sequence. Clustering could be useful in some circumstances to predict or confirm the presence of multiple starting templates.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这里,我报告了弧菌的完整基因组序列。菌株AH4,已从垂死的尼罗罗非鱼(Oreochromisniloticus)中分离出来。对该菌株基因组序列的评估显示存在两个线性染色体2,894,109bp和1,082,372bp。
    Here, I report the complete genome sequence of Vibrio sp. strain AH4, which had been isolated from moribund farmed Nile tilapia (Oreochromis niloticus). Assessment of the genome sequence of this strain revealed the presence of two linear chromosomes 2,894,109 bp and 1,082,372 bp.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    使用Illumina和PacificBiosciences(PacBio)测序平台的组合,对从垂死的尼罗罗非鱼(Oreochromisniloticus)中分离出的表皮葡萄球菌菌株AH3的全基因组序列。基因组序列由2,464,380bp的单个染色体和2,220个预测的蛋白质编码基因组成,其GC含量为32.2%。
    The whole-genome sequence of Staphylococcus epidermidis strain AH3 isolated from moribund farmed Nile tilapia (Oreochromis niloticus) was performed using a combination of the Illumina and Pacific Biosciences (PacBio) sequencing platforms. The genome sequence is composed of a single chromosome of 2,464,380 bp with a GC content of 32.2% and 2,220 predicted protein-coding genes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    最近对海洋生物的研究利用了第三代测序技术,如太平洋生物科学(PacBio)和牛津纳米孔技术(ONT)。虽然这些专业的生物信息学工具具有不同的算法设计和性能,它们提供可伸缩性,可以应用于各种数据集。我们研究了PacBio和ONTRNA测序方法在鉴定水母物种Nemopilemanomurai毒液中的有效性。我们对两种方法的测序数据进行了详细分析,专注于CD等关键特征,交替拼接,长链非编码RNA,简单序列重复,转录因子,和功能转录本注释。我们的发现表明,ONT通常在转录组分析中产生更高的原始数据质量,而PacBio产生更长的读取长度。PacBio被发现在识别CD和长链非编码RNA方面具有优势,而ONT在预测选择性剪接事件方面更具成本效益,简单的序列重复,和转录因子。基于这些结果,我们得出的结论是,PacBio是鉴定毒液成分的最特异和最灵敏的方法,虽然ONT是研究静脉发生的最具成本效益的方法,刺胞(毒腺)的发展,和水母毒力基因的转录。我们的研究对未来海洋水母的测序技术有意义,并强调了全长转录组分析在发现水母性皮炎潜在治疗靶点方面的作用。
    Recent studies on marine organisms have made use of third-generation sequencing technologies such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). While these specialized bioinformatics tools have different algorithmic designs and performance capabilities, they offer scalability and can be applied to various datasets. We investigated the effectiveness of PacBio and ONT RNA sequencing methods in identifying the venom of the jellyfish species Nemopilema nomurai. We conducted a detailed analysis of the sequencing data from both methods, focusing on key characteristics such as CD, alternative splicing, long-chain noncoding RNA, simple sequence repeat, transcription factor, and functional transcript annotation. Our findings indicate that ONT generally produced higher raw data quality in the transcriptome analysis, while PacBio generated longer read lengths. PacBio was found to be superior in identifying CDs and long-chain noncoding RNA, whereas ONT was more cost-effective for predicting alternative splicing events, simple sequence repeats, and transcription factors. Based on these results, we conclude that PacBio is the most specific and sensitive method for identifying venom components, while ONT is the most cost-effective method for studying venogenesis, cnidocyst (venom gland) development, and transcription of virulence genes in jellyfish. Our study has implications for future sequencing technologies in marine jellyfish, and highlights the power of full-length transcriptome analysis in discovering potential therapeutic targets for jellyfish dermatitis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Preprint
    最近,PacificBiosciences发布了一种名为RevioSystem的新型高精度长读数测序仪,该系统预计将在一个测序SMRTCell中为人类基因组生成30×HiFi全基因组测序。小鼠和人类基因组的大小相似。在这项研究中,我们试图通过表征小鼠神经元细胞系Neuro-2a的基因组和表观基因组来测试这种新的测序仪。我们在三个RevioSMRT细胞上生成了长读HiFi全基因组测序,达到98×的总覆盖率,30×,32×,三个RevioSMRT单元中的每一个分别为36倍覆盖率。我们对这些数据进行了几项测试,包括使用GPU加速的DeepVariant进行单核苷酸变异和小插入检测。用pbsv检测结构变体,用pb-CpG工具检测甲基化,并与HiCanu和hifiasm汇编器一起生成从头汇编。总的来说,我们发现SMRT小区在覆盖范围上的一致性,检测变异,甲基化,和三个SMRT单元中的每一个的从头组件。
    Recently, Pacific Biosciences released a new highly accurate long-read sequencer called the Revio System that is projected to generate 30× HiFi whole-genome sequencing for the human genome within one sequencing SMRT Cell. Mouse and human genomes are similar in size. In this study, we sought to test this new sequencer by characterizing the genome and epigenome of the mouse neuronal cell line Neuro-2a. We generated long-read HiFi whole-genome sequencing on three Revio SMRT Cells, achieving a total coverage of 98×, with 30×, 32×, and 36× coverage respectively for each of the three Revio SMRT Cells. We performed several tests on these data including single-nucleotide variant and small insertion detection using GPU-accelerated DeepVariant, structural variant detection with pbsv, methylation detection with pb-CpG-tools, and generating de novo assemblies with the HiCanu and hifiasm assemblers. Overall, we find consistency across SMRT Cells in coverage, detection of variation, methylation, and de novo assemblies for each of the three SMRT Cells.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    小草原鸡(Tympanuchuspallidicinctus;LEPC)是一种标志性的北美草原松鸡,以华丽和壮观的繁殖季节展示而闻名。不幸的是,该物种在其大部分历史范围内都消失了,随着当代人口丰度的相应急剧下降,主要是由于气候和人为因素。这些下降导致2022U.S.鱼类和野生动物决定确定并列出两个不同的种群细分(即,北部和南部DPS)根据1973年《濒危物种法》受到威胁或濒危。在这里,我们描述了从SouthernDPS收集的LEPC样品产生的带注释的参考基因组。我们选择了来自南方DPS的代表,因为北方DPS可能会渗入,一些种群与大草原鸡(Tympanuchuscupido)杂交。这个新的LEPC参考组件由206个脚手架组成,45Mb的N50,和15563个预测的蛋白质编码基因。我们通过估计代表性LEPC和相关物种中的全基因组杂合性来证明这种新基因组组装的实用性。LEPC样品中的杂合性为0.0024,接近相关物种范围的中间(0.0003-0.0050)。总的来说,这种新的组装提供了宝贵的资源,将加强草原松鸡的进化和保护遗传研究。
    The Lesser Prairie-Chicken (Tympanuchus pallidicinctus; LEPC) is an iconic North American prairie grouse, renowned for ornate and spectacular breeding season displays. Unfortunately, the species has disappeared across much of its historical range, with corresponding precipitous declines in contemporary population abundance, largely due to climatic and anthropogenic factors. These declines led to a 2022 US Fish and Wildlife decision to identify and list two distinct population segments (DPSs; i.e., northern and southern DPSs) as threatened or endangered under the 1973 Endangered Species Act. Herein, we describe an annotated reference genome that was generated from a LEPC sample collected from the southern DPS. We chose a representative from the southern DPS because of the potential for introgression in the northern DPS, where some populations hybridize with the Greater Prairie-Chicken (Tympanuchus cupido). This new LEPC reference assembly consists of 206 scaffolds, an N50 of 45 Mb, and 15,563 predicted protein-coding genes. We demonstrate the utility of this new genome assembly by estimating genome-wide heterozygosity in a representative LEPC and in related species. Heterozygosity in a LEPC sample was 0.0024, near the middle of the range (0.0003-0.0050) of related species. Overall, this new assembly provides a valuable resource that will enhance evolutionary and conservation genetic research in prairie grouse.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    长读测序技术如同种型测序(Iso-Seq)可以产生高度准确的全长mRNA转录物同种型序列。这种长读转录组学在淋巴细胞功能可塑性研究中可能特别有用,因为它与人类健康和疾病有关。然而,尽管在多种转录组学研究中作为基准是有价值的,但没有人循环淋巴细胞的长读同工型感知参考转录组。为了开始填补这个空白,我们纯化了四个淋巴细胞群体(CD4+T,CD8+T,NK,和PanB细胞)来自健康男性供体的外周血,并获得高质量RNA(RIN>8)用于Iso-Seq和平行RNA-Seq分析。许多新的聚腺苷酸化转录物亚型,在每个样品中鉴定由Iso-Seq和RNA-Seq数据支持。数据集符合高质量的几个指标,并已作为原始文件和处理文件存储在基因表达综合(GEO)数据库(GSE202327,GSE202328,GSE202329)中,用作未来研究的长阅读参考转录组人类循环淋巴细胞。
    Long-read sequencing technologies such as isoform sequencing can generate highly accurate sequences of full-length mRNA transcript isoforms. Such long-read transcriptomics may be especially useful in investigations of lymphocyte functional plasticity as it relates to human health and disease. However, no long-read isoform-aware reference transcriptomes of human circulating lymphocytes are readily available despite being valuable as benchmarks in a variety of transcriptomic studies. To begin to fill this gap, we purified 4 lymphocyte populations (CD4+ T, CD8+ T, NK, and Pan B cells) from the peripheral blood of a healthy male donor and obtained high-quality RNA (RIN > 8) for isoform sequencing and parallel RNA-Seq analyses. Many novel polyadenylated transcript isoforms, supported by both isoform sequencing and RNA-Seq data, were identified within each sample. The datasets met several metrics of high quality and have been deposited to the Gene Expression Omnibus database (GSE202327, GSE202328, GSE202329) as both raw and processed files to serve as long-read reference transcriptomes for future studies of human circulating lymphocytes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在整个人类历史中,细菌病原体发挥了重要作用,甚至塑造了文明的命运。在过去的27年中,基因组学的应用彻底改变了我们理解这些病原体的生物学和进化的方式。在这次审查中,我们讨论了短期(Illumina)和长期(PacBio,牛津纳米孔)测序技术塑造了细菌病原体基因组学的学科,在基础研究方面(即,致病性进化),取证,食品安全,和常规临床微生物学。我们已经挖掘并讨论了一些最突出的数据/生物信息学资源,如NCBI病原体,PATRIC,和Pathogenwatch.基于这种挖掘,我们介绍了一些最流行的测序技术,混合方法,装配工,和注释管道。少数细菌病原体非常重要,我们还提供了这些物种的丰富基因组数据(即,他们是哪个,每个基因组的抗菌素抗性基因的数量,毒力因子的数量)。最后,我们讨论这个学科在不久的将来可能会如何转变,特别是通过过渡到宏基因组组装的基因组(MAG),感谢长期阅读测序。
    Throughout the entirety of human history, bacterial pathogens have played an important role and even shaped the fate of civilizations. The application of genomics within the last 27 years has radically changed the way we understand the biology and evolution of these pathogens. In this review, we discuss how the short- (Illumina) and long-read (PacBio, Oxford Nanopore) sequencing technologies have shaped the discipline of bacterial pathogen genomics, in terms of fundamental research (i.e., evolution of pathogenicity), forensics, food safety, and routine clinical microbiology. We have mined and discuss some of the most prominent data/bioinformatics resources such as NCBI pathogens, PATRIC, and Pathogenwatch. Based on this mining, we present some of the most popular sequencing technologies, hybrid approaches, assemblers, and annotation pipelines. A small number of bacterial pathogens are of very high importance, and we also present the wealth of the genomic data for these species (i.e., which ones they are, the number of antimicrobial resistance genes per genome, the number of virulence factors). Finally, we discuss how this discipline will probably be transformed in the near future, especially by transitioning into metagenome-assembled genomes (MAGs), thanks to long-read sequencing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    子宫颈的特征在于鹿角的脱落和再生。此外,它们提供了对朊病毒和其他疾病的见解。基因组资源可以促进对鹿表型的遗传基础的研究,行为,和抗病性。在北美广泛分布,白尾鹿(Odocoileusvirginianus)具有娱乐性,商业,和许多家庭的食物来源价值。我们介绍了使用在PacBioSequelII平台上测序并使用Wtdbg2组装的单个伊利诺伊州白尾DNA生成的基因组。使用Omni-C染色质构象捕获测序来支架基因组重叠群。最终组装为2.42Gb,由508个支架组成,重叠群N50为21.7Mb,脚手架N50为52.4Mb,BUSCO的总分为93.1%。36个染色体假分子占整个测序基因组长度的93%。使用InterProScan验证了使用BRAKER管道的总共20651个预测基因。将染色体长度组装序列与相关物种的基因组进行比对以揭示相应的染色体。
    Cervids are distinguished by the shedding and regrowth of antlers. Furthermore, they provide insights into prion and other diseases. Genomic resources can facilitate studies of the genetic underpinnings of deer phenotypes, behavior, and disease resistance. Widely distributed in North America, the white-tailed deer (Odocoileus virginianus) has recreational, commercial, and food source value for many households. We present a genome generated using DNA from a single Illinois white-tailed sequenced on the PacBio Sequel II platform and assembled using Wtdbg2. Omni-C chromatin conformation capture sequencing was used to scaffold the genome contigs. The final assembly was 2.42 Gb, consisting of 508 scaffolds with a contig N50 of 21.7 Mb, a scaffold N50 of 52.4 Mb, and a BUSCO complete score of 93.1%. Thirty-six chromosome pseudomolecules comprised 93% of the entire sequenced genome length. A total of 20 651 predicted genes using the BRAKER pipeline were validated using InterProScan. Chromosome length assembly sequences were aligned to the genomes of related species to reveal corresponding chromosomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    长读测序平台的最新发展使研究人员能够通过分析全长16SrRNA基因(〜1,500bp)或16S-ITS-23SrRNA操纵子区域(〜4,300bp)来探索细菌群落结构,导致比短读取测序平台更高的分类分辨率。尽管长读数测序在宏基因组学中具有潜力,这种技术的资源和协议是稀缺的。这里,我们描述了镜子,使用细菌16S-ITS-23SrRNA操纵子区域的代谢组学数据库和分析工具。我们从NCBIGenBank的细菌基因组中收集了16S-ITS-23SrRNA操纵子序列,并进行了管理。总共获得了97,781个16S-ITS-23SrRNA操纵子序列,涵盖了来自43,653个基因组的9,485个物种。为方便用户,我们提供了一种基于映射策略的分析工具,可用于MIrROR数据库的分类分析。要对镜子进行基准测试,我们将性能与公开可用的数据库和工具与模拟社区和模拟数据集进行了比较。我们的平台在覆盖的物种数量和分类的准确性方面显示了有希望的结果。为了鼓励现场积极的16S-ITS-23SrRNA操纵子分析,使用16S-ITS-23SrRNA操纵子研究的BLAST功能和分类学分析结果,提供了在NCBI上报告为BioProject的文件。MirROR(http://mirror。egnome.co.kr/)对于希望使用具有成本效益的测序仪(例如牛津纳米孔技术公司的MinION)进行高分辨率宏基因组分析的研究人员来说,将是一个有用的平台。IMPORTANCEMetabarcoding是通过扩增特定基因标记区以经济有效的方式研究社区多样性的有力工具。随着长读数测序技术的进步,元编码领域进入了一个新的阶段。这些技术带来了几个领域的发展需求,包括长读可以覆盖的新标记,标记的数据库,反映长读特征的工具,以及与下游分析工具的兼容性。通过构造镜子,我们满足了16S-ITS-23SrRNA操纵子区域的数据库和工具的需求,最近被证明在物种层面有足够的分辨率。使用具有MIrROR的16S-ITS-23SrRNA操纵子区域的细菌群落分析将提供来自各个研究领域的新见解。
    Recent development of long-read sequencing platforms has enabled researchers to explore bacterial community structure through analysis of full-length 16S rRNA gene (∼1,500 bp) or 16S-ITS-23S rRNA operon region (∼4,300 bp), resulting in higher taxonomic resolution than short-read sequencing platforms. Despite the potential of long-read sequencing in metagenomics, resources and protocols for this technology are scarce. Here, we describe MIrROR, the database and analysis tool for metataxonomics using the bacterial 16S-ITS-23S rRNA operon region. We collected 16S-ITS-23S rRNA operon sequences extracted from bacterial genomes from NCBI GenBank and performed curation. A total of 97,781 16S-ITS-23S rRNA operon sequences covering 9,485 species from 43,653 genomes were obtained. For user convenience, we provide an analysis tool based on a mapping strategy that can be used for taxonomic profiling with MIrROR database. To benchmark MIrROR, we compared performance against publicly available databases and tool with mock communities and simulated data sets. Our platform showed promising results in terms of the number of species covered and the accuracy of classification. To encourage active 16S-ITS-23S rRNA operon analysis in the field, BLAST function and taxonomic profiling results with 16S-ITS-23S rRNA operon studies, which have been reported as BioProject on NCBI are provided. MIrROR (http://mirror.egnome.co.kr/) will be a useful platform for researchers who want to perform high-resolution metagenome analysis with a cost-effective sequencer such as MinION from Oxford Nanopore Technologies. IMPORTANCE Metabarcoding is a powerful tool to investigate community diversity in an economic and efficient way by amplifying a specific gene marker region. With the advancement of long-read sequencing technologies, the field of metabarcoding has entered a new phase. The technologies have brought a need for development in several areas, including new markers that long-read can cover, database for the markers, tools that reflect long-read characteristics, and compatibility with downstream analysis tools. By constructing MIrROR, we met the need for a database and tools for the 16S-ITS-23S rRNA operon region, which has recently been shown to have sufficient resolution at the species level. Bacterial community analysis using the 16S-ITS-23S rRNA operon region with MIrROR will provide new insights from various research fields.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号