Pacific biosciences

  • 文章类型: Journal Article
    Our goal was to assess the accuracy of next generation sequencing (NGS) compared with Sanger. We performed single genome amplification (SGA) of HIV-1 gp160 on extracted tissue DNA from two HIV+ individuals. Amplicons (n = 30) were sequenced with Sanger or reamplified with barcoded primers and pooled before sequencing using Oxford Nanopore Technologies (ONT) and Pacific Biosciences (PB). For each amplicon, a consensus sequence for NGS reads was obtained by (1) mapping reads to the Sanger sequence when available (\"reference-based\") or (2) mapping reads to a \"pseudo-reference\" sequence, i.e., a consensus sequence of a subset of NGS reads (\"reference-free\"). PB reads were clustered based on genetic similarity. A Sanger consensus sequence was obtained for 23/30 amplicons, for which all NGS consensus sequences were identical (n = 9) or nearly identical (n = 14) compared with Sanger. For the nine mismatches between Sanger/NGS, the nucleotide in the NGS sequence matched all other sequences from that patient. Of the 7/30 amplicons without a Sanger sequence, NGS sequences had ≥35 ambiguous calls in five amplicons and 0 ambiguities in two amplicons. Analysis of the electropherograms showed failure of a single sequencing primer for the latter two amplicons (consistent with a single template) and overlapping peaks for the other five (consistent with multiple templates). Clustering results closely followed the Sanger/NGS consensus results, where amplicons derived from a single template also had a single cluster and vice versa (with one exception, which could be the result of barcode misidentification). Representative sequences from the clusters contained 2-13 differences compared with Sanger/NGS. In summary, we show that both ONT and PB can produce amplicon consensus sequences with similar or higher accuracy compared with Sanger and, importantly, without the need for a known reference sequence. Clustering could be useful in some circumstances to predict or confirm the presence of multiple starting templates.






  • 文章类型: Journal Article
    Here, I report the complete genome sequence of Vibrio sp. strain AH4, which had been isolated from moribund farmed Nile tilapia (Oreochromis niloticus). Assessment of the genome sequence of this strain revealed the presence of two linear chromosomes 2,894,109 bp and 1,082,372 bp.






  • 文章类型: Journal Article
    The whole-genome sequence of Staphylococcus epidermidis strain AH3 isolated from moribund farmed Nile tilapia (Oreochromis niloticus) was performed using a combination of the Illumina and Pacific Biosciences (PacBio) sequencing platforms. The genome sequence is composed of a single chromosome of 2,464,380 bp with a GC content of 32.2% and 2,220 predicted protein-coding genes.






  • 文章类型: Journal Article
    Recent studies on marine organisms have made use of third-generation sequencing technologies such as Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT). While these specialized bioinformatics tools have different algorithmic designs and performance capabilities, they offer scalability and can be applied to various datasets. We investigated the effectiveness of PacBio and ONT RNA sequencing methods in identifying the venom of the jellyfish species Nemopilema nomurai. We conducted a detailed analysis of the sequencing data from both methods, focusing on key characteristics such as CD, alternative splicing, long-chain noncoding RNA, simple sequence repeat, transcription factor, and functional transcript annotation. Our findings indicate that ONT generally produced higher raw data quality in the transcriptome analysis, while PacBio generated longer read lengths. PacBio was found to be superior in identifying CDs and long-chain noncoding RNA, whereas ONT was more cost-effective for predicting alternative splicing events, simple sequence repeats, and transcription factors. Based on these results, we conclude that PacBio is the most specific and sensitive method for identifying venom components, while ONT is the most cost-effective method for studying venogenesis, cnidocyst (venom gland) development, and transcription of virulence genes in jellyfish. Our study has implications for future sequencing technologies in marine jellyfish, and highlights the power of full-length transcriptome analysis in discovering potential therapeutic targets for jellyfish dermatitis.






  • 文章类型: Preprint
    Recently, Pacific Biosciences released a new highly accurate long-read sequencer called the Revio System that is projected to generate 30× HiFi whole-genome sequencing for the human genome within one sequencing SMRT Cell. Mouse and human genomes are similar in size. In this study, we sought to test this new sequencer by characterizing the genome and epigenome of the mouse neuronal cell line Neuro-2a. We generated long-read HiFi whole-genome sequencing on three Revio SMRT Cells, achieving a total coverage of 98×, with 30×, 32×, and 36× coverage respectively for each of the three Revio SMRT Cells. We performed several tests on these data including single-nucleotide variant and small insertion detection using GPU-accelerated DeepVariant, structural variant detection with pbsv, methylation detection with pb-CpG-tools, and generating de novo assemblies with the HiCanu and hifiasm assemblers. Overall, we find consistency across SMRT Cells in coverage, detection of variation, methylation, and de novo assemblies for each of the three SMRT Cells.






  • 文章类型: Journal Article
    The Lesser Prairie-Chicken (Tympanuchus pallidicinctus; LEPC) is an iconic North American prairie grouse, renowned for ornate and spectacular breeding season displays. Unfortunately, the species has disappeared across much of its historical range, with corresponding precipitous declines in contemporary population abundance, largely due to climatic and anthropogenic factors. These declines led to a 2022 US Fish and Wildlife decision to identify and list two distinct population segments (DPSs; i.e., northern and southern DPSs) as threatened or endangered under the 1973 Endangered Species Act. Herein, we describe an annotated reference genome that was generated from a LEPC sample collected from the southern DPS. We chose a representative from the southern DPS because of the potential for introgression in the northern DPS, where some populations hybridize with the Greater Prairie-Chicken (Tympanuchus cupido). This new LEPC reference assembly consists of 206 scaffolds, an N50 of 45 Mb, and 15,563 predicted protein-coding genes. We demonstrate the utility of this new genome assembly by estimating genome-wide heterozygosity in a representative LEPC and in related species. Heterozygosity in a LEPC sample was 0.0024, near the middle of the range (0.0003-0.0050) of related species. Overall, this new assembly provides a valuable resource that will enhance evolutionary and conservation genetic research in prairie grouse.






  • 文章类型: Journal Article
    Long-read sequencing technologies such as isoform sequencing can generate highly accurate sequences of full-length mRNA transcript isoforms. Such long-read transcriptomics may be especially useful in investigations of lymphocyte functional plasticity as it relates to human health and disease. However, no long-read isoform-aware reference transcriptomes of human circulating lymphocytes are readily available despite being valuable as benchmarks in a variety of transcriptomic studies. To begin to fill this gap, we purified 4 lymphocyte populations (CD4+ T, CD8+ T, NK, and Pan B cells) from the peripheral blood of a healthy male donor and obtained high-quality RNA (RIN > 8) for isoform sequencing and parallel RNA-Seq analyses. Many novel polyadenylated transcript isoforms, supported by both isoform sequencing and RNA-Seq data, were identified within each sample. The datasets met several metrics of high quality and have been deposited to the Gene Expression Omnibus database (GSE202327, GSE202328, GSE202329) as both raw and processed files to serve as long-read reference transcriptomes for future studies of human circulating lymphocytes.






  • 文章类型: Journal Article
    Throughout the entirety of human history, bacterial pathogens have played an important role and even shaped the fate of civilizations. The application of genomics within the last 27 years has radically changed the way we understand the biology and evolution of these pathogens. In this review, we discuss how the short- (Illumina) and long-read (PacBio, Oxford Nanopore) sequencing technologies have shaped the discipline of bacterial pathogen genomics, in terms of fundamental research (i.e., evolution of pathogenicity), forensics, food safety, and routine clinical microbiology. We have mined and discuss some of the most prominent data/bioinformatics resources such as NCBI pathogens, PATRIC, and Pathogenwatch. Based on this mining, we present some of the most popular sequencing technologies, hybrid approaches, assemblers, and annotation pipelines. A small number of bacterial pathogens are of very high importance, and we also present the wealth of the genomic data for these species (i.e., which ones they are, the number of antimicrobial resistance genes per genome, the number of virulence factors). Finally, we discuss how this discipline will probably be transformed in the near future, especially by transitioning into metagenome-assembled genomes (MAGs), thanks to long-read sequencing.






  • 文章类型: Journal Article
    Cervids are distinguished by the shedding and regrowth of antlers. Furthermore, they provide insights into prion and other diseases. Genomic resources can facilitate studies of the genetic underpinnings of deer phenotypes, behavior, and disease resistance. Widely distributed in North America, the white-tailed deer (Odocoileus virginianus) has recreational, commercial, and food source value for many households. We present a genome generated using DNA from a single Illinois white-tailed sequenced on the PacBio Sequel II platform and assembled using Wtdbg2. Omni-C chromatin conformation capture sequencing was used to scaffold the genome contigs. The final assembly was 2.42 Gb, consisting of 508 scaffolds with a contig N50 of 21.7 Mb, a scaffold N50 of 52.4 Mb, and a BUSCO complete score of 93.1%. Thirty-six chromosome pseudomolecules comprised 93% of the entire sequenced genome length. A total of 20 651 predicted genes using the BRAKER pipeline were validated using InterProScan. Chromosome length assembly sequences were aligned to the genomes of related species to reveal corresponding chromosomes.






  • 文章类型: Journal Article
    Recent development of long-read sequencing platforms has enabled researchers to explore bacterial community structure through analysis of full-length 16S rRNA gene (∼1,500 bp) or 16S-ITS-23S rRNA operon region (∼4,300 bp), resulting in higher taxonomic resolution than short-read sequencing platforms. Despite the potential of long-read sequencing in metagenomics, resources and protocols for this technology are scarce. Here, we describe MIrROR, the database and analysis tool for metataxonomics using the bacterial 16S-ITS-23S rRNA operon region. We collected 16S-ITS-23S rRNA operon sequences extracted from bacterial genomes from NCBI GenBank and performed curation. A total of 97,781 16S-ITS-23S rRNA operon sequences covering 9,485 species from 43,653 genomes were obtained. For user convenience, we provide an analysis tool based on a mapping strategy that can be used for taxonomic profiling with MIrROR database. To benchmark MIrROR, we compared performance against publicly available databases and tool with mock communities and simulated data sets. Our platform showed promising results in terms of the number of species covered and the accuracy of classification. To encourage active 16S-ITS-23S rRNA operon analysis in the field, BLAST function and taxonomic profiling results with 16S-ITS-23S rRNA operon studies, which have been reported as BioProject on NCBI are provided. MIrROR ( will be a useful platform for researchers who want to perform high-resolution metagenome analysis with a cost-effective sequencer such as MinION from Oxford Nanopore Technologies. IMPORTANCE Metabarcoding is a powerful tool to investigate community diversity in an economic and efficient way by amplifying a specific gene marker region. With the advancement of long-read sequencing technologies, the field of metabarcoding has entered a new phase. The technologies have brought a need for development in several areas, including new markers that long-read can cover, database for the markers, tools that reflect long-read characteristics, and compatibility with downstream analysis tools. By constructing MIrROR, we met the need for a database and tools for the 16S-ITS-23S rRNA operon region, which has recently been shown to have sufficient resolution at the species level. Bacterial community analysis using the 16S-ITS-23S rRNA operon region with MIrROR will provide new insights from various research fields.





