differential gene expression

差异基因表达
  • 文章类型: Journal Article
    在差异基因表达数据分析中,一个目标是从一个大的数据集中鉴定一组共表达的基因,以便检测这一组基因与实验条件之间的关联。这通常是通过聚类方法来完成的,如k-均值或二分层次聚类,基于分组过程中的特定相似性度量。在这样的数据集中,基因差异表达本身是一个固有的属性,可以在特征提取过程中使用。例如,在一个由多个治疗与对照组成的数据集中,每种治疗中基因的表达会有三种可能的行为,上调,下调,或不变。我们在本章介绍,一种差分表达式特征提取(DEFE)方法,通过在每个字符处使用由三个数值组成的字符串来表示这种行为,即,1=向上,2=向下,和0=不变,这导致在所有B比较中多达3B的差异表达模式。这种方法已成功应用于许多研究项目中,其中,我们证明了DEFE的强度在一个案例研究的RNA测序(RNA-seq)数据分析的小麦挑战的植物病原真菌,镰刀菌。多种DEFE模式方案的组合揭示了与FHB抗性或易感性相关的基因群。
    In differential gene expression data analysis, one objective is to identify groups of co-expressed genes from a large dataset in order to detect the association between such a group of genes and an experimental condition. This is often done through a clustering approach, such as k-means or bipartition hierarchical clustering, based on particular similarity measures in the grouping process. In such a dataset, the gene differential expression itself is an innate attribute that can be used in the feature extraction process. For example, in a dataset consisting of multiple treatments versus their controls, the expression of a gene in each treatment would have three possible behaviors, upregulated, downregulated, or unchanged. We present in this chapter, a differential expression feature extraction (DEFE) method by using a string consisting of three numerical values at each character to denote such behavior, i.e., 1 = up, 2 = down, and 0 = unchanged, which results in up to 3B differential expression patterns across all B comparisons. This approach has been successfully applied in many research projects, and among these, we demonstrate the strength of DEFE in a case study on RNA-sequencing (RNA-seq) data analysis of wheat challenged with the phytopathogenic fungus, Fusarium graminearum. Combinations of multiple schemes of DEFE patterns revealed groups of genes putatively associated with resistance or susceptibility to FHB.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    通过纳米孔的单分子的直接测序允许天然RNA或互补DNA(cDNA)的准确定量和全长表征而无需扩增。基于纳米孔的天然RNA和cDNA方法都涉及低成本的复杂转录组程序。然而,这两种方法有几个不同。在这项研究中,我们进行匹配的天然RNA测序和cDNA测序,以进行相关的比较和评估.使用酿酒酵母,一种广泛用于工业生物技术的真核生物模型,考虑两种不同的生长条件进行比较,包括从在补充有葡萄糖的呼吸发酵条件(葡萄糖生长条件)下在最小培养基中生长的酵母细胞和从已经转移到乙醇作为碳源(乙醇生长条件)的细胞中分离的poly-A信使RNA。用于直接RNA测序的文库制备比用于直接cDNA测序的文库制备短。两种方法的序列特征不同,例如序列产量,阅读质量评分,读取长度分布,并映射到读取的参考能力。然而,来自两种方法的差异基因表达分析具有可比性。直接RNA测序的独特特征是RNA修饰;我们发现,由于直接RNA测序的3'偏倚行为,转录物5'末端的RNA修饰被低估。我们从这项工作中进行的综合评估可以帮助研究人员在选择合适的长读取测序方法来理解基因功能时做出明智的选择。通路,和详细的功能表征。
    Direct sequencing of single molecules through nanopores allows for accurate quantification and full-length characterization of native RNA or complementary DNA (cDNA) without amplification. Both nanopore-based native RNA and cDNA approaches involve complex transcriptome procedures at a lower cost. However, there are several differences between the two approaches. In this study, we perform matched native RNA sequencing and cDNA sequencing to enable relevant comparisons and evaluation. Using Saccharomyces cerevisiae, a eukaryotic model organism widely used in industrial biotechnology, two different growing conditions are considered for comparison, including the poly-A messenger RNA isolated from yeast cells grown in minimum media under respirofermentative conditions supplemented with glucose (glucose growth conditions) and from cells that had shifted to ethanol as a carbon source (ethanol growth conditions). Library preparation for direct RNA sequencing is shorter than that for direct cDNA sequencing. The sequence characteristics of the two methods were different, such as sequence yields, quality score of reads, read length distribution, and mapped on reference ability of reads. However, differential gene expression analyses derived from the two approaches are comparable. The unique feature of direct RNA sequencing is RNA modification; we found that the RNA modification at the 5\' end of a transcript was underestimated due to the 3\' bias behavior of the direct RNA sequencing. Our comprehensive evaluation from this work could help researchers make informed choices when selecting an appropriate long-read sequencing method for understanding gene functions, pathways, and detailed functional characterization.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    RNA-Seq is increasingly being used to measure human RNA expression on a genome-wide scale. Expression profiles can be interrogated to identify and functionally characterize treatment-responsive genes. Ultimately, such controlled studies promise to reveal insights into molecular mechanisms of treatment effects, identify biomarkers, and realize personalized medicine. RNA-Seq Reports (RSEQREP) is a new open-source cloud-enabled framework that allows users to execute start-to-end gene-level RNA-Seq analysis on a preconfigured RSEQREP Amazon Virtual Machine Image (AMI) hosted by AWS or on their own Ubuntu Linux machine via a Docker container or installation script. The framework works with unstranded, stranded, and paired-end sequence FASTQ files stored locally, on Amazon Simple Storage Service (S3), or at the Sequence Read Archive (SRA). RSEQREP automatically executes a series of customizable steps including reference alignment, CRAM compression, reference alignment QC, data normalization, multivariate data visualization, identification of differentially expressed genes, heatmaps, co-expressed gene clusters, enriched pathways, and a series of custom visualizations. The framework outputs a file collection that includes a dynamically generated PDF report using R, knitr, and LaTeX, as well as publication-ready table and figure files. A user-friendly configuration file handles sample metadata entry, processing, analysis, and reporting options. The configuration supports time series RNA-Seq experimental designs with at least one pre- and one post-treatment sample for each subject, as well as multiple treatment groups and specimen types. All RSEQREP analyses components are built using open-source R code and R/Bioconductor packages allowing for further customization. As a use case, we provide RSEQREP results for a trivalent influenza vaccine (TIV) RNA-Seq study that collected 1 pre-TIV and 10 post-TIV vaccination samples (days 1-10) for 5 subjects and two specimen types (peripheral blood mononuclear cells and B-cells).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号