differential expression analysis

  • 文章类型: Journal Article
    Missing covariate data is a common problem that has not been addressed in observational studies of gene expression. Here, we present a multiple imputation method that accommodates high dimensional gene expression data by incorporating principal component analysis of the transcriptome into the multiple imputation prediction models to avoid bias. Simulation studies using three datasets show that this method outperforms complete case and single imputation analyses at uncovering true positive differentially expressed genes, limiting false discovery rates, and minimizing bias. This method is easily implemented via an R Bioconductor package, RNAseqCovarImpute that integrates with the limma-voom pipeline for differential expression analysis.






  • 文章类型: Journal Article
    Breast cancer has the highest diagnosis rate among all cancers. Tumor budding (TB) is recognized as a recent prognostic marker. Identifying genes specific to high-TB samples is crucial for hindering tumor progression and metastasis. In this study, we utilized an RNA sequencing technique, called TempO-Seq, to profile transcriptomic data from breast cancer samples, aiming to identify biomarkers for high-TB cases. Through differential expression analysis and mutual information, we identified seven genes (NOL4, STAR, C8G, NEIL1, SLC46A3, FRMD6, and SCARF2) that are potential biomarkers in breast cancer. To gain more relevant proteins, further investigation based on a protein-protein interaction network and the network diffusion technique revealed enrichment in the Hippo signaling and Wnt signaling pathways, promoting tumor initiation, invasion, and metastasis in several cancer types. In conclusion, these novel genes, recognized as overexpressed in high-TB samples, along with their associated pathways, offer promising therapeutic targets, thus advancing treatment and diagnosis for breast cancer.






  • 文章类型: Journal Article
    The integration of data from multiple sources and analytical techniques to obtain novel insights and answer challenging questions is a hallmark of modern science. In arthropods, exocrine secretions may act as pheromones, defensive substances, antibiotics, as well as surface protectants, and as such they play a crucial role in ecology and evolution. Exocrine chemical compounds are frequently characterized by gas chromatography-mass spectrometry. Technological advances of recent years now allow us to routinely characterize the total gene complement transcribed in a particular biological tissue, often in the context of experimental treatment, via RNAseq. We here introduce a novel methodological approach to successfully characterize exocrine secretions and full transcriptomes of one and the same individual of oribatid mites. We found that chemical extraction prior to RNA extraction had only minor effects on the total RNA integrity. De novo transcriptomes obtained from such combined extractions were of comparable quality to those assembled for samples that were subject to RNA extraction only, indicating that combined chemical/RNA extraction is perfectly suitable for phylotranscriptomic studies. However, in-depth analysis of RNA expression analysis indicates that chemical extraction prior to RNAseq may affect transcript degradation rates, similar to the effects reported in previous studies comparing RNA extraction protocols. With this pilot study, we demonstrate that profiling chemical secretions and RNA expression levels from the same individual is methodologically feasible, paving the way for future research to understand the genes and pathways underlying the syntheses of biogenic chemical compounds. Our approach should be applicable broadly to most arachnids, insects, and other arthropods.






  • 文章类型: Journal Article
    BACKGROUND: Effective identification of differentially expressed genes (DEGs) has been challenging for single-cell RNA sequencing (scRNA-seq) profiles. Many existing algorithms have high false positive rates (FPRs) and often fail to identify weak biological signals.
    RESULTS: We present a novel method for identifying DEGs in scRNA-seq data called RankCompV3. It is based on the comparison of relative expression orderings (REOs) of gene pairs which are determined by comparing the expression levels of a pair of genes in a set of single-cell profiles. The numbers of genes with consistently higher or lower expression levels than the gene of interest are counted in two groups in comparison, respectively, and the result is tabulated in a 3 × 3 contingency table which is tested by McCullagh\'s method to determine if the gene is dysregulated. In both simulated and real scRNA-seq data, RankCompV3 tightly controlled the FPR and demonstrated high accuracy, outperforming 11 other common single-cell DEG detection algorithms. Analysis with either regular single-cell or synthetic pseudo-bulk profiles produced highly concordant DEGs with the ground-truth. In addition, RankCompV3 demonstrates higher sensitivity to weak biological signals than other methods. The algorithm was implemented using Julia and can be called in R. The source code is available at https://github.com/pathint/RankCompV3.jl .
    CONCLUSIONS: The REOs-based algorithm is a valuable tool for analyzing single-cell RNA profiles and identifying DEGs with high accuracy and sensitivity.






  • 文章类型: Journal Article
    The watermelon (Citrullus lanatus L.) holds substantial economic value as a globally cultivated horticultural crop. However, the genetic architecture of watermelon fruit weight (FW) remains poorly understood. In this study, we used sh14-11 with small fruit and N14 with big fruit to construct 100 recombinant inbred lines (RILs). Based on whole-genome resequencing (WGR), 218,127 single nucleotide polymorphisms (SNPs) were detected to construct a high-quality genetic map. After quantitative trait loci (QTL) mapping, a candidate interval of 31-38 Mb on chromosome 2 was identified for FW. Simultaneously, the bulked segregant analysis (BSA) in the F2 population corroborated the identification of the same interval, encompassing the homologous gene linked to the known FW-related gene fas. Additionally, RNA-seq was carried out across 11 tissues from sh14-11 and N14, revealing expression profiles that identified 1695 new genes and corrected the annotation of 2941 genes. Subsequent differential expression analysis unveiled 8969 differentially expressed genes (DEGs), with 354 of these genes exhibiting significant differences across four key developmental stages. The integration of QTL mapping and differential expression analysis facilitated the identification of 14 FW-related genes, including annotated TGA and NAC transcription factors implicated in fruit development. This combined approach offers valuable insights into the genetic basis of FW, providing crucial resources for enhancing watermelon cultivation.






  • 文章类型: Journal Article
    The high incidence of idiopathic recurrent pregnancy loss (iRPL) may stem from the limited research on male contributory factors. Many studies suggest that sperm DNA fragmentation and oxidative stress contribute to iRPL, but their roles are still debated. MicroRNAs (miRNAs) are short non-coding RNAs that regulate various biological processes by modulating gene expression. While differential expression of specific miRNAs has been observed in women suffering from recurrent miscarriages, paternal miRNAs remain unexplored. We hypothesize that analyzing sperm miRNAs can provide crucial insights into the pathophysiology of iRPL. Therefore, this study aims to identify dysregulated miRNAs in the spermatozoa of male partners of iRPL patients. Total mRNA was extracted from sperm samples of iRPL and control groups, followed by miRNA library preparation and high-output miRNA sequencing. Subsequently, raw sequence reads were processed for differential expression analysis, target prediction, and bioinformatics analysis. Twelve differentially expressed miRNAs were identified in the iRPL group, with eight miRNAs upregulated (hsa-miR-4454, hsa-miR-142-3p, hsa-miR-145-5p, hsa-miR-1290, hsa-miR-1246, hsa-miR-7977, hsa-miR-449c-5p, and hsa-miR-92b-3p) and four downregulated (hsa-miR-29c-3p, hsa-miR-30b-5p, hsa-miR-519a-2-5p, and hsa-miR-520b-5p). Functional enrichment analysis revealed that gene targets of the upregulated miRNAs are involved in various biological processes closely associated with sperm quality and embryonic development.






  • 文章类型: Journal Article
    Transcriptional profiling has become a common tool for investigating the nervous system. During analysis, differential expression results are often compared to functional ontology databases, which contain curated gene sets representing well-studied pathways. This dependence can cause neuroscience studies to be interpreted in terms of functional pathways documented in better studied tissues (e.g., liver) and topics (e.g., cancer), and systematically emphasizes well-studied genes, leaving other findings in the obscurity of the brain \"ignorome\". To address this issue, we compiled a curated database of 918 gene sets related to nervous system function, tissue, and cell types (\"Brain.GMT\") that can be used within common analysis pipelines (GSEA, limma, edgeR) to interpret results from three species (rat, mouse, human). Brain.GMT includes brain-related gene sets curated from the Molecular Signatures Database (MSigDB) and extracted from public databases (GeneWeaver, Gemma, DropViz, BrainInABlender, HippoSeq) and published studies containing differential expression results. Although Brain.GMT is still undergoing development and currently only represents a fraction of available brain gene sets, \"brain ignorome\" genes are already better represented than in traditional Gene Ontology databases. Moreover, Brain.GMT substantially improves the quantity and quality of gene sets identified as enriched with differential expression in neuroscience studies, enhancing interpretation. •We compiled a curated database of 918 gene sets related to nervous system function, tissue, and cell types (\"Brain.GMT\").•Brain.GMT can be used within common analysis pipelines (GSEA, limma, edgeR) to interpret neuroscience transcriptional profiling results from three species (rat, mouse, human).•Although Brain.GMT is still undergoing development, it substantially improved the interpretation of differential expression results within our initial use cases.






  • 文章类型: Letter






  • 文章类型: Journal Article
    Wooden Breast (WB) abnormality represents one of the major challenges that the poultry industry has faced in the last 10 years. Despite the enormous progress in understanding the mechanisms underlying WB, the precise initial causes remain to be clarified. In this scenario, the present research is intended to characterize the gene expression profiles of broiler Pectoralis major muscles affected by WB, comparing them to the unaffected counterpart, to provide new insights into the biological mechanisms underlying this defect and potentially identifying novel genes likely involved in its occurrence. To this purpose, data obtained in a previous study through the RNA-sequencing technology have been used to identify differentially expressed genes (DEGs) between 6 affected and 5 unaffected broilers\' breast muscles, by using the newest reference genome assembly for Gallus gallus (GRCg7b). Also, to deeply investigate molecular and biological pathways involved in the WB progression, pathways analyses have been performed. The results achieved through the differential gene expression analysis mainly evidenced the downregulation of glycogen metabolic processes, gluconeogenesis, and tricarboxylic acid cycle in WB muscles, thus corroborating the evidence of a dysregulated energy metabolism characterizing breasts affected by this abnormality. Also, genes related to hypertrophic muscle growth have been identified as differentially expressed (e.g., WFIKKN1). Together with that, a downregulation of genes involved in mitochondrial biogenesis and functionality has been detected. Among them, PPARGC1A and PPARGC1B chicken genes are particularly noteworthy. These genes not only have essential roles in regulating mitochondrial biogenesis but also play pivotal roles in maintaining glucose and energy homeostasis. In view of that, their downregulation in WB-affected muscle may be considered as potentially related to both the mitochondrial dysfunction and altered glucose metabolism in WB muscles, and their key involvement in the molecular alterations characterizing this muscular abnormality might be hypothesized.






  • 文章类型: Journal Article
    The mechanism of spinal cord injury (SCI) is highly complex, and an increasing number of studies have indicated the involvement of pyroptosis in the physiological and pathological processes of secondary SCI. However, there is limited bioinformatics research on pyroptosis-related genes (PRGs) in SCI. This study aims to identify and validate differentially expressed PRGs in the GEO database, perform bioinformatics analysis, and construct regulatory networks to explore potential regulatory mechanisms and therapeutic targets for SCI. We obtained high-throughput sequencing datasets of SCI in rats and mice from the GEO database. Differential analysis was conducted using the \"limma\" package in R to identify differentially expressed genes (DEGs). These genes were then intersected with previously reported PRGs, resulting in a set of PRGs in SCI. GO and KEGG enrichment analyses, as well as correlation analysis, were performed on the PRGs in both rat and mouse models of SCI. Additionally, a protein-protein interaction (PPI) network was constructed using the STRING website to examine the relationships between proteins. Hub genes were identified using Cytoscape software, and the intersection of the top 5 hub genes in rats and mice were selected for subsequent experimentally validated. Furthermore, a competing endogenous RNA (ceRNA) network was constructed to explore potential regulatory mechanisms. The gene expression profiles of GSE93249, GSE133093, GSE138637, GSE174549, GSE45376, GSE171441_3d and GSE171441_35d were selected in this study. We identified 10 and 12 PRGs in rats and mice datasets respectively. Six common DEGs were identified in the intersection of rats and mice PRGs. Enrichment analysis of these DEGs indicated that GO analysis was mainly focused on inflammation-related factors, while KEGG analysis showed that the most genes were enriched on the NOD-like receptor signaling pathway. We constructed a ceRNA regulatory network that consisted of five important PRGs, as well as 24 miRNAs and 34 lncRNAs. This network revealed potential regulatory mechanisms. Additionally, the three hub genes obtained from the intersection were validated in the rat model, showing high expression of PRGs in SCI. Pyroptosis is involved in secondary SCI and may play a significant role in its pathogenesis. The regulatory mechanisms associated with pyroptosis deserve further in-depth research.





