dark genes

  • 文章类型: Journal Article
    BACKGROUND: There exists a critical transition or tipping point during the complex biological process. Such critical transition is usually accompanied by the catastrophic consequences. Therefore, hunting for the tipping point or critical state is of significant importance to prevent or delay the occurrence of catastrophic consequences. However, predicting critical state based on the high-dimensional small sample data is a difficult problem, especially for single-cell expression data.
    RESULTS: In this study, we propose the comprehensive neighbourhood-based perturbed mutual information (CPMI) method to detect the critical states of complex biological processes. The CPMI method takes into account the relationship between genes and neighbours, so as to reduce the noise and enhance the robustness. This method is applied to a simulated dataset and six real datasets, including an influenza dataset, two single-cell expression datasets and three bulk datasets. The method can not only successfully detect the tipping points, but also identify their dynamic network biomarkers (DNBs). In addition, the discovery of transcription factors (TFs) which can regulate DNB genes and nondifferential \'dark genes\' validates the effectiveness of our method. The numerical simulation verifies that the CPMI method is robust under different noise strengths and is superior to the existing methods on identifying the critical states.
    CONCLUSIONS: In conclusion, we propose a robust computational method, i.e., CPMI, which is applicable in both the bulk and single cell datasets. The CPMI method holds great potential in providing the early warning signals for complex biological processes and enabling early disease diagnosis.






  • 文章类型: Journal Article
    Haploinsufficiency of the PRR12 gene is implicated in a human neuro-ocular syndrome. Although identified as a nuclear protein highly expressed in the embryonic mouse brain, PRR12 molecular function remains elusive. This study explores the spatio-temporal expression of zebrafish PRR12 co-orthologs, prr12a and prr12b, as a first step to elucidate their function. In silico analysis reveals high evolutionary conservation in the DNA-interacting domains for both orthologs, with significant syntenic conservation observed for the prr12b locus. In situ hybridization and RT-qPCR analyses on zebrafish embryos and larvae reveal distinct expression patterns: prr12a is expressed early in zygotic development, mainly in the central nervous system, while prr12b expression initiates during gastrulation, localizing later to dopaminergic telencephalic and diencephalic cell clusters. Both transcripts are enriched in the ganglion cell and inner neural layers of the 72 hpf retina, with prr12b widely distributed in the ciliary marginal zone. In the adult brain, prr12a and prr12b are found in the cerebellum, amygdala and ventral telencephalon, which represent the main areas affected in autistic patients. Overall, this study suggests PRR12\'s potential involvement in eye and brain development, laying the groundwork for further investigations into PRR12-related neurobehavioral disorders.






  • 文章类型: Journal Article
    BACKGROUND: The importance of early diagnosis of 5q-Spinal muscular atrophy (5q-SMA) has heightened as early intervention can significantly improve clinical outcomes. In 96% of cases, 5q-SMA is caused by a homozygous deletion of SMN1. Around 4 % of patients carry a SMN1 deletion and a single-nucleotide variant (SNV) on the other allele. Traditionally, diagnosis is based on multiplex ligation probe amplification (MLPA) to detect homozygous or heterozygous exon 7 deletions in SMN1. Due to high homologies within the SMN1/SMN2 locus, sequence analysis to identify SNVs of the SMN1 gene is unreliable by standard Sanger or short-read next-generation sequencing (srNGS) methods.
    OBJECTIVE: The objective was to overcome the limitations in high-throughput srNGS with the aim of providing SMA patients with a fast and reliable diagnosis to enable their timely therapy.
    METHODS: A bioinformatics workflow to detect homozygous SMN1 deletions and SMN1 SNVs on srNGS analysis was applied to diagnostic whole exome and panel testing for suggested neuromuscular disorders (1684 patients) and to fetal samples in prenatal diagnostics (260 patients). SNVs were detected by aligning sequencing reads from SMN1 and SMN2 to an SMN1 reference sequence. Homozygous SMN1 deletions were identified by filtering sequence reads for the ,, gene-determining variant\" (GDV).
    RESULTS: 10 patients were diagnosed with 5q-SMA based on (i) SMN1 deletion and hemizygous SNV (2 patients), (ii) homozygous SMN1 deletion (6 patients), and (iii) compound heterozygous SNVs in SMN1 (2 patients).
    CONCLUSIONS: Applying our workflow in srNGS-based panel and whole exome sequencing (WES) is crucial in a clinical laboratory, as otherwise patients with an atypical clinical presentation initially not suspected to suffer from SMA remain undiagnosed.






  • 文章类型: Journal Article
    Type 2 diabetes mellitus (T2DM) is a metabolic disease caused by multiple etiologies, the development of which can be divided into three states: normal state, critical state/pre-disease state, and disease state. To avoid irreversible development, it is important to detect the early warning signals before the onset of T2DM. However, detecting critical states of complex diseases based on high-throughput and strongly noisy data remains a challenging task. In this study, we developed a new method, i.e., degree matrix network entropy (DMNE), to detect the critical states of T2DM based on a sample-specific network (SSN). By applying the method to the datasets of three different tissues for experiments involving T2DM in rats, the critical states were detected, and the dynamic network biomarkers (DNBs) were successfully identified. Specifically, for liver and muscle, the critical transitions occur at 4 and 16 weeks. For adipose, the critical transition is at 8 weeks. In addition, we found some \"dark genes\" that did not exhibit differential expression but displayed sensitivity in terms of their DMNE score, which is closely related to the progression of T2DM. The information uncovered in our study not only provides further evidence regarding the molecular mechanisms of T2DM but may also assist in the development of strategies to prevent this disease.






  • 文章类型: Journal Article
    BACKGROUND: The evolution of complex diseases can be modeled as a time-dependent nonlinear dynamic system, and its progression can be divided into three states, i.e., the normal state, the pre-disease state and the disease state. The sudden deterioration of the disease can be regarded as the state transition of the dynamic system at the critical state or pre-disease state. How to detect the critical state of an individual before the disease state based on single-sample data has attracted many researchers\' attention.
    METHODS: In this study, we proposed a novel approach, i.e., single-sample-based Jensen-Shannon Divergence (sJSD) method to detect the early-warning signals of complex diseases before critical transitions based on individual single-sample data. The method aims to construct score index based on sJSD, namely, inconsistency index (ICI).
    RESULTS: This method is applied to five real datasets, including prostate cancer, bladder urothelial carcinoma, influenza virus infection, cervical squamous cell carcinoma and endocervical adenocarcinoma and pancreatic adenocarcinoma. The critical states of 5 datasets with their corresponding sJSD signal biomarkers are successfully identified to diagnose and predict each individual sample, and some \"dark genes\" that without differential expressions but are sensitive to ICI score were revealed. This method is a data-driven and model-free method, which can be applied to not only disease prediction on individuals but also targeted drug design of each disease. At the same time, the identification of sJSD signal biomarkers is also of great significance for studying the molecular mechanism of disease progression from a dynamic perspective.






  • 文章类型: Journal Article
    The human genome contains \"dark\" gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions.
    Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark and 35.2% are ≥ 5% dark. We identify dark regions that are present in protein-coding exons across 748 genes. Linked-read or long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduce dark protein-coding regions to approximately 50.5%, 35.6%, and 9.6%, respectively. We present an algorithm to resolve most camouflaged regions and apply it to the Alzheimer\'s Disease Sequencing Project. We rescue a rare ten-nucleotide frameshift deletion in CR1, a top Alzheimer\'s disease gene, found in disease cases but not in controls.
    While we could not formally assess the association of the CR1 frameshift mutation with Alzheimer\'s disease due to insufficient sample-size, we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies.






