Biological network

  • 文章类型: Journal Article
    Link prediction (LP) is a task for the identification of potential, missing and spurious links in complex networks. Protein-protein interaction (PPI) networks are important for understanding the underlying biological mechanisms of diseases. Many complex networks have been constructed using LP methods; however, there are a limited number of studies that focus on disease-related gene predictions and evaluate these genes using various evaluation criteria. The main objective of the study is to investigate the effect of a simple ensemble method in disease related gene predictions. Local similarity indices (LSIs) based disease related gene predictions were integrated by a simple ensemble decision method, simple majority voting (SMV), on the PPI network to detect accurate disease related genes. Human PPI network was utilized to discover potential disease related genes using four LSIs for the gene prediction. LSIs discovered potential links between disease related genes, which were obtained from OMIM database for gastric, colorectal, breast, prostate and lung cancers. LSIs based disease related genes were ranked due to their LSI scores in descending order for retrieving the top 10, 50 and 100 disease related genes. SMV integrated four LSIs based predictions to obtain SMV based the top 10, 50 and 100 disease related genes. The performance of LSIs based and SMV based genes were evaluated separately by employing overlap analyses, which were performed with GeneCard disease-gene relation dataset and Gene Ontology (GO) terms. The GO-terms were used for biological assessment for the inferred gene lists by LSIs and SMV on all cancer types. Adamic-Adar (AA), Resource Allocation Index (RAI), and SMV based gene lists are generally achieved good performance results on all cancers in both overlap analyses. SMV also outperformed on breast cancer data. The increment in the selection of the number of the top ranked disease related genes also enhanced the performance results of SMV.






  • 文章类型: Journal Article
    Systematic characterization of biological effects to genetic perturbation is essential to the application of molecular biology and biomedicine. However, the experimental exhaustion of genetic perturbations on the genome-wide scale is challenging. Here, we show TranscriptionNet, a deep learning model that integrates multiple biological networks to systematically predict transcriptional profiles to three types of genetic perturbations based on transcriptional profiles induced by genetic perturbations in the L1000 project: RNA interference, clustered regularly interspaced short palindromic repeat, and overexpression. TranscriptionNet performs better than existing approaches in predicting inducible gene expression changes for all three types of genetic perturbations. TranscriptionNet can predict transcriptional profiles for all genes in existing biological networks and increases perturbational gene expression changes for each type of genetic perturbation from a few thousand to 26 945 genes. TranscriptionNet demonstrates strong generalization ability when comparing predicted and true gene expression changes on different external tasks. Overall, TranscriptionNet can systemically predict transcriptional consequences induced by perturbing genes on a genome-wide scale and thus holds promise to systemically detect gene function and enhance drug development and target discovery.






  • 文章类型: Journal Article
    Biological networks serve a crucial role in elucidating intricate biological processes. While interspecies environmental interactions have been extensively studied, the exploration of gene interactions within species, particularly among individual microorganisms, is less developed. The increasing amount of microbiome genomic data necessitates a more nuanced analysis of microbial genome structures and functions. In this context, we introduce a complex structure using higher-order network theory, \"Solid Motif Structures (SMS)\", via a hierarchical biological network analysis of genomes within the same genus, effectively linking microbial genome structure with its function. Leveraging 162 high-quality genomes of Microcystis, a key freshwater cyanobacterium within microbial ecosystems, we established a genome structure network. Employing deep learning techniques, such as adaptive graph encoder, we uncovered 27 critical functional subnetworks and their associated SMSs. Incorporating metagenomic data from seven geographically distinct lakes, we conducted an investigation into Microcystis\' functional stability under varying environmental conditions, unveiling unique functional interaction models for each lake. Our work compiles these insights into an extensive resource repository, providing novel perspectives on the functional dynamics within Microcystis. This research offers a hierarchical network analysis framework for understanding interactions between microbial genome structures and functions within the same genus.






  • 文章类型: Journal Article
    Deciphering the intricate relationships between transcription factors (TFs), enhancers, and genes through the inference of enhancer-driven gene regulatory networks (eGRNs) is crucial in understanding gene regulatory programs in a complex biological system. This study introduces STREAM, a novel method that leverages a Steiner forest problem model, a hybrid biclustering pipeline, and submodular optimization to infer eGRNs from jointly profiled single-cell transcriptome and chromatin accessibility data. Compared to existing methods, STREAM demonstrates enhanced performance in terms of TF recovery, TF-enhancer linkage prediction, and enhancer-gene relation discovery. Application of STREAM to an Alzheimer\'s disease dataset and a diffuse small lymphocytic lymphoma dataset reveals its ability to identify TF-enhancer-gene relations associated with pseudotime, as well as key TF-enhancer-gene relations and TF cooperation underlying tumor cells.






  • 文章类型: Journal Article
    The growing complexity of biological data has spurred the development of innovative computational techniques to extract meaningful information and uncover hidden patterns within vast datasets. Biological networks, such as gene regulatory networks and protein-protein interaction networks, hold critical insights into biological features\' connections and functions. Integrating and analyzing high-dimensional data, particularly in gene expression studies, stands prominent among the challenges in deciphering these networks. Clustering methods play a crucial role in addressing these challenges, with spectral clustering emerging as a potent unsupervised technique considering intrinsic geometric structures. However, spectral clustering\'s user-defined cluster number can lead to inconsistent and sometimes orthogonal clustering regimes. We propose the Multi-layer Bundling (MLB) method to address this limitation, combining multiple prominent clustering regimes to offer a comprehensive data view. We call the outcome clusters \"bundles\". This approach refines clustering outcomes, unravels hierarchical organization, and identifies bridge elements mediating communication between network components. By layering clustering results, MLB provides a global-to-local view of biological feature clusters enabling insights into intricate biological systems. Furthermore, the method enhances bundle network predictions by integrating the bundle co-cluster matrix with the affinity matrix. The versatility of MLB extends beyond biological networks, making it applicable to various domains where understanding complex relationships and patterns is needed.






  • 文章类型: Journal Article
    BACKGROUND: Ginseng Radix and Astragali Radix are commonly combined to tonify Qi and alleviate fatigue. Previous studies have employed biological networks to investigate the mechanisms of herb pairs in treating different diseases. However, these studies have only elucidated a single network for each herb pair, without emphasizing the superiority of the herb combination over individual herbs.
    OBJECTIVE: This study proposes an approach of comparing biological networks to highlight the synergistic effect of the pair in treating cancer-related fatigue (CRF).
    METHODS: The compounds and targets of Ginseng Radix, Astragali Radix, and CRF diseases were collected and predicted using different databases. Subsequently, the overlapping targets between herbs and disease were imported into the STRING and DAVID tools to build protein-protein interaction (PPI) networks and analyze enriched KEGG pathways. The biological networks of Ginseng Radix and Astragali Radix were compared separately or together using the DyNet application. Molecular docking was used to verify the predicted results. Further, in vitro experiments were conducted to validate the synergistic pathways identified in in silico studies.
    RESULTS: In the PPI network comparison, the combination created 89 new interactions and an increased average degree (11.260) when compared to single herbs (10.296 and 9.394). The new interactions concentrated on HRAS, STAT3, JUN, and IL6. The topological analysis identified 20 core targets of the combination, including three Ginseng Radix-specific targets, three Astragali Radix-specific targets, and 14 shared targets. In KEGG enrichment analysis, the combination regulated additional signaling pathways (152) more than Ginseng Radix (146) and Astragali Radix (134) alone. The targets of the herb pair synergistically regulated cancer pathways, specifically hypoxia-inducible factor 1 (HIF-1) signaling pathway. In vitro experiments including enzyme-linked immunosorbent assay and Western blot demonstrated that two herbs combination could up-regulate HIF-1α signaling pathway at different combined concentrations compared to either single herb alone.
    CONCLUSIONS: The herb pair increased protein interactions and adjusted metabolic pathways more than single herbs. This study provides insights into the combination of Ginseng Radix and Astragali Radix in clinical practice.






  • 文章类型: Journal Article
    Kidney transplantation is the preferred treatment for people suffering from end-stage renal disease. Successful kidney transplants still fail over time, known as graft failure; however, the time to grant failure, or graft survival time, can vary significantly between different recipients. A significant biological factor affecting graft survival times is the compatibility between the human leukocyte antigens (HLAs) of the donor and recipient. We propose to model HLA compatibility using a network, where the nodes denote different HLAs of the donor and recipient, and edge weights denote compatibilities of the HLAs, which can be positive or negative. The network is indirectly observed, as the edge weights are estimated from transplant outcomes rather than directly observed. We propose a latent space model for such indirectly-observed weighted and signed networks. We demonstrate that our latent space model can not only result in more accurate estimates of HLA compatibilities, but can also be incorporated into survival analysis models to improve accuracy for the downstream task of predicting graft survival times.






  • 文章类型: Journal Article
    BACKGROUND: Biological networks have proven invaluable ability for representing biological knowledge. Multilayer networks, which gather different types of nodes and edges in multiplex, heterogeneous and bipartite networks, provide a natural way to integrate diverse and multi-scale data sources into a common framework. Recently, we developed MultiXrank, a Random Walk with Restart algorithm able to explore such multilayer networks. MultiXrank outputs scores reflecting the proximity between an initial set of seed node(s) and all the other nodes in the multilayer network. We illustrate here the versatility of bioinformatics tasks that can be performed using MultiXrank.
    RESULTS: We first show that MultiXrank can be used to prioritise genes and drugs of interest by exploring multilayer networks containing interactions between genes, drugs, and diseases. In a second study, we illustrate how MultiXrank scores can also be used in a supervised strategy to train a binary classifier to predict gene-disease associations. The classifier performance are validated using outdated and novel gene-disease association for training and evaluation, respectively. Finally, we show that MultiXrank scores can be used to compute diffusion profiles and use them as disease signatures. We computed the diffusion profiles of more than 100 immune diseases using a multilayer network that includes cell-type specific genomic information. The clustering of the immune disease diffusion profiles reveals shared shared phenotypic characteristics.
    CONCLUSIONS: Overall, we illustrate here diverse applications of MultiXrank to showcase its versatility. We expect that this can lead to further and broader bioinformatics applications.






  • 文章类型: Journal Article
    Biological networks are commonly used in biomedical and healthcare domains to effectively model the structure of complex biological systems with interactions linking biological entities. However, due to their characteristics of high dimensionality and low sample size, directly applying deep learning models on biological networks usually faces severe overfitting. In this work, we propose R-Mixup, a Mixup-based data augmentation technique that suits the symmetric positive definite (SPD) property of adjacency matrices from biological networks with optimized training efficiency. The interpolation process in R-Mixup leverages the log-Euclidean distance metrics from the Riemannian manifold, effectively addressing the swelling effect and arbitrarily incorrect label issues of vanilla Mixup. We demonstrate the effectiveness of R-Mixup with five real-world biological network datasets on both regression and classification tasks. Besides, we derive a commonly ignored necessary condition for identifying the SPD matrices of biological networks and empirically study its influence on the model performance. The code implementation can be found in Appendix E.






  • 文章类型: Journal Article
    Biological macromolecules, such as DNA, RNA, and proteins in living organisms, form an intricate network that plays a key role in many biological processes. Many attempts have been made to build new networks by connecting non-communicable proteins with network mediators, especially using antibodies. In this study, we devised an aptamer-based switching system that enables communication between non-interacting proteins. As a proof of concept, two proteins, Cas13a and T7 RNA polymerase (T7 RNAP), were rationally connected using an aptamer that specifically binds to T7 RNAP. The proposed switching system can be modulated in both signal-on and signal-off manners and its responsiveness to the target activator can be controlled by adjusting the reaction time. This study paves the way for the expansion of biological networks by mediating interactions between proteins using aptamers.





