Biocomputational method

生物计算方法
  • 文章类型: Journal Article
    去卷积算法主要依赖于单细胞RNA测序(scRNA-seq)数据应用于批量RNA测序(批量RNA-seq)来估计组织\'细胞类型组成,在存放的数据库上验证了性能准确性。脂肪组织\'细胞组成是高度可变的,脂肪细胞只能通过单核RNA测序(snRNA-seq)捕获。在这里我们报告sNucConv的发展,基于Scaden深度学习的反卷积工具,使用5hSAT和7hVAT基于snRNA-seq的数据进行训练,这些数据通过(i)snRNA-seq/bulkRNA-seq高度相关基因和(ii)单个细胞类型回归模型进行校正。将sNucConv应用于我们的大量RNA-seq数据导致15和13种细胞类型的细胞类型比例估计,对于hVAT和hSAT,精度为R=0.93(范围:0.76-0.97)和R=0.95(范围:0.92-0.98),分别。该性能水平在一组独立的样品(5hSAT;5hVAT)上进一步验证。由此产生的模型是特定于仓库的,反映了基因表达模式的储库差异。联合,sNucConv提供了概念证明,用于为不适合scRNA-seq的组织生成经过验证的反卷积模型。
    Deconvolution algorithms mostly rely on single-cell RNA-sequencing (scRNA-seq) data applied onto bulk RNA-sequencing (bulk RNA-seq) to estimate tissues\' cell-type composition, with performance accuracy validated on deposited databases. Adipose tissues\' cellular composition is highly variable, and adipocytes can only be captured by single-nucleus RNA-sequencing (snRNA-seq). Here we report the development of sNucConv, a Scaden deep-learning-based deconvolution tool, trained using 5 hSAT and 7 hVAT snRNA-seq-based data corrected by (i) snRNA-seq/bulk RNA-seq highly correlated genes and (ii) individual cell-type regression models. Applying sNucConv on our bulk RNA-seq data resulted in cell-type proportion estimation of 15 and 13 cell types, with accuracy of R = 0.93 (range: 0.76-0.97) and R = 0.95 (range: 0.92-0.98) for hVAT and hSAT, respectively. This performance level was further validated on an independent set of samples (5 hSAT; 5 hVAT). The resulting model was depot specific, reflecting depot differences in gene expression patterns. Jointly, sNucConv provides proof-of-concept for producing validated deconvolution models for tissues un-amenable to scRNA-seq.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    蛋白质-蛋白质相互作用(PPI)网络的从头计算重建将为细胞系统提供宝贵的见解,能够发现新的分子相互作用,并阐明生物体内和生物体之间的生物学机制。利用最新一代的蛋白质语言模型和递归神经网络,我们提出了SENSE-PPI,基于序列的深度学习模型,可有效地从头算重建PPI,在数万种蛋白质中区分伴侣,并识别功能相似蛋白质中的特定相互作用。SENSE-PPI表现出高精度,有限的培训要求,以及跨物种预测的多功能性,即使是非模式生物和人类病毒相互作用。对于系统发育上更远的模型和非模型生物,其性能会降低,但是信号改变非常缓慢。在这方面,它证明了参数在蛋白质语言模型中的重要作用。SENSE-PPI非常快,可以在几个小时内测试10,000种蛋白质。能够重建全基因组的蛋白质组。
    Ab initio computational reconstructions of protein-protein interaction (PPI) networks will provide invaluable insights into cellular systems, enabling the discovery of novel molecular interactions and elucidating biological mechanisms within and between organisms. Leveraging the latest generation protein language models and recurrent neural networks, we present SENSE-PPI, a sequence-based deep learning model that efficiently reconstructs ab initio PPIs, distinguishing partners among tens of thousands of proteins and identifying specific interactions within functionally similar proteins. SENSE-PPI demonstrates high accuracy, limited training requirements, and versatility in cross-species predictions, even with non-model organisms and human-virus interactions. Its performance decreases for phylogenetically more distant model and non-model organisms, but signal alteration is very slow. In this regard, it demonstrates the important role of parameters in protein language models. SENSE-PPI is very fast and can test 10,000 proteins against themselves in a matter of hours, enabling the reconstruction of genome-wide proteomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    冠状动脉疾病(CAD)仍然是全球疾病负担的主要原因,并且持续需要新的治疗靶点。仪器变量(IV)和遗传共定位分析可以通过在全基因组关联研究(GWAS)基因座中提名因果基因来帮助识别人类疾病的新治疗靶标。我们使用来自三项不同研究的分子性状定量性状基因座变体(QTL)数据,对CAD的20,125个基因和1,746个血浆蛋白进行了顺式IV分析。通过IV分析,19种蛋白质和119种基因与CAD风险显着相关,并证明了遗传共定位的证据。值得注意的是,我们的分析验证了PCSK9和ANGPTL4等公认的靶标,同时还将HTRA1和内毒素(COL6A3的裂解产物)鉴定为其水平与CAD风险有因果关系的蛋白质.需要进一步的实验研究来证实通过我们对人类疾病的多体顺式-IV分析鉴定的基因和蛋白质的因果作用。
    Coronary artery disease (CAD) remains a leading cause of disease burden globally, and there is a persistent need for new therapeutic targets. Instrumental variable (IV) and genetic colocalization analyses can help identify novel therapeutic targets for human disease by nominating causal genes in genome-wide association study (GWAS) loci. We conducted cis-IV analyses for 20,125 genes and 1,746 plasma proteins with CAD using molecular trait quantitative trait loci variant (QTLs) data from three different studies. 19 proteins and 119 genes were significantly associated with CAD risk by IV analyses and demonstrated evidence of genetic colocalization. Notably, our analyses validated well-established targets such as PCSK9 and ANGPTL4 while also identifying HTRA1 and endotrophin (a cleavage product of COL6A3) as proteins whose levels are causally associated with CAD risk. Further experimental studies are needed to confirm the causal role of the genes and proteins identified through our multiomic cis-IV analyses on human disease.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肌萎缩侧索硬化症(ALS)是一种普遍致命的神经退行性疾病,无法治愈。人内源性逆转录病毒(HERV)已涉及其发病机理,但其与ALS的相关性尚未完全了解。我们检查了来自近2,000个ALS和来自皮质和脊髓的不受影响的对照样品的大量RNA-seq数据。使用不同的特征选择方法,包括差异表达分析和机器学习,我们发现HERV-K位点1q22和8p23.1的转录在ALS患者的脊髓中显著上调.此外,我们确定了ALS患者的一个亚组在皮质和脊髓HERV-K表达上调.在这项研究中,我们还发现HERV-K基因座19q11和8p23.1的表达与先前与ALS有关的蛋白质编码基因相关,并且在ALS患者中失调。这些结果阐明了HERV-K和ALS的关联,并突出了晚期ALS病理生物学中的特定基因。
    Amyotrophic lateral sclerosis (ALS) is a universally fatal neurodegenerative disease with no cure. Human endogenous retroviruses (HERVs) have been implicated in its pathogenesis but their relevance to ALS is not fully understood. We examined bulk RNA-seq data from almost 2,000 ALS and unaffected control samples derived from the cortex and spinal cord. Using different methods of feature selection, including differential expression analysis and machine learning, we discovered that transcription of HERV-K loci 1q22 and 8p23.1 were significantly upregulated in the spinal cord of individuals with ALS. Additionally, we identified a subset of ALS patients with upregulated HERV-K expression in the cortex and spinal cord. We also found the expression of HERV-K loci 19q11 and 8p23.1 was correlated with protein coding genes previously implicated in ALS and dysregulated in ALS patients in this study. These results clarify the association of HERV-K and ALS and highlight specific genes in the pathobiology of late-stage ALS.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    评估药物与蛋白质的结合亲和力是确定药物药理作用的关键过程,但它需要蛋白质的三维结构。在这里,我们提出了新的计算方法来预测治疗适应症和副作用的药物候选化合物从结合亲和力的人类蛋白质结构的范围内。对7,582种药物进行了大规模对接模拟,该药物具有AlphaFold揭示的19,135种蛋白质结构(包括实验未解决的蛋白质),构建了全蛋白质组结合亲和力评分(PBAS)的机器学习模型。我们证明了该方法预测559种疾病的治疗适应症和285种毒性的副作用的有用性。该方法能够预测尚未通过实验确定相关蛋白质结构的药物适应症,并成功提取引起副作用的蛋白质。所提出的方法将在药物发现的各种应用中有用。
    Evaluation of the binding affinities of drugs to proteins is a crucial process for identifying drug pharmacological actions, but it requires three dimensional structures of proteins. Herein, we propose novel computational methods to predict the therapeutic indications and side effects of drug candidate compounds from the binding affinities to human protein structures on a proteome-wide scale. Large-scale docking simulations were performed for 7,582 drugs with 19,135 protein structures revealed by AlphaFold (including experimentally unresolved proteins), and machine learning models on the proteome-wide binding affinity score (PBAS) profiles were constructed. We demonstrated the usefulness of the method for predicting the therapeutic indications for 559 diseases and side effects for 285 toxicities. The method enabled to predict drug indications for which the related protein structures had not been experimentally determined and to successfully extract proteins eliciting the side effects. The proposed method will be useful in various applications in drug discovery.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    细胞毒性T淋巴细胞(CTL)和终末耗竭T淋巴细胞(ETL)活性对免疫检查点抑制剂(ICI)反应至关重要。尽管如此,ETL和CTL转录组特征对应答预测的有效性仍然有限.在TCGA和公开可用的单细胞队列中进行调查,我们发现,在大多数癌症中,ETL和CTL表达特征之间有很强的正相关性.因此,我们认为,由于它们对ICI响应的相互抵消作用,它们的可预测性有限。因此,我们开发了DETACH,一种计算方法,用于鉴定其表达精确指向CTL和ETL相关性低的黑色素瘤患者子集的基因集。DETACH提高了CTL的预测精度,优于现有签名。DETACH特征基因活性也证明与淋巴细胞浸润和肿瘤微环境(TME)中反应性T细胞的患病率呈正相关,推进我们对TME内CTL细胞状态的理解。
    Cytotoxic T lymphocyte (CTL) and terminal exhausted T lymphocyte (ETL) activities crucially influence immune checkpoint inhibitor (ICI) response. Despite this, the efficacy of ETL and CTL transcriptomic signatures for response prediction remains limited. Investigating this across the TCGA and publicly available single-cell cohorts, we find a strong positive correlation between ETL and CTL expression signatures in most cancers. We hence posited that their limited predictability arises due to their mutually canceling effects on ICI response. Thus, we developed DETACH, a computational method to identify a gene set whose expression pinpoints to a subset of melanoma patients where the CTL and ETL correlation is low. DETACH enhances CTL\'s prediction accuracy, outperforming existing signatures. DETACH signature genes activity also demonstrates a positive correlation with lymphocyte infiltration and the prevalence of reactive T cells in the tumor microenvironment (TME), advancing our understanding of the CTL cell state within the TME.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    准确检测病原体,特别区分革兰氏阳性和革兰氏阴性细菌,可以改善疾病治疗。宿主基因表达可以捕获免疫系统对各种病原体引起的感染的反应。这里,我们提出了一个深度神经网络模型,bvnGPS2,它结合了基于大规模整合宿主转录组数据集的注意力机制,以精确识别革兰氏阳性和革兰氏阴性细菌感染以及病毒感染。我们使用我们先前设计的组学数据整合方法,对来自10个国家的40个队列的4,949个血液样本进行了分析。iPAGE,选择判别式基因对并训练bvnGPS2。在包含374个样品的6个独立队列上评估模型的性能。总的来说,我们的深度神经网络模型显示出准确识别特定感染的强大能力,为感染治疗中的精确医学策略铺平了道路,也可能为识别其他疾病的亚型铺平了道路。
    Accurate detection of pathogens, particularly distinguishing between Gram-positive and Gram-negative bacteria, could improve disease treatment. Host gene expression can capture the immune system\'s response to infections caused by various pathogens. Here, we present a deep neural network model, bvnGPS2, which incorporates the attention mechanism based on a large-scale integrated host transcriptome dataset to precisely identify Gram-positive and Gram-negative bacterial infections as well as viral infections. We performed analysis of 4,949 blood samples across 40 cohorts from 10 countries using our previously designed omics data integration method, iPAGE, to select discriminant gene pairs and train the bvnGPS2. The performance of the model was evaluated on six independent cohorts comprising 374 samples. Overall, our deep neural network model shows robust capability to accurately identify specific infections, paving the way for precise medicine strategies in infection treatment and potentially also for identifying subtypes of other diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    肿瘤微环境(TME)内的相互作用显着影响肿瘤进展和治疗反应。虽然单细胞RNA测序(scRNA-seq)和空间基因组学有助于TME探索,许多临床队列在整体组织水平进行评估.通过计算去卷积整合scRNA-seq和批量组织RNA-seq数据对于获得临床相关见解至关重要。我们的方法,ProM,使主要和次要细胞类型的检查。通过使用人类尿路上皮癌(UC)样品的配对单细胞和批量RNA测序对现有方法进行评估,ProM显示出优越性。对接受免疫检查点抑制剂治疗的UC队列的应用揭示了与不良预后相关的治疗前细胞特征。例如巨噬细胞/单核细胞(MM)中SPP1表达升高。我们的去卷积方法和配对的单细胞和大块组织RNA-seq数据集为TME异质性和对免疫检查点阻断的抗性提供了新的见解。
    Interactions within the tumor microenvironment (TME) significantly influence tumor progression and treatment responses. While single-cell RNA sequencing (scRNA-seq) and spatial genomics facilitate TME exploration, many clinical cohorts are assessed at the bulk tissue level. Integrating scRNA-seq and bulk tissue RNA-seq data through computational deconvolution is essential for obtaining clinically relevant insights. Our method, ProM, enables the examination of major and minor cell types. Through evaluation against existing methods using paired single-cell and bulk RNA sequencing of human urothelial cancer (UC) samples, ProM demonstrates superiority. Application to UC cohorts treated with immune checkpoint inhibitors reveals pre-treatment cellular features associated with poor outcomes, such as elevated SPP1 expression in macrophage/monocytes (MM). Our deconvolution method and paired single-cell and bulk tissue RNA-seq dataset contribute novel insights into TME heterogeneity and resistance to immune checkpoint blockade.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    全球尸检率在下降,影响死因(CoD)诊断和质量控制。使用4,282例人类病例评估死后代谢组学用于CoD筛查,包括CoD组:酸中毒,药物中毒,悬挂,缺血性心脏病(IHD),和肺炎。将案例以3:1分为训练集和测试集。通过正交偏最小二乘判别分析(OPLS-DA)分析来自股骨血液的高分辨率质谱数据以区分CoD组。OPLS-DA实现了R2=0.52和Q2=0.30,训练集和测试集的真正预测率为68%和65%,分别,跨所有群体。特异性优化阈值预测56%的测试用例具有独特的CoD,平均45%灵敏度,和平均96%的特异性。预测准确性各不相同:酸中毒为98.7%,80.5%为药物中毒,悬挂81.6%,IHD为73.1%,肺炎占93.6%。这项研究证明了大规模死后代谢组学用于CoD筛查的潜力,在人类死亡调查中提供高特异性并增强吞吐量和决策。
    Autopsy rates are declining globally, impacting cause-of-death (CoD) diagnoses and quality control. Postmortem metabolomics was evaluated for CoD screening using 4,282 human cases, encompassing CoD groups: acidosis, drug intoxication, hanging, ischemic heart disease (IHD), and pneumonia. Cases were split 3:1 into training and test sets. High-resolution mass spectrometry data from femoral blood were analyzed via orthogonal-partial least squares discriminant analysis (OPLS-DA) to discriminate CoD groups. OPLS-DA achieved an R2 = 0.52 and Q2 = 0.30, with true-positive prediction rates of 68% and 65% for training and test sets, respectively, across all groups. Specificity-optimized thresholds predicted 56% of test cases with a unique CoD, average 45% sensitivity, and average 96% specificity. Prediction accuracies varied: 98.7% for acidosis, 80.5% for drug intoxication, 81.6% for hanging, 73.1% for IHD, and 93.6% for pneumonia. This study demonstrates the potential of large-scale postmortem metabolomics for CoD screening, offering high specificity and enhancing throughput and decision-making in human death investigations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    微注射酵母细胞数十年来一直具有挑战性,由于超坚韧的细胞壁和传统注射器尖端在微观尺度上的低硬度,因此没有重大突破。穿透这个保护墙是人为将外来物质带入酵母的关键步骤。在本文中,利用有限元分析(FEA)方法建立酵母细胞模型,分析其穿透过程。模型中酵母细胞壁的关键参数(杨氏模量,剪切模量,和Lame常数)根据一般的纳米压痕实验进行校准。然后通过使用校准模型,优化注入参数以最小化细胞损伤(细胞壁临界应力下的最大细胞变形)。建议在显微注射过程中穿透细胞壁的关键指南。
    Microinjecting yeast cells has been challenging for decades with no significant breakthrough due to the ultra-tough cell wall and low stiffness of the traditional injector tip at the micro-scale. Penetrating this protection wall is the key step for artificially bringing foreign substance into the yeast. In this paper, a yeast cell model was built by using finite element analysis (FEA) method to analyze the penetrating process. The key parameters of the yeast cell wall in the model (the Young\'s modulus, the shear modulus, and the Lame constant) were calibrated according to a general nanoindentation experiment. Then by employing the calibrated model, the injection parameters were optimized to minimize the cell damage (the maximum cell deformation at the critical stress of the cell wall). Key guidelines were suggested for penetrating the cell wall during microinjection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号