l2-regularized logistic regression-医云文献数字医云科研云海量医学决策数据服务

l2-regularized logistic regression 关注

文献(3篇)

百科

视频

1 A Computational Framework for Predicting Direct Contacts and Substructures within Protein Complexes.

影响指数 : 6.064
发表时间：10 2019 25
来源期刊：Biomolecules PMID：31717703

DOI：10.3390/biom9110656
文章类型： Journal Article

Understanding the physical arrangement of subunits within protein complexes potentially provides valuable clues about how the subunits work together and how the complexes function. The majority of recent research focuses on identifying protein complexes as a whole and seldom studies the inner structures within complexes. In this study, we propose a computational framework to predict direct contacts and substructures within protein complexes. In this framework, we first train a supervised learning model of l2-regularized logistic regression to learn the patterns of direct and indirect interactions within complexes, from where physical subunit interaction networks are predicted. Then, to infer substructures within complexes, we apply a graph clustering method (i.e., maximum modularity clustering (MMC)) and a gene ontology (GO) semantic similarity based functional clustering on partially- and fully-connected networks, respectively. Computational results show that the proposed framework achieves fairly good performance of cross validation and independent test in terms of detecting direct contacts between subunits. Functional analyses further demonstrate the rationality of partitioning the subunits into substructures via the MMC algorithm and functional clustering.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Sci-hub)

PDF(Pubmed)
2 Neglog: Homology-Based Negative Data Sampling Method for Genome-Scale Reconstruction of Human Protein-Protein Interaction Networks.

影响指数 : 6.208
发表时间：Oct 2019 12
来源期刊：Int J Mol Sci PMID：31614890

DOI：10.3390/ijms20205075
文章类型： Journal Article

Rapid reconstruction of genome-scale protein-protein interaction (PPI) networks is instrumental in understanding the cellular processes and disease pathogenesis and drug reactions. However, lack of experimentally verified negative data (i.e., pairs of proteins that do not interact) is still a major issue that needs to be properly addressed in computational modeling. In this study, we take advantage of the very limited experimentally verified negative data from Negatome to infer more negative data for computational modeling. We assume that the paralogs or orthologs of two non-interacting proteins also do not interact with high probability. We coin an assumption as \"Neglog\" this assumption is to some extent supported by paralogous/orthologous structure conservation. To reduce the risk of bias toward the negative data from Negatome, we combine Neglog with less biased random sampling according to a certain ratio to construct training data. L2-regularized logistic regression is used as the base classifier to counteract noise and train on a large dataset. Computational results show that the proposed Neglog method outperforms pure random sampling method with sound biological interpretability. In addition, we find that independent test on negative data is indispensable for bias control, which is usually neglected by existing studies. Lastly, we use the Neglog method to validate the PPIs in STRING, which are supported by gene ontology (GO) enrichment analyses.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Sci-hub)

PDF(Pubmed)
3 Transferring knowledge of bacterial protein interaction networks to predict pathogen targeted human genes and immune signaling pathways: a case study on M. tuberculosis.

影响指数 : 4.547
发表时间：Jun 2018 28
来源期刊：BMC Genomics PMID：29954330

DOI：10.1186/s12864-018-4873-9
文章类型： Journal Article

BACKGROUND: Bacterial invasive infection and host immune response is fundamental to the understanding of pathogen pathogenesis and the discovery of effective therapeutic drugs. However, there are very few experimental studies on the signaling cross-talks between bacteria and human host to date.
METHODS: In this work, taking M. tuberculosis H37Rv (MTB) that is co-evolving with its human host as an example, we propose a general computational framework that exploits the known bacterial pathogen protein interaction networks in STRING database to predict pathogen-host protein interactions and their signaling cross-talks. In this framework, significant interlogs are derived from the known pathogen protein interaction networks to train a predictive l2-regularized logistic regression model.
RESULTS: The computational results show that the proposed method achieves excellent performance of cross validation as well as low predicted positive rates on the less significant interlogs and non-interlogs, indicating a low risk of false discovery. We further conduct gene ontology (GO) and pathway enrichment analyses of the predicted pathogen-host protein interaction networks, which potentially provides insights into the machinery that M. tuberculosis H37Rv targets human genes and signaling pathways. In addition, we analyse the pathogen-host protein interactions related to drug resistance, inhibition of which potentially provides an alternative solution to M. tuberculosis H37Rv drug resistance.
CONCLUSIONS: The proposed machine learning framework has been verified effective for predicting bacteria-host protein interactions via known bacterial protein interaction networks. For a vast majority of bacterial pathogens that lacks experimental studies of bacteria-host protein interactions, this framework is supposed to achieve a general-purpose applicability. The predicted protein interaction networks between M. tuberculosis H37Rv and Homo sapiens, provided in the Additional files, promise to gain applications in the two fields: (1) providing an alternative solution to drug resistance; (2) revealing the patterns that M. tuberculosis H37Rv genes target human immune signaling pathways.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Sci-hub)

PDF(Pubmed)

l2-regularized logistic regression 关注

1 A Computational Framework for Predicting Direct Contacts and Substructures within Protein Complexes.

2 Neglog: Homology-Based Negative Data Sampling Method for Genome-Scale Reconstruction of Human Protein-Protein Interaction Networks.

3 Transferring knowledge of bacterial protein interaction networks to predict pathogen targeted human genes and immune signaling pathways: a case study on M. tuberculosis.