lasso regression model

Lasso 回归模型
  • 文章类型: Journal Article
    本研究旨在探讨肌酸激酶同工酶(CK-MB)的地理空间分布,为临床检查提供科学依据。通过阅读大量文献,收集了中国137个城市8697名健康成年人的CK-MB参考值。莫兰指数用于确定空间关系,选择了24个因素,属于地形,气候,和土壤指数。对CK-MB和地理因素进行相关性分析,以确定显著性。提取了9个显著性因子。基于R语言评估模型的多重共线性程度,CK-MB脊模型,套索模型,建立了PCA模型,通过计算相对误差来选择最佳的PCA模型,测试预测值的正常性,并选择析取克里格插值来进行地理分布。结果表明,健康成年人的CK-MB参考值与纬度大致相关,年日照持续时间,年平均相对湿度,年降水量,和年气温范围,并与年平均气温显着相关,表土砾石含量,粘土中的表土阳离子交换能力,和表层土壤中的阳离子交换能力。地理空间分布图显示,北部较高,南部较低,并从东南沿海地区向西北内陆地区逐渐增加。如果地理因素是在某个位置获得的,CK-MB模型可用于预测该地区健康成年人的CK-MB,为我们在临床诊断中考虑区域差异提供了参考。
    The aim of this study was to investigate the geographical spatial distribution of creatine kinase isoenzyme (CK-MB) in order to provide a scientific basis for clinical examination. The reference values of CK-MB of 8697 healthy adults in 137 cities in China were collected by reading a large number of literates. Moran index was used to determine the spatial relationship, and 24 factors were selected, which belonged to terrain, climate, and soil indexes. Correlation analysis was conducted between CK-MB and geographical factors to determine significance, and 9 significance factors were extracted. Based on R language to evaluate the degree of multicollinearity of the model, CK-MB Ridge model, Lasso model, and PCA model were established, through calculating the relative error to choose the best model PCA, testing the normality of the predicted values, and choosing the disjunctive kriging interpolation to make the geographical distribution. The results show that CK-MB reference values of healthy adults were generally correlated with latitude, annual sunshine duration, annual mean relative humidity, annual precipitation amount, and annual range of air temperature and significantly correlated with annual mean air temperature, topsoil gravel content, topsoil cation exchange capacity in clay, and topsoil cation exchange capacity in silt. The geospatial distribution map shows that on the whole, it is higher in the north and lower in the south, and gradually increases from the southeast coastal area to the northwest inland area. If the geographical factors are obtained in a location, the CK-MB model can be used to predict the CK-MB of healthy adults in the region, which provides a reference for us to consider regional differences in clinical diagnosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    企业碳绩效是实现企业可持续发展的关键驱动力。识别影响企业碳排放的因素是提高碳绩效的基础。基于碳披露项目(CDP)数据库,我们整合最小绝对收缩和选择算子(LASSO)回归模型和固定效应模型来确定碳排放的决定因素。此外,我们根据决定因素的重要性进行排序。我们发现Capx在所有碳环境下都进入模型。对于范围1和范围2,财务层面的因素发挥更大的作用。对于范围3,企业内部激励政策和减排行为很重要。与绝对碳排放不同,对于相对碳排放,财务层面的因素\'偿债能力是企业碳排放影响的重要参考指标。
    Corporate carbon performance is a key driver of achieving corporate sustainability. The identification of factors that influence corporate carbon emissions is fundamental to promoting carbon performance. Based on the carbon disclosure project (CDP) database, we integrate the least absolute shrinkage and selection operator (LASSO) regression model and the fixed effects model to identify the determinants of carbon emissions. Furthermore, we rank determining factors according to their importance. We find that Capx enters the models under all carbon contexts. For Scope 1 and Scope 2, financial-level factors play a greater role. For Scope 3, corporate internal incentive policies and emission reduction behaviors are important. Different from absolute carbon emissions, for relative carbon emissions, the financial-level factors\' debt-paying ability is a vital reference indicator for the impact of corporate carbon emissions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    BACKGROUND: ceRNAs have emerged as pivotal players in the regulation of gene expression and play a crucial role in the physiology and development of various cancers. Nevertheless, the function and underlying mechanisms of ceRNAs in esophageal cancer (EC) are still largely unknown.
    METHODS: In this study, profiles of DEmRNAs, DElncRNAs, and DEmiRNAs between normal and EC tumor tissue samples were obtained from the Cancer Genome Atlas database using the DESeq package in R by setting the adjusted P<0.05 and |log2(fold change)|>2 as the cutoff. The ceRNA network (ceRNet) was initially constructed to reveal the interaction of these ceRNAs during carcinogenesis based on the bioinformatics of miRcode, miRDB, miRTarBase, and TargetScan. Then, independent microarray data of GSE6188, GSE89102, and GSE92396 and correlation analysis were used to validate molecular biomarkers in the initial ceRNet. Finally, a least absolute shrinkage and selection operator logistic regression model was built using an oncogenic ceRNet to diagnose EC more accurately.
    RESULTS: We successfully constructed an oncogenic ceRNet of EC, crosstalk of hsa-miR372-centered CADM2-ADAMTS9-AS2 and hsa-miR145-centered SERPINE1-PVT1. In addition, the risk-score model -0.0053*log2(CADM2)+0.0168*log2(SERPINE1)-0.0073*log2(ADAMTS9-AS2)+0.0905*log2(PVT1)+0.0047*log2(hsa-miR372)-0.0193*log2(hsa-miR145), (log2[gene count]) could improve diagnosis of EC with an AUC of 0.988.
    CONCLUSIONS: We identified two novel pairs of ceRNAs in EC and its role of diagnosis. The pairs of hsa-miR372-centered CADM2-ADAMTS9-AS2 and hsa-miR145-centered SERPINE1-PVT1 were likely potential carcinogenic mechanisms of EC, and their joint detection could improve diagnostic accuracy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号