Regression Analysis

回归分析
  • 文章类型: Journal Article
    精神分裂症患者通常表现出较差的生活技能,带来了重大的临床挑战。生活技能包括对规划日常活动至关重要的认知功能,包括发散思维。然而,导致精神分裂症患者这些技能下降的认知缺陷研究不足.本研究引入了改良的Tinkertoy测试(m-TTT)来调查生活技能之间的相关性,发散思维,和精神分裂症患者的心理评估工具。
    52名精神分裂症患者,和一个对照组一起,匹配性别,年龄,和教育,使用心理评估工具进行评估。对于患者组来说,生活技能简介(LSP)和阳性和阴性综合征量表用于测量功能能力和精神症状,分别。此外,仅在患者组中评估疾病持续时间和抗精神病药物每日剂量水平.两组均采用m-TTT,想法流畅度测试(IFT),设计流畅度测试(DFT),和简要评估精神分裂症的认知(BACS),以全面评估认知功能。进行逐步多元回归模型以识别患者组中LSP总评分的显著相关性。
    精神分裂症组的m-TTT评分明显低于神经典型对照组,IFT,DFT,和BACS。我们的逐步多元回归分析强调,LSP总分与m-TTT总分和阴性症状的存在显着相关。
    分歧思维可能是精神分裂症患者生活技能的关键因素。基于这种认知功能的康复计划可能会增强他们的日常生活能力。
    UNASSIGNED: Patients with schizophrenia often exhibit poor life skills, posing significant clinical challenges. Life skills comprise cognitive functions crucial for planning daily activities, including divergent thinking. However, the cognitive deficits contributing to these diminished skills among patients with schizophrenia are underexplored. This study introduces a modified Tinkertoy Test (m-TTT) to investigate the correlation between life skills, divergent thinking, and psychological assessment tools in patients with schizophrenia.
    UNASSIGNED: Fifty-two patients with schizophrenia, alongside a control group, matched for sex, age, and education, were evaluated using psychological assessment tools. For the patient group, the Life Skills Profile (LSP) and Positive and Negative Syndrome Scale were administered to measure functional abilities and psychiatric symptoms, respectively. Additionally, duration of disease and antipsychotic daily dosage levels were assessed exclusively in the patient group. Both groups were evaluated with the m-TTT, Idea Fluency Test (IFT), Design Fluency Test (DFT), and Brief Assessment of Cognition in Schizophrenia (BACS) to comprehensively assess cognitive functions. A stepwise multiple regression model was conducted to identify significant correlates of LSP total score among the patient group.
    UNASSIGNED: The schizophrenia group scored notably lower than the neurotypical controls on the m-TTT, IFT, DFT, and BACS. Our stepwise multiple regression analysis highlighted that the LSP total score was significantly correlated with the total m-TTT score and presence of negative symptoms.
    UNASSIGNED: Divergent thinking could be a crucial factor in the life skills of individuals with schizophrenia. Rehabilitation programs based on this cognitive function might enhance their daily living capabilities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    本文介绍了一种通过将聚类策略与回归模型集成并利用气象数据来模拟尼日利亚疟疾发病率的新方法。通过使用聚类技术将数据集分解为多个子集,我们增加了解释变量的数量,并阐明了天气在预测不同发病率数据范围中的作用.我们的聚类集成回归模型,伴随着最优障碍,提供有关疟疾发病率与降雨和温度等既定影响天气因素之间复杂关系的见解。我们探索两种模式。第一个模型结合了滞后发生率和个体特异性效应。第二个模型只关注天气成分。模型的选择取决于决策者的优先事项。推荐模型一用于更高的预测精度。此外,我们的发现揭示了疟疾发病率的显著差异,特定于某些地理集群,而不仅仅是观测到的天气变量可以解释的。值得注意的是,降雨和温度在不同的发病率集群中表现出不同的边际效应,表明它们对疟疾传播的不同影响。高降雨量与低发病率相关,可能是由于它在冲洗蚊子繁殖场所中的作用。另一方面,温度不能预测高发病例,这表明温度以外的其他因素也会导致高病例。我们的研究解决了疟疾发病率综合模型的需求,特别是在尼日利亚等疾病仍然流行的地区。通过将聚类技术与回归分析相结合,我们对预定的天气因素如何影响疟疾传播提供了细微差别的理解。这种方法有助于公共卫生当局实施有针对性的干预措施。我们的研究强调了在疟疾控制工作中考虑当地环境因素的重要性,并强调了基于天气的预测对主动疾病管理的潜力。
    This paper introduces a novel approach to modeling malaria incidence in Nigeria by integrating clustering strategies with regression modeling and leveraging meteorological data. By decomposing the datasets into multiple subsets using clustering techniques, we increase the number of explanatory variables and elucidate the role of weather in predicting different ranges of incidence data. Our clustering-integrated regression models, accompanied by optimal barriers, provide insights into the complex relationship between malaria incidence and well-established influencing weather factors such as rainfall and temperature.We explore two models. The first model incorporates lagged incidence and individual-specific effects. The second model focuses solely on weather components. Selection of a model depends on decision-makers priorities. The model one is recommended for higher predictive accuracy. Moreover, our findings reveal significant variability in malaria incidence, specific to certain geographic clusters and beyond what can be explained by observed weather variables alone.Notably, rainfall and temperature exhibit varying marginal effects across incidence clusters, indicating their differential impact on malaria transmission. High rainfall correlates with lower incidence, possibly due to its role in flushing mosquito breeding sites. On the other hand, temperature could not predict high-incidence cases, suggesting that other factors other than temperature contribute to high cases.Our study addresses the demand for comprehensive modeling of malaria incidence, particularly in regions like Nigeria where the disease remains prevalent. By integrating clustering techniques with regression analysis, we offer a nuanced understanding of how predetermined weather factors influence malaria transmission. This approach aids public health authorities in implementing targeted interventions. Our research underscores the importance of considering local contextual factors in malaria control efforts and highlights the potential of weather-based forecasting for proactive disease management.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    电解质浓度的不平衡会产生严重的后果,但准确和可访问的测量可以改善患者的结果。基于血液测试的当前测量方法是准确的,但侵入性且耗时,并且通常在例如远程位置或救护车环境中不可用。在本文中,我们探索使用深度神经网络(DNN)进行回归任务,以准确预测来自心电图(ECG)的连续电解质浓度,一个快速和广泛采用的工具。我们在四个主要电解质的超过290,000个ECG的新数据集上分析了我们的DNN模型,并将它们的性能与传统机器学习模型进行了比较。为了提高理解,我们还研究了从连续预测到极端浓度水平的二元分类的全谱。最后,我们研究了概率回归方法,并探索了增强临床有用性的不确定性估计.我们的结果表明,DNN优于传统模型,但不同电解质的模型性能差异显著。虽然离散化导致良好的分类性能,它没有解决连续浓度水平预测的原始问题。概率回归具有实际潜力,但是我们的不确定性估计并没有完全校准。因此,我们的研究是朝着开发一种准确可靠的基于ECG的电解质浓度水平预测方法迈出的第一步,该方法在多种临床场景中具有很高的潜在影响。
    Imbalances in electrolyte concentrations can have severe consequences, but accurate and accessible measurements could improve patient outcomes. The current measurement method based on blood tests is accurate but invasive and time-consuming and is often unavailable for example in remote locations or an ambulance setting. In this paper, we explore the use of deep neural networks (DNNs) for regression tasks to accurately predict continuous electrolyte concentrations from electrocardiograms (ECGs), a quick and widely adopted tool. We analyze our DNN models on a novel dataset of over 290,000 ECGs across four major electrolytes and compare their performance with traditional machine learning models. For improved understanding, we also study the full spectrum from continuous predictions to a binary classification of extreme concentration levels. Finally, we investigate probabilistic regression approaches and explore uncertainty estimates for enhanced clinical usefulness. Our results show that DNNs outperform traditional models but model performance varies significantly across different electrolytes. While discretization leads to good classification performance, it does not address the original problem of continuous concentration level prediction. Probabilistic regression has practical potential, but our uncertainty estimates are not perfectly calibrated. Our study is therefore a first step towards developing an accurate and reliable ECG-based method for electrolyte concentration level prediction-a method with high potential impact within multiple clinical scenarios.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: English Abstract
    OBJECTIVE: To investigate the feasibility of constructing the risk index of Echinococcus infection based on the classification of echinococcosis lesions, so as to provide insights into the management of echinococcosis.
    METHODS: The imaging data of echinococcosis cases were collected from epidemiological surveys of echinococcosis in China from 2012 to 2016, and the detection of incident echinococcosis cases was captured from the annual echinococcosis prevention and control reports across provinces (autonomous regions) and Xinjiang Production and Construction Corps in China from 2017 to 2022. After echinococcosis lesions were classified, a risk index of Echinococcus infection was constructed based on the principle of discrete distribution marginal probability and multi-group classification data tests. The correlation between the risk index of Echinococcus infection and the detection of incident echinococcosis cases was evaluated in the provinces (autonomous regions and corps) from 2017 to 2022, and the correlations between the short and medium-term risk indices and between the medium and long-term risk indices of Echinococcus infection were examined using a univariate linear regression model.
    RESULTS: A total of 4 014 echinococcosis cases in China from 2012 to 2016 were included in this study. The short-, medium- and long-term risk indices of E. granulosus infection varied in echinococcosis-endemic provinces (autonomous regions and corps) of China (χ2 = 4.12 to 708.65, all P values < 0.05), with high short- (0.058), medium- (0.137) and long-term risk indices (0.104) in Tibet Autonomous Region, and the short-, medium- and long-term risk indices of E. multilocularis infection varied in echinococcosis-endemic provinces (autonomous regions and corps) of China (χ2 = 6.74 to 122.60, all P values < 0.05), with a high short-term risk index in Sichuan Province (0.016) and high medium- (0.009) and long-term risk indices in Qinghai Province (0.018). There were no significant correlations between the risk index of E. granulosus infection and the detection of incident cystic echinococcosis cases during the study period (t = -0.518 to 2.265, all P values > 0.05), and strong correlations were found between the risk indices of E. multilocularis infection and the detection of incident alveolar echinococcosis cases (including mixed type) in 2018, 2020, 2021, 2022, during the period from 2017 through 2020, from 2017 through 2021, from 2017 through 2022 (all r values > 0.7, t = 2.521 to 3.692, all P values < 0.05). Linear regression models were established between the risk index of E. multilocular infection and the detection of alveolar echinococcosis cases (including mixed type), and the models were all statistically significant (b = 0.214 to 2.168, t = 2.458 to 3.692, F = 6.044 to 13.629, all P values < 0.05). The regression coefficients for the correlations between the medium- and short-term, and between the long- and medium-term risk indices of E. granulosus infection were 2.339 and 0.765, and the regression coefficients for the correlations between the medium- and short-term, and between the long- and medium-term risk indices of E. multilocular infection were 0.280 and 1.842, with statistical significance seen in both the regression coefficients and regression models (t = 16.479 to 197.304, F = 271.570 to 38 928.860, all P values < 0.05).
    CONCLUSIONS: The risk index of Echinococcus infection has been successfully established based on the classification of echinococcosis lesions, which may provide insights into the prevention and control, prediction, diagnosis and treatment, and classified management of echinococcosis.
    [摘要] 目的 分析基于棘球蚴病病灶分型构建棘球蚴感染风险指数的可行性, 从而为棘球蚴病防控提供参考。方法 收集2012—2016年我国棘球蚴病流行病学调查中棘球蚴病病例病灶影像学资料及2017—2022年我国棘球蚴病防治 工作年报中各流行省 (自治区) 及新疆生产建设兵团棘球蚴病新发现病例检出率数据。对棘球蚴病病灶进行分型后, 参 考离散分布边际概率原理和多分组分类数据检验方法构建棘球蚴感染风险指数。对该指数与2017—2022年各流行省 (自治区) 和新疆生产建设兵团棘球蚴病新发现病例检出率数据进行相关性分析, 建立单因素线性回归模型分析近期与 中期、中期与远期棘球蚴感染风险指数间关系。结果 本研究累计纳入2012—2016年我国棘球蚴病病例4 014例。我 国各棘球蚴病流行省 (自治区) 、新疆生产建设兵团间细粒棘球蚴近期、中期和远期感染风险指数均不相同 (χ2 = 4.12 ~ 708.65, P 均< 0.05), 其中西藏自治区近期 (0.058) 、中期 (0.137) 和远期 (0.104) 细粒棘球蚴感染风险指数均较高; 多房棘 球蚴近期、中期和远期感染风险指数均不相同 (χ2 = 6.74 ~ 122.60, P 均< 0.05), 其中四川省近期感染风险指数 (0.016) 较 高, 青海省中期 (0.009) 、远期 (0.018) 感染风险指数较高。细粒棘球蚴感染风险指数与新发现病例检出率相关性均无统 计学意义 (t = -0.518 ~ 2.265, P 均> 0.05); 多房棘球蚴各期感染风险指数与2018、2020、2021、2022、2017—2020、2017—2021、2017—2022年泡型 (含混合型) 棘球蚴病新发现病例检出率均呈强相关 (r 均> 0.7, t = 2.521 ~ 3.692, P 均< 0.05) 。 对多房棘球蚴各期感染风险指数与泡型 (含混合型) 棘球蚴病新发现病例检出率建立线性回归模型, 均有统计学意义 (b = 0.214 ~ 2.168, t = 2.458 ~ 3.692, F = 6.044 ~ 13.629, P 均< 0.05) 。中期与近期、远期与中期细粒棘球蚴感染风险指数 间回归系数分别为2.339和0.765, 中期与近期、远期与中期多房棘球蚴感染风险指数间回归系数分别为0.280和1.842, 回归系数与回归模型均有统计学意义 (t = 16.479 ~ 197.304, F = 271.570 ~ 38 928.860, P均< 0.05) 。结论 成功建立了 一种基于棘球蚴病病灶分型的棘球蚴感染风险指数, 可望为棘球蚴病防控、预测、诊疗和分类管理提供参考。.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    可扩展的PTSD筛查策略必须简短,准确,能够由非专业劳动力管理。
    我们使用由结构化临床访谈确定的PTSD作为我们的黄金标准,并考虑了(a)创伤后应激清单5(PCL-5)的预测因素集,(b)DSM-5(PC-PTSD)的初级保健PTSD屏幕,(c)PCL-5和PC-PTSD问题,以确定肯尼亚公共部门环境中PTSD筛查的最佳项目。通过最小化验证数据中的平均平方误差来拟合使用LASSO的逻辑回归模型。接收器工作特性曲线下面积(AUROC)测量辨别性能。
    惩罚回归分析提出了一种筛选工具,该工具将两个PCL-5问题的李克特量表值求和-对压力经历(#1)和失眠(#21)的侵入性想法。根据MINI的评估,预测PTSD的AUROC为0.85(使用固定测试数据),优于PC-PTSD。AUROC在按年龄定义的亚组中相似,性别,除了没有创伤史的患者,经历的创伤类别数量(所有AUROC>0.83)-AUROC为0.78。
    在某些东非环境中,2个项目的PTSD筛查工具可能优于更长的筛查人员,并且很容易由非专业人员进行缩放。
    UNASSIGNED: Scalable PTSD screening strategies must be brief, accurate and capable of administration by a non-specialized workforce.
    UNASSIGNED: We used PTSD as determined by the structured clinical interview as our gold standard and considered predictors sets of (a) Posttraumatic Stress Checklist-5 (PCL-5), (b) Primary Care PTSD Screen for the DSM-5 (PC-PTSD) and, (c) PCL-5 and PC-PTSD questions to identify the optimal items for PTSD screening for public sector settings in Kenya. A logistic regression model using LASSO was fit by minimizing the average squared error in the validation data. Area under the receiver operating characteristic curve (AUROC) measured discrimination performance.
    UNASSIGNED: Penalized regression analysis suggested a screening tool that sums the Likert scale values of two PCL-5 questions-intrusive thoughts of the stressful experience (#1) and insomnia (#21). This had an AUROC of 0.85 (using hold-out test data) for predicting PTSD as evaluated by the MINI, which outperformed the PC-PTSD. The AUROC was similar in subgroups defined by age, sex, and number of categories of trauma experienced (all AUROCs>0.83) except those with no trauma history- AUROC was 0.78.
    UNASSIGNED: In some East African settings, a 2-item PTSD screening tool may outperform longer screeners and is easily scaled by a non-specialist workforce.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    狼疮性肾炎患者会出现疾病症状和治疗副作用。尽管自我管理行为在这种疾病的患者中很重要,对影响这些行为的因素的研究有限。
    本研究旨在探讨狼疮性肾炎患者自我管理行为的影响因素。
    这项横断面研究是在2019年8月至2020年12月期间在泰国一家大学医院的240名狼疮性肾炎患者中进行的,采用随机抽样方法。使用人口统计学和临床特征问卷收集数据,自我管理行为问卷,管理慢性病的自我效能感:6项量表,狼疮性肾炎知识问卷,家庭支持量表,成人生活问卷中的社会网络,狼疮性肾炎纪念症状评定量表。采用描述性统计和多元线性回归分析。
    参与者报告了中等水平的自我管理行为。多元回归分析显示,疾病持续时间,收入,症状,自我效能感,知识,家庭支持,社交网络,狼疮性肾炎和类别显着解释了自我管理行为变化的21%(R2=0.21;F(8,231)=7.73;p<0.001)。家庭支持(β=0.32,p<0.001)和症状(β=-0.23,p<0.001)是狼疮性肾炎患者自我管理行为的重要决定因素。
    这些发现为护士更好地了解影响狼疮性肾炎患者自我管理行为的因素提供了有价值的见解。家庭支持低,症状严重程度高的患者可能难以执行自我管理行为。护士应更多关注这些患者,并提供基于家庭的干预措施,以优化该人群的自我管理行为。
    UNASSIGNED: Patients with lupus nephritis experience disease symptoms and side effects from treatment. Although self-management behaviors are important in patients with this disease, there is limited research on the factors influencing these behaviors.
    UNASSIGNED: This study aimed to examine the factors influencing self-management behaviors in patients with lupus nephritis.
    UNASSIGNED: This cross-sectional study was conducted in 240 patients with lupus nephritis at a university hospital in Thailand between August 2019 and December 2020 using a random sampling method. Data were collected using a demographic and clinical characteristic questionnaire, Self-Management Behavior Questionnaire, Self-efficacy for Managing Chronic Disease: A 6-item Scale, Knowledge about Lupus Nephritis Questionnaire, Family Support Scale, Social Networks in Adult Life Questionnaire, and Memorial Symptom Assessment Scale for Lupus Nephritis. Descriptive statistics and multiple linear regression analyses were employed.
    UNASSIGNED: The participants reported a moderate level of self-management behaviors. Multiple regression analyses revealed that disease duration, income, symptoms, self-efficacy, knowledge, family support, social networks, and classes of lupus nephritis significantly explained 21% of the variance in self-management behaviors (R2 = 0.21; F(8,231) = 7.73; p <0.001). Family support (β = 0.32, p <0.001) and symptoms (β = -0.23, p <0.001) were significant determinants of self-management behaviors in patients with lupus nephritis.
    UNASSIGNED: The findings provide valuable insight for nurses to better understand the factors influencing self-management behaviors in patients with lupus nephritis. Patients with low family support and high symptom severity may face difficulty in performing self-management behaviors. Nurses should pay more attention to these patients and provide family-based interventions to optimize self-management behaviors in this population.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    严重烧伤的治疗通常需要大量的人力和物力,包括专门的重症监护,分期手术,继续恢复。这给患者及其家庭带来了巨大的负担。烧伤治疗的费用受许多因素影响,包括患者的人口统计学和临床特征。这项研究旨在确定Korle-Bu教学医院的烧伤护理成本及其相关预测因素,加纳。
    在Korle-Bu教学医院的Burns中心对65名同意入院的成年患者进行了分析性横断面研究。获得了患者的人口统计学和临床特征以及烧伤治疗的直接成本。进行多元回归分析以确定烧伤护理直接成本的预测因素。
    共有65名参与者参加了这项研究,男女比例为1.4:1,平均年龄为35.9±14.6岁。近85%的人持续10-30%的全身表面积烧伤,而只有6.2%(4)的烧伤超过30%的全身表面积。烧伤治疗的平均总费用为GHS22,333.15(3,897.58美元)。手术治疗,伤口敷料和药物费用占45.6%,分别占燃烧总费用的27.5%和9.8%。
    烧伤治疗的直接成本非常高,并且可以通过烧伤的总表面积百分比和住院时间来预测。
    UNASSIGNED: treatment of severe burn injury generally requires enormous human and material resources including specialized intensive care, staged surgery, and continued restoration. This contributes to the enormous burden on patients and their families. The cost of burn treatment is influenced by many factors including the demographic and clinical characteristics of the patient. This study aimed to determine the costs of burn care and its associated predictive factors in Korle-Bu Teaching Hospital, Ghana.
    UNASSIGNED: an analytical cross-sectional study was conducted among 65 consenting adult patients on admission at the Burns Centre of the Korle-Bu Teaching Hospital. Demographic and clinical characteristics of patients as well as the direct cost of burns treatment were obtained. Multiple regression analysis was done to determine the predictors of the direct cost of burn care.
    UNASSIGNED: a total of sixty-five (65) participants were enrolled in the study with a male-to-female ratio of 1.4: 1 and a mean age of 35.9 ± 14.6 years. Nearly 85% sustained between 10-30% total body surface area burns whilst only 6.2% (4) had burns more than 30% of total body surface area. The mean total cost of burns treatment was GHS 22,333.15 (USD 3,897.58). Surgical treatment, wound dressing and medication charges accounted for 45.6%, 27.5% and 9.8% of the total cost of burn respectively.
    UNASSIGNED: the direct costs of burn treatment were substantially high and were predicted by the percentage of total body surface area burn and length of hospital stay.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    基因表达是动态的,并且在过程的不同阶段有所不同。鉴定具有时间特异性表达模式的基因谱可以为正在进行的生物过程提供有价值的见解。比如细胞周期,细胞发育,昼夜节律,或对外部刺激的反应,如药物治疗或病毒感染。然而,目前,没有数据库定义,识别或存档具有时间特异性表达模式的基因谱。这里,使用高通量回归分析方法,将8个线性和非线性参数模型拟合到来自时间序列实验的基因表达谱中,以鉴定具有时间特异性表达模式的8种类型的基因谱.我们整理了2684个时间序列转录组数据集,并鉴定了2644,370个表现出时间特异性表达模式的基因谱。结果存储在数据库GeTeSEPdb(具有时间特异性表达模式数据库的基因谱,http://www。inbirg.com/GeTeSEPdb/)。此外,我们实施了一个在线工具,从用户提交的数据中鉴定具有时间特异性表达模式的基因谱.总之,GeTeSEPdb是一个全面的网络服务,可用于识别和分析具有时间特异性表达模式的基因谱。这种方法有助于探索转录变化和反应的时间模式。我们坚信GeTeSEPdb将成为生物学家和生物信息学家的宝贵资源。
    Gene expression is dynamic and varies at different stages of processes. The identification of gene profiles with temporal-specific expression patterns can provide valuable insights into ongoing biological processes, such as the cell cycle, cell development, circadian rhythms, or responses to external stimuli such as drug treatments or viral infections. However, currently, no database defines, identifies or archives gene profiles with temporal-specific expression patterns. Here, using a high-throughput regression analysis approach, eight linear and nonlinear parametric models were fitted to gene expression profiles from time-series experiments to identify eight types of gene profiles with temporal-specific expression patterns. We curated 2684 time-series transcriptome datasets and identified 2644,370 gene profiles exhibiting temporal-specific expression patterns. The results were stored in the database GeTeSEPdb (gene profiles with temporal-specific expression patterns database, http://www.inbirg.com/GeTeSEPdb/). Moreover, we implemented an online tool to identify gene profiles with temporal-specific expression patterns from user-submitted data. In summary, GeTeSEPdb is a comprehensive web service that can be used to identify and analyse gene profiles with temporal-specific expression patterns. This approach facilitates the exploration of transcriptional changes and temporal patterns of responses. We firmly believe that GeTeSEPdb will become a valuable resource for biologists and bioinformaticians.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    使用最大似然估计(MLE)拟合的风险预测模型通常过度拟合,导致预测过于极端,校准斜率(CS)小于1。惩罚方法,比如里奇和套索,已经被建议作为这个问题的解决方案,因为它们倾向于将回归系数缩小到零,导致预测更接近平均值。收缩量由调谐参数调节,λ,$\\lambda,$通常通过交叉验证(“标准调整”)选择。尽管已经发现惩罚方法可以平均改善校准,它们经常过度收缩,并在选定的λ$\\lambda$和CS中表现出很大的可变性。这是个问题,特别是对于小样本量,而且在使用样本量时也建议控制过拟合。我们考虑这些问题是否部分是由于使用交叉验证选择λ$\\lambda$,与原始开发样本相比,“训练”数据集的大小减小,导致λ$\\lambda$的高估,因此,过度收缩。我们提出了一种改进的交叉验证调优方法(“改进的调优”),从通过从原始数据集引导获得的伪开发数据集估计λ$\\lambda$,尽管尺寸较大,这样得到的交叉验证训练数据集的大小与原始数据集相同。修改的调谐可以在标准软件中容易地实现,并且与调谐参数的引导选择(“引导调谐”)密切相关。我们使用推荐的样本量在模拟和真实数据中评估了Ridge和Lasso的修改和引导调整,和尺寸略低和高。他们大大改进了λ$\\lambda$的选择,与标准调谐方法相比,CS得到了改进。与MLE相比,他们还改进了预测。
    Risk prediction models fitted using maximum likelihood estimation (MLE) are often overfitted resulting in predictions that are too extreme and a calibration slope (CS) less than 1. Penalized methods, such as Ridge and Lasso, have been suggested as a solution to this problem as they tend to shrink regression coefficients toward zero, resulting in predictions closer to the average. The amount of shrinkage is regulated by a tuning parameter, λ , $\\lambda ,$ commonly selected via cross-validation (\"standard tuning\"). Though penalized methods have been found to improve calibration on average, they often over-shrink and exhibit large variability in the selected λ $\\lambda $ and hence the CS. This is a problem, particularly for small sample sizes, but also when using sample sizes recommended to control overfitting. We consider whether these problems are partly due to selecting λ $\\lambda $ using cross-validation with \"training\" datasets of reduced size compared to the original development sample, resulting in an over-estimation of λ $\\lambda $ and, hence, excessive shrinkage. We propose a modified cross-validation tuning method (\"modified tuning\"), which estimates λ $\\lambda $ from a pseudo-development dataset obtained via bootstrapping from the original dataset, albeit of larger size, such that the resulting cross-validation training datasets are of the same size as the original dataset. Modified tuning can be easily implemented in standard software and is closely related to bootstrap selection of the tuning parameter (\"bootstrap tuning\"). We evaluated modified and bootstrap tuning for Ridge and Lasso in simulated and real data using recommended sample sizes, and sizes slightly lower and higher. They substantially improved the selection of λ $\\lambda $ , resulting in improved CS compared to the standard tuning method. They also improved predictions compared to MLE.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在0K至300K的温度范围内,可以通过Debye-Einstein积分以物理上合理的方式描述许多结晶固体的热容数据。德拜-爱因斯坦方法的参数可以通过马尔可夫链蒙特卡罗(MCMC)全局优化方法或通过Levenberg-Marquardt(LM)局部优化例程获得。在MCMC方法的情况下,同时优化模型参数和描述测量点残差的函数的系数。因此,得到热容函数的贝叶斯可信区间。尽管两种回归工具(LM和MCMC)是完全不同的方法,不仅仅是德拜-爱因斯坦参数的值,但它们的标准误差似乎也相似。然后使用计算的模型参数及其相关的标准误差来推导焓,熵和吉布斯能量作为温度的函数。通过直接插入所有4·105计算机的MCMC参数,可以运行积分焓的分布,熵和吉布斯能量被确定。
    Heat capacity data of many crystalline solids can be described in a physically sound manner by Debye-Einstein integrals in the temperature range from 0K to 300K. The parameters of the Debye-Einstein approach are either obtained by a Markov chain Monte Carlo (MCMC) global optimization method or by a Levenberg-Marquardt (LM) local optimization routine. In the case of the MCMC approach the model parameters and the coefficients of a function describing the residuals of the measurement points are simultaneously optimized. Thereby, the Bayesian credible interval for the heat capacity function is obtained. Although both regression tools (LM and MCMC) are completely different approaches, not only the values of the Debye-Einstein parameters, but also their standard errors appear to be similar. The calculated model parameters and their associated standard errors are then used to derive the enthalpy, entropy and Gibbs energy as functions of temperature. By direct insertion of the MCMC parameters of all 4·105 computer runs the distributions of the integral quantities enthalpy, entropy and Gibbs energy are determined.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号