K - means k-means-医云文献数字医云科研云海量医学决策数据服务

k-means 关注

K - means

文献(18篇)

百科

视频

1 Dietary patterns associated with the incidence of hypertension among adult Japanese males: application of machine learning to a cohort study.

与日本成年男性高血压发病率相关的饮食模式：机器学习在队列研究中的应用。影响指数 : 4.865
发表时间：Jun 2024 25
来源期刊：Eur J Nutr PMID：38403812

DOI：10.1007/s00394-024-03342-w
文章类型： Journal Article

目的：先前研究了无监督机器学习方法与传统方法在评估饮食模式及其与高血压相关方面的有效性，结果相互矛盾。因此,我们的目的是探索高血压发病率与使用无监督机器学习技术提取的总体饮食模式之间的相关性.
方法：数据来自2008年8月至2010年8月参加前瞻性队列研究的日本男性参与者。447名男性参与者的最终数据集用于分析。使用均匀流形近似和投影（UMAP）和随后的K均值聚类进行降维，以得出饮食模式。此外,多变量logistic回归用于评估饮食模式与高血压发病率之间的关系。
结果：我们确定了四种饮食模式：低蛋白/纤维高糖，\'\'乳制品/蔬菜为主,\'\'肉类，\'和\'海鲜和酒精。\'与\'海鲜和酒精\'作为参考相比，在调整潜在的混杂因素后，高血压的保护性膳食模式为“乳制品/蔬菜为主”(OR0.39，95%CI0.19-0.80，P=0.013)和“肉类为主”(OR0.37，95%CI0.16-0.86，P=0.022)，包括年龄,身体质量指数,吸烟,教育,身体活动,血脂异常,和糖尿病。年龄匹配的敏感性分析证实了这一发现。
结论：这项研究发现，相对于“海鲜和酒精”模式，“乳制品/蔬菜为主”和“肉类为主”的饮食模式与男性高血压风险较低相关。
OBJECTIVE: The previous studies that examined the effectiveness of unsupervised machine learning methods versus traditional methods in assessing dietary patterns and their association with incident hypertension showed contradictory results. Consequently, our aim is to explore the correlation between the incidence of hypertension and overall dietary patterns that were extracted using unsupervised machine learning techniques.
METHODS: Data were obtained from Japanese male participants enrolled in a prospective cohort study between August 2008 and August 2010. A final dataset of 447 male participants was used for analysis. Dimension reduction using uniform manifold approximation and projection (UMAP) and subsequent K-means clustering was used to derive dietary patterns. In addition, multivariable logistic regression was used to evaluate the association between dietary patterns and the incidence of hypertension.
RESULTS: We identified four dietary patterns: \'Low-protein/fiber High-sugar,\' \'Dairy/vegetable-based,\' \'Meat-based,\' and \'Seafood and Alcohol.\' Compared with \'Seafood and Alcohol\' as a reference, the protective dietary patterns for hypertension were \'Dairy/vegetable-based\' (OR 0.39, 95% CI 0.19-0.80, P = 0.013) and the \'Meat-based\' (OR 0.37, 95% CI 0.16-0.86, P = 0.022) after adjusting for potential confounding factors, including age, body mass index, smoking, education, physical activity, dyslipidemia, and diabetes. An age-matched sensitivity analysis confirmed this finding.
CONCLUSIONS: This study finds that relative to the \'Seafood and Alcohol\' pattern, the \'Dairy/vegetable-based\' and \'Meat-based\' dietary patterns are associated with a lower risk of hypertension among men.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
2 Multi-Dimensional Validation of the Integration of Syntactic and Semantic Distance Measures for Clustering Fibromyalgia Patients in the Rheumatic Monitor Big Data Study.

在风湿病监测大数据研究中，聚类纤维肌痛患者的句法和语义距离度量集成的多维验证。影响指数 : 5.046
发表时间：Jan 2024 19
来源期刊：Bioengineering (Basel) PMID：38275577

DOI：10.3390/bioengineering11010097
文章类型： Journal Article

这项研究主要旨在开发一种新颖的多维方法，以发现和验证最佳集群数量。次要目标是将其用于聚集纤维肌痛患者的任务。我们提出了一种全面的方法，包括使用几种不同的聚类算法，使用几种句法距离度量(剪影指数(SI)，卡林斯基-哈拉巴斯指数(CHI)，和戴维斯-博尔丁指数(DBI))，使用调整后的兰德指数(ARI)进行稳定性评估，并通过对数据重复装袋后的多次聚类迭代的性能来验证每个聚类选项的内部语义一致性，以选择多个部分数据集。然后，我们使用完整数据集对最稳定的聚类选项的(临床)语义进行统计分析.最后,结果通过监督机器学习(ML)模型进行验证,该模型将患者分类回发现的聚类,并通过计算模型的Shapley加性解释(SHAP)值进行解释.因此，我们把我们的方法称为聚类，距离度量和迭代统计和语义验证(CDI-SSV)方法。我们将我们的方法应用于分析从1370名纤维肌痛患者获得的综合数据集。结果表明，K-means在句法和内部一致的语义分析阶段具有很高的鲁棒性，因此随后进行语义评估以确定最佳的聚类数量（k）。这表明k=3是一种更具临床意义的解决方案，代表三个不同的严重程度。随机森林模型通过以高精度(AUC:0.994;准确度:0.946)分类到发现的簇中来验证结果。SHAP分析强调了区分最严重状况时“功能问题”的临床相关性。总之,CDI-SSV方法学为改善复杂患者的分类提供了巨大潜力.我们的研究结果表明，不同的纤维肌痛患者的分类系统，这有可能改善临床护理，通过为循证个性化诊断提供临床标记，管理,纤维肌痛患者的预后。
This study primarily aimed at developing a novel multi-dimensional methodology to discover and validate the optimal number of clusters. The secondary objective was to deploy it for the task of clustering fibromyalgia patients. We present a comprehensive methodology that includes the use of several different clustering algorithms, quality assessment using several syntactic distance measures (the Silhouette Index (SI), Calinski-Harabasz index (CHI), and Davies-Bouldin index (DBI)), stability assessment using the adjusted Rand index (ARI), and the validation of the internal semantic consistency of each clustering option via the performance of multiple clustering iterations after the repeated bagging of the data to select multiple partial data sets. Then, we perform a statistical analysis of the (clinical) semantics of the most stable clustering options using the full data set. Finally, the results are validated through a supervised machine learning (ML) model that classifies the patients back into the discovered clusters and is interpreted by calculating the Shapley additive explanations (SHAP) values of the model. Thus, we refer to our methodology as the clustering, distance measures and iterative statistical and semantic validation (CDI-SSV) methodology. We applied our method to the analysis of a comprehensive data set acquired from 1370 fibromyalgia patients. The results demonstrate that the K-means was highly robust in the syntactic and the internal consistent semantics analysis phases and was therefore followed by a semantic assessment to determine the optimal number of clusters (k), which suggested k = 3 as a more clinically meaningful solution, representing three distinct severity levels. the random forest model validated the results by classification into the discovered clusters with high accuracy (AUC: 0.994; accuracy: 0.946). SHAP analysis emphasized the clinical relevance of \"functional problems\" in distinguishing the most severe condition. In conclusion, the CDI-SSV methodology offers significant potential for improving the classification of complex patients. Our findings suggest a classification system for different profiles of fibromyalgia patients, which has the potential to improve clinical care, by providing clinical markers for the evidence-based personalized diagnosis, management, and prognosis of fibromyalgia patients.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
3 Machine Learning and Symptom Patterns in Degenerative Cervical Myelopathy: Web-Based Survey Study.

机器学习和退行性脊髓型颈椎病的症状模式：基于网络的调查研究。影响指数 : 暂无
发表时间：Jan 2024 25
来源期刊：JMIR Form Res PMID：38271070

DOI：10.2196/54747
文章类型： Journal Article

背景：退行性颈椎病（DCM），由退行性病理引起的脊髓压迫引起的进行性脊髓损伤，经常表现为颈部疼痛，上肢或下肢感觉运动功能障碍，步态紊乱,和膀胱或肠功能障碍。它的症状非常不同，使早期检测以及对潜在因素及其后果的测量或理解具有挑战性。越来越多,证据表明DCM可能由疾病的亚组组成，还有待定义。
目的：本研究旨在探索机器学习是否可以仅根据临床特征识别有临床意义的患者群体。
方法：进行了一项调查，要求参与者指定他们所经历的临床特征，他们的校长提出申诉，诊断时间以及人口统计信息，包括疾病的严重程度,年龄,和性爱。K均值聚类用于使用欧几里得距离度量和Hartigan-Wong算法根据受访者的临床特征将其划分为聚类。随后通过比较各组的就诊时间来探讨各组的临床意义，时间与疾病的严重程度，和其他人口统计学。
结果：在审查了辅助数据和集群数据之后，一致确定DCM应答组的最佳数量为3个.在第1组中，有40名受访者，男性和女性参与者的比例为13:21。在第二组中，有92名受访者，男女参与者比例为27:65。第三组有57名受访者，参与者的男女比例为9:48。在第1组中，共有6人没有报告生物性别。该集群的平均年龄为56.2（SD10.5）岁；在第2组中，为54.7（SD9.63）年；在第3组中，为51.8（SD8.4）年。不同集群的患者在报告的临床特征总数上存在显著差异，第3组中临床特征较多，第1组中临床特征最少（Kruskal-Wallis秩和检验：χ22=159.46；P<.001）。临床特征的模式与严重程度之间没有关系。关于自诊断以来的时间和DCM的时间，集群之间也没有差异。
结论：使用机器学习和患者报告的经验，定义3组DCM患者，临床特征的数量不同，但DCM的严重程度或DCM的时间不同。尽管可能已经错过了集群的更清晰的生物学基础，这些发现与新兴的观察结果一致，即DCM是一种异质性疾病，难以诊断或分层。机器学习方法可以有效地辅助模式识别。然而,挑战在于创建从这些方法中获益所必需的高质量数据集。
BACKGROUND: Degenerative cervical myelopathy (DCM), a progressive spinal cord injury caused by spinal cord compression from degenerative pathology, often presents with neck pain, sensorimotor dysfunction in the upper or lower limbs, gait disturbance, and bladder or bowel dysfunction. Its symptomatology is very heterogeneous, making early detection as well as the measurement or understanding of the underlying factors and their consequences challenging. Increasingly, evidence suggests that DCM may consist of subgroups of the disease, which are yet to be defined.
OBJECTIVE: This study aimed to explore whether machine learning can identify clinically meaningful groups of patients based solely on clinical features.
METHODS: A survey was conducted wherein participants were asked to specify the clinical features they had experienced, their principal presenting complaint, and time to diagnosis as well as demographic information, including disease severity, age, and sex. K-means clustering was used to divide respondents into clusters according to their clinical features using the Euclidean distance measure and the Hartigan-Wong algorithm. The clinical significance of groups was subsequently explored by comparing their time to presentation, time with disease severity, and other demographics.
RESULTS: After a review of both ancillary and cluster data, it was determined by consensus that the optimal number of DCM response groups was 3. In Cluster 1, there were 40 respondents, and the ratio of male to female participants was 13:21. In Cluster 2, there were 92 respondents, with a male to female participant ratio of 27:65. Cluster 3 had 57 respondents, with a male to female participant ratio of 9:48. A total of 6 people did not report biological sex in Cluster 1. The mean age in this Cluster was 56.2 (SD 10.5) years; in Cluster 2, it was 54.7 (SD 9.63) years; and in Cluster 3, it was 51.8 (SD 8.4) years. Patients across clusters significantly differed in the total number of clinical features reported, with more clinical features in Cluster 3 and the least clinical features in Cluster 1 (Kruskal-Wallis rank sum test: χ22=159.46; P<.001). There was no relationship between the pattern of clinical features and severity. There were also no differences between clusters regarding time since diagnosis and time with DCM.
CONCLUSIONS: Using machine learning and patient-reported experience, 3 groups of patients with DCM were defined, which were different in the number of clinical features but not in the severity of DCM or time with DCM. Although a clearer biological basis for the clusters may have been missed, the findings are consistent with the emerging observation that DCM is a heterogeneous disease, difficult to diagnose or stratify. There is a place for machine learning methods to efficiently assist with pattern recognition. However, the challenge lies in creating quality data sets necessary to derive benefit from such approaches.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
4 Unsupervised cluster analysis of clinical and metabolite characteristics in patients with chronic complications of T2DM: an observational study of real data.

2 型糖尿病慢性并发症患者临床和代谢物特征的无监督聚类分析：真实数据的观察性研究. 影响指数 : 6.055
发表时间：2023
来源期刊：Front Endocrinol (Lausanne) PMID：37929026

DOI：10.3389/fendo.2023.1230921
文章类型： Observational Study

■本研究的目的是通过聚类分析对大连地区的2型糖尿病（T2DM）慢性并发症患者进行聚类。中国,并检查不同亚组之间不同慢性并发症和代谢水平的风险差异。
■2267例住院患者被纳入基于11个变量[体重指数（BMI），收缩压(SBP),舒张压（DBP），葡萄糖,甘油三酯（TG），总胆固醇（TC），尿酸(UA),微量白蛋白尿（mAlb），胰岛素,胰岛素敏感指数（ISI）和Homa胰岛素抵抗（Homa-IR）]。采用多因素logistic回归分析不同亚组T2DM各种慢性并发症的风险,Kruskal-WallisH检验和Nemenyi检验检查了不同亚簇之间代谢物的差异。
■通过聚类分析确定了四个子集群，每个亚簇都有显著的特征,并标记有不同的风险水平.第1组住院患者1112例（49.05%），标记为“低风险”；第2组包括859名(37.89%)住院患者，标签特征为“中低风险”；第3组包括134名(5.91%)住院患者，标记为“中等风险”；第4组包括162名(7.15%)住院患者，标签特征为“高风险”。此外,在不同的子集群中，多种慢性并发症的患者比例不同，相同慢性并发症的风险也有显著差异。与“低风险”群集相比，其他三个集群显示出更高的微血管病变的风险。在对20个协变量进行额外调整后，“中低风险”集群的比值比(OR)和95%置信区间(95CI)，“中等风险”群集，“高风险”集群为1.369(1.042，1.799)，2.188(1.496,3.201),和9.644（5.851，15.896）（所有p<0.05）。代表性地,“高风险”集群的DN风险最高[OR(95CI)：11.510(7.139,18.557)，（p<0.05）]和DR[OR（95CI）：3.917（2.526,6.075），(p<0.05)]调整20个变量后。与其他亚簇[苏氨酸（Thr）相比，四种代谢物具有统计学上的显着分布差异，酪氨酸(Tyr),戊二酰肉碱（C5DC），和丁酰基肉碱(C4)]。
■T2DM慢性并发症患者具有显著的聚集性特征，不同亚簇的靶器官损害风险差异显著,代谢物的水平也是如此。这可能成为防治T2DM慢性并发症的新思路。
The aim of this study was to cluster patients with chronic complications of type 2 diabetes mellitus (T2DM) by cluster analysis in Dalian, China, and examine the variance in risk of different chronic complications and metabolic levels among the various subclusters.
2267 hospitalized patients were included in the K-means cluster analysis based on 11 variables [Body Mass Index (BMI), Systolic Blood Pressure (SBP), Diastolic Blood Pressure (DBP), Glucose, Triglycerides (TG), Total Cholesterol (TC), Uric Acid (UA), microalbuminuria (mAlb), Insulin, Insulin Sensitivity Index (ISI) and Homa Insulin-Resistance (Homa-IR)]. The risk of various chronic complications of T2DM in different subclusters was analyzed by multivariate logistic regression, and the Kruskal-Wallis H test and the Nemenyi test examined the differences in metabolites among different subclusters.
Four subclusters were identified by clustering analysis, and each subcluster had significant features and was labeled with a different level of risk. Cluster 1 contained 1112 inpatients (49.05%), labeled as \"Low-Risk\"; cluster 2 included 859 (37.89%) inpatients, the label characteristics as \"Medium-Low-Risk\"; cluster 3 included 134 (5.91%) inpatients, labeled \"Medium-Risk\"; cluster 4 included 162 (7.15%) inpatients, and the label feature was \"High-Risk\". Additionally, in different subclusters, the proportion of patients with multiple chronic complications was different, and the risk of the same chronic complication also had significant differences. Compared to the \"Low-Risk\" cluster, the other three clusters exhibit a higher risk of microangiopathy. After additional adjustment for 20 covariates, the odds ratios (ORs) and 95% confidence intervals (95%CI) of the \"Medium-Low-Risk\" cluster, the \"Medium-Risk\" cluster, and the\"High-Risk\" cluster are 1.369 (1.042, 1.799), 2.188 (1.496, 3.201), and 9.644 (5.851, 15.896) (all p<0.05). Representatively, the \"High-Risk\" cluster had the highest risk of DN [OR (95%CI): 11.510(7.139,18.557), (p<0.05)] and DR [OR (95%CI): 3.917(2.526,6.075), (p<0.05)] after 20 variables adjusted. Four metabolites with statistically significant distribution differences when compared with other subclusters [Threonine (Thr), Tyrosine (Tyr), Glutaryl carnitine (C5DC), and Butyryl carnitine (C4)].
Patients with chronic complications of T2DM had significant clustering characteristics, and the risk of target organ damage in different subclusters was significantly different, as were the levels of metabolites. Which may become a new idea for the prevention and treatment of chronic complications of T2DM.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
5 Explorative Clustering of the Nitrogen Balance Trajectory in Critically Ill Patients: A Preliminary post hoc Analysis of a Single-Center Prospective Observational Study.

危重患者氮平衡轨迹的探索性聚类：单中心前瞻性观察研究的初步事后分析。影响指数 : 5.923
发表时间：Oct 2023 9
来源期刊：Ann Nutr Metab PMID：37812913

DOI：10.1159/000532126
文章类型： Observational Study

背景：氮平衡估计蛋白质净差异。然而,因为它有许多限制，在危重患者的临床过程中，重要的是要考虑氮平衡的轨迹。
目的：我们在此使用机器学习方法对氮平衡轨迹进行探索性分类。
方法：这是一项单中心前瞻性研究的事后分析，该研究针对我们急诊和重症中心ICU收治的患者。从ICU第1天至第10天收集24小时尿液以9分评估氮平衡。进行K均值聚类以对氮平衡轨迹进行分类。我们还评估了与未发现的集群相关的因素。
结果：76名符合条件的患者被纳入本研究。聚类后，氮平衡轨迹分为4类。第1类在10天内被视为负平衡(24名患者)。2级在第3天或第4天具有阳性转换(8名患者)。第3级在第8天或第9天具有阳性转换(28名患者)。第4类最初具有正平衡,然后转变为负平衡(16名患者)。脓毒症并发症和类固醇使用与负氮平衡轨迹相关。第2类与住院时间缩短和股骨肌肉体积损失相关，然而,入院时经常有虚弱和肌少症。主动营养治疗意向与积极轨迹无关。
结论：重症患者的氮平衡轨迹可分为4类用于临床实践。在紧急入住ICU的患者中，氮平衡的正转化可能会延迟10天。
The nitrogen balance estimates a protein net difference. However, since it has a number of limitations, it is important to consider the trajectory of the nitrogen balance in the clinical course of critically ill patients.
We herein exploratively classified the nitrogen balance trajectory using a machine learning method.
This is a post hoc analysis of a single-center prospective study for the patients admitted to our Emergency and Critical Center ICU. The nitrogen balance was evaluated with 24-h urine collection from ICU days 1-10 with 9 points. K-means clustering was performed to classify the nitrogen balance trajectory. We also evaluated factors associated with uncovered clusters.
Seventy-six eligible patients were included in the present study. After clustering, the nitrogen balance trajectory was classified into 4 classes. Class 1 was trajected as a negative balance over 10 days (24 patients). Class 2 had a positive conversion on day 3 or 4 (8 patients). Class 3 had a positive conversion on day 8 or 9 (28 patients). Class 4 initially had a positive balance and then converted to a negative balance (16 patients). Sepsis complication and steroid use were associated with negative nitrogen balance trajectory. Class 2 was associated with lower length of hospital stay and femoral muscle volume loss, however, frequently had frailty and sarcopenia on admission. Active nutrition therapy intention was not correlated with positive trajectory.
The nitrogen balance trajectory in critically ill patients may be classified into 4 classes for clinical practice. Among patients emergently admitted to the ICU, the positive conversion of the nitrogen balance might be delayed over 10 days.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
6 Identification of spinal tuberculosis subphenotypes using routine clinical data: a study based on unsupervised machine learning.

使用常规临床数据识别脊柱结核亚表型：基于无监督机器学习的研究。影响指数 : 5.348
发表时间：2023
来源期刊：Ann Med PMID：37611242

DOI：10.1080/07853890.2023.2249004
文章类型： Journal Article

■脊柱结核亚型的鉴定是精准医学不可或缺的组成部分。然而,我们缺乏适当的研究模型来识别脊柱结核患者的亚型.在这里，我们确定了脊柱结核的可能亚型，并比较了它们的临床结果。
■共422例脊柱结核患者接受手术治疗。使用K均值聚类算法和入院后24小时内从患者收集的常规可用临床数据进行聚类分析。最后,临床特征的差异，手术疗效，并对各亚型的术后并发症进行了比较。
■确定了脊柱结核的两种亚型。实验室检查结果显示，第2组中一个以上的炎症指标水平高于第1组。就疾病的严重程度而言,第2组显示出较高的Oswestry残疾指数(ODI)，较高的视觉分析量表(VAS)得分，和较低的日本骨科协会（JOA）得分。此外,就术后结果而言，第2组患者更容易出现并发症，尤其是伤口感染,住院时间更长.
■基于常规可用临床数据的K-means聚类分析可快速识别具有不同临床结果的脊柱结核的两种亚型。我们相信这一发现将有助于临床医生在床边快速轻松地识别脊柱结核的亚型，并成为个性化治疗策略的基石。
The identification of spinal tuberculosis subphenotypes is an integral component of precision medicine. However, we lack proper study models to identify subphenotypes in patients with spinal tuberculosis. Here we identified possible subphenotypes of spinal tuberculosis and compared their clinical results.
A total of 422 patients with spinal tuberculosis who received surgical treatment were enrolled. Clustering analysis was performed using the K-means clustering algorithm and the routinely available clinical data collected from patients within 24 h after admission. Finally, the differences in clinical characteristics, surgical efficacy, and postoperative complications among the subphenotypes were compared.
Two subphenotypes of spinal tuberculosis were identified. Laboratory examination results revealed that the levels of more than one inflammatory index in cluster 2 were higher than those in cluster 1. In terms of disease severity, Cluster 2 showed a higher Oswestry Disability Index (ODI), a higher visual analysis scale (VAS) score, and a lower Japanese Orthopedic Association (JOA) score. In addition, in terms of postoperative outcomes, cluster 2 patients were more prone to complications, especially wound infections, and had a longer hospital stay.
K-means clustering analysis based on conventional available clinical data can rapidly identify two subtypes of spinal tuberculosis with different clinical results. We believe this finding will help clinicians to rapidly and easily identify the subtypes of spinal tuberculosis at the bedside and become the cornerstone of individualized treatment strategies.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
7 Assessing resource allocation based on workload: a data envelopment analysis study on clinical departments in a class a tertiary public hospital in China.

基于工作量的资源分配评估：中国三级甲等公立医院临床科室的数据包络分析研究。影响指数 : 2.908
发表时间：Jul 2023 28
来源期刊：BMC Health Serv Res PMID：37507799

DOI：10.1186/s12913-023-09803-y
文章类型： Journal Article

目标：今天，我国公立医院的发展模式正在从扩张转向效率,管理模式正在从粗放型向精细化转变。本研究旨在评估中国甲级三级公立医院（M医院）的临床科室效率，分析医院资源在这些科室之间的配置情况，为医院管理提供参考。
方法：从医院信息系统（HIS）中提取了2021年M医院32个临床科室的住院患者的住院数据，使用分层抽样方法获得了一个包含38,147名住院患者的数据集。考虑到临床科室的非同质性，根据包括住院天数在内的工作量相关数据标签，使用K-means算法对38,147名患者进行聚类，重症监护工作量指数，护理工作量指数，和操作工作量指数，从而将非同质临床科室住院患者的医疗资源消耗转化为医务人员同质的工作量。以医生的数量来看，护士,和床作为输入指标，以及分配给某些集群的住院患者数量作为输出指标，建立了面向输入的BCC模型，称为基于工作量的DEA模型。同时,建立了以住院人数和医疗收入为产出指标的控制DEA模型,并对两种模型的输出进行了比较和分析。
结果：将38,147名患者分为3类具有更好的可解释性。14个部门在基于工作量的DEA模型中达到DEA效率，10在控制DEA模型中达到DEA有效，8在两个模型中都达到DEA有效。基于工作量的DEA模型对规模扩张带来的收入增加做出了相对理性的判断,并对重症医学科等特殊科室进行评估。,老年病科.康复医学科.更恰当,更好地适应我国公立医院的功能定位。
结论：本研究中提出的以工作量为输出来评估非同质临床科室的效率的设计是可行的，并提供了一种量化专业医疗人力资源的新思路，对公立医院优化资源布局具有现实意义,为人力分组策略提供实时指导，并合理估计预期产量。
OBJECTIVE: Today, the development mode of public hospitals in China is turning from expansion to efficiency, and the management mode is turning from extensive to refined. This study aims to evaluate the efficiency of clinical departments in a Chinese class A tertiary public hospital (Hospital M) to analyze the allocation of hospital resources among these departments providing a reference for the hospital management.
METHODS: The hospitalization data of inpatients from 32 clinical departments of Hospital M in 2021 are extracted from the hospital information system (HIS), and a dataset containing 38,147 inpatients is got using stratified sampling. Considering the non-homogeneity of clinical departments, the 38,147 patients are clustered using the K-means algorithm based on workload-related data labels including inpatient days, intensive care workload index, nursing workload index, and operation workload index, so that the medical resource consumption of inpatients from non-homogeneous clinical departments can be transformed into the homogeneous workload of medical staff. Taking the numbers of doctors, nurses, and beds as input indicators, and the numbers of inpatients assigned to certain clusters as output indicators, an input-oriented BCC model is built named the workload-based DEA model. Meanwhile, a control DEA model with the number of inpatients and medical revenue as output indicators is built, and the outputs of the two models are compared and analyzed.
RESULTS: Clustering of 38,147 patients into 3 categories is of better interpretability. 14 departments reach DEA efficient in the workload-based DEA model, 10 reach DEA efficient in the control DEA model, and 8 reach DEA efficient in both models. The workload-based DEA model gives a relatively rational judge on the increase of income brought by scale expansion, and evaluates some special departments like Critical Care Medicine Dept., Geriatrics Dept. and Rehabilitation Medicine Dept. more properly, which better adapts to the functional orientation of public hospitals in China.
CONCLUSIONS: The design of evaluating the efficiency of non-homogeneous clinical departments with the workload as output proposed in this study is feasible, and provides a new idea to quantify professional medical human resources, which is of practical significance for public hospitals to optimize the layout of resources, to provide real-time guidance on manpower grouping strategies, and to estimate the expected output reasonably.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
8 Shannon entropy as a reliable score to diagnose human fibroelastic degenerative mitral chords: A micro-ct ex-vivo study.

香农熵作为诊断人纤维弹性变性二尖瓣弦的可靠评分：一项微 ct 离体研究。影响指数 : 2.356
发表时间：12 2022
来源期刊：Med Eng Phys PMID：36564142

DOI：10.1016/j.medengphy.2022.103919
文章类型： Journal Article

本文旨在通过显微CT识别正常和退行性二尖瓣边缘腱索之间的微观结构差异。对照组由从心脏移植受者的14个正常二尖瓣中切除的21个正常弦组成。实验组包括22个变性纤维弹性弦,其在手术中从二尖瓣修复或置换后的11个病理性瓣膜获得。在对照组中，浅表内皮细胞和海绵体层保持完整，覆盖波浪形核心胶原蛋白。相比之下,在实验组中，胶原纤维排列为平行排列的拉直粗束。通过微CT检查每个弦的100个横截面。通过K-means机器学习算法对每个图像进行随机化，然后，获得了全球和当地的香农熵。集群的最佳数量，K,估计可以最大化全局和局部Shannon熵中正常和退化和弦之间的差异；选择嵌套ANOVA测试后的p值作为要最小化的参数。在全局香农熵和2≤K≤7的情况下，获得了最佳结果，p<0.01；对于K=3，p=2.86·10-3。这些发现为新的围手术期诊断方法打开了大门，以避免或减少术后二尖瓣反流复发。
This paper is aimed at identifying by means of micro-CT the microstructural differences between normal and degenerative mitral marginal chordae tendineae. The control group is composed of 21 normal chords excised from 14 normal mitral valves from heart transplant recipients. The experimental group comprises 22 degenerative fibroelastic chords obtained at surgery from 11 pathological valves after mitral repair or replacement. In the control group the superficial endothelial cells and spongiosa layer remained intact, covering the wavy core collagen. In contrast, in the experimental group the collagen fibers were arranged as straightened thick bundles in a parallel configuration. 100 cross-sections were examined by micro-CT from each chord. Each image was randomized through the K-means machine learning algorithm and then, the global and local Shannon entropies were obtained. The optimum number of clusters, K, was estimated to maximize the differences between normal and degenerative chords in global and local Shannon entropy; the p-value after a nested ANOVA test was chosen as the parameter to be minimized. Optimum results were obtained with global Shannon entropy and 2≤K≤7, providing p < 0.01; for K=3, p = 2.86·10-3. These findings open the door to novel perioperative diagnostic methods in order to avoid or reduce postoperative mitral valve regurgitation recurrences.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

求助全文
9 Use of Multiple Correspondence Analysis and K-means to Explore Associations Between Risk Factors and Likelihood of Colorectal Cancer: Cross-sectional Study.

使用多重对应分析和 K - means 探索结直肠癌危险因素和可能性之间的关联：横断面研究。影响指数 : 7.076
发表时间：07 2022 19
来源期刊：J Med Internet Res PMID：35852835

DOI：10.2196/29056
文章类型： Journal Article

以前的工作表明，危险因素与结直肠癌的可能性增加有关。
这项研究的目的是通过使用多重对应分析（MCA）和k-means来检测Lleida（加泰罗尼亚）地区的这些关联。
这项横断面研究由2012年至2015年的1083例结直肠癌发作组成，摘自莱里达（西班牙）省基于人群的癌症登记处。初级保健中心数据库，还有加泰罗尼亚卫生服务登记册.数据集包括吸烟和BMI等风险因素以及社会人口统计信息和肿瘤细节。使用MCA和k-means确定危险因素与患者特征之间的关系。
这些技术的结合有助于检测具有相似危险因素的患者群。死亡风险与老年、肥胖或超重有关。III期癌症与年龄≥65岁的人和农村/半城市人口有关。而年轻人则与0期相关。
MCA和k-means对于检测危险因素和患者特征之间的关联非常有用。这些技术已被证明是分析结直肠癌某些因素发生率的有效工具。获得的结果有助于证实可疑趋势，并刺激使用这些技术来发现风险因素与其他癌症发病率的关联。
Previous works have shown that risk factors are associated with an increased likelihood of colorectal cancer.
The purpose of this study was to detect these associations in the region of Lleida (Catalonia) by using multiple correspondence analysis (MCA) and k-means.
This cross-sectional study was made up of 1083 colorectal cancer episodes between 2012 and 2015, extracted from the population-based cancer registry for the province of Lleida (Spain), the Primary Care Centers database, and the Catalan Health Service Register. The data set included risk factors such as smoking and BMI as well as sociodemographic information and tumor details. The relations between the risk factors and patient characteristics were identified using MCA and k-means.
The combination of these techniques helps to detect clusters of patients with similar risk factors. Risk of death is associated with being elderly and obesity or being overweight. Stage III cancer is associated with people aged ≥65 years and rural/semiurban populations, while younger people were associated with stage 0.
MCA and k-means were significantly useful for detecting associations between risk factors and patient characteristics. These techniques have proven to be effective tools for analyzing the incidence of some factors in colorectal cancer. The outcomes obtained help corroborate suspected trends and stimulate the use of these techniques for finding the association of risk factors with the incidence of other cancers.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

求助全文
10 Big-Data-Mining-Based Improved K-Means Algorithm for Energy Use Analysis of Coal-Fired Power Plant Units: A Case Study.

影响指数 : 2.738
发表时间：Sep 2018 13
来源期刊：Entropy (Basel) PMID：33265791

DOI：10.3390/e20090702
文章类型： Journal Article

The energy use analysis of coal-fired power plant units is of significance for energy conservation and consumption reduction. One of the most serious problems attributed to Chinese coal-fired power plants is coal waste. Several units in one plant may experience a practical rated output situation at the same time, which may increase the coal consumption of the power plant. Here, we propose a new hybrid methodology for plant-level load optimization to minimize coal consumption for coal-fired power plants. The proposed methodology includes two parts. One part determines the reference value of the controllable operating parameters of net coal consumption under typical load conditions, based on an improved K-means algorithm and the Hadoop platform. The other part utilizes a support vector machine to determine the sensitivity coefficients of various operating parameters for the net coal consumption under different load conditions. Additionally, the fuzzy rough set attribute reduction method was employed to obtain the minimalist properties reduction method parameters to reduce the complexity of the dataset. This work is based on continuously-measured information system data from a 600 MW coal-fired power plant in China. The results show that the proposed strategy achieves high energy conservation performance. Taking the 600 MW load optimization value as an example, the optimized power supply coal consumption is 307.95 g/(kW·h) compared to the actual operating value of 313.45 g/(kW·h). It is important for coal-fired power plants to reduce their coal consumption.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)

k-means 关注

1 Dietary patterns associated with the incidence of hypertension among adult Japanese males: application of machine learning to a cohort study.

2 Multi-Dimensional Validation of the Integration of Syntactic and Semantic Distance Measures for Clustering Fibromyalgia Patients in the Rheumatic Monitor Big Data Study.

3 Machine Learning and Symptom Patterns in Degenerative Cervical Myelopathy: Web-Based Survey Study.

4 Unsupervised cluster analysis of clinical and metabolite characteristics in patients with chronic complications of T2DM: an observational study of real data.

5 Explorative Clustering of the Nitrogen Balance Trajectory in Critically Ill Patients: A Preliminary post hoc Analysis of a Single-Center Prospective Observational Study.

6 Identification of spinal tuberculosis subphenotypes using routine clinical data: a study based on unsupervised machine learning.

7 Assessing resource allocation based on workload: a data envelopment analysis study on clinical departments in a class a tertiary public hospital in China.

8 Shannon entropy as a reliable score to diagnose human fibroelastic degenerative mitral chords: A micro-ct ex-vivo study.

9 Use of Multiple Correspondence Analysis and K-means to Explore Associations Between Risk Factors and Likelihood of Colorectal Cancer: Cross-sectional Study.

10 Big-Data-Mining-Based Improved K-Means Algorithm for Energy Use Analysis of Coal-Fired Power Plant Units: A Case Study.