Data Data-医云文献数字医云科研云海量医学决策数据服务

Data 关注

Data

文献(682篇)

百科

视频

1 Using community-based participatory research methods to build the foundation for an equitable integrated health data system within a Canadian urban context.

使用基于社区的参与性研究方法，为加拿大城市环境中的公平综合健康数据系统奠定基础。影响指数 : 4.666
发表时间：Jul 2024 1
来源期刊：Int J Equity Health PMID：38951827

DOI：10.1186/s12939-024-02179-3
文章类型： Journal Article

COVID-19大流行加剧了健康不平等，不成比例地影响了加拿大各地种族化和公平应得的社区。在皮尔市，现有数据,虽然有限，说明来自种族化和公平应得社区的个人继续受苦，接受延迟护理，过早地死去.针对这些令人不安的统计数据，基层社区倡导呼吁Peel的卫生系统领导人与社区和非营利组织合作，解决阻碍解决该地区健康社会决定因素的关键数据和基础设施差距。为了支持这些宣传工作，我们使用基于社区的参与式研究方法来了解我们如何建立跨部门的数据收集生态系统，与社区居民和服务提供者一起，准确地获取有关健康的社会决定因素的数据。这种方法涉及建立一个社区参与委员会，与社区一起定义问题，映射哪些数据被积极收集，哪些数据被排除，并了解从社区成员和服务提供商收集社会人口统计数据的经验。在社区声音的引导下，我们的研究侧重于初级保健背景下的社会人口统计数据收集，并确定哪些服务提供者使用和收集这些数据，如何在他们的工作中使用数据，数据使用和收集的促进者和障碍。此外,我们深入了解了社会人口统计学数据收集如何受到尊重，安全,并从社区成员的角度进行适当的管理。从这项研究中,我们确定了一套8条社会人口统计学数据收集建议,并强调了局限性.这项基于社区的基础工作将为未来的研究提供信息，以与多元化和公平的社区合作建立数据治理。
Health inequalities amplified by the COVID-19 pandemic have disproportionately affected racialized and equity-deserving communities across Canada. In the Municipality of Peel, existing data, while limited, illustrates that individuals from racialized and equity-deserving communities continue to suffer, receive delayed care, and die prematurely. In response to these troubling statistics, grassroots community advocacy has called on health systems leaders in Peel to work with community and non-profit organizations to address the critical data and infrastructure gaps that hinder addressing the social determinants of health in the region. To support these advocacy efforts, we used a community-based participatory research approach to understand how we might build a data collection ecosystem across sectors, alongside community residents and service providers, to accurately capture the data about the social determinants of health. This approach involved developing a community engagement council, defining the problem with the community, mapping what data is actively collected and what is excluded, and understanding experiences of sociodemographic data collection from community members and service providers. Guided by community voices, our study focused on sociodemographic data collection in the primary care context and identified which service providers use and collect these data, how data are used in their work, the facilitators and barriers to data use and collection. Additionally, we gained insight into how sociodemographic data collection could be respectful, safe, and properly governed from the perspectives of community members. From this study, we identify a set of eight recommendations for sociodemographic data collection and highlight limitations. This foundational community-based work will inform future research in establishing data governance in partnership with diverse and equity-deserving communities.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
2 Health consequences of disasters: Advancing disaster data science.

灾害的健康后果：推进灾害数据科学。影响指数 : 暂无
发表时间：Jun 2024
来源期刊：PNAS Nexus PMID：38911596

DOI：10.1093/pnasnexus/pgae211
文章类型： Journal Article

了解灾害对健康的影响对于有效备灾至关重要，回应,recovery,和缓解。然而,灾害数据的有限可用性以及难以识别和利用与灾害研究和管理相关的特定灾害和健康数据源的负面影响。为了响应灾难研究人员的众多要求，应急管理人员,和运营响应组织，在灾害和健康的交汇处，对73个不同的数据源进行了汇编和分类。这些数据来源通常覆盖整个美国，解决灾害和健康问题，并且可以在很少或没有成本的情况下提供给研究人员。对数据源进行了描述和表征，以支持改进的研究并指导基于证据的决策。提出了当前的差距和潜在的解决方案，以改善灾难数据收集，利用率，和传播。
Understanding the health effects of disasters is critical for effective preparedness, response, recovery, and mitigation. However, research is negatively impacted by both the limited availability of disaster data and the difficulty of identifying and utilizing disaster-specific and health data sources relevant to disaster research and management. In response to numerous requests from disaster researchers, emergency managers, and operational response organizations, 73 distinct data sources at the intersection of disasters and health were compiled and categorized. These data sources generally cover the entire United States, address both disasters and health, and are available to researchers at little or no cost. Data sources are described and characterized to support improved research and guide evidence-based decision making. Current gaps and potential solutions are presented to improve disaster data collection, utilization, and dissemination.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
3 Prediction of Chronic Stress and Protective Factors in Adults: Development of an Interpretable Prediction Model Based on XGBoost and SHAP Using National Cross-sectional DEGS1 Data.

成人慢性应激和保护因素的预测：基于 XGBoost 和 SHAP 使用国家横截面 DEGS1 数据开发可解释的预测模型。影响指数 : 暂无
发表时间：May 2023 16
来源期刊：JMIR AI PMID：38875576

DOI：10.2196/41868
文章类型： Journal Article

背景：慢性应激在德国人群中非常普遍。已知它对心理健康有不良影响，如倦怠和抑郁。慢性压力的已知长期影响是心血管疾病，糖尿病,和癌症。
目的：本研究旨在基于德国成人健康访谈和检查调查的全国代表性数据，得出一个可解释的多类机器学习模型，用于预测慢性压力水平和预防慢性压力的因素。这是国家健康监测计划的一部分。
方法：来自德国成人健康访谈和检查调查研究的数据集，包括人口统计学，临床,分析了5801名参与者的实验室数据.构建了一个多类极限梯度提升（XGBoost）模型，将参与者分为3类，包括低，中间,和高慢性压力水平。使用接收器工作特性曲线下的面积评估模型的性能，精度,召回，特异性，和F1得分。此外,使用Shapley加法扩张来解释预测XGBoost模型并确定保护免受慢性压力的因素。
结果：多类XGBoost模型显示了宏观平均分数，接收器工作特性曲线下面积为81%，精度为63%，召回52%,特异性为78%，F1得分为54%。低水平慢性压力的最重要特征是男性，良好的整体健康，对生活空间的高度满意，强大的社会支持。
结论：本研究为德国成年人的慢性应激提供了一个多类可解释的预测模型。可解释的人工智能技术Shapley加法扩张确定了慢性压力的相关保护因素，在制定减少慢性压力的干预措施时需要考虑这一点。
BACKGROUND: Chronic stress is highly prevalent in the German population. It has known adverse effects on mental health, such as burnout and depression. Known long-term effects of chronic stress are cardiovascular disease, diabetes, and cancer.
OBJECTIVE: This study aims to derive an interpretable multiclass machine learning model for predicting chronic stress levels and factors protecting against chronic stress based on representative nationwide data from the German Health Interview and Examination Survey for Adults, which is part of the national health monitoring program.
METHODS: A data set from the German Health Interview and Examination Survey for Adults study including demographic, clinical, and laboratory data from 5801 participants was analyzed. A multiclass eXtreme Gradient Boosting (XGBoost) model was constructed to classify participants into 3 categories including low, middle, and high chronic stress levels. The model\'s performance was evaluated using the area under the receiver operating characteristic curve, precision, recall, specificity, and the F1-score. Additionally, SHapley Additive exPlanations was used to interpret the prediction XGBoost model and to identify factors protecting against chronic stress.
RESULTS: The multiclass XGBoost model exhibited the macroaverage scores, with an area under the receiver operating characteristic curve of 81%, precision of 63%, recall of 52%, specificity of 78%, and F1-score of 54%. The most important features for low-level chronic stress were male gender, very good general health, high satisfaction with living space, and strong social support.
CONCLUSIONS: This study presents a multiclass interpretable prediction model for chronic stress in adults in Germany. The explainable artificial intelligence technique SHapley Additive exPlanations identified relevant protective factors for chronic stress, which need to be considered when developing interventions to reduce chronic stress.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
4 Patient Embeddings From Diagnosis Codes for Health Care Prediction Tasks: Pat2Vec Machine Learning Framework.

医疗保健预测任务的诊断代码中的患者嵌入： Pat2Vec 机器学习框架。影响指数 : 暂无
发表时间：Apr 2023 21
来源期刊：JMIR AI PMID：38875541

DOI：10.2196/40755
文章类型： Journal Article

背景：在医疗保健方面，索赔数据和电子健康记录（EHR）中的诊断代码在数据驱动的决策中起着重要作用。使用患者诊断代码来预测未来结果或描述发病率的任何分析都需要由基于字符串的诊断代码组成的诊断配置文件的数字表示。这些数值表示对于机器学习模型尤其重要。最常见的是,已使用二进制编码表示，通常用于诊断的子集。在现实世界的医疗保健应用中，出现了几个问题：即使潜在疾病相同，患者档案也显示出高变异性，他们可能有差距，不包含所有可用的信息，必须考虑大量适当的诊断。
目的：我们在此介绍Pat2Vec，一种自监督的机器学习框架，其灵感来自基于神经网络的自然语言处理，该框架将完整的诊断配置文件嵌入到一个小的实值数值向量中。
方法：基于德国门诊索赔数据，根据国际疾病和相关健康问题统计分类的诊断代码，第十次修订（ICD-10），我们发现了一个最佳的矢量化嵌入模型的病人诊断配置文件与贝叶斯优化的超参数。校准过程通过使用不同的机器学习算法(线性和逻辑回归以及梯度提升树)聚合不同的回归和分类任务的度量来确保用于医疗保健相关任务的鲁棒嵌入模型。针对二进制编码最常见诊断的基线模型对模型进行测试。该研究使用了2016年至2019年超过1000万患者的诊断概况和补充数据，并基于德国最大的门诊索赔数据集。为了描述医疗保健中的亚群，我们识别了聚类(通过基于密度的聚类),并在2D中可视化了患者向量(通过使用均匀流形近似的降维).此外，我们应用我们的矢量化模型来预测基于患者诊断的前瞻性药物处方成本.
结果：我们的最终模型在尺寸相等的情况下优于基线模型（二进制编码）。它们对缺失的数据更健壮，并显示出巨大的性能提升，特别是在较低的维度上，演示了嵌入模型对非线性信息的压缩。在未来,其他医疗保健数据来源可以整合到当前的基于诊断的框架中.其他研究人员可以将我们公开共享的嵌入模型应用于他们自己的诊断数据。
结论：我们设想了Pat2Vec的广泛应用，这将提高医疗保健质量，包括患者监测中的个性化预防和信号检测，以及基于我们的数据驱动的机器学习框架确定的子队列的医疗保健资源规划。
BACKGROUND: In health care, diagnosis codes in claims data and electronic health records (EHRs) play an important role in data-driven decision making. Any analysis that uses a patient\'s diagnosis codes to predict future outcomes or describe morbidity requires a numerical representation of this diagnosis profile made up of string-based diagnosis codes. These numerical representations are especially important for machine learning models. Most commonly, binary-encoded representations have been used, usually for a subset of diagnoses. In real-world health care applications, several issues arise: patient profiles show high variability even when the underlying diseases are the same, they may have gaps and not contain all available information, and a large number of appropriate diagnoses must be considered.
OBJECTIVE: We herein present Pat2Vec, a self-supervised machine learning framework inspired by neural network-based natural language processing that embeds complete diagnosis profiles into a small real-valued numerical vector.
METHODS: Based on German outpatient claims data with diagnosis codes according to the International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10), we discovered an optimal vectorization embedding model for patient diagnosis profiles with Bayesian optimization for the hyperparameters. The calibration process ensured a robust embedding model for health care-relevant tasks by aggregating the metrics of different regression and classification tasks using different machine learning algorithms (linear and logistic regression as well as gradient-boosted trees). The models were tested against a baseline model that binary encodes the most common diagnoses. The study used diagnosis profiles and supplementary data from more than 10 million patients from 2016 to 2019 and was based on the largest German ambulatory claims data set. To describe subpopulations in health care, we identified clusters (via density-based clustering) and visualized patient vectors in 2D (via dimensionality reduction with uniform manifold approximation). Furthermore, we applied our vectorization model to predict prospective drug prescription costs based on patients\' diagnoses.
RESULTS: Our final models outperform the baseline model (binary encoding) with equal dimensions. They are more robust to missing data and show large performance gains, particularly in lower dimensions, demonstrating the embedding model\'s compression of nonlinear information. In the future, other sources of health care data can be integrated into the current diagnosis-based framework. Other researchers can apply our publicly shared embedding model to their own diagnosis data.
CONCLUSIONS: We envision a wide range of applications for Pat2Vec that will improve health care quality, including personalized prevention and signal detection in patient surveillance as well as health care resource planning based on subcohorts identified by our data-driven machine learning framework.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
5 The open ontology and information society.

开放本体论与信息社会。影响指数 : 4.772
发表时间：2024
来源期刊：Front Genet PMID：38873107

DOI：10.3389/fgene.2024.1290658
文章类型： Journal Article

信息，作为最难以捉摸的主题，是所有思想形式的核心，治理，经济结构,科学,和社会。信息监管，尤其是在医疗保健领域，在全球范围内被证明是一项艰巨的任务，鉴于缺乏定性框架和对信息（或数据）本身的概念和属性的理解。总体定性框架的介绍，包括对信息的定性分析，数据,和知识，将是有价值的，对划定监管有很大帮助，伦理,和战略轨迹。此外,这个框架提供了关于(1)数据隐私和保护的见解(和答案)；(2)信息之间的划分，数据,和基于信任的重要概念的知识；（3）建立开放社会和制度的必要条件的结构化方法，保持这种开放性，基于卡尔·波普尔和乔治·威廉·弗里德里希·黑格尔的工作；（4）促进自治和自由并保护开放社会的积极代理方法；（5）基于弗里德里希·哈耶克的工作的数据治理机制，构建了当前的法律-道德-金融和社会社会。这对于有关权利和义务的范围的问题是有见地的，生物体和自由的程度，以及分布式网络系统中的关系结构。这个框架提供了巨大的价值；此外，它提供了关于学术文化的批判性见解和想法（并揭示了它们之间的相互作用），政治,科学,社会,和社会衰败。请注意，根据这份手稿中表达的想法，例如结合个人经验(从而弥补康德和笛卡尔差距)，将使用第一人称视角，在相关的地方。
Information, as the most elusive subject, is central to all forms of thought, governance, economic structure, science, and society. Regulation of information, especially within the healthcare field, is proving to be a difficult task globally, given the lack of a qualitative framework and understanding of the concept and properties of information (or data) itself. The presentation of the overall qualitative framework, comprising a qualitative analysis of information, data, and knowledge, will be valuable and of great assistance in delineating regulatory, ethical, and strategic trajectories. In addition, this framework provides insights (and answers) regarding (1) data privacy and protection; (2) delineations between information, data, and knowledge based on the important notion of trust; (3) a structured approach to establishing the necessary conditions for an open society and system, and the maintenance of said openness, based on the work of Karl Popper and Georg Wilhelm Friedrich Hegel; (4) an active agent approach that promotes autonomy and freedom and protects the open society; and (5) a data governance mechanism based on the work of Friedrich Hayek, which structures the current legal-ethical-financial and social society. This is insightful for questions relating to the extent of rights and duties, the extent of biological bodies and freedom, and the structure of relations in distributed networked systems. There is great value offered in this framework; furthermore, it provides critical insights and thoughts about (and uncovers the interplay between) academic culture, politics, science, society, and societal decay. Note that, in line with the ideas expressed in this manuscript, such as incorporation of personal experience (thereby mending the Kantian and Cartesian gap), a first-person perspective will be used, where relevant.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
6 In silico study to explore the mechanism of Toxoplasma-induced inflammation and target therapy based on sero and salivary Toxoplasma.

基于血清和唾液弓形虫，探讨弓形虫诱导炎症的机制和靶向治疗。影响指数 : 4.996
发表时间：06 2024 13
来源期刊：Sci Rep PMID：38866852

DOI：10.1038/s41598-024-63735-z
文章类型： Journal Article

我们旨在评估危险人群中弓形虫免疫球蛋白的唾液和血清阳性率，并评估靶向TgERP的药物对接。在亚历山大大学医院的门诊诊所进行了一项横断面研究。从2022年9月至2023年11月，共有192名参与者参加。ELISA法测定血清和唾液中抗弓形虫IgG和IgM。Silico研究检查了TgERP蛋白-蛋白相互作用（PPI）与促炎细胞因子受体，抗炎细胞因子，细胞周期进程调节蛋白，增殖标记，和核包膜完整性相关蛋白LaminB1。我们的发现揭示了反T.血清（66.1％）和唾液（54.7％）中检测到刚地IgG，2.1%的样本IgM阳性。唾液IgG有75.59%的敏感性,86.15%特异性，91.40%PPV,64.40%NPP,准确度为79.17%，与血清IgG相当。另一方面,灵敏度,特异性，PPV,NPV,检测唾液IgM的准确率为75.0%，99.47%,75.0%,99.47%,98.96%。AUC0.859表示良好的鉴别力。经过检查的合成药物和天然产物可以靶向TgERP的特定氨基酸残基，这些残基位于与LB1和Ki67相同的结合界面上，阻碍他们的互动。因此，唾液样本可能是一种有前途的诊断方法.所研究的药物可以抵消TgERP的促炎作用。
We aimed to assess salivary and seroprevalence of Toxoplasma immunoglobulins in risky populations and evaluate drug docking targeting TgERP. A cross-sectional study was conducted in Alexandria University hospitals\' outpatient clinics. 192 participants were enrolled from September 2022 to November 2023. Anti-Toxoplasma IgG and IgM were determined in serum and saliva by ELISA. An in-Silico study examined TgERP\'s protein-protein interactions (PPIs) with pro-inflammatory cytokine receptors, anti-inflammatory cytokine, cell cycle progression regulatory proteins, a proliferation marker, and nuclear envelope integrity-related protein Lamin B1. Our findings revealed that anti-T. gondii IgG were detected in serum (66.1%) and saliva (54.7%), with 2.1% of both samples were positive for IgM. Salivary IgG had 75.59% sensitivity, 86.15% specificity, 91.40% PPV, 64.40% NPP, 79.17% accuracy and fair agreement with serum IgG. On the other hand, the sensitivity, specificity, PPV, NPV, and accuracy in detecting salivary IgM were 75.0%, 99.47%, 75.0%, 99.47%, and 98.96%. AUC 0.859 indicates good discriminatory power. Examined synthetic drugs and natural products can target specific amino acids residues of TgERP that lie at the same binding interface with LB1 and Ki67, subsequently, hindering their interaction. Hence, salivary samples can be a promising diagnostic approach. The studied drugs can counteract the pro-inflammatory action of TgERP.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
7 Harmonizing data on correlates of sleep in children within and across neurodevelopmental disorders: lessons learned from an Ontario Brain Institute cross-program collaboration.

协调有关神经发育障碍内部和之间儿童睡眠相关数据：从安大略省大脑研究所跨项目合作中学到的经验教训。影响指数 : 3.739
发表时间：2024
来源期刊：Front Neuroinform PMID：38828185

DOI：10.3389/fninf.2024.1385526
文章类型： Journal Article

人们越来越希望一起研究神经发育障碍（NDD），以了解共同之处，以制定通用的健康促进策略并改善临床治疗。在涉及NDD儿童的研究中收集的通用数据元素（CDE）为回答有临床意义的问题提供了机会。我们进行了回顾，通过各种研究收集的不同NDD儿童睡眠相关数据的二次分析。本文的目的是分享数据管理方面的经验教训，排序规则,以及对NDD内外儿童的睡眠研究的协调，安大略省大脑研究所（OBI）的合作研究网络。三个合作研究网络贡献了人口统计数据和与睡眠有关的数据，内化症状，与健康相关的生活质量，患有六种不同NDD的儿童的疾病严重程度：自闭症谱系障碍；注意力缺陷/多动障碍；强迫症；智力障碍；脑瘫和癫痫。数据协调程序，派生,共享和合并，并详细描述了与疾病严重程度和睡眠障碍有关的示例。数据协调程序产生了重要的经验教训：优先考虑CDE的收集，以确保数据的完整性；确保上传未处理的数据进行协调，以促进及时的分析程序；在项目验证时保持与数据字典一致的变量命名的价值；以及与研究网络定期举行会议以讨论和克服数据协调方面的挑战的价值。从研究开始时涉及的所有研究网络的购买和集中式基础设施（OBI）的监督确定了合作收集CDE并促进数据协调以改善NDD儿童结果的重要性。
There is an increasing desire to study neurodevelopmental disorders (NDDs) together to understand commonalities to develop generic health promotion strategies and improve clinical treatment. Common data elements (CDEs) collected across studies involving children with NDDs afford an opportunity to answer clinically meaningful questions. We undertook a retrospective, secondary analysis of data pertaining to sleep in children with different NDDs collected through various research studies. The objective of this paper is to share lessons learned for data management, collation, and harmonization from a sleep study in children within and across NDDs from large, collaborative research networks in the Ontario Brain Institute (OBI). Three collaborative research networks contributed demographic data and data pertaining to sleep, internalizing symptoms, health-related quality of life, and severity of disorder for children with six different NDDs: autism spectrum disorder; attention deficit/hyperactivity disorder; obsessive compulsive disorder; intellectual disability; cerebral palsy; and epilepsy. Procedures for data harmonization, derivations, and merging were shared and examples pertaining to severity of disorder and sleep disturbances were described in detail. Important lessons emerged from data harmonizing procedures: prioritizing the collection of CDEs to ensure data completeness; ensuring unprocessed data are uploaded for harmonization in order to facilitate timely analytic procedures; the value of maintaining variable naming that is consistent with data dictionaries at time of project validation; and the value of regular meetings with the research networks to discuss and overcome challenges with data harmonization. Buy-in from all research networks involved at study inception and oversight from a centralized infrastructure (OBI) identified the importance of collaboration to collect CDEs and facilitate data harmonization to improve outcomes for children with NDDs.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
8 Digital humanities in the era of digital reproducibility: towards a fairest and post-computational framework.

数字再现时代的数字人文：迈向最公平和后计算框架。影响指数 : 暂无
发表时间：2024
来源期刊：Int J Digit Humanit PMID：38799025

DOI：10.1007/s42803-023-00079-6
文章类型： Journal Article

重复性已经成为硬科学的要求，它的采用正在逐步扩展到数字人文科学。FAIR标准和数据文件的出版都表明了这一趋势。然而,出现的问题是，数字再现性的严格先决条件是否仅用于将数字人文学科排除在更广泛的人文学科奖学金之外。而不是采用二元方法，一种替代方法承认对象的独特特征，询问,和人文科学的技术，包括数字人文科学，以及可再现性概念在人文科学中发展的社会和历史背景。在本文的第一部分,我建议研究可再现性概念在人文科学中发展的历史和学科背景，以及这个过程中涉及的纪律斗争，特别是艺术史和文学研究。在第二部分,我将通过两个利用各种计算方法的艺术史研究项目来探讨再现性问题。我认为语料库的问题，方法，和解释不能分开，渲染再现性的程序定义是不切实际的。因此,我建议采用“计算后再现性”，就数字语料库而言，这是基于最公平的标准(公平+道德和专业知识，来源提及+时间戳)，但扩展到包括其他非计算方法确认计算结果的其他来源。
Reproducibility has become a requirement in the hard sciences, and its adoption is gradually extending to the digital humanities. The FAIR criteria and the publication of data papers are both indicative of this trend. However, the question that arises is whether the strict prerequisites of digital reproducibility serve only to exclude digital humanities from broader humanities scholarship. Instead of adopting a binary approach, an alternative method acknowledges the unique features of the objects, inquiries, and techniques of the humanities, including digital humanities, as well as the social and historical contexts in which the concept of reproducibility has developed in the human sciences. In the first part of this paper, I propose to examine the historical and disciplinary context in which the concept of reproducibility has developed within the human sciences, and the disciplinary struggles involved in this process, especially for art history and literature studies. In the second part, I will explore the question of reproducibility through two art history research projects that utilize various computational methods. I argue that issues of corpus, method, and interpretation cannot be separated, rendering a procedural definition of reproducibility impractical. Consequently, I propose the adoption of \'post-computational reproducibility\', which is based on FAIREST criteria as far as digital corpora are concerned (FAIR + Ethics and Expertise, Source mention + Time-Stamp), but extended to include further sources that confirm computational results with other non-computational methodologies.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
9 On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data.

从多维样本数据中准确估计信息论量。影响指数 : 2.738
发表时间：Apr 2024 30
来源期刊：Entropy (Basel) PMID：38785636

DOI：10.3390/e26050387
文章类型： Journal Article

在具有连续数据的实际应用中使用信息理论量通常受到概率密度函数需要在更高维度上估计的事实的阻碍，这可能变得不可靠，甚至在计算上不可行。为了使这些有用的数量更容易获得，已经提出了诸如使用直方图和k-最近邻(k-NN)的分组频率的替代方法。然而,缺乏对这些方法适用性的系统比较。我们希望通过在精心设计的合成测试用例中将基于核密度的估计（KDE）与这两种替代方案进行比较来填补这一空白。具体来说,我们希望估计信息论量：熵，Kullback-Leibler分歧,和相互信息，从样本数据。作为参考,将结果与封闭形式的解或数值积分进行比较。我们从尺寸范围从一到十的各种形状的分布中生成样本。我们将估计器的性能评估为样本量的函数，分布特征，和选择的超参数。我们进一步比较了所需的计算时间和具体的实现挑战。值得注意的是,k-NN估计往往优于其他方法，考虑算法实现，计算效率，和估计准确性，特别是有足够的数据。这项研究为信息理论量的不同估计方法的优势和局限性提供了宝贵的见解。它还强调了考虑数据特征的重要性，以及选择适当的估计技术时的目标信息理论量。这些发现将有助于科学家和从业者选择最合适的方法，考虑到它们的具体应用和可用数据。我们已经在一个现成的开源Python3工具箱中收集了比较的估计方法，因此，希望促进研究人员和实践者使用信息理论量来评估各个学科的数据和模型中的信息。
Using information-theoretic quantities in practical applications with continuous data is often hindered by the fact that probability density functions need to be estimated in higher dimensions, which can become unreliable or even computationally unfeasible. To make these useful quantities more accessible, alternative approaches such as binned frequencies using histograms and k-nearest neighbors (k-NN) have been proposed. However, a systematic comparison of the applicability of these methods has been lacking. We wish to fill this gap by comparing kernel-density-based estimation (KDE) with these two alternatives in carefully designed synthetic test cases. Specifically, we wish to estimate the information-theoretic quantities: entropy, Kullback-Leibler divergence, and mutual information, from sample data. As a reference, the results are compared to closed-form solutions or numerical integrals. We generate samples from distributions of various shapes in dimensions ranging from one to ten. We evaluate the estimators\' performance as a function of sample size, distribution characteristics, and chosen hyperparameters. We further compare the required computation time and specific implementation challenges. Notably, k-NN estimation tends to outperform other methods, considering algorithmic implementation, computational efficiency, and estimation accuracy, especially with sufficient data. This study provides valuable insights into the strengths and limitations of the different estimation methods for information-theoretic quantities. It also highlights the significance of considering the characteristics of the data, as well as the targeted information-theoretic quantity when selecting an appropriate estimation technique. These findings will assist scientists and practitioners in choosing the most suitable method, considering their specific application and available data. We have collected the compared estimation methods in a ready-to-use open-source Python 3 toolbox and, thereby, hope to promote the use of information-theoretic quantities by researchers and practitioners to evaluate the information in data and models in various disciplines.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)
10 The Cooperation Between Nurses and a New Digital Colleague "AI-Driven Lifestyle Monitoring" in Long-Term Care for Older Adults: Viewpoint.

护士与新的数字同事 “AI 驱动的生活方式监测 ” 在老年人长期护理中的合作：观点。影响指数 : 暂无
发表时间：May 2024 23
来源期刊：JMIR Nurs PMID：38781012

DOI：10.2196/56474
文章类型： Journal Article

技术对护士的工作方式有重大影响。数据驱动技术，例如人工智能(AI)，有特别强的潜力支持护士的工作。然而,它们的使用也引入了歧义。这种技术的一个例子是人工智能驱动的老年人长期护理生活方式监测。基于从老年人家中的环境传感器收集的数据。在这样一个亲密的环境中设计和实施这项技术需要与具有长期和老年成人护理经验的护士合作。本文强调需要将护士和护理观点纳入设计的每个阶段，使用,并在长期护理环境中实施人工智能驱动的生活方式监测。有人认为这项技术不会取代护士，而是作为一个新的数字同事，补充护士的人文素质，无缝融入护理工作流程。强调了护士和技术之间这种合作的几个优点，以及潜在的风险，如患者赋权减少，去个性化,缺乏透明度,失去与人的联系。最后,提供了切实可行的建议，以推动整合数字同事。
Technology has a major impact on the way nurses work. Data-driven technologies, such as artificial intelligence (AI), have particularly strong potential to support nurses in their work. However, their use also introduces ambiguities. An example of such a technology is AI-driven lifestyle monitoring in long-term care for older adults, based on data collected from ambient sensors in an older adult\'s home. Designing and implementing this technology in such an intimate setting requires collaboration with nurses experienced in long-term and older adult care. This viewpoint paper emphasizes the need to incorporate nurses and the nursing perspective into every stage of designing, using, and implementing AI-driven lifestyle monitoring in long-term care settings. It is argued that the technology will not replace nurses, but rather act as a new digital colleague, complementing the humane qualities of nurses and seamlessly integrating into nursing workflows. Several advantages of such a collaboration between nurses and technology are highlighted, as are potential risks such as decreased patient empowerment, depersonalization, lack of transparency, and loss of human contact. Finally, practical suggestions are offered to move forward with integrating the digital colleague.

导出

Endnote Noteexpress

更多引用

收藏

翻译标题摘要

我要上传

PDF(Pubmed)

Data 关注

1 Using community-based participatory research methods to build the foundation for an equitable integrated health data system within a Canadian urban context.

2 Health consequences of disasters: Advancing disaster data science.

3 Prediction of Chronic Stress and Protective Factors in Adults: Development of an Interpretable Prediction Model Based on XGBoost and SHAP Using National Cross-sectional DEGS1 Data.

4 Patient Embeddings From Diagnosis Codes for Health Care Prediction Tasks: Pat2Vec Machine Learning Framework.

5 The open ontology and information society.

6 In silico study to explore the mechanism of Toxoplasma-induced inflammation and target therapy based on sero and salivary Toxoplasma.

7 Harmonizing data on correlates of sleep in children within and across neurodevelopmental disorders: lessons learned from an Ontario Brain Institute cross-program collaboration.

8 Digital humanities in the era of digital reproducibility: towards a fairest and post-computational framework.

9 On the Accurate Estimation of Information-Theoretic Quantities from Multi-Dimensional Sample Data.

10 The Cooperation Between Nurses and a New Digital Colleague "AI-Driven Lifestyle Monitoring" in Long-Term Care for Older Adults: Viewpoint.