data processing

数据处理
  • 文章类型: Journal Article
    迄今为止,聚和全氟烷基物质(PFAS)对其环境持久性构成了真正的威胁,广泛的物理化学变异性,以及它们的潜在毒性。到目前为止,这些化学物质的很大一部分在结构上仍然未知。这些化学物质,因此,需要使用液相色谱与高分辨率质谱联用(LC-HRMS)实施复杂的非目标分析工作流程,以进行全面的检测和监测。这种方法,尽管全面,并不总是为复杂PFAS混合物的分析提供急需的分析分辨率,例如消防水性成膜泡沫(AFFF)。这项研究巩固了LC×LC技术与高分辨率串联质谱(HRMS/MS)联用的优势,用于鉴定AFFF混合物中的PFAS。在3M和OrchideeAFFF混合物中鉴定出总共57个PFAS同源物系列(HS),这得益于(i)高色谱峰容量(n'2D,c〜300)和(i)通过对HRMS数据的“KendrickMass剩余部分”(RKM)分析提供的质量域分辨率增加。然后,我们试图通过利用可用的参考标准和FluorMatch工作流程与不同氟重复单元的RKM缺陷相结合来注释每个HS的PFAS,如CF2,CF2O,和C2F4O。这种方法产生了12个确定的PFASHS,包括属于全氟烷基羧酸(PFACAs)HS的化合物,全氟烷基磺酸(PFASAs),(N-五氟(5)硫化物)-全氟烷烃磺酸盐(SF5-PFASAs),N-磺丙基二甲基氨丙基全氟烷烃磺酰胺(N-SPAmP-FASA),和N-羧甲基二甲基铵丙基全氟烷烃磺酰胺(N-CMAMP-FASA)。全氟烷基醛和氯化PFASAs的注释类别代表了所研究的AFFF样品中PFASHS的第一个记录。
    To date, poly- and perfluoroalkyl substances (PFAS) represent a real threat for their environmental persistence, wide physicochemical variability, and their potential toxicity. Thus far a large portion of these chemicals remain structurally unknown. These chemicals, therefore, require the implementation of complex non-targeted analysis workflows using liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS) for their comprehensive detection and monitoring. This approach, even though comprehensive, does not always provide the much-needed analytical resolution for the analysis of complex PFAS mixtures such as fire-fighting aqueous film-forming foams (AFFFs). This study consolidates the advantages of the LC×LC technique hyphenated with high-resolution tandem mass spectrometry (HRMS/MS) for the identification of PFAS in AFFF mixtures. A total of 57 PFAS homolog series (HS) were identified in 3M and Orchidee AFFF mixtures thanks to the (i) high chromatographic peak capacity (n\'2D,c ~ 300) and the (i) increased mass domain resolution provided by the \"remainder of Kendrick Mass\" (RKM) analysis on the HRMS data. Then, we attempted to annotate the PFAS of each HS by exploiting the available reference standards and the FluoroMatch workflow in combination with the RKM defect by different fluorine repeating units, such as CF2, CF2O, and C2F4O. This approach resulted in 12 identified PFAS HS, including compounds belonging to the HS of perfluoroalkyl carboxylic acids (PFACAs), perfluoroalkyl sulfonic acids (PFASAs), (N-pentafluoro(5)sulfide)-perfluoroalkane sulfonates (SF5-PFASAs), N-sulfopropyldimethylammoniopropyl perfluoroalkane sulfonamides (N-SPAmP-FASA), and N-carboxymethyldimethylammoniopropyl perfluoroalkane sulfonamide (N-CMAmP-FASA). The annotated categories of perfluoroalkyl aldehydes and chlorinated PFASAs represent the first record of PFAS HS in the investigated AFFF samples.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Exposomics旨在测量人类在整个生命周期中的暴露情况以及它们在人体中产生的变化。Exposome规模的研究具有重要的潜力,可以了解环境因素与复杂的多因素疾病之间的相互作用,这些疾病在我们的社会中普遍存在,其起源尚不清楚。在这个框架中,对化学暴露的研究旨在涵盖所有化学暴露及其对人类健康的影响,但是,今天,这个目标似乎仍然不可行,或者至少非常具有挑战性,这使得目前的曝光只是一个概念。此外,化学暴露的研究面临着几个方法学挑战,例如从特定的目标方法转向高通量的多目标和非目标方法,保证生物样品的可用性和质量,以获得高质量的分析数据,应用分析方法的标准化,以及日益复杂的数据集的统计分配,或(非)已知分析物的鉴定。这篇综述从分析的角度讨论了应用曝光概念所涉及的各个步骤。它概述了现有的各种分析方法和仪器,强调它们的互补性,以开发组合分析策略,以推进化学暴露组表征。此外,这篇综述的重点是内分泌干扰化学物质(EDCs),以表明研究即使是一小部分的化学物质暴露是一个巨大的挑战。在暴露组学背景下应用的分析策略已显示出阐明EDC在健康结果中的作用的巨大潜力。然而,将创新方法转化为病因学研究和化学风险评估将需要多学科的努力。与其他专注于曝光组学的评论文章不同,这篇综述从分析化学的角度提供了一个整体的观点,并讨论了整个分析工作流程,以最终获得有价值的结果。
    Exposomics aims to measure human exposures throughout the lifespan and the changes they produce in the human body. Exposome-scale studies have significant potential to understand the interplay of environmental factors with complex multifactorial diseases widespread in our society and whose origin remain unclear. In this framework, the study of the chemical exposome aims to cover all chemical exposures and their effects in human health but, today, this goal still seems unfeasible or at least very challenging, which makes the exposome for now only a concept. Furthermore, the study of the chemical exposome faces several methodological challenges such as moving from specific targeted methodologies towards high-throughput multitargeted and non-targeted approaches, guaranteeing the availability and quality of biological samples to obtain quality analytical data, standardization of applied analytical methodologies, as well as the statistical assignment of increasingly complex datasets, or the identification of (un)known analytes. This review discusses the various steps involved in applying the exposome concept from an analytical perspective. It provides an overview of the wide variety of existing analytical methods and instruments, highlighting their complementarity to develop combined analytical strategies to advance towards the chemical exposome characterization. In addition, this review focuses on endocrine disrupting chemicals (EDCs) to show how studying even a minor part of the chemical exposome represents a great challenge. Analytical strategies applied in an exposomics context have shown great potential to elucidate the role of EDCs in health outcomes. However, translating innovative methods into etiological research and chemical risk assessment will require a multidisciplinary effort. Unlike other review articles focused on exposomics, this review offers a holistic view from the perspective of analytical chemistry and discuss the entire analytical workflow to finally obtain valuable results.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    我们旨在开发一种强大的算法,用于使用CFHealthHub学习健康系统中电子捕获的雾化器数据准确计算吸入药物的“每日完整剂量计数”。
    一项多中心横断面研究涉及参与者和临床医生审查吸入药物使用记录,并用客观雾化器数据对其进行三角测量,以就每日完整剂量计数达成共识。\'一种算法,只使用客观的雾化器数据,是使用派生数据集开发的,并使用内部验证数据集进行评估。以共识得出的计数作为参考标准,检查了算法得出的和共识得出的“每日完整剂量计数”之间的一致性和准确性。
    该算法通过筛选出“无效”剂量(持续时间<60s或在清洁模式下运行),得出“每日完整剂量计数”,将所有剂量在120秒内开始,然后筛选出持续时间<480s的所有剂量,这些剂量因电源故障而中断。推导中的κ系数为0.85(0.71-0.91),验证数据集中为0.86(0.77-0.94)。
    该算法与参与者-临床医生共识具有很强的一致性,增强对CFHealthHub数据的信心。发布这样的算法方法可以鼓励对数字端点的信任,并作为其他项目的范例。
    UNASSIGNED: To develop a robust algorithm to accurately calculate \'daily complete dose counts\' for inhaled medicines, used in percent adherence calculations, from electronically-captured nebulizer data within the CFHealthHub Learning Health System.
    UNASSIGNED: A multi-center, cross-sectional study involved participants and clinicians reviewing real-world inhaled medicine usage records and triangulating them with objective nebulizer data to establish a consensus on \'daily complete dose counts.\' An algorithm, which used only objective nebulizer data, was then developed using a derivation dataset and evaluated using internal validation dataset. The agreement and accuracy between the algorithm-derived and consensus-derived \'daily complete dose counts\' was examined, with the consensus-derived count as the reference standard.
    UNASSIGNED: Twelve people with CF participated. The algorithm derived a \'daily complete dose count\' by screening out \'invalid\' doses (those <60s in duration or run in cleaning mode), combining all doses starting within 120s of each other, and then screening out all doses with duration < 480s which were interrupted by power supply failure. The kappa co-efficient was 0.85 (0.71-0.91) in the derivation and 0.86 (0.77-0.94) in the validation dataset.
    UNASSIGNED: The algorithm demonstrated strong agreement with the participant-clinician consensus, enhancing confidence in CFHealthHub data. Publishingdata processing methods can encourage trust in digital endpoints and serve as an exemplar for other projects.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    随着开采进程的加快,采空区已成为地下矿山的主要危险源之一,严重威胁矿山的安全生产。为了快速准确预测采空区的风险等级,本文通过相关性分析和特征重要度对采空区特征进行优化,构建了采空区风险等级预测的特征参数组合,解决了评价指标的冗余问题。将多种机器学习算法分别应用于121组采空区数据,用精度、kappa系数等多个指标对矿区进行评价,得到最优算法和最佳特征参数组合。特征参数的最佳组合是地下水,采空区布局,采空区体积,采空区体积,跨高比,和采矿扰动,最优算法是额外树(ET),提出了采空区风险水平预测问题,准确率为94%。该模型可用于解决如何快速准确地预测采空区风险水平的问题。
    With the acceleration of the mining process, the goaf has become one of the main sources of danger in underground mines, seriously threatening the safe production of mines. To make an accurate prediction of the risk level of the goaf quickly, this paper optimizes the features of the goaf by correlation analysis and feature importance and constructs a combination of feature parameters for the risk level prediction of the goaf to solve the problem of redundancy of evaluation indexes. Multiple machine learning algorithms are applied to 121 sets of goaf data respectively, and the optimal algorithm and the best combination of feature parameters are obtained by evaluating the mining area with multiple indicators such as accuracy and kappa coefficient. The best combination of features parameters are ground-water, goaf layout, volume of goaf, goaf volume, span-height ratio, and mining disturbance, and the optimal algorithm is Extra Tree (ET), which needles the goaf risk level prediction problem with the accuracy of 94%. This model can be used to solve the problem of how to quickly and accurately predict the risk level of the goaf.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    乳腺癌诊断对于及时治疗和改善预后至关重要。本文提出了一种基于光纤探针的衰减全反射傅里叶变换红外(ATR-FTIR)光谱从750到4000cm-1的快速乳腺癌诊断的新方法。该技术可以直接分析组织样本,消除了切片机切片和染色的需要,从而节省时间和资源。通过捕获分子指纹信息,各种机器学习模型被用来分析光谱数据,以准确地分类癌组织和非癌组织。比较脱蜡和石蜡样品揭示了样品制备和实验方法的影响。该研究表明,样品的癌性与其ATR-FTIR光谱之间存在很强的相关性,提示其用于乳腺癌诊断的潜力(敏感性为74.2%,特异性为78.3%)。拟议的方法有望整合到临床操作中,为乳腺癌的初步诊断提供了一种快速的方法。本文受版权保护。保留所有权利。
    Breast cancer diagnosis is crucial for timely treatment and improved outcomes. This paper proposes a novel approach for rapid breast cancer diagnosis using optical fiber probe-based attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopy from 750 to 4000 cm-1 . The technique enables direct analysis of tissue samples, eliminating the need for microtome sectioning and staining, thus saving time and resources. By capturing molecular fingerprint information, various machine-learning models were used to analyze the spectroscopic data to classify cancerous and non-cancerous tissues accurately. Comparing deparaffinized and paraffinized samples reveals the impact of sample preparation and experimental methods. The study demonstrates a strong correlation between the cancerous nature of a sample and its ATR-FTIR spectrum, suggesting its potential for breast cancer diagnosis (sensitivity of 74.2% and specificity of 78.3%). The proposed approach holds promise for integration into clinical operations, providing a rapid method for preliminary breast cancer diagnosis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    目的:优化代谢组学数据处理参数是获得可靠结果的一项具有挑战性和根本性的任务。已经开发了自动化工具来帮助LC-MS数据的这种优化。GC-MS数据需要对加工参数进行大量修改,随着色谱分析更加稳健,具有更多的对称和高斯峰。这项工作比较了使用同位素参数优化(IPO)软件的自动XCMS参数优化与GC-MS代谢组学数据的手动优化。此外,将结果与在线XCMS平台进行比较。
    方法:使用来自对照组和试验组的锥虫锥虫细胞内代谢物的GC-MS数据。对质量控制(QC)样品进行优化。
    结果:提取的分子特征数的结果,重复性,缺少值,对重要代谢物的搜索表明了优化峰值检测参数的重要性,对齐,和分组,尤其是那些与峰宽相关的(fwhm,bw)和噪声比(snthresh)。
    结论:这是首次对GC-MS数据进行使用IPO的系统优化。结果表明,没有通用的优化方法,但自动化工具在代谢组学工作流程的这个阶段是有价值的。在线XCMS被证明是一个有趣的处理工具,帮助,最重要的是,在选择参数作为调整和优化的起点。虽然这些工具很容易使用,仍然需要有关所用分析方法和仪器的技术知识。
    Optimizing metabolomics data processing parameters is a challenging and fundamental task to obtain reliable results. Automated tools have been developed to assist this optimization for LC-MS data. GC-MS data require substantial modifications in processing parameters, as the chromatographic profiles are more robust, with more symmetrical and Gaussian peaks. This work compared an automated XCMS parameter optimization using the Isotopologue Parameter Optimization (IPO) software with manual optimization of GC-MS metabolomics data. Additionally, the results were compared to online XCMS platform.
    GC-MS data from control and test groups of intracellular metabolites from Trypanosoma cruzi trypomastigotes were used. Optimizations were performed on the quality control (QC) samples.
    The results in terms of the number of molecular features extracted, repeatability, missing values, and the search for significant metabolites showed the importance of optimizing the parameters for peak detection, alignment, and grouping, especially those related to peak width (fwhm, bw) and noise ratio (snthresh).
    This is the first time that a systematic optimization using IPO has been performed on GC-MS data. The results demonstrate that there is no universal approach for optimization but automated tools are valuable at this stage of the metabolomics workflow. The online XCMS proves to be an interesting processing tool, helping, above all, in the choice of parameters as a starting point for adjustments and optimizations. Although the tools are easy to use, there is still a need for technical knowledge about the analytical methods and instruments used.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:在美国进行的全国家庭成长调查(NSFG)和妊娠风险评估监测系统(PRAMS)等研究收集了有关妊娠意向的数据,以帮助改善健康教育,服务,和程序。PRAMS从特定站点收集数据,NSFG是一项基于家庭的全国性调查。像NSFG一样,妇女调查旨在使用基于地址的样本和多模式数据收集方法对居住在家庭中的参与者进行调查。妇女调查从美国9个州的合格参与者那里收集了关于避孕方法使用的数据,生殖健康,怀孕的意图。在本文中,我们专注于基线数据收集协议,包括样本设计,数据收集程序,和数据处理。我们还简要讨论了后续调查和终点线调查方法。我们的目标是向其他研究人员通报在进行家庭水平的生殖健康调查时要考虑的方法。
    目标:制定妇女调查是为了支持特定国家的研究和评估项目,总体目标是了解18-44岁女性的避孕保健方法。该项目从9个不同州的受访者那里收集数据(亚利桑那州,阿拉巴马,特拉华州,爱荷华州,马里兰,新泽西,俄亥俄州,南卡罗来纳州,和威斯康星州)多轮。
    方法:使用基于地址的抽样方法随机选择家庭。该项目包括横断面基线调查,2或3次后续调查,由一个选择加入的受访者小组进行,和横截面端线测量。每轮数据收集都通过使用编程的网络调查和格式化的硬拷贝调查表来使用多模式设计。来自随机选择的家庭的参与者通过网络调查或邮寄硬拷贝问卷来访问他们的个性化调查。为了最大限度地回应,这些调查遵循严格的时间表,包括各种提示,以支持调查实施设计,参与者获得了适度的货币激励。
    结果:这是一个正在进行的项目,其结果由参与数据分析的评估团队单独发布。
    结论:第一次基线调查中使用的方法对后续全州范围调查中使用的方法进行了修改。从这个项目收集的数据将提供洞察妇女的生殖健康,使用避孕药,和9个选定州的堕胎态度。该项目的长期目标是使用一种数据收集方法,从具有代表性的参与者样本中收集数据,以评估生殖健康行为随时间的变化。
    未经评估:DERR1-10.2196/40675。
    BACKGROUND: Studies conducted in the United States such as the National Survey of Family Growth (NSFG) and the Pregnancy Risk Assessment Monitoring System (PRAMS) collect data on pregnancy intentions to aid in improving health education, services, and programs. PRAMS collects data from specific sites, and NSFG is a national household-based survey. Like NSFG, the Surveys of Women was designed to survey participants residing in households using an address-based sample and a multimode data collection approach. The Surveys of Women collects data from eligible participants in 9 states within the United States on contraception use, reproductive health, and pregnancy intentions. In this paper, we focus on the baseline data collection protocol, including sample design, data collection procedures, and data processing. We also include a brief discussion on the follow-up and endline survey methodologies. Our goal is to inform other researchers on methods to consider when fielding a household-level reproductive health survey.
    OBJECTIVE: The Surveys of Women was developed to support state-specific research and evaluation projects, with an overall goal of understanding contraceptive health practices among women aged 18-44 years. The project collects data from respondents in 9 different states (Arizona, Alabama, Delaware, Iowa, Maryland, New Jersey, Ohio, South Carolina, and Wisconsin) over multiple rounds.
    METHODS: Households were selected at random using address-based sampling methods. This project includes a cross-sectional baseline survey, 2 or 3 follow-up surveys with an opt-in panel of respondents, and a cross-sectional endline survey. Each round of data collection uses a multimode design through the use of a programmed web survey and a formatted hard copy questionnaire. Participants from the randomly selected households access their personalized surveys through a web survey or mail in a hard copy questionnaire. To maximize responses, these surveys follow a rigorous schedule of various prompts bolstering the survey implementation design, and the participants received a modest monetary incentive.
    RESULTS: This is an ongoing project with results published separately by the evaluation teams involved with data analysis.
    CONCLUSIONS: The methods used in the first baseline survey informed modifications to the methods used in subsequent statewide surveys. Data collected from this project will provide insight into women\'s reproductive health, contraceptive use, and abortion attitudes in the 9 selected states. The long-term goal of the project is to use a data collection methodology that collects data from a representative sample of participants to assess changes in reproductive health behaviors over time.
    UNASSIGNED: DERR1-10.2196/40675.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    核磁共振(NMR)是石油工业中用于地层评估以确定参数的强大工具,如孔隙结构,流体饱和度,和多孔材料的渗透性,这对油藏工程至关重要。测量的松弛数据的反演是一个不适定问题,可能导致反演结果的偏差,这可能会降低进一步数据分析和评估的准确性。本文提出了一种用于核磁共振弛豫数据多指数反演的深度学习方法,以提高精度。首先使用基于信号参数和高斯分布的先验知识构建模拟NMR数据。然后将这些数据用于训练神经网络,该神经网络旨在考虑噪声特性,信号衰减特性,信号能量变化,和T2光谱的非负特征。通过模拟数据的验证,多尺度卷积神经网络(CNN)和注意力机制引入的模型在去噪和T2反演方面优于其他方法。最后,岩心的NMR测量用于比较注意力多尺度卷积神经网络(ATT-CNN)模型在实际应用中的有效性。结果表明,本文提出的基于深度学习的方法比正则化方法具有更好的性能。
    Nuclear magnetic resonance (NMR) is a powerful tool for formation evaluation in the oil industry to determine parameters, such as pore structure, fluid saturation, and permeability of porous materials, which are critical to reservoir engineering. The inversion of the measured relaxation data is an ill-posed problem and may lead to deviations of inversion results, which may degrade the accuracy of further data analysis and evaluation. This paper proposes a deep learning method for multi-exponential inversion of NMR relaxation data to improve accuracy. Simulated NMR data are first constructed using a priori knowledge based on the signal parameters and Gaussian distribution. These data are then used to train the neural network designed to consider noise characteristics, signal decay characteristics, signal energy variations, and non-negative features of the T2 spectra. With the validation from simulated data, the models introduced by multi-scale convolutional neural network (CNN) and attention mechanism outperform other approaches in terms of denoising and T2 inversion. Finally, NMR measurements of rock cores are used to compare the effectiveness of the attention multi-scale convolutional neural network (ATT-CNN) model in practical applications. The results demonstrate that the proposed method based on deep learning has better performance than the regularization method.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:使用基于人工智能的教育计划的护生需要进行教育和培训。然而,很少有研究评估在护理教育中使用聊天机器人的效果。
    目的:本研究旨在开发和检查人工智能聊天机器人教育计划的效果,以提高护理大学生在COVID-19大流行期间的非面对面课程中与电子胎儿监护相关的护理技能。
    方法:本准实验研究采用非等效对照组非同步前测-后测设计。
    方法:参与者是来自韩国G省一所护理学院的61名大三学生。数据收集于2021年11月3日至16日之间,并使用独立t检验进行分析。
    结果:应用人工智能聊天机器人程序的实验组在知识方面没有显示出统计学上的显着差异(t=-0.58,p=.567),临床推理能力(t=0.75,p=.455),置信度(t=1.13,p=.264),和反馈满意度(t=1.72,p=0.090),与对照组相比;然而,其参与者对教育的兴趣(t=2.38,p=.020)和自我导向学习(t=2.72,p=.006)显著高于对照组。
    结论:我们的研究结果强调了人工智能聊天机器人项目作为一种教育辅助工具的潜力,可以促进护理大学生对教育和自主学习的兴趣。此外,这些项目可以有效地提高护理学生在持续的COVID-19大流行引起的非面对面情况下的技能。
    BACKGROUND: Education and training are needed for nursing students using artificial intelligence-based educational programs. However, few studies have assessed the effect of using chatbots in nursing education.
    OBJECTIVE: This study aimed to develop and examine the effect of an artificial intelligence chatbot educational program for promoting nursing skills related to electronic fetal monitoring in nursing college students during non-face-to-face classes during the COVID-19 pandemic.
    METHODS: This quasi-experimental study used a nonequivalent control group non-synchronized pretest-posttest design.
    METHODS: The participants were 61 junior students from a nursing college located in G province of South Korea. Data were collected between November 3 and 16, 2021, and analyzed using independent t-tests.
    RESULTS: The experimental group-in which the artificial intelligence chatbot program was applied-did not show statistically significant differences in knowledge (t = -0.58, p = .567), clinical reasoning competency (t = 0.75, p = .455), confidence (t = 1.13, p = .264), and feedback satisfaction (t = 1.72, p = .090), compared with the control group; however, its participants\' interest in education (t = 2.38, p = .020) and self-directed learning (t = 2.72, p = .006) were significantly higher than those in the control group.
    CONCLUSIONS: The findings of our study highlighted the potential of artificial intelligence chatbot programs as an educational assistance tool to promote nursing college students\' interest in education and self-directed learning. Moreover, such programs can be effective in enhancing nursing students\' skills in non-face-to face-situations caused by the ongoing COVID-19 pandemic.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在大数据时代,基于患者的实时质量控制(PBRTQC),作为一种新兴的质量控制(QC)方法,正在临床实验室行业内扩张。然而,当前PBRTQC方法的主要问题是数据稳定性。我们的研究旨在通过将delta数据与机器学习(ML)技术相结合来探索一种新的数据稳定性协议,以提高QC事件检测的能力。
    将北京朝阳医院2019年患者结果的423,290个实验室结果数据集用作训练集(n=380960,90%)和内部验证集(n=42330,10%)。北京龙福医院2019年患者结果的22,460个结果被用作测试集。三类型数据(1)截断极限处理的单类型数据;(2)截断极限处理的delta型数据和(3)隔离森林(IF)算法处理的delta型数据进行了准确性评估。灵敏度,NPed,等。,并与以前发表的统计方法进行比较。
    最佳模型基于随机森林(RF)算法,使用IF算法处理的delta型数据。该模型具有较好的精度(0.99),敏感性(0.99)特异性(0.99)和AUC(0.99)与依赖性测试集,超过PBRTQC的临界偏差50%以上。对于LYMPH#,HGB,和PLT,MLQC的累积MNPed降低了95.43%,97.39%,与PBRTQC的最佳相比,分别为97.97%。
    最终结果表明,通过将创新的ML算法与整体数据处理协议集成,可以改善对QC事件的检测。
    UNASSIGNED: In the big data era, patient-based real-time quality control (PBRTQC), as an emerging quality control (QC) method, is expanding within the clinical laboratory industry. However, the main issue of current PBRTQC methodology is data stability. Our study is aimed to explore a novel protocol for data stability by combining delta data with machine learning (ML) technique to improve the capacity of QC event detection.
    UNASSIGNED: A data set of 423,290 laboratory results from Beijing Chao-yang Hospital 2019 patient results were used as a training set (n = 380960, 90%) and internal validation set (n = 42330, 10%). A further 22,460 results from Beijing Long-fu Hospital 2019 patient results were used as a test set. Three-type data (1) Single-type data processed by truncation limits; (2) delta-type data processed by truncation limits and (3)delta-type data processed by Isolated Forest (IF) algorithm were evaluated with accuracy, sensitivity, NPed, etc., and compared with previously published statistical methods.
    UNASSIGNED: The optimal model was based on Random Forest (RF) algorithm by using delta-type data processed by IF algorithm. The model had a better accuracy (0.99), sensitivity (0.99) specificity (0.99) and AUC (0.99) with the dependent test set, surpassing the critical bias of PBRTQC by over 50%. For the LYMPH#, HGB, and PLT, the cumulative MNPed of MLQC were reduced by 95.43%, 97.39%, and 97.97% respectively when compared to the best of the PBRTQC.
    UNASSIGNED: Final results indicate that by integrating an innovative ML algorithm with the overall data processing protocol the detection of QC events is improved.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号