data processing

数据处理
  • 文章类型: Journal Article
    多样性和隐蔽性的属性在准确检测和有效管理路基结构内的困境方面提出了巨大的挑战。路基病害的发生可能会导致结构退化,从而扩大了交通事故的频率并煽动了经济后果。准确,及时地检测路基病害对于维护和修复具有现有病害的路段至关重要。这有助于延长道路基础设施的使用寿命并减轻财政负担。近年来,众多新技术和方法的出现推动了路基遇险探测的重大进步。因此,这篇综述描述了对路基遇险探测的集中检查,有条不紊地巩固和呈现各种技术,同时剖析它们各自的优点和制约因素。通过对路基遇险检测的全面指导,这项审查有助于方便地识别和有针对性地处理路基病害,从而加强安全性和提高耐久性。强调了本次审查在加强运输基础设施的建设和运营方面的关键作用。
    The attributes of diversity and concealment pose formidable challenges in the accurate detection and efficacious management of distresses within subgrade structures. The onset of subgrade distresses may precipitate structural degradation, thereby amplifying the frequency of traffic incidents and instigating economic ramifications. Accurate and timely detection of subgrade distresses is essential for maintaining and repairing road sections with existing distresses. This helps to prolong the service life of road infrastructure and reduce financial burden. In recent years, the advent of numerous novel technologies and methodologies has propelled significant advancements in subgrade distress detection. Therefore, this review delineates a concentrated examination of subgrade distress detection, methodically consolidating and presenting various techniques while dissecting their respective merits and constraints. By furnishing comprehensive guidance on subgrade distress detection, this review facilitates the expedient identification and targeted treatment of subgrade distresses, thereby fortifying safety and enhancing durability. The pivotal role of this review in bolstering the construction and operational facets of transportation infrastructure is underscored.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:挥发性有机化合物的非目标直接质谱分析在医疗保健和食品安全等领域具有许多潜在应用。然而,必须采用强大的数据处理协议,以确保研究是可复制的,并且可以实现实际应用。方便用户的数据处理和统计工具越来越多;然而,这些工具的使用没有被分析,它们也不一定适合每种数据类型。
    目的:本综述旨在分析当前使用的数据处理和分析工作流程,并检查方法学报告是否足以实现复制。
    方法:从WebofScience和Scopus数据库中确定的研究根据纳入标准进行了系统检查。实验,数据处理,并对相关研究的数据分析工作流程进行了综述。
    结果:从数据库中确定的459项研究中,共有110人符合纳入标准。很少有论文提供了足够的细节,可以准确地复制方法的所有方面,只有三个符合以前的指南报告实验方法。使用了广泛的数据处理方法,只有8篇论文(7.3%)采用了基本相似的工作流程,可以实现直接可比性。
    结论:需要开发标准化的工作流程和报告系统,以确保该领域的研究是可复制的,可比性,并保持高标准。因此,允许实现广泛的潜在应用。
    BACKGROUND: Untargeted direct mass spectrometric analysis of volatile organic compounds has many potential applications across fields such as healthcare and food safety. However, robust data processing protocols must be employed to ensure that research is replicable and practical applications can be realised. User-friendly data processing and statistical tools are becoming increasingly available; however, the use of these tools have neither been analysed, nor are they necessarily suited for every data type.
    OBJECTIVE: This review aims to analyse data processing and analytic workflows currently in use and examine whether methodological reporting is sufficient to enable replication.
    METHODS: Studies identified from Web of Science and Scopus databases were systematically examined against the inclusion criteria. The experimental, data processing, and data analysis workflows were reviewed for the relevant studies.
    RESULTS: From 459 studies identified from the databases, a total of 110 met the inclusion criteria. Very few papers provided enough detail to allow all aspects of the methodology to be replicated accurately, with only three meeting previous guidelines for reporting experimental methods. A wide range of data processing methods were used, with only eight papers (7.3%) employing a largely similar workflow where direct comparability was achievable.
    CONCLUSIONS: Standardised workflows and reporting systems need to be developed to ensure research in this area is replicable, comparable, and held to a high standard. Thus, allowing the wide-ranging potential applications to be realised.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Review
    近年来,全面的二维气相色谱(GC×GC)由于其与常规气相色谱(GC)相比具有更高的峰容量和分辨率,因此逐渐成为分析复杂样品的首选方法。尽管如此,为了充分受益于GC×GC的能力,方法开发和数据处理的整体方法对于成功和信息分析至关重要。方法开发使色谱分离的微调,产生高质量的数据。虽然生成这样的数据至关重要,它不一定保证从中提取有意义的信息。为此,本文的第一部分回顾了理论建模在实现分离条件的良好优化方面的重要性,最终提高了色谱分离的质量。讨论了多种理论建模方法,特别关注基于热力学的建模。本综述的第二部分强调了建立健壮的数据处理工作流程的重要性,特别强调使用先进的数据处理工具,如,机器学习(ML)算法。讨论了三种广泛使用的ML算法:随机森林(RF),支持向量机(SVM)和偏最小二乘判别分析(PLS-DA),强调他们在基于发现的分析中的作用。
    In recent years, comprehensive two-dimensional gas chromatography (GC × GC) has been gradually gaining prominence as a preferred method for the analysis of complex samples due to its higher peak capacity and resolution power compared to conventional gas chromatography (GC). Nonetheless, to fully benefit from the capabilities of GC × GC, a holistic approach to method development and data processing is essential for a successful and informative analysis. Method development enables the fine-tuning of the chromatographic separation, resulting in high-quality data. While generating such data is pivotal, it does not necessarily guarantee that meaningful information will be extracted from it. To this end, the first part of this manuscript reviews the importance of theoretical modeling in achieving good optimization of the separation conditions, ultimately improving the quality of the chromatographic separation. Multiple theoretical modeling approaches are discussed, with a special focus on thermodynamic-based modeling. The second part of this review highlights the importance of establishing robust data processing workflows, with a special emphasis on the use of advanced data processing tools such as, Machine Learning (ML) algorithms. Three widely used ML algorithms are discussed: Random Forest (RF), Support Vector Machine (SVM), and Partial Least Square-Discriminate Analysis (PLS-DA), highlighting their role in discovery-based analysis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    由于气候变化和工业化学品的产生导致的水污染增加,定期的水质监测变得越来越可取。无人驾驶车辆已经成为远程数据采集的关键技术,为水质监测提供快速准确的方法。然而,当前对无人驾驶车辆的研究尚未系统地研究其特征和局限性,这对于确定无人驾驶汽车技术的未来研究方向和应用至关重要。因此,本研究广泛回顾了使用无人驾驶车辆技术进行水质监测的远程数据采集和处理方面的进展,为未来的研究提供有价值的见解。首先,总结了无人车的类型及其在水质监测中的应用范围。在无人驾驶汽车技术中,无人机由于其广泛的数据采集范围以及能够容纳各种传感器和采样器而被认为是水质监测的主要平台。此外,根据无人驾驶车辆上安装的采样器和传感器的特点,分析了它们的类型。结论是,光谱传感器为获取实时水质数据提供了最具成本效益的方法。此外,检查将图像数据转换为水质数据的算法,专注于数据预处理,分析,和验证。研究结果揭示了每个水质参数的光谱特性分析与红色/红色边缘的波长范围之间的密切关系。最后,在总结技术局限性的基础上,进一步提出了无人驾驶汽车技术未来的研究方向。
    Regular water quality monitoring is becoming desirable due to the increase in water pollution caused by both climate change and the generation of industrial chemicals. Unmanned vehicles have emerged as key technologies for remote data acquisition, providing fast and accurate methods for water quality monitoring. However, current research on unmanned vehicles has not systematically examined their features and limitations, which are crucial for identifying future research directions and applications of unmanned vehicle technologies. Therefore, this study extensively reviews the advancements in remote data acquisition and processing using unmanned vehicle technologies for water quality monitoring to provide valuable insights for future research. First, the types of unmanned vehicles and their application ranges for water quality monitoring are summarized. Among the unmanned vehicle technologies, unmanned aerial vehicles are considered primary platforms for water quality monitoring due to their wide data acquisition range and their ability to accommodate diverse sensors and samplers. Also, the types of samplers and sensors mounted on the unmanned vehicles are analyzed based on their characteristics. It is concluded that spectral sensors offer the most cost-effective approach for acquiring real-time water quality data. Furthermore, algorithms that convert image data into water quality data are examined, focusing on data preprocessing, analysis, and validation. The findings reveal a close relationship between the analysis of spectral characteristics of each water quality parameter and the wavelength ranges of red and red-edge. Lastly, future research directions for unmanned vehicle technologies are further suggested based on the summarized technological limitations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Systematic Review
    GDPR的实施旨在为整个欧盟/欧洲经济区的个人数据保护建立一个总体框架。直接从队列参与者收集的数据的联系,可能作为健康研究的重要工具,必须遵守数据保护规则和隐私权。我们的目标是调查将未成年人的队列数据与常规收集的教育和健康数据相比较的法律可能性。方法:对EUR-Lex和葡萄牙GDPR实施的国家法律数据库中公开发布的法律法规进行了法律比较分析和范围审查,芬兰,挪威,以及荷兰及其相关的国家法规,旨在为2021年4月30日之前实施的健康研究建立记录链接。结果:GDPR不能确保成员国之间数据保护立法的完全统一,从而为国家立法提供灵活性。处理个人数据的例外情况,例如,公共利益和科学研究,必须在欧盟/欧洲经济区或国家法律中规定。国家解释的差异在跨国研究和记录链接方面造成了障碍:葡萄牙需要书面同意和道德批准;芬兰允许链接大多未经国家社会和卫生数据许可证管理局的同意;挪威基于区域伦理委员会的批准和适当的信息技术保密;荷兰主要基于选择退出系统和数据保护影响评估。结论:尽管GDPR是最重要的法律框架,将队列数据与常规收集的健康和教育数据联系起来时,国家立法执行最重要。由于国家的解释各不相同,健康研究迫切需要法律干预,以平衡个人的信息自决权和公共利益。在欧盟/欧洲经济区进行更多的协调可能会有所帮助,但对于那些未经明确同意已经为注册和研究公共利益开辟了余地的成员国来说,不应该有害。
    Background: The GDPR was implemented to build an overarching framework for personal data protection across the EU/EEA. Linkage of data directly collected from cohort participants, potentially serving as a prominent tool for health research, must respect data protection rules and privacy rights. Our objective was to investigate law possibilities of linking cohort data of minors with routinely collected education and health data comparing EU/EEA member states. Methods: A legal comparative analysis and scoping review was conducted of openly accessible published laws and regulations in EUR-Lex and national law databases on GDPR\'s implementation in Portugal, Finland, Norway, and the Netherlands and its connected national regulations purposing record linkage for health research that have been implemented up until April 30, 2021. Results: The GDPR does not ensure total uniformity in data protection legislation across member states offering flexibility for national legislation. Exceptions to process personal data, e.g., public interest and scientific research, must be laid down in EU/EEA or national law. Differences in national interpretation caused obstacles in cross-national research and record linkage: Portugal requires written consent and ethical approval; Finland allows linkage mostly without consent through the national Social and Health Data Permit Authority; Norway when based on regional ethics committee\'s approval and adequate information technology safeguarding confidentiality; the Netherlands mainly bases linkage on the opt-out system and Data Protection Impact Assessment. Conclusions: Though the GDPR is the most important legal framework, national legislation execution matters most when linking cohort data with routinely collected health and education data. As national interpretation varies, legal intervention balancing individual right to informational self-determination and public good is gravely needed for health research. More harmonization across EU/EEA could be helpful but should not be detrimental in those member states which already opened a leeway for registries and research for the public good without explicit consent.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    非目标筛选(NTS)是一种强大的环境和分析化学方法,用于检测和识别复杂样品中的未知化合物。高分辨率质谱增强了NTS功能,但在数据分析方面带来了挑战,包括数据预处理,峰值检测,和特征提取。这篇综述提供了对NTS数据处理方法的深入理解,专注于中心,提取离子色谱(XIC)构建,色谱峰表征,对齐,组件化,和功能的优先级。我们讨论了各种算法的优缺点,用户输入参数对结果的影响,以及自动参数优化的需要。我们解决不确定性和数据质量问题,强调在数据处理工作流程中纳入置信区间和原始数据质量评估的重要性。此外,我们强调需要交叉研究的可比性,并提出潜在的解决方案,例如利用标准化统计和开放存取数据交换平台。总之,我们为NTS数据处理算法和工作流的开发人员和用户提供未来的观点和建议。通过应对这些挑战并利用所带来的机遇,NTS社区可以推进这一领域,提高结果的可靠性,并增强不同研究之间的数据可比性。
    Non-target screening (NTS) is a powerful environmental and analytical chemistry approach for detecting and identifying unknown compounds in complex samples. High-resolution mass spectrometry has enhanced NTS capabilities but created challenges in data analysis, including data preprocessing, peak detection, and feature extraction. This review provides an in-depth understanding of NTS data processing methods, focusing on centroiding, extracted ion chromatogram (XIC) building, chromatographic peak characterization, alignment, componentization, and prioritization of features. We discuss the strengths and weaknesses of various algorithms, the influence of user input parameters on the results, and the need for automated parameter optimization. We address uncertainty and data quality issues, emphasizing the importance of incorporating confidence intervals and raw data quality assessment in data processing workflows. Furthermore, we highlight the need for cross-study comparability and propose potential solutions, such as utilizing standardized statistics and open-access data exchange platforms. In conclusion, we offer future perspectives and recommendations for developers and users of NTS data processing algorithms and workflows. By addressing these challenges and capitalizing on the opportunities presented, the NTS community can advance the field, improve the reliability of results, and enhance data comparability across different studies.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    中药的真实性和质量直接影响临床疗效和安全性。由于需求增加和资源短缺,中药质量评估(QATCM)是全球关注的问题。最近,现代分析技术已被广泛研究和利用,以分析中药的化学成分。然而,单一的分析技术有一些局限性,而仅仅从成分的特点来判断中药的质量不足以反映中药的整体观。因此,多源信息融合技术和机器学习(ML)的发展,进一步完善了QATCM。来自不同分析仪器的数据信息可以从多个方面更好地了解草药样品之间的联系。本文重点介绍了数据融合(DF)和ML在QATCM中的应用,包括色谱法,光谱学,和其他电子传感器。介绍了常用的数据结构和DF策略,其次是ML方法,包括快速增长的深度学习。最后,结合机器学习方法的DF策略进行了讨论和说明,用于源识别等应用的研究,物种鉴定,和中医含量预测。该综述证明了基于QATCM的DF和ML策略的有效性和准确性,为开发和应用QATCM方法提供了参考。
    The authenticity and quality of traditional Chinese medicine (TCM) directly impact clinical efficacy and safety. Quality assessment of traditional Chinese medicine (QATCM) is a global concern due to increased demand and shortage of resources. Recently, modern analytical technologies have been extensively investigated and utilized to analyze the chemical composition of TCM. However, a single analytical technique has some limitations, and judging the quality of TCM only from the characteristics of the components is not enough to reflect the overall view of TCM. Thus, the development of multi-source information fusion technology and machine learning (ML) has further improved QATCM. Data information from different analytical instruments can better understand the connection between herbal samples from multiple aspects. This review focuses on the use of data fusion (DF) and ML in QATCM, including chromatography, spectroscopy, and other electronic sensors. The common data structures and DF strategies are introduced, followed by ML methods, including fast-growing deep learning. Finally, DF strategies combined with ML methods are discussed and illustrated for research on applications such as source identification, species identification, and content prediction in TCM. This review demonstrates the validity and accuracy of QATCM-based DF and ML strategies and provides a reference for developing and applying QATCM methods.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    脑电图(EEG)是一种诊断测试,可记录和测量人脑的电活动。使用EEG调查人类行为和状况的研究逐年增加。因此,一种有效的方法对于处理EEG数据集以提高输出信号质量至关重要。小波是在时频域分析中处理EEG信号的众所周知的方法之一。小波变换优于传统的傅里叶变换,因为它具有良好的时频局部化特性和多分辨率分析,可以有效地提取脑电信号的瞬态信息。因此,本文旨在在最近研究的基础上,全面介绍小波方法在脑电信号去噪中的应用。本文首先简要概述了EEG的基本理论和特点以及小波变换方法。然后,描述了在EEG数据集去噪中常用的几种基于小波的方法,并回顾了相当数量的最新发表的具有小波应用的EEG研究工作。此外,讨论了当前基于EEG的小波方法研究中存在的挑战。最后,建议采用替代解决方案来缓解这些问题。
    Electroencephalography (EEG) is a diagnostic test that records and measures the electrical activity of the human brain. Research investigating human behaviors and conditions using EEG has increased from year to year. Therefore, an efficient approach is vital to process the EEG dataset to improve the output signal quality. The wavelet is one of the well-known approaches for processing the EEG signal in time-frequency domain analysis. The wavelet is better than the traditional Fourier Transform because it has good time-frequency localized properties and multi-resolution analysis where the transient information of an EEG signal can be extracted efficiently. Thus, this review article aims to comprehensively describe the application of the wavelet method in denoising the EEG signal based on recent research. This review begins with a brief overview of the basic theory and characteristics of EEG and the wavelet transform method. Then, several wavelet-based methods commonly applied in EEG dataset denoising are described and a considerable number of the latest published EEG research works with wavelet applications are reviewed. Besides, the challenges that exist in current EEG-based wavelet method research are discussed. Finally, alternative solutions to mitigate the issues are recommended.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    意义:光学神经成像已成为一种完善的临床和研究工具,用于监测人脑中的皮质活动。值得注意的是,功能近红外光谱(fNIRS)研究的结果在很大程度上取决于所采用的数据处理流程和分类模型。最近,深度学习(DL)方法已经在许多生物医学领域的数据处理和分类任务中表现出快速和准确的性能。目的:我们旨在回顾fNIRS研究中新兴的DL应用。方法:我们首先介绍一些常用的DL技术。然后,综述总结了该领域一些最活跃领域的当前DL工作,包括脑机接口,神经损伤诊断,和神经科学发现。结果:在这篇综述中考虑的63篇论文中,图32报告了DL技术与传统机器学习技术的比较研究,其中26在分类准确度方面已经显示优于后者。此外,8项研究还利用DL来减少通常使用fNIRS数据进行的预处理量或通过数据增强来增加数据量。结论:将DL技术应用于fNIRS研究已表明可以减轻fNIRS研究中存在的许多障碍,例如冗长的数据预处理或小样本量,同时实现可比或提高的分类准确性。
    Significance: Optical neuroimaging has become a well-established clinical and research tool to monitor cortical activations in the human brain. It is notable that outcomes of functional near-infrared spectroscopy (fNIRS) studies depend heavily on the data processing pipeline and classification model employed. Recently, deep learning (DL) methodologies have demonstrated fast and accurate performances in data processing and classification tasks across many biomedical fields. Aim: We aim to review the emerging DL applications in fNIRS studies. Approach: We first introduce some of the commonly used DL techniques. Then, the review summarizes current DL work in some of the most active areas of this field, including brain-computer interface, neuro-impairment diagnosis, and neuroscience discovery. Results: Of the 63 papers considered in this review, 32 report a comparative study of DL techniques to traditional machine learning techniques where 26 have been shown outperforming the latter in terms of the classification accuracy. In addition, eight studies also utilize DL to reduce the amount of preprocessing typically done with fNIRS data or increase the amount of data via data augmentation. Conclusions: The application of DL techniques to fNIRS studies has shown to mitigate many of the hurdles present in fNIRS studies such as lengthy data preprocessing or small sample sizes while achieving comparable or improved classification accuracy.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    水质的长期连续监测(LTCM)可以通过提供各种参数的时空数据集并以节能和经济的方式实现水和废水处理过程的运行,从而对水生态系统产生深远的影响。然而,目前的水监测技术缺乏数据收集和处理能力的长期准确性。LTCM数据不足会阻碍水质评估,并阻碍利益相关者和决策者预见新出现的问题并执行有效的控制方法。为了应对这一挑战,这篇综述提供了一个前瞻性的路线图,突出了LTCM的重要创新,并通过三个层次的视角阐述了LTCM的影响:数据,参数,和系统。首先,我们展示了LTCM在自然资源用水方面的关键需求和挑战,饮用水,和废水系统,并将LTCM与现有的短期和离散监测技术区分开来。然后,我们阐述了在水系统中实现LTCM的三个步骤,由数据采集(水传感器)组成,数据处理(机器学习算法),和数据应用程序(以建模和过程控制为两个示例)。最后,我们在四个关键领域探索LTCM的未来机遇,水,能源,传感,和数据,并强调将科学发现转移给一般最终用户的策略。
    Long-term continuous monitoring (LTCM) of water quality can bring far-reaching influences on water ecosystems by providing spatiotemporal data sets of diverse parameters and enabling operation of water and wastewater treatment processes in an energy-saving and cost-effective manner. However, current water monitoring technologies are deficient for long-term accuracy in data collection and processing capability. Inadequate LTCM data impedes water quality assessment and hinders the stakeholders and decision makers from foreseeing emerging problems and executing efficient control methodologies. To tackle this challenge, this review provides a forward-looking roadmap highlighting vital innovations toward LTCM, and elaborates on the impacts of LTCM through a three-hierarchy perspective: data, parameters, and systems. First, we demonstrate the critical needs and challenges of LTCM in natural resource water, drinking water, and wastewater systems, and differentiate LTCM from existing short-term and discrete monitoring techniques. We then elucidate three steps to achieve LTCM in water systems, consisting of data acquisition (water sensors), data processing (machine learning algorithms), and data application (with modeling and process control as two examples). Finally, we explore future opportunities of LTCM in four key domains, water, energy, sensing, and data, and underscore strategies to transfer scientific discoveries to general end-users.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号