Data mining

数据挖掘
  • 文章类型: Journal Article
    鉴别诊断是医学实践的一个重要方面,因为它指导临床医生准确的诊断和有效的治疗计划。传统资源,例如医疗书籍和UpToDate等服务,受到人工策展的约束,可能错过了新的或不太常见的发现。本文介绍和分析了两种从科学文献中挖掘病因的新方法。第一种方法采用基于句法模式的传统自然语言处理(NLP)方法。通过使用人工引导模式的新应用,快速导出了自举模式,和症状病因被提取,具有显著的覆盖面。第二种方法利用生成模型,特别是GPT-4,加上事实验证管道,标志着生成技术在病因提取中的开创性应用。分析第二种方法表明,虽然它非常精确,与句法方法相比,它提供的覆盖范围较小。重要的是,将这两种方法结合起来会产生协同效果,提高病因挖掘的深度和可靠性。
    Differential diagnosis is a crucial aspect of medical practice, as it guides clinicians to accurate diagnoses and effective treatment plans. Traditional resources, such as medical books and services like UpToDate, are constrained by manual curation, potentially missing out on novel or less common findings. This paper introduces and analyzes two novel methods to mine etiologies from scientific literature. The first method employs a traditional Natural Language Processing (NLP) approach based on syntactic patterns. By using a novel application of human-guided pattern bootstrapping patterns are derived quickly, and symptom etiologies are extracted with significant coverage. The second method utilizes generative models, specifically GPT-4, coupled with a fact verification pipeline, marking a pioneering application of generative techniques in etiology extraction. Analyzing this second method shows that while it is highly precise, it offers lesser coverage compared to the syntactic approach. Importantly, combining both methodologies yields synergistic outcomes, enhancing the depth and reliability of etiology mining.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在造船业,由于空间限制,使用焊接机器人的焊接自动化通常依赖于电弧传感技术。然而,反馈电流值的可靠性,核心传感数据,当焊接目标工件由于短路过渡的控制而在弯曲工件之间具有明显的曲率或间隙时,导致焊缝跟踪失败和随后的工件损坏。为了解决这些问题,本研究提出了一种新的算法,MBSC(基于中位数的空间聚类),基于DBSCAN(带噪声应用的基于密度的空间聚类)聚类算法。通过基于每个编织区域中数据的中值进行聚类,并考虑反馈电流数据的特性,所提出的技术利用检测到的异常值,以提高焊缝跟踪的准确性和响应性在非结构化和具有挑战性的焊接环境。通过在院子环境中的实际焊接实验,验证了该技术的有效性。
    In the shipbuilding industry, welding automation using welding robots often relies on arc-sensing techniques due to spatial limitations. However, the reliability of the feedback current value, core sensing data, is reduced when welding target workpieces have significant curvature or gaps between curved workpieces due to the control of short-circuit transition, leading to seam tracking failure and subsequent damage to the workpieces. To address these problems, this study proposes a new algorithm, MBSC (median-based spatial clustering), based on the DBSCAN (density-based spatial clustering of applications with noise) clustering algorithm. By performing clustering based on the median value of data in each weaving area and considering the characteristics of the feedback current data, the proposed technique utilizes detected outliers to enhance seam tracking accuracy and responsiveness in unstructured and challenging welding environments. The effectiveness of the proposed technique was verified through actual welding experiments in a yard environment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    先进的技术可以加快从微生物中发现天然产物的步伐,一直落后于药物发现时代。因此,本综述文章讨论了各种跨学科和尖端技术,以提出一种具体策略,该策略可以从已知微生物中高通量筛选新型天然化合物(NC)。最近的生物信息学方法表明,微生物基因组包含一个巨大的未开发的沉默生物合成基因簇(BGC)库。本文介绍了几种鉴定具有隐性BGC隐矿的微生物菌株的方法。此外,AntiSMASH5.0是免费的,准确,和高度可靠的生物信息学工具进行了详细讨论,以识别微生物基因组中的沉默BGC。Further,最新的微生物培养技术,HiTES(高通量诱导子筛选),已经详细描述了一次使用500-1000种不同生长条件的沉默BGC的表达。在沉默的BGC表达之后,强调了最新的质谱方法来识别NC。最近出现的LAESI-IMS(激光烧蚀电喷雾电离成像质谱)技术,这使得能够直接从微量滴定板快速鉴定新型NC,详细介绍了。最后,强调各种趋势“去复制”策略,以提高NC筛查的有效性。
    Advanced techniques can accelerate the pace of natural product discovery from microbes, which has been lagging behind the drug discovery era. Therefore, the present review article discusses the various interdisciplinary and cutting-edge techniques to present a concrete strategy that enables the high-throughput screening of novel natural compounds (NCs) from known microbes. Recent bioinformatics methods revealed that the microbial genome contains a huge untapped reservoir of silent biosynthetic gene clusters (BGC). This article describes several methods to identify the microbial strains with hidden mines of silent BGCs. Moreover, antiSMASH 5.0 is a free, accurate, and highly reliable bioinformatics tool discussed in detail to identify silent BGCs in the microbial genome. Further, the latest microbial culture technique, HiTES (high-throughput elicitor screening), has been detailed for the expression of silent BGCs using 500-1000 different growth conditions at a time. Following the expression of silent BGCs, the latest mass spectrometry methods are highlighted to identify the NCs. The recently emerged LAESI-IMS (laser ablation electrospray ionization-imaging mass spectrometry) technique, which enables the rapid identification of novel NCs directly from microtiter plates, is presented in detail. Finally, various trending \'dereplication\' strategies are emphasized to increase the effectiveness of NC screening.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    行为分析是在实际生产中广泛使用的非侵入性工具,因为动物充当生物传感器,能够反映其对某些环境挑战的适应和不适程度。常规统计使用发生数据进行行为评估和幸福感估计,无视事件的时间顺序。广义序列模式(GSP)算法是一种数据挖掘方法,用于识别超过用户指定支持阈值的循环序列,在丰富的环境中,尚未对肉鸡的潜力进行研究。浓缩旨在增加环境复杂性,对动物福利产生有希望的影响,刺激优先行为,并可能减少热应激的有害影响。这里的目的是通过概念证明来验证GSP算法的应用,以识别热应激与丰富环境中肉鸡行为之间的时间相关性。连续48小时自动采集视频图像,分析连续七个小时的时间,从12:00PM到6:00PM,在连续两天的测试中,在舒适和压力温度下饲养在丰富和非丰富环境中的鸡。在舒适的温度下,鸡表现出很高的动机来执行打扮(P)的行为,觅食(F),躺下(Ld),吃(E),和行走(W);序列<{Ld,P}>;<{Ld,F}>;<{P,F,P}>;<{Ld,P,F}>;和<{E,W,F}>是在两种处理中观察到的唯一的。所有其他顺序模式(舒适和压力)是不同的,表明环境富集改变了肉鸡的行为模式。在测试环境中,热应力大大降低了在20%阈值水平下发现的顺序模式。横向躺着“Ll”的行为是肉鸡热应激的强烈指标,仅在非富集环境中频繁出现,这可能表明环境的丰富为动物提供了更好的机会来适应压力引发的挑战,如热。
    Behavior analysis is a widely used non-invasive tool in the practical production routine, as the animal acts as a biosensor capable of reflecting its degree of adaptation and discomfort to some environmental challenge. Conventional statistics use occurrence data for behavioral evaluation and well-being estimation, disregarding the temporal sequence of events. The Generalized Sequential Pattern (GSP) algorithm is a data mining method that identifies recurrent sequences that exceed a user-specified support threshold, the potential of which has not yet been investigated for broiler chickens in enriched environments. Enrichment aims to increase environmental complexity with promising effects on animal welfare, stimulating priority behaviors and potentially reducing the deleterious effects of heat stress. The objective here was to validate the application of the GSP algorithm to identify temporal correlations between heat stress and the behavior of broiler chickens in enriched environments through a proof of concept. Video image collection was carried out automatically for 48 continuous hours, analyzing a continuous period of seven hours, from 12:00 PM to 6:00 PM, during two consecutive days of tests for chickens housed in enriched and non-enriched environments under comfort and stress temperatures. Chickens at the comfort temperature showed high motivation to perform the behaviors of preening (P), foraging (F), lying down (Ld), eating (E), and walking (W); the sequences <{Ld,P}>; <{Ld,F}>; <{P,F,P}>; <{Ld,P,F}>; and <{E,W,F}> were the only ones observed in both treatments. All other sequential patterns (comfort and stress) were distinct, suggesting that environmental enrichment alters the behavioral pattern of broiler chickens. Heat stress drastically reduced the sequential patterns found at the 20% threshold level in the tested environments. The behavior of lying laterally \"Ll\" is a strong indicator of heat stress in broilers and was only frequent in the non-enriched environment, which may suggest that environmental enrichment provides the animal with better opportunities to adapt to stress-inducing challenges, such as heat.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    在最近的研究中,已经对红外热成像进行了研究,以监测体表温度并将其与动物福利和性能因素相关联。在这种情况下,这项研究提出了使用热签名方法作为从蛋鸡体表区域获得的温度矩阵的特征提取器(脸,眼睛,wattle,梳子,腿,和foot),以实现热应力水平分类的计算模型的构建。在气候控制室进行的实验中,192只产蛋鸡,34周大,来自两个不同菌株(DekalbWhite和DekalbBrown)的菌株被分为几组,并在热应激(35°C和60%湿度)和热舒适(26°C和60%湿度)的条件下饲养。每周,使用热成像相机收集母鸡的个体热图像,以及它们各自的直肠温度。切出母鸡身体的六个无羽图像区域的表面温度。直肠温度用于将每个红外热成像数据标记为“危险”或“正常”,和五种不同的分类器模型(随机森林,随机树,多层感知器,K-最近的邻居,和Logistic回归)使用各自的热特征生成直肠温度类别。在表面温度和直肠温度的热特征中没有观察到菌株之间的差异。事实证明,直肠温度和热信号表示热应力和舒适条件。蛋鸡面部面积的随机森林模型实现了最高的性能(89.0%)。对于wattle区,随机森林模型也展示了高性能(88.3%),表明该区域在更发达的菌株中的重要性。这些发现验证了从红外热成像中提取特征的方法。当与机器学习相结合时,这种方法已被证明是有前途的生成分类器模型的热应力水平在蛋鸡生产环境。
    Infrared thermography has been investigated in recent studies to monitor body surface temperature and correlate it with animal welfare and performance factors. In this context, this study proposes the use of the thermal signature method as a feature extractor from the temperature matrix obtained from regions of the body surface of laying hens (face, eye, wattle, comb, leg, and foot) to enable the construction of a computational model for heat stress level classification. In an experiment conducted in climate-controlled chambers, 192 laying hens, 34 weeks old, from two different strains (Dekalb White and Dekalb Brown) were divided into groups and housed under conditions of heat stress (35 °C and 60% humidity) and thermal comfort (26 °C and 60% humidity). Weekly, individual thermal images of the hens were collected using a thermographic camera, along with their respective rectal temperatures. Surface temperatures of the six featherless image areas of the hens\' bodies were cut out. Rectal temperature was used to label each infrared thermography data as \"Danger\" or \"Normal\", and five different classifier models (Random Forest, Random Tree, Multilayer Perceptron, K-Nearest Neighbors, and Logistic Regression) for rectal temperature class were generated using the respective thermal signatures. No differences between the strains were observed in the thermal signature of surface temperature and rectal temperature. It was evidenced that the rectal temperature and the thermal signature express heat stress and comfort conditions. The Random Forest model for the face area of the laying hen achieved the highest performance (89.0%). For the wattle area, a Random Forest model also demonstrated high performance (88.3%), indicating the significance of this area in strains where it is more developed. These findings validate the method of extracting characteristics from infrared thermography. When combined with machine learning, this method has proven promising for generating classifier models of thermal stress levels in laying hen production environments.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:近年来,随着计算机辅助诊断系统的发展,机器学习在医学诊断和治疗中的使用显着增长,通常基于带注释的医学放射学图像。然而,缺乏大型注释图像数据集仍然是一个主要障碍,因为注释过程耗时且成本高昂。本研究旨在通过提出一种基于语义相似性来注释大型医学放射学图像数据库的自动化方法来克服这一挑战。
    结果:自动,无监督方法用于创建源自临床医院中心Rijeka的大型医学放射学图像注释数据集,克罗地亚。该管道是通过数据挖掘三种不同类型的医疗数据构建的:图像,DICOM元数据和叙事诊断。然后将最佳特征提取器集成到多模态表示中,然后对其进行聚类以创建自动管道,用于将1,337,926个医学图像的前体数据集标记为50个视觉上相似的图像集群。通过检查聚类的同质性和互信息来评估聚类的质量,考虑到解剖区域和模态表示。
    结论:结果表明,将所有三个数据源的嵌入融合在一起,为大规模医疗数据的无监督聚类任务提供了最佳结果,并导致了最简洁的聚类。因此,这项工作标志着朝着建立更大,更细粒度的医学放射学图像注释数据集迈出了第一步。
    BACKGROUND: The use of machine learning in medical diagnosis and treatment has grown significantly in recent years with the development of computer-aided diagnosis systems, often based on annotated medical radiology images. However, the lack of large annotated image datasets remains a major obstacle, as the annotation process is time-consuming and costly. This study aims to overcome this challenge by proposing an automated method for annotating a large database of medical radiology images based on their semantic similarity.
    RESULTS: An automated, unsupervised approach is used to create a large annotated dataset of medical radiology images originating from the Clinical Hospital Centre Rijeka, Croatia. The pipeline is built by data-mining three different types of medical data: images, DICOM metadata and narrative diagnoses. The optimal feature extractors are then integrated into a multimodal representation, which is then clustered to create an automated pipeline for labelling a precursor dataset of 1,337,926 medical images into 50 clusters of visually similar images. The quality of the clusters is assessed by examining their homogeneity and mutual information, taking into account the anatomical region and modality representation.
    CONCLUSIONS: The results indicate that fusing the embeddings of all three data sources together provides the best results for the task of unsupervised clustering of large-scale medical data and leads to the most concise clusters. Hence, this work marks the initial step towards building a much larger and more fine-grained annotated dataset of medical radiology images.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    共分馏质谱(CF-MS)使用生化分馏从细胞裂解物中分离和表征大分子复合物,而无需亲和标记或捕获。近年来,这已成为阐明各种生物样本中整体蛋白质-蛋白质相互作用网络的强大技术。这篇综述重点介绍了CF-MS实验工作流程的最新进展,包括机器学习指导分析,用于发现具有增强灵敏度的动态和高分辨率蛋白质相互作用景观,精度和吞吐量,能够更好地表征内源性蛋白质复合物。通过应对该领域的挑战和紧急机遇,这篇综述强调了CF-MS在促进我们对健康和疾病中功能性蛋白质相互作用网络的理解方面的转化潜力。
    Co-fractionation mass spectrometry (CF-MS) uses biochemical fractionation to isolate and characterize macromolecular complexes from cellular lysates without the need for affinity tagging or capture. In recent years, this has emerged as a powerful technique for elucidating global protein-protein interaction networks in a wide variety of biospecimens. This review highlights the latest advancements in CF-MS experimental workflows including machine learning-guided analyses, for uncovering dynamic and high-resolution protein interaction landscapes with enhanced sensitivity, accuracy and throughput, enabling better biophysical characterization of endogenous protein complexes. By addressing challenges and emergent opportunities in the field, this review underscores the transformative potential of CF-MS in advancing our understanding of functional protein interaction networks in health and disease.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    生物医学关系提取是自然语言处理社区中持续存在的挑战。它的应用对于理解科学生物医学文献很重要,有很多用例,比如药物发现,精准医学,疾病诊断,治疗优化和生物医学知识图谱构建。因此,能够有效解决这一任务的工具的开发具有通过自动化从研究手稿中提取关系来提高知识发现的潜力。BioCreativeVIII竞赛的第一首曲目通过在文献中引入对新颖关系的检测,扩大了这一挑战的范围。本文描述了我们的参与系统最初专注于联合提取和分类生物医学实体之间的新关系。然后,我们描述我们对端到端模型的后续改进。具体来说,我们通过将其合并到包括标记器和链接器模块的级联管道中来增强我们的初始系统。这种集成可以直接从原始文本中全面提取关系并对其新颖性进行分类。我们的实验取得了有希望的结果,我们的标记器模块设法获得了最先进的命名实体识别性能,微F1评分为90.24,而我们的端到端系统获得了24.59的竞争新颖性F1评分。运行我们系统的代码可在https://github.com/ieeta-pt/BioNExt上公开获得。数据库URL:https://github.com/ieeta-pt/BioNExt。
    Biomedical relation extraction is an ongoing challenge within the natural language processing community. Its application is important for understanding scientific biomedical literature, with many use cases, such as drug discovery, precision medicine, disease diagnosis, treatment optimization and biomedical knowledge graph construction. Therefore, the development of a tool capable of effectively addressing this task holds the potential to improve knowledge discovery by automating the extraction of relations from research manuscripts. The first track in the BioCreative VIII competition extended the scope of this challenge by introducing the detection of novel relations within the literature. This paper describes that our participation system initially focused on jointly extracting and classifying novel relations between biomedical entities. We then describe our subsequent advancement to an end-to-end model. Specifically, we enhanced our initial system by incorporating it into a cascading pipeline that includes a tagger and linker module. This integration enables the comprehensive extraction of relations and classification of their novelty directly from raw text. Our experiments yielded promising results, and our tagger module managed to attain state-of-the-art named entity recognition performance, with a micro F1-score of 90.24, while our end-to-end system achieved a competitive novelty F1-score of 24.59. The code to run our system is publicly available at https://github.com/ieeta-pt/BioNExt. Database URL: https://github.com/ieeta-pt/BioNExt.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:慢性糖尿病伤口对糖尿病患者的健康构成重大威胁,代表严重和持久的并发症。全球范围内,估计有2.5%到15%的年度健康预算与糖尿病有关,糖尿病伤口占很大比例。探索新的治疗剂和方法来解决糖尿病中延迟和受损的伤口愈合变得势在必行。中药治疗慢性创面愈合历史悠久,疗效显著。在这项研究中,所有经国家药品监督管理局(NMPA)正式批准的局部应用的伤口愈合中成药(pCM)均来自NMPATCM数据库。采用数据挖掘的方法获得了一个高频中药成分对,珍珠冰片(1:1)。
    方法:本研究通过动物实验和代谢组学研究了珍珠-冰片对糖尿病创面愈合的影响和分子机制。动物实验结果显示,珍珠-冰片对显著加速糖尿病创面愈合,表现出比单独的珍珠或冰片治疗更有效的效果。同时,代谢组学分析确定了模型组和正常组之间伤口代谢谱的显着差异,表明糖尿病伤口具有与正常伤口不同的代谢特征。此外,凡士林处理的伤口表现出与模型组伤口相似的代谢特征,这表明凡士林对糖尿病伤口代谢的影响可以忽略不计。此外,用珍珠治疗的伤口,冰片,和珍珠-冰片对显示出与凡士林治疗伤口明显不同的代谢谱,表示这些治疗对伤口代谢的影响。随后对代谢途径的富集分析强调了精氨酸代谢途径的参与,与糖尿病伤口密切相关,在珍珠-冰片对治疗的愈合过程中。进一步分析显示精氨酸和瓜氨酸水平升高,与正常伤口相比,模型伤口和凡士林处理伤口中的一氧化氮(NO)减少,指出糖尿病伤口中精氨酸的利用受损。有趣的是,用珍珠和珍珠-冰片对治疗降低精氨酸和瓜氨酸水平,同时增加NO含量,提示这些治疗可能促进精氨酸的分解代谢产生NO,从而促进更快的伤口闭合。此外,单独冰片显著升高伤口中的NO含量,可能是由于其直接将硝酸盐/亚硝酸盐还原为NO的能力。氧化应激是糖尿病伤口中代谢受损的决定性特征。
    结果:结果表明,与凡士林治疗相比,Pearl和Pearl-Borneol对降低了糖尿病伤口中氧化应激生物标志物甲硫氨酸亚砜的水平,表明Pearl单独或联合冰片可增强糖尿病伤口的氧化应激微环境。
    结论:总之,研究结果验证了珍珠冰片对加速糖尿病伤口愈合的有效性,具有减少氧化应激的作用,增强精氨酸代谢,增加NO的产生,为这种治疗方法提供了机械基础。
    BACKGROUND: Chronic diabetic wounds pose a significant threat to the health of diabetic patients, representing severe and enduring complications. Globally, an estimated 2.5% to 15% of the annual health budget is associated with diabetes, with diabetic wounds accounting for a substantial share. Exploring new therapeutic agents and approaches to address delayed and impaired wound healing in diabetes becomes imperative. Traditional Chinese medicine (TCM) has a long history and remarkable efficacy in treating chronic wound healing. In this study, all topically applied proprietary Chinese medicines (pCMs) for wound healing officially approved by the National Medical Products Administration (NMPA) were collected from the NMPA TCM database. Data mining was employed to obtain a high-frequency TCM ingredients pair, Pearl-Borneol (1:1).
    METHODS: This study investigated the effect and molecular mechanism of the Pearl-Borneol pair on the healing of diabetic wounds by animal experiments and metabolomics. The results from animal experiments showed that the Pearl-Borneol pair significantly accelerated diabetic wound healing, exhibiting a more potent effect than the Pearl or Borneol treatment alone. Meanwhile, the metabolomics analysis identified significant differences in metabolic profiles in wounds between the model and normal groups, indicating that diabetic wounds had distinct metabolic characteristics from normal wounds. Moreover, Vaseline-treated wounds exhibited similar metabolic profiles to the wounds from the model group, suggesting that Vaseline might have a negligible impact on diabetic wound metabolism. In addition, wounds treated with Pearl, Borneol, and Pearl-Borneol pair displayed significantly different metabolic profiles from Vaseline-treated wounds, signifying the influence of these treatments on wound metabolism. Subsequent enrichment analysis of the metabolic pathway highlighted the involvement of the arginine metabolic pathway, closely associated with diabetic wounds, in the healing process under Pearl- Borneol pair treatment. Further analysis revealed elevated levels of arginine and citrulline, coupled with reduced nitric oxide (NO) in both the model and Vaseline-treated wounds compared to normal wounds, pointing to impaired arginine utilization in diabetic wounds. Interestingly, treatment with Pearl and Pearl-Borneol pair lowered arginine and citrulline levels while increasing NO content, suggesting that these treatments may promote the catabolism of arginine to generate NO, thereby facilitating faster wound closure. Additionally, borneol alone significantly elevated NO content in wounds, potentially due to its ability to directly reduce nitrates/nitrites to NO. Oxidative stress is a defining characteristic of impaired metabolism in diabetic wounds.
    RESULTS: The result showed that both Pearl and Pearl-Borneol pair decreased the oxidative stress biomarker methionine sulfoxide level in diabetic wounds compared to those treated with Vaseline, indicating that Pearl alone or combined with Borneol may enhance the oxidative stress microenvironment in diabetic wounds.
    CONCLUSIONS: In summary, the findings validate the effectiveness of the Pearl-Borneol pair in accelerating the healing of diabetic wounds, with effects on reducing oxidative stress, enhancing arginine metabolism, and increasing NO generation, providing a mechanistic basis for this therapeutic approach.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    基于工艺参数(因素)的筛选和使用回归分析的参数(响应)的统计验证的实验室规模(体外)微生物发酵。最近的趋势已经从全因子设计转向更复杂的响应面方法设计,如Box-Behnken设计,中央复合材料设计。除了优化方法之外,列出的设计在根据类变量推导参数属性方面不够灵活。机器学习算法对于通过适当的学习算法呈现的数据集具有独特的可视化。分类算法不能应用于所有数据集,在这方面,分类器的选择至关重要。要解决此问题,因子-反应关系需要作为数据集进行评估,随后的预处理可能会导致适当的结果。当前研究的目的是首次研究使用有机来源的体外丙酮酸生产开发的数据集的数据挖掘准确性。属性在各种分类器上进行比较分类,并基于准确性,选择多层感知器(神经网络算法)作为分类器。根据结果,该模型对类别的预测结果显著,拟合良好。所开发的学习曲线还显示数据集收敛并且是线性可分离的。
    The laboratory-scale (in-vitro) microbial fermentation based on screening of process parameters (factors) and statistical validation of parameters (responses) using regression analysis. The recent trends have shifted from full factorial design towards more complex response surface methodology designs such as Box-Behnken design, Central Composite design. Apart from the optimisation methodologies, the listed designs are not flexible enough in deducing properties of parameters in terms of class variables. Machine learning algorithms have unique visualisations for the dataset presented with appropriate learning algorithms. The classification algorithms cannot be applied on all datasets and selection of classifier is essential in this regard. To resolve this issue, factor-response relationship needs to be evaluated as dataset and subsequent preprocessing could lead to appropriate results. The aim of the current study was to investigate the data-mining accuracy on the dataset developed using in-vitro pyruvate production using organic sources for the first time. The attributes were subjected to comparative classification on various classifiers and based on accuracy, multilayer perceptron (neural network algorithm) was selected as classifier. As per the results, the model showed significant results for prediction of classes and a good fit. The learning curve developed also showed the datasets converging and were linearly separable.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号