Reinforcement Learning

强化学习
  • 文章类型: Journal Article
    基于强化学习的超启发式算法(RL-HH)是优化领域的流行趋势。RL-HH结合了超启发式(HH)的全局搜索能力和强化学习(RL)的学习能力。这种协同作用允许代理动态调整自己的策略,导致解决方案的逐步优化。现有研究表明RL-HH在解决复杂现实问题方面的有效性。然而,对RL-HH领域的全面介绍和总结尚属空白。本研究回顾了目前存在的RL-HH,并提出了RL-HH的一般框架。本文将算法类型分为两类:基于价值的强化学习超启发式和基于策略的强化学习超启发式。对每个类别中的典型算法进行了总结和详细描述。最后,讨论了RL-HH现有研究的不足和未来的研究方向。
    The reinforcement learning based hyper-heuristics (RL-HH) is a popular trend in the field of optimization. RL-HH combines the global search ability of hyper-heuristics (HH) with the learning ability of reinforcement learning (RL). This synergy allows the agent to dynamically adjust its own strategy, leading to a gradual optimization of the solution. Existing researches have shown the effectiveness of RL-HH in solving complex real-world problems. However, a comprehensive introduction and summary of the RL-HH field is still blank. This research reviews currently existing RL-HHs and presents a general framework for RL-HHs. This article categorizes the type of algorithms into two categories: value-based reinforcement learning hyper-heuristics and policy-based reinforcement learning hyper-heuristics. Typical algorithms in each category are summarized and described in detail. Finally, the shortcomings in existing researches on RL-HH and future research directions are discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    我们提出了一种用于加速退化测试和预测性维护的新颖决策框架,该框架利用了有关系统状态的先验知识和实验数据。作为这些领域顺序决策的框架,考虑了动态规划和强化学习,以及必要时数据驱动的降级学习。此外,我们说明了随机和机器学习退化模型,它们集成在框架中,使用数据驱动的方法。这些方法是设计寿命测试实验和维护锂离子电池的有价值的工具。
    We present a novel decision-making framework for accelerated degradation tests and predictive maintenance that exploits prior knowledge and experimental data on the system\'s state. As a framework for sequential decision making in these areas, dynamic programming and reinforcement learning are considered, along with data-driven degradation learning when necessary. Furthermore, we illustrate both stochastic and machine learning degradation models, which are integrated in the framework, using data-driven methods. These methods are presented as a valuable tool for designing life-testing experiments and for maintaining lithium-ion batteries.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    强化学习(RL)已应用于计算化学的各个领域,并取得了广泛的成功。在这次审查中,我们首先激发RL在化学中的应用,并列出一些广泛的应用领域,例如,分子生成,几何优化,和逆转录途径搜索。我们建立了一些与强化学习相关的形式主义,可以帮助读者将他们的化学问题转化为可以使用RL来解决它们的形式。然后,我们讨论了最近文献中针对这些问题提出的解决方案公式和算法,一个比另一个的优势,以及他们采用的RL算法的必要细节。这篇文章应该帮助读者理解RL在化学中的应用状况,了解一些相关的积极研究的开放问题,深入了解如何使用RL来接近它们,并希望激发化学中创新的RL应用。
    Reinforcement learning (RL) has been applied to various domains in computational chemistry and has found wide-spread success. In this review, we first motivate the application of RL to chemistry and list some broad application domains, for example, molecule generation, geometry optimization, and retrosynthetic pathway search. We set up some of the formalism associated with reinforcement learning that should help the reader translate their chemistry problems into a form where RL can be used to solve them. We then discuss the solution formulations and algorithms proposed in recent literature for these problems, the advantages of one over the other, together with the necessary details of the RL algorithms they employ. This article should help the reader understand the state of RL applications in chemistry, learn about some relevant actively-researched open problems, gain insight into how RL can be used to approach them and hopefully inspire innovative RL applications in Chemistry.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    强化学习(RL)已经成为人工智能中一种动态和变革的范式,在复杂和动态环境中提供智能决策的承诺。这种独特的功能使RL能够通过同时采样来解决顺序决策问题,评估,和反馈。因此,RL技术已经成为在各个领域开发强大解决方案的合适候选者。在这项研究中,我们对RL算法和应用进行了全面系统的综述。这篇综述从探索RL的基础开始,并开始详细研究每种算法,最后,根据几个标准对RL算法进行了比较分析。然后,这篇综述扩展到RL的两个关键应用:机器人技术和医疗保健。在机器人操纵中,RL增强了对象抓取和自主学习等任务的精确性和适应性。在医疗保健方面,这篇综述将焦点转向细胞生长问题领域,阐明RL如何提供数据驱动的方法来优化细胞培养物的生长和治疗解决方案的开发。这篇综述提供了一个全面的概述,揭示RL不断发展的景观及其在两个不同但相互联系的领域的潜力。
    Reinforcement learning (RL) has emerged as a dynamic and transformative paradigm in artificial intelligence, offering the promise of intelligent decision-making in complex and dynamic environments. This unique feature enables RL to address sequential decision-making problems with simultaneous sampling, evaluation, and feedback. As a result, RL techniques have become suitable candidates for developing powerful solutions in various domains. In this study, we present a comprehensive and systematic review of RL algorithms and applications. This review commences with an exploration of the foundations of RL and proceeds to examine each algorithm in detail, concluding with a comparative analysis of RL algorithms based on several criteria. This review then extends to two key applications of RL: robotics and healthcare. In robotics manipulation, RL enhances precision and adaptability in tasks such as object grasping and autonomous learning. In healthcare, this review turns its focus to the realm of cell growth problems, clarifying how RL has provided a data-driven approach for optimizing the growth of cell cultures and the development of therapeutic solutions. This review offers a comprehensive overview, shedding light on the evolving landscape of RL and its potential in two diverse yet interconnected fields.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    软机器人与体现的智能密切相关,共同探索通过物理形式和智能交互实现更自然有效的机器人行为的手段。体现智力强调智力受大脑协同作用的影响,身体,和环境,关注代理人与环境之间的相互作用。在这个框架下,软机器人的设计和控制策略取决于它们的物理形式和材料特性,以及算法和数据处理,使它们能够以自然和适应性的方式与环境互动。目前,体现智力全面整合了进化的相关研究成果,学习,感知,智能算法领域的决策,以及机器人领域的行为和控制。从这个角度来看,研究了软机器人背景下体现智能的相关分支,涵盖体现形态的计算;体现人工智能的演变;以及感知,control,和软机器人的决策。此外,在此基础上,总结了重要的研究进展,并对相关科学问题进行了讨论。本研究可为软机器人背景下体现智能的研究提供参考。
    Soft robotics is closely related to embodied intelligence in the joint exploration of the means to achieve more natural and effective robotic behaviors via physical forms and intelligent interactions. Embodied intelligence emphasizes that intelligence is affected by the synergy of the brain, body, and environment, focusing on the interaction between agents and the environment. Under this framework, the design and control strategies of soft robotics depend on their physical forms and material properties, as well as algorithms and data processing, which enable them to interact with the environment in a natural and adaptable manner. At present, embodied intelligence has comprehensively integrated related research results on the evolution, learning, perception, decision making in the field of intelligent algorithms, as well as on the behaviors and controls in the field of robotics. From this perspective, the relevant branches of the embodied intelligence in the context of soft robotics were studied, covering the computation of embodied morphology; the evolution of embodied AI; and the perception, control, and decision making of soft robotics. Moreover, on this basis, important research progress was summarized, and related scientific problems were discussed. This study can provide a reference for the research of embodied intelligence in the context of soft robotics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这篇综述旨在系统地识别和全面审查小脑在绩效监测中的作用,在非运动学习中,侧重于学习和处理外部反馈。虽然对1078篇文章进行了资格筛选,最终纳入了36项研究,这些研究在认知任务中提供了外部反馈,并引用了小脑.这些研究包括对患有小脑损伤的患者人群的研究和对健康受试者应用神经影像学的研究。不同小脑疾病患者的学习表现是异质性的,只有大约一半的患者表现出改变。一项使用EEG的患者研究表明,小脑的损伤与外部反馈的神经处理改变有关。使用基于任务的fMRI或PET评估大脑活动的研究以及一项静息状态功能成像研究,该研究调查了健康参与者基于反馈的学习后的连通性变化,发现特别是小脑外侧和后部区域参与了外部反馈的处理和学习。小脑受累在不同阶段被发现,例如,在反馈预测期间和反馈刺激开始后,证实小脑与性能监测的不同方面的相关性,如反馈预测。未来的研究将需要进一步准确地阐明,where,当小脑调节外部反馈信息的预测和处理时,哪些小脑亚区域特别相关,以及小脑疾病在多大程度上改变了这些过程。
    This review aimed to systematically identify and comprehensively review the role of the cerebellum in performance monitoring, focusing on learning from and on processing of external feedback in non-motor learning. While 1078 articles were screened for eligibility, ultimately 36 studies were included in which external feedback was delivered in cognitive tasks and which referenced the cerebellum. These included studies in patient populations with cerebellar damage and studies in healthy subjects applying neuroimaging. Learning performance in patients with different cerebellar diseases was heterogeneous, with only about half of all patients showing alterations. One patient study using EEG demonstrated that damage to the cerebellum was associated with altered neural processing of external feedback. Studies assessing brain activity with task-based fMRI or PET and one resting-state functional imaging study that investigated connectivity changes following feedback-based learning in healthy participants revealed involvement particularly of lateral and posterior cerebellar regions in processing of and learning from external feedback. Cerebellar involvement was found at different stages, e.g., during feedback anticipation and following the onset of the feedback stimuli, substantiating the cerebellum\'s relevance for different aspects of performance monitoring such as feedback prediction. Future research will need to further elucidate precisely how, where, and when the cerebellum modulates the prediction and processing of external feedback information, which cerebellar subregions are particularly relevant, and to what extent cerebellar diseases alter these processes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Systematic Review
    思想和行动往往是由一个决定,要么探索新的途径与未知的结果,或利用具有可预测结果的已知选项。然而,人类探索-开发权衡背后的神经机制仍然知之甚少。这是由于探索和开发作为心理建构的可操作性的可变性,以及用于研究这些选择行为的实验方案和范式的异质性。为了解决这个差距,在这里,我们将对文献进行全面回顾,以研究人类探索利用决策的神经基础。我们首先对强化学习期间健康成年人的探索与开发决策的功能神经影像学(fMRI)研究进行了系统回顾,信息搜索,和觅食。11项功能磁共振成像研究符合本综述的纳入标准。采用网络神经科学框架,综合这些研究的发现表明,基于探索的选择与注意力的参与有关,control,和显著性网络。相比之下,基于开发的选择与默认网络大脑区域的参与相关。我们在支持外部和内部定向认知过程之间的灵活切换的网络体系结构的背景下解释这些结果。适应所必需的,目标导向的行为。为了进一步研究探索-开发权衡的潜在神经机制,我们接下来调查了涉及神经发育的研究,神经心理学,和神经精神疾病,以及生命周期的发展,和神经退行性疾病。我们观察到这些人群的探索利用决策模式存在显著差异,再次表明这两种决策模式是由独立的神经回路支持的。一起来看,我们的综述强调了对与人类开发和探索相关的神经回路和行为相关性进行精确映射的必要性.表征探索与开发决策偏见可能会提供一种新颖的方法,跨诊断评估方法,监视,并对正常发育和临床人群的认知功能下降和功能障碍进行干预。
    Thoughts and actions are often driven by a decision to either explore new avenues with unknown outcomes, or to exploit known options with predictable outcomes. Yet, the neural mechanisms underlying this exploration-exploitation trade-off in humans remain poorly understood. This is attributable to variability in the operationalization of exploration and exploitation as psychological constructs, as well as the heterogeneity of experimental protocols and paradigms used to study these choice behaviours. To address this gap, here we present a comprehensive review of the literature to investigate the neural basis of explore-exploit decision-making in humans. We first conducted a systematic review of functional magnetic resonance imaging (fMRI) studies of exploration-versus exploitation-based decision-making in healthy adult humans during foraging, reinforcement learning, and information search. Eleven fMRI studies met inclusion criterion for this review. Adopting a network neuroscience framework, synthesis of the findings across these studies revealed that exploration-based choice was associated with the engagement of attentional, control, and salience networks. In contrast, exploitation-based choice was associated with engagement of default network brain regions. We interpret these results in the context of a network architecture that supports the flexible switching between externally and internally directed cognitive processes, necessary for adaptive, goal-directed behaviour. To further investigate potential neural mechanisms underlying the exploration-exploitation trade-off we next surveyed studies involving neurodevelopmental, neuropsychological, and neuropsychiatric disorders, as well as lifespan development, and neurodegenerative diseases. We observed striking differences in patterns of explore-exploit decision-making across these populations, again suggesting that these two decision-making modes are supported by independent neural circuits. Taken together, our review highlights the need for precision-mapping of the neural circuitry and behavioural correlates associated with exploration and exploitation in humans. Characterizing exploration versus exploitation decision-making biases may offer a novel, trans-diagnostic approach to assessment, surveillance, and intervention for cognitive decline and dysfunction in normal development and clinical populations.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Systematic Review
    强化学习(RL)是指学习与行为库的获取和对环境的适应相关的刺激-响应或响应-结果关联的能力。相关和病例对照研究的研究数据表明,肥胖与RL的损伤有关。本研究的目的是系统地回顾肥胖和超重与RL表现的关系。更具体地说,通过分析与不同生理相关的特定RL过程,探索高体重指数(BMI)与任务绩效之间的关系,计算,和行为表现。我们的系统分析表明,肥胖可能与使用厌恶结果改变持续行为的障碍有关。正如涉及仪器负强化和灭绝/反转学习的结果所揭示的那样,但是需要进一步的研究来证实这种关联。讨论了关于肥胖如何与RL改变相关的假设。
    Reinforcement learning (RL) refers to the ability to learn stimulus-response or response-outcome associations relevant to the acquisition of behavioral repertoire and adaptation to the environment. Research data from correlational and case-control studies have shown that obesity is associated with impairments in RL. The aim of the present study was to systematically review how obesity and overweight are associated with RL performance. More specifically, the relationship between high body mass index (BMI) and task performance was explored through the analysis of specific RL processes associated with different physiological, computational, and behavioral manifestations. Our systematic analyses indicate that obesity might be associated with impairments in the use of aversive outcomes to change ongoing behavior, as revealed by results involving instrumental negative reinforcement and extinction/reversal learning, but further research needs to be conducted to confirm this association. Hypotheses regarding how obesity might be associated with altered RL were discussed.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Systematic Review
    目的:基于机械的模型模拟(MM)是一种常用的有效方法,为了研究和学习的目的,更好地研究和理解生物系统的固有行为。现代技术的最新进展和组学数据的大量可用性允许机器学习(ML)技术应用于不同的研究领域,包括系统生物学。然而,关于分析的生物学背景的信息的可用性,足够的实验数据,以及计算复杂性的程度,代表了MM和ML技术可能单独存在的一些问题。出于这个原因,最近,一些研究表明,克服或显著减少这些缺点,通过结合上述两种方法。随着人们对这种混合分析方法的兴趣与日俱增,根据目前的审查,我们希望系统地研究科学文献中可用的研究,其中MMs和ML已经结合起来解释基因组学的生物过程,蛋白质组学,和代谢组学水平,或整个细胞群体的行为。
    方法:ElsevierScopus®,使用表1中报告的查询查询了ClarivateWebofScience™和国家医学图书馆PubMed®数据库,得出350篇科学文章。
    结果:在三个主要在线数据库上进行的全面搜索返回的350个文档中,只有14个符合我们的搜索标准,即,呈现由MM和ML的协同组合组成的混合方法以治疗系统生物学的特定方面。
    结论:尽管最近对这种方法感兴趣,通过对精选论文的仔细分析,它出现了MMs和ML之间集成的例子已经存在于系统生物学中,强调了这种混合方法在微观和宏观生物学尺度上的巨大潜力。
    OBJECTIVE: Mechanistic-based Model simulations (MM) are an effective approach commonly employed, for research and learning purposes, to better investigate and understand the inherent behavior of biological systems. Recent advancements in modern technologies and the large availability of omics data allowed the application of Machine Learning (ML) techniques to different research fields, including systems biology. However, the availability of information regarding the analyzed biological context, sufficient experimental data, as well as the degree of computational complexity, represent some of the issues that both MMs and ML techniques could present individually. For this reason, recently, several studies suggest overcoming or significantly reducing these drawbacks by combining the above-mentioned two methods. In the wake of the growing interest in this hybrid analysis approach, with the present review, we want to systematically investigate the studies available in the scientific literature in which both MMs and ML have been combined to explain biological processes at genomics, proteomics, and metabolomics levels, or the behavior of entire cellular populations.
    METHODS: Elsevier Scopus®, Clarivate Web of Science™ and National Library of Medicine PubMed® databases were enquired using the queries reported in Table 1, resulting in 350 scientific articles.
    RESULTS: Only 14 of the 350 documents returned by the comprehensive search conducted on the three major online databases met our search criteria, i.e. present a hybrid approach consisting of the synergistic combination of MMs and ML to treat a particular aspect of systems biology.
    CONCLUSIONS: Despite the recent interest in this methodology, from a careful analysis of the selected papers, it emerged how examples of integration between MMs and ML are already present in systems biology, highlighting the great potential of this hybrid approach to both at micro and macro biological scales.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    智能工厂是工业4.0的核心,是建立先进制造系统和实现大规模定制等现代制造目标的新范式,自动化,效率,和一次自我组织。这样的制造系统,然而,其特点是动态和复杂的环境,在这些环境中,应该以实时和最佳的方式为智能组件(如生产机器和物料搬运系统)做出大量决策。人工智能提供关键的智能控制方法,以实现效率,敏捷性,同时实现自动化。在这方面面临的最具挑战性的问题之一是不确定性,这意味着由于智能制造环境的动态性,突然看到或看不见的事件发生,应该实时处理。由于智能工厂的复杂性和高维性,不可能预测所有可能的事件或准备适当的场景来响应。强化学习是一种AI技术,可提供处理此类不确定性所需的智能控制过程。由于智能工厂的分布式特性和多个决策组件的存在,应该结合多主体强化学习(MARL),而不是单主体强化学习(SARL),which,由于开发过程的复杂性,引起的关注较少。在这项研究中,我们将回顾有关将MARL应用于智能工厂中的任务的文献,然后演示将智能工厂属性连接到等效MARL功能的映射,在此基础上,我们建议MARL是实现智能工厂控制机制的最有效方法之一。
    The smart factory is at the heart of Industry 4.0 and is the new paradigm for establishing advanced manufacturing systems and realizing modern manufacturing objectives such as mass customization, automation, efficiency, and self-organization all at once. Such manufacturing systems, however, are characterized by dynamic and complex environments where a large number of decisions should be made for smart components such as production machines and the material handling system in a real-time and optimal manner. AI offers key intelligent control approaches in order to realize efficiency, agility, and automation all at once. One of the most challenging problems faced in this regard is uncertainty, meaning that due to the dynamic nature of the smart manufacturing environments, sudden seen or unseen events occur that should be handled in real-time. Due to the complexity and high-dimensionality of smart factories, it is not possible to predict all the possible events or prepare appropriate scenarios to respond. Reinforcement learning is an AI technique that provides the intelligent control processes needed to deal with such uncertainties. Due to the distributed nature of smart factories and the presence of multiple decision-making components, multi-agent reinforcement learning (MARL) should be incorporated instead of single-agent reinforcement learning (SARL), which, due to the complexities involved in the development process, has attracted less attention. In this research, we will review the literature on the applications of MARL to tasks within a smart factory and then demonstrate a mapping connecting smart factory attributes to the equivalent MARL features, based on which we suggest MARL to be one of the most effective approaches for implementing the control mechanism for smart factories.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号