深度确定性策略梯度算法：一个系统综述。Deep deterministic policy gradient algorithm: A systematic review.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

Deep Reinforcement Learning (DRL) has gained significant adoption in diverse fields and applications, mainly due to its proficiency in resolving complicated decision-making problems in spaces with high-dimensional states and actions. Deep Deterministic Policy Gradient (DDPG) is a well-known DRL algorithm that adopts an actor-critic approach, synthesizing the advantages of value-based and policy-based reinforcement learning methods. The aim of this study is to provide a thorough examination of the latest developments, patterns, obstacles, and potential opportunities related to DDPG. A systematic search was conducted using relevant academic databases (Scopus, Web of Science, and ScienceDirect) to identify 85 relevant studies published in the last five years (2018-2023). We provide a comprehensive overview of the key concepts and components of DDPG, including its formulation, implementation, and training. Then, we highlight the various applications and domains of DDPG, including Autonomous Driving, Unmanned Aerial Vehicles, Resource Allocation, Communications and the Internet of Things, Robotics, and Finance. Additionally, we provide an in-depth comparison of DDPG with other DRL algorithms and traditional RL methods, highlighting its strengths and weaknesses. We believe that this review will be an essential resource for researchers, offering them valuable insights into the methods and techniques utilized in the field of DRL and DDPG.

摘要：

深度强化学习（DRL）在不同的领域和应用中获得了广泛的采用，主要是由于其在具有高维状态和动作的空间中解决复杂的决策问题的能力。深度确定性策略梯度（DDPG）是一种众所周知的DRL算法，采用演员-批评方法，综合基于价值和基于策略的强化学习方法的优势。这项研究的目的是全面研究最新发展，模式,障碍,以及与DDPG相关的潜在机会。使用相关的学术数据库进行了系统的搜索(Scopus，WebofScience,和ScienceDirect)确定过去五年(2018-2023年)发表的85项相关研究。我们全面概述了DDPG的关键概念和组件，包括它的配方，实施，和训练。然后，我们重点介绍了DDPG的各种应用和领域，包括自动驾驶,无人机,资源分配,通信和物联网，机器人,和金融。此外,我们提供了DDPG与其他DRL算法和传统RL方法的深入比较，突出它的优点和缺点。我们相信，这次审查将是研究人员的重要资源，为他们提供有关DRL和DDPG领域使用的方法和技术的宝贵见解。