3 个月前

通过部分解耦强化学习与向量化多样性,在一小时内训练真实场景下的局部路径规划器

通过部分解耦强化学习与向量化多样性,在一小时内训练真实场景下的局部路径规划器

摘要

深度强化学习(Deep Reinforcement Learning, DRL)在解决局部路径规划(Local Path Planning, LPP)问题方面已展现出显著成效。然而,由于DRL在训练效率和泛化能力方面的不足,其在真实场景中的应用仍受到极大限制。为缓解上述两大挑战,本文提出一种名为Color的新方法,该方法由一个Actor-Sharer-Learner(ASL)训练框架与面向移动机器人设计的仿真器Sparrow组成。具体而言,ASL框架旨在提升DRL算法的训练效率。其采用向量化数据采集(Vectorized Data Collection, VDC)模式,加速数据获取过程;通过多线程技术将数据采集与模型优化解耦,同时借助时间反馈机制(Time Feedback Mechanism, TFM)部分连接两个流程,有效避免数据的利用率不足或过度使用问题。与此同时,Sparrow仿真器采用基于二维网格的世界建模方式,简化运动学模型,并实现无转换的数据流设计,从而达成轻量化架构。该轻量化特性支持向量化多样性,能够在大量并行的向量化环境中部署多样化的仿真配置,显著增强所训练DRL算法的泛化能力。为验证所提方法在效率与泛化性能方面的优越性,本文开展了全面的实验评估,涵盖57个DRL基准环境、32个仿真LPP场景以及36个真实世界LPP任务。实验结果充分证明了Color方法的有效性。相关代码与演示视频已开源,可访问 https://github.com/XinJingHao/Color。

代码仓库

xinjinghao/color
官方
GitHub 中提及

基准测试

基准方法指标
atari-games-on-atari-2600-alienASL DDQN
Score: 6955.2
atari-games-on-atari-2600-amidarASL DDQN
Score: 2232.3
atari-games-on-atari-2600-assaultASL DDQN
Score: 14372.8
atari-games-on-atari-2600-asterixASL DDQN
Score: 567640
atari-games-on-atari-2600-asteroidsASL DDQN
Score: 1984.5
atari-games-on-atari-2600-atlantisASL DDQN
Score: 947275
atari-games-on-atari-2600-bank-heistASL DDQN
Score: 1340.9
atari-games-on-atari-2600-battle-zoneASL DDQN
Score: 38986
atari-games-on-atari-2600-beam-riderASL DDQN
Score: 26841.6
atari-games-on-atari-2600-berzerkASL DDQN
Score: 2597.2
atari-games-on-atari-2600-bowlingASL DDQN
Score: 62.4
atari-games-on-atari-2600-boxingASL DDQN
Score: 99.6
atari-games-on-atari-2600-breakoutASL DDQN
Score: 621.7
atari-games-on-atari-2600-centipedeASL DDQN
Score: 3899.8
atari-games-on-atari-2600-chopper-commandASL DDQN
Score: 15071
atari-games-on-atari-2600-crazy-climberASL DDQN
Score: 166019
atari-games-on-atari-2600-defenderASL DDQN
Score: 37026.5
atari-games-on-atari-2600-demon-attackASL DDQN
Score: 119773.9
atari-games-on-atari-2600-double-dunkASL DDQN
Score: 0.1
atari-games-on-atari-2600-enduroASL DDQN
Score: 2103.1
atari-games-on-atari-2600-fishing-derbyASL DDQN
Score: 35.1
atari-games-on-atari-2600-freewayASL DDQN
Score: 33.9
atari-games-on-atari-2600-frostbiteASL DDQN
Score: 8616.4
atari-games-on-atari-2600-gopherASL DDQN
Score: 103514.4
atari-games-on-atari-2600-gravitarASL DDQN
Score: 760
atari-games-on-atari-2600-heroASL DDQN
Score: 26578.5
atari-games-on-atari-2600-ice-hockeyASL DDQN
Score: -3.6
atari-games-on-atari-2600-james-bondASL DDQN
Score: 2237
atari-games-on-atari-2600-kangarooASL DDQN
Score: 13027
atari-games-on-atari-2600-krullASL DDQN
Score: 10422.5
atari-games-on-atari-2600-kung-fu-masterASL DDQN
Score: 85182
atari-games-on-atari-2600-montezumas-revengeASL DDQN
Score: 0
atari-games-on-atari-2600-ms-pacmanASL DDQN
Score: 4416
atari-games-on-atari-2600-name-this-gameASL DDQN
Score: 16535.4
atari-games-on-atari-2600-phoenixASL DDQN
Score: 71752.6
atari-games-on-atari-2600-pitfallASL DDQN
Score: 0
atari-games-on-atari-2600-pongASL DDQN
Score: 21
atari-games-on-atari-2600-private-eyeASL DDQN
Score: 349.7
atari-games-on-atari-2600-qbertASL DDQN
Score: 24548.8
atari-games-on-atari-2600-river-raidASL DDQN
Score: 24445
atari-games-on-atari-2600-road-runnerASL DDQN
Score: 56520
atari-games-on-atari-2600-robotankASL DDQN
Score: 65.8
atari-games-on-atari-2600-seaquestASL DDQN
Score: 29278.6
atari-games-on-atari-2600-skiingASL DDQN
Score: -8295.4
atari-games-on-atari-2600-solarisASL DDQN
Score: 3506.8
atari-games-on-atari-2600-space-invadersASL DDQN
Score: 21602
atari-games-on-atari-2600-star-gunnerASL DDQN
Score: 129140
atari-games-on-atari-2600-surroundASL DDQN
Score: 2.5
atari-games-on-atari-2600-tennisASL DDQN
Score: 22.3
atari-games-on-atari-2600-time-pilotASL DDQN
Score: 12071
atari-games-on-atari-2600-tutankhamASL DDQN
Score: 252.9
atari-games-on-atari-2600-up-and-downASL DDQN
Score: 25127.4
atari-games-on-atari-2600-ventureASL DDQN
Score: 291
atari-games-on-atari-2600-video-pinballASL DDQN
Score: 626794
atari-games-on-atari-2600-wizard-of-worASL DDQN
Score: 21049
atari-games-on-atari-2600-yars-revengeASL DDQN
Score: 29231.9
atari-games-on-atari-2600-zaxxonASL DDQN
Score: 16420

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
通过部分解耦强化学习与向量化多样性,在一小时内训练真实场景下的局部路径规划器 | 论文 | HyperAI超神经