4 个月前

分布式优先经验回放

分布式优先经验回放

摘要

我们提出了一种分布式架构,用于大规模深度强化学习,该架构使智能体能够从比以往可能的数量多几个数量级的数据中有效学习。该算法将行为与学习解耦:行为者根据共享神经网络选择动作并与环境的独立实例进行交互,将由此产生的经验累积到共享的经验回放记忆中;学习者则重播经验样本并更新神经网络。该架构依赖于优先经验回放(Prioritized Experience Replay),以专注于由行为者生成的最重要数据。我们的架构在街机学习环境(Arcade Learning Environment)上显著提升了现有技术水平,不仅在较短的实际训练时间内实现了更好的最终性能。

代码仓库

neka-nat/distributed_rl
pytorch
GitHub 中提及
dannysdeng/dqn-pytorch
pytorch
GitHub 中提及
Lyusungwon/apex_dqn_pytorch
pytorch
GitHub 中提及
belepi93/Ape-X
pytorch
GitHub 中提及
eladsar/rbi
pytorch
GitHub 中提及
vwxyzjn/cleanrl
pytorch
GitHub 中提及
cindycia/Atari-SAC-Discrete
pytorch
GitHub 中提及
uber-research/ape-x
tf
GitHub 中提及
HussonnoisMaxence/RL_Algorithms
pytorch
GitHub 中提及
mightypirate1/DRL-Tetris
tf
GitHub 中提及
ku2482/rltorch
pytorch
GitHub 中提及
haje01/distper
pytorch
GitHub 中提及

基准测试

基准方法指标
atari-games-on-atari-2600-alienApe-X
Score: 40804.9
atari-games-on-atari-2600-amidarApe-X
Score: 8659.2
atari-games-on-atari-2600-assaultApe-X
Score: 24559.4
atari-games-on-atari-2600-asterixApe-X
Score: 313305
atari-games-on-atari-2600-asteroidsApe-X
Score: 155495.1
atari-games-on-atari-2600-atlantisApe-X
Score: 944497.5
atari-games-on-atari-2600-bank-heistApe-X
Score: 1716.4
atari-games-on-atari-2600-battle-zoneApe-X
Score: 98895
atari-games-on-atari-2600-beam-riderApe-X
Score: 63305.2
atari-games-on-atari-2600-berzerkApe-X
Score: 57196.7
atari-games-on-atari-2600-bowlingApe-X
Score: 17.6
atari-games-on-atari-2600-boxingApe-X
Score: 100
atari-games-on-atari-2600-breakoutApe-X
Score: 800.9
atari-games-on-atari-2600-centipedeApe-X
Score: 12974
atari-games-on-atari-2600-chopper-commandApe-X
Score: 721851
atari-games-on-atari-2600-crazy-climberApe-X
Score: 320426
atari-games-on-atari-2600-defenderApe-X
Score: 411943.5
atari-games-on-atari-2600-demon-attackApe-X
Score: 133086.4
atari-games-on-atari-2600-double-dunkApe-X
Score: 23.5
atari-games-on-atari-2600-enduroApe-X
Score: 2177.4
atari-games-on-atari-2600-fishing-derbyApe-X
Score: 44.4
atari-games-on-atari-2600-freewayApe-X
Score: 33.7
atari-games-on-atari-2600-frostbiteApe-X
Score: 9328.6
atari-games-on-atari-2600-gopherApe-X
Score: 120500.9
atari-games-on-atari-2600-gravitarApe-X
Score: 1598.5
atari-games-on-atari-2600-heroApe-X
Score: 31655.9
atari-games-on-atari-2600-ice-hockeyApe-X
Score: 33
atari-games-on-atari-2600-james-bondApe-X
Score: 21322.5
atari-games-on-atari-2600-kangarooApe-X
Score: 1416
atari-games-on-atari-2600-krullApe-X
Score: 11741.4
atari-games-on-atari-2600-kung-fu-masterApe-X
Score: 97829.5
atari-games-on-atari-2600-montezumas-revengeApe-X
Score: 2500.0
atari-games-on-atari-2600-ms-pacmanApe-X
Score: 11255.2
atari-games-on-atari-2600-name-this-gameApe-X
Score: 25783.3
atari-games-on-atari-2600-phoenixApe-X
Score: 224491.1
atari-games-on-atari-2600-pitfallApe-X
Score: -0.6
atari-games-on-atari-2600-pongApe-X
Score: 20.9
atari-games-on-atari-2600-private-eyeApe-X
Score: 49.8
atari-games-on-atari-2600-qbertApe-X
Score: 302391.3
atari-games-on-atari-2600-river-raidApe-X
Score: 63864.4
atari-games-on-atari-2600-road-runnerApe-X
Score: 222234.5
atari-games-on-atari-2600-robotankApe-X
Score: 73.8
atari-games-on-atari-2600-seaquestApe-X
Score: 392952.3
atari-games-on-atari-2600-skiingApe-X
Score: -10789.9
atari-games-on-atari-2600-solarisApe-X
Score: 2892.9
atari-games-on-atari-2600-space-invadersApe-X
Score: 54681
atari-games-on-atari-2600-star-gunnerApe-X
Score: 434342.5
atari-games-on-atari-2600-surroundApe-X
Score: 7.1
atari-games-on-atari-2600-tennisApe-X
Score: 23.9
atari-games-on-atari-2600-time-pilotApe-X
Score: 87085
atari-games-on-atari-2600-tutankhamApe-X
Score: 272.6
atari-games-on-atari-2600-up-and-downApe-X
Score: 401884.3
atari-games-on-atari-2600-ventureApe-X
Score: 1813
atari-games-on-atari-2600-video-pinballApe-X
Score: 565163.2
atari-games-on-atari-2600-wizard-of-worApe-X
Score: 46204
atari-games-on-atari-2600-yars-revengeApe-X
Score: 148594.8
atari-games-on-atari-2600-zaxxonApe-X
Score: 42285.5

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
分布式优先经验回放 | 论文 | HyperAI超神经