4 个月前

隐式分位数网络在分布强化学习中的应用

隐式分位数网络在分布强化学习中的应用

摘要

在本研究中,我们基于近期在分布强化学习领域的进展,提出了一种普遍适用、灵活且处于前沿的DQN(深度Q网络)分布变体。我们通过使用分位数回归来近似状态-动作回报分布的完整分位数函数,从而实现这一目标。通过对样本空间上的分布进行重新参数化,这产生了一个隐式定义的回报分布,并引发了一大类风险敏感策略。我们在ALE(Arcade Learning Environment)中的57款Atari 2600游戏中展示了改进的性能,并利用算法隐式定义的分布研究了风险敏感策略在Atari游戏中的影响。

代码仓库

BY571/IQN
pytorch
GitHub 中提及
pihey1995/DistributionalRL
pytorch
GitHub 中提及
ku2482/rljax
jax
GitHub 中提及
sjYoondeltar/myRL_example
tf
GitHub 中提及
marload/dist-rl-tf2
tf
GitHub 中提及
chainer/chainerrl
pytorch
GitHub 中提及
sjYoondeltar/IQN_example
tf
GitHub 中提及
ACampero/dopamine
tf
GitHub 中提及
KatyNTsachi/Hierarchical-RL
tf
GitHub 中提及
ku2482/fqf-iqn-qrdqn.pytorch
pytorch
GitHub 中提及
Kchu/DeepRL_CK
pytorch
GitHub 中提及
robinzixuan/IQN_Agent
pytorch
GitHub 中提及
marload/DistRL-TensorFlow2
tf
GitHub 中提及
V0LsTeR/DQN_heap
tf
GitHub 中提及

基准测试

基准方法指标
atari-games-on-atari-2600-alienIQN
Score: 7022
atari-games-on-atari-2600-amidarIQN
Score: 2946
atari-games-on-atari-2600-assaultIQN
Score: 29091
atari-games-on-atari-2600-asterixIQN
Score: 342016
atari-games-on-atari-2600-asteroidsIQN
Score: 2898
atari-games-on-atari-2600-atlantisIQN
Score: 978200
atari-games-on-atari-2600-bank-heistIQN
Score: 1416
atari-games-on-atari-2600-battle-zoneIQN
Score: 42244
atari-games-on-atari-2600-beam-riderIQN
Score: 42776
atari-games-on-atari-2600-berzerkIQN
Score: 1053
atari-games-on-atari-2600-bowlingIQN
Score: 86.5
atari-games-on-atari-2600-boxingIQN
Score: 99.8
atari-games-on-atari-2600-breakoutIQN
Score: 734
atari-games-on-atari-2600-centipedeIQN
Score: 11561
atari-games-on-atari-2600-chopper-commandIQN
Score: 16836
atari-games-on-atari-2600-crazy-climberIQN
Score: 179082
atari-games-on-atari-2600-defenderIQN
Score: 53537
atari-games-on-atari-2600-demon-attackIQN
Score: 128580
atari-games-on-atari-2600-double-dunkIQN
Score: 5.6
atari-games-on-atari-2600-enduroIQN
Score: 2359
atari-games-on-atari-2600-fishing-derbyIQN
Score: 33.8
atari-games-on-atari-2600-freewayIQN
Score: 34
atari-games-on-atari-2600-frostbiteIQN
Score: 4324
atari-games-on-atari-2600-gopherIQN
Score: 118365
atari-games-on-atari-2600-gravitarIQN
Score: 911
atari-games-on-atari-2600-heroIQN
Score: 28386
atari-games-on-atari-2600-ice-hockeyIQN
Score: 0.2
atari-games-on-atari-2600-james-bondIQN
Score: 35108
atari-games-on-atari-2600-kangarooIQN
Score: 15487
atari-games-on-atari-2600-krullIQN
Score: 10707
atari-games-on-atari-2600-kung-fu-masterIQN
Score: 73512
atari-games-on-atari-2600-montezumas-revengeIQN
Score: 0
atari-games-on-atari-2600-ms-pacmanIQN
Score: 6349
atari-games-on-atari-2600-name-this-gameIQN
Score: 22682
atari-games-on-atari-2600-phoenixIQN
Score: 56599
atari-games-on-atari-2600-pitfallIQN
Score: 0
atari-games-on-atari-2600-pongIQN
Score: 21
atari-games-on-atari-2600-private-eyeIQN
Score: 200
atari-games-on-atari-2600-qbertIQN
Score: 25750
atari-games-on-atari-2600-river-raidIQN
Score: 17765
atari-games-on-atari-2600-road-runnerIQN
Score: 57900
atari-games-on-atari-2600-robotankIQN
Score: 62.5
atari-games-on-atari-2600-seaquestIQN
Score: 30140
atari-games-on-atari-2600-skiingIQN
Score: -9289
atari-games-on-atari-2600-solarisIQN
Score: 8007
atari-games-on-atari-2600-space-invadersIQN
Score: 28888
atari-games-on-atari-2600-star-gunnerIQN
Score: 74677
atari-games-on-atari-2600-surroundIQN
Score: 9.4
atari-games-on-atari-2600-tennisIQN
Score: 23.6
atari-games-on-atari-2600-time-pilotIQN
Score: 12236
atari-games-on-atari-2600-tutankhamIQN
Score: 293
atari-games-on-atari-2600-up-and-downIQN
Score: 88148
atari-games-on-atari-2600-ventureIQN
Score: 1318
atari-games-on-atari-2600-video-pinballIQN
Score: 698045
atari-games-on-atari-2600-wizard-of-worIQN
Score: 31190
atari-games-on-atari-2600-yars-revengeIQN
Score: 28379
atari-games-on-atari-2600-zaxxonIQN
Score: 21772

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
隐式分位数网络在分布强化学习中的应用 | 论文 | HyperAI超神经