3 个月前

自适应理性激活函数以提升深度强化学习

自适应理性激活函数以提升深度强化学习

摘要

最新的生物学研究揭示,智力不仅源于神经元之间的连接,而且单个神经元所承担的计算职责远超以往预期。这一观点在不断变化的多样化强化学习环境中尤为重要,然而当前主流方法仍主要依赖静态激活函数。本文旨在阐明为何有理函数(rationals)适用于可适应的激活函数,并强调将其引入神经网络的必要性。受残差网络中循环结构的启发,我们推导出有理单元在残差连接下保持封闭的条件,并进一步提出一种自然正则化的变体——循环有理单元(recurrent-rational)。实验表明,将(循环)有理激活函数应用于主流强化学习算法,可在Atari游戏任务上持续提升性能,尤其能够显著增强简单DQN算法的表现,使其达到与DDQN和Rainbow等先进方法相媲美的水平。

代码仓库

ml-research/rational_activations
官方
pytorch
GitHub 中提及
ml-research/rational_rl
官方
pytorch
GitHub 中提及
ml-research/rational_sl
官方
pytorch
GitHub 中提及

基准测试

基准方法指标
atari-games-on-atari-2600-asterixRecurrent Rational DQN Average
Score: 12621
atari-games-on-atari-2600-asterixRational DQN Average
Score: 18109
atari-games-on-atari-2600-battle-zoneRational DQN Average
Score: 23403
atari-games-on-atari-2600-battle-zoneRecurrent Rational DQN Average
Score: 25749
atari-games-on-atari-2600-breakoutRecurrent Rational DQN Average
Score: 336
atari-games-on-atari-2600-breakoutRational DQN Average
Score: 316
atari-games-on-atari-2600-enduroRational DQN Average
Score: 1043
atari-games-on-atari-2600-enduroRecurrent Rational DQN Average
Score: 957
atari-games-on-atari-2600-james-bondRational DQN Average
Score: 1122
atari-games-on-atari-2600-james-bondRecurrent Rational DQN Average
Score: 1137
atari-games-on-atari-2600-kangarooRational DQN Average
Score: 2941
atari-games-on-atari-2600-kangarooRecurrent Rational DQN Average
Score: 5266
atari-games-on-atari-2600-pongRecurrent Rational DQN Average
Score: 18.13
atari-games-on-atari-2600-pongRational DQN Average
Score: 18.04
atari-games-on-atari-2600-qbertRational DQN Average
Score: 14436
atari-games-on-atari-2600-qbertRecurrent Rational DQN Average
Score: 14080
atari-games-on-atari-2600-seaquestRecurrent Rational DQN Average
Score: 7460
atari-games-on-atari-2600-seaquestRational DQN Average
Score: 6603
atari-games-on-atari-2600-skiingRational DQN Average
Score: -23487
atari-games-on-atari-2600-skiingRecurrent Rational DQN Average
Score: -23582
atari-games-on-atari-2600-space-invadersRecurrent Rational DQN Average
Score: 1395
atari-games-on-atari-2600-space-invadersRational DQN Average
Score: 650
atari-games-on-atari-2600-tennisRecurrent Rational DQN Average
Score: 20.6
atari-games-on-atari-2600-tennisRational DQN Average
Score: 20.5
atari-games-on-atari-2600-time-pilotRational DQN Average
Score: 17632
atari-games-on-atari-2600-time-pilotRecurrent Rational DQN Average
Score: 13261
atari-games-on-atari-2600-tutankhamRecurrent Rational DQN Average
Score: 184
atari-games-on-atari-2600-tutankhamRational DQN Average
Score: 179
atari-games-on-atari-2600-video-pinballRational DQN Average
Score: 149712
atari-games-on-atari-2600-video-pinballRecurrent Rational DQN Average
Score: 86942

用 AI 构建 AI

从想法到上线——通过免费 AI 协同编程、开箱即用的环境和市场最优价格的 GPU 加速您的 AI 开发

AI 协同编程
即用型 GPU
最优价格
立即开始

Hyper Newsletters

订阅我们的最新资讯
我们会在北京时间 每周一的上午九点 向您的邮箱投递本周内的最新更新
邮件发送服务由 MailChimp 提供
自适应理性激活函数以提升深度强化学习 | 论文 | HyperAI超神经