
摘要
最新的生物学研究揭示,智力不仅源于神经元之间的连接,而且单个神经元所承担的计算职责远超以往预期。这一观点在不断变化的多样化强化学习环境中尤为重要,然而当前主流方法仍主要依赖静态激活函数。本文旨在阐明为何有理函数(rationals)适用于可适应的激活函数,并强调将其引入神经网络的必要性。受残差网络中循环结构的启发,我们推导出有理单元在残差连接下保持封闭的条件,并进一步提出一种自然正则化的变体——循环有理单元(recurrent-rational)。实验表明,将(循环)有理激活函数应用于主流强化学习算法,可在Atari游戏任务上持续提升性能,尤其能够显著增强简单DQN算法的表现,使其达到与DDQN和Rainbow等先进方法相媲美的水平。
代码仓库
ml-research/rational_activations
官方
pytorch
GitHub 中提及
ml-research/rational_rl
官方
pytorch
GitHub 中提及
k4ntz/activation-functions
官方
pytorch
ml-research/rational_sl
官方
pytorch
GitHub 中提及
基准测试
| 基准 | 方法 | 指标 |
|---|---|---|
| atari-games-on-atari-2600-asterix | Recurrent Rational DQN Average | Score: 12621 |
| atari-games-on-atari-2600-asterix | Rational DQN Average | Score: 18109 |
| atari-games-on-atari-2600-battle-zone | Rational DQN Average | Score: 23403 |
| atari-games-on-atari-2600-battle-zone | Recurrent Rational DQN Average | Score: 25749 |
| atari-games-on-atari-2600-breakout | Recurrent Rational DQN Average | Score: 336 |
| atari-games-on-atari-2600-breakout | Rational DQN Average | Score: 316 |
| atari-games-on-atari-2600-enduro | Rational DQN Average | Score: 1043 |
| atari-games-on-atari-2600-enduro | Recurrent Rational DQN Average | Score: 957 |
| atari-games-on-atari-2600-james-bond | Rational DQN Average | Score: 1122 |
| atari-games-on-atari-2600-james-bond | Recurrent Rational DQN Average | Score: 1137 |
| atari-games-on-atari-2600-kangaroo | Rational DQN Average | Score: 2941 |
| atari-games-on-atari-2600-kangaroo | Recurrent Rational DQN Average | Score: 5266 |
| atari-games-on-atari-2600-pong | Recurrent Rational DQN Average | Score: 18.13 |
| atari-games-on-atari-2600-pong | Rational DQN Average | Score: 18.04 |
| atari-games-on-atari-2600-qbert | Rational DQN Average | Score: 14436 |
| atari-games-on-atari-2600-qbert | Recurrent Rational DQN Average | Score: 14080 |
| atari-games-on-atari-2600-seaquest | Recurrent Rational DQN Average | Score: 7460 |
| atari-games-on-atari-2600-seaquest | Rational DQN Average | Score: 6603 |
| atari-games-on-atari-2600-skiing | Rational DQN Average | Score: -23487 |
| atari-games-on-atari-2600-skiing | Recurrent Rational DQN Average | Score: -23582 |
| atari-games-on-atari-2600-space-invaders | Recurrent Rational DQN Average | Score: 1395 |
| atari-games-on-atari-2600-space-invaders | Rational DQN Average | Score: 650 |
| atari-games-on-atari-2600-tennis | Recurrent Rational DQN Average | Score: 20.6 |
| atari-games-on-atari-2600-tennis | Rational DQN Average | Score: 20.5 |
| atari-games-on-atari-2600-time-pilot | Rational DQN Average | Score: 17632 |
| atari-games-on-atari-2600-time-pilot | Recurrent Rational DQN Average | Score: 13261 |
| atari-games-on-atari-2600-tutankham | Recurrent Rational DQN Average | Score: 184 |
| atari-games-on-atari-2600-tutankham | Rational DQN Average | Score: 179 |
| atari-games-on-atari-2600-video-pinball | Rational DQN Average | Score: 149712 |
| atari-games-on-atari-2600-video-pinball | Recurrent Rational DQN Average | Score: 86942 |