HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

A Distributional Perspective on Reinforcement Learning

Marc G. Bellemare; Will Dabney; Rémi Munos

A Distributional Perspective on Reinforcement Learning

Abstract

In this paper we argue for the fundamental importance of the value distribution: the distribution of the random return received by a reinforcement learning agent. This is in contrast to the common approach to reinforcement learning which models the expectation of this return, or value. Although there is an established body of literature studying the value distribution, thus far it has always been used for a specific purpose such as implementing risk-aware behaviour. We begin with theoretical results in both the policy evaluation and control settings, exposing a significant distributional instability in the latter. We then use the distributional perspective to design a new algorithm which applies Bellman's equation to the learning of approximate value distributions. We evaluate our algorithm using the suite of games from the Arcade Learning Environment. We obtain both state-of-the-art results and anecdotal evidence demonstrating the importance of the value distribution in approximate reinforcement learning. Finally, we combine theoretical and empirical evidence to highlight the ways in which the value distribution impacts learning in the approximate setting.

Code Repositories

facebookresearch/Horizon
pytorch
Mentioned in GitHub
eric-yim/fin_map
tf
Mentioned in GitHub
pihey1995/DistributionalRL
pytorch
Mentioned in GitHub
marload/dist-rl-tf2
tf
Mentioned in GitHub
guillaumeboniface/bananaland
pytorch
Mentioned in GitHub
chainer/chainerrl
pytorch
Mentioned in GitHub
Abdelhamid-bouzid/Distributional-RL
pytorch
Mentioned in GitHub
shuli0808/DQN
pytorch
Mentioned in GitHub
BY571/DQN-Atari-Agents
pytorch
Mentioned in GitHub
mindspore-courses/Rainbow-MindSpore
mindspore
Mentioned in GitHub
NervanaSystems/coach
tf
Mentioned in GitHub
facebookresearch/ReAgent
pytorch
Mentioned in GitHub
Kchu/DeepRL_CK
pytorch
Mentioned in GitHub
marload/DistRL-TensorFlow2
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
atari-games-on-atari-2600-alienC51 noop
Score: 3166.0
atari-games-on-atari-2600-amidarC51 noop
Score: 1735.0
atari-games-on-atari-2600-assaultC51 noop
Score: 7203.0
atari-games-on-atari-2600-asterixC51 noop
Score: 406211
atari-games-on-atari-2600-asteroidsC51 noop
Score: 1516.0
atari-games-on-atari-2600-atlantisC51 noop
Score: 841075.0
atari-games-on-atari-2600-bank-heistC51 noop
Score: 976.0
atari-games-on-atari-2600-battle-zoneC51 noop
Score: 28742.0
atari-games-on-atari-2600-beam-riderC51 noop
Score: 14074.0
atari-games-on-atari-2600-berzerkC51 noop
Score: 1645.0
atari-games-on-atari-2600-bowlingC51 noop
Score: 81.8
atari-games-on-atari-2600-boxingC51 noop
Score: 97.8
atari-games-on-atari-2600-breakoutC51 noop
Score: 748.0
atari-games-on-atari-2600-centipedeC51 noop
Score: 9646.0
atari-games-on-atari-2600-chopper-commandC51 noop
Score: 15600.0
atari-games-on-atari-2600-crazy-climberC51 noop
Score: 179877.0
atari-games-on-atari-2600-demon-attackC51 noop
Score: 130955.0
atari-games-on-atari-2600-double-dunkC51 noop
Score: 2.5
atari-games-on-atari-2600-enduroC51 noop
Score: 3454.0
atari-games-on-atari-2600-fishing-derbyC51 noop
Score: 8.9
atari-games-on-atari-2600-freewayC51 noop
Score: 33.9
atari-games-on-atari-2600-frostbiteC51 noop
Score: 3965.0
atari-games-on-atari-2600-gopherC51 noop
Score: 33641.0
atari-games-on-atari-2600-gravitarC51 noop
Score: 440.0
atari-games-on-atari-2600-heroC51 noop
Score: 38874
atari-games-on-atari-2600-ice-hockeyC51 noop
Score: -3.5
atari-games-on-atari-2600-james-bondC51 noop
Score: 1909.0
atari-games-on-atari-2600-kangarooC51 noop
Score: 12853.0
atari-games-on-atari-2600-krullC51 noop
Score: 9735.0
atari-games-on-atari-2600-kung-fu-masterC51 noop
Score: 48192.0
atari-games-on-atari-2600-ms-pacmanC51 noop
Score: 3415.0
atari-games-on-atari-2600-name-this-gameC51 noop
Score: 12542.0
atari-games-on-atari-2600-pongC51 noop
Score: 20.9
atari-games-on-atari-2600-private-eyeC51 noop
Score: 15095.0
atari-games-on-atari-2600-qbertC51 noop
Score: 23784
atari-games-on-atari-2600-river-raidC51 noop
Score: 17322.0
atari-games-on-atari-2600-road-runnerC51 noop
Score: 55839.0
atari-games-on-atari-2600-robotankC51 noop
Score: 52.3
atari-games-on-atari-2600-seaquestC51 noop
Score: 266434.0
atari-games-on-atari-2600-space-invadersC51 noop
Score: 5747.0
atari-games-on-atari-2600-star-gunnerC51 noop
Score: 49095.0
atari-games-on-atari-2600-tennisC51 noop
Score: 23.1
atari-games-on-atari-2600-time-pilotC51 noop
Score: 8329.0
atari-games-on-atari-2600-tutankhamC51 noop
Score: 280.0
atari-games-on-atari-2600-up-and-downC51 noop
Score: 15612.0
atari-games-on-atari-2600-ventureC51 noop
Score: 1520.0
atari-games-on-atari-2600-video-pinballC51 noop
Score: 949604.0
atari-games-on-atari-2600-wizard-of-worC51 noop
Score: 9300.0
atari-games-on-atari-2600-zaxxonC51 noop
Score: 10513.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A Distributional Perspective on Reinforcement Learning | Papers | HyperAI