HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

DNA: Proximal Policy Optimization with a Dual Network Architecture

Matthew Aitchison Penny Sweetser

DNA: Proximal Policy Optimization with a Dual Network Architecture

Abstract

This paper explores the problem of simultaneously learning a value function and policy in deep actor-critic reinforcement learning models. We find that the common practice of learning these functions jointly is sub-optimal, due to an order-of-magnitude difference in noise levels between these two tasks. Instead, we show that learning these tasks independently, but with a constrained distillation phase, significantly improves performance. Furthermore, we find that the policy gradient noise levels can be decreased by using a lower \textit{variance} return estimate. Whereas, the value learning noise level decreases with a lower \textit{bias} estimate. Together these insights inform an extension to Proximal Policy Optimization we call \textit{Dual Network Architecture} (DNA), which significantly outperforms its predecessor. DNA also exceeds the performance of the popular Rainbow DQN algorithm on four of the five environments tested, even under more difficult stochastic control settings.

Code Repositories

maitchison/PPO
Official
pytorch

Benchmarks

BenchmarkMethodologyMetrics
atari-games-on-atari-2600-alienDNA
Score: 5021
atari-games-on-atari-2600-amidarDNA
Score: 1025
atari-games-on-atari-2600-assaultDNA
Score: 16293
atari-games-on-atari-2600-asterixDNA
Score: 323965
atari-games-on-atari-2600-asteroidsDNA
Score: 165973
atari-games-on-atari-2600-atlantisDNA
Score: 932559
atari-games-on-atari-2600-bank-heistDNA
Score: 1286
atari-games-on-atari-2600-battle-zoneDNA
Score: 71003
atari-games-on-atari-2600-beam-riderDNA
Score: 20393
atari-games-on-atari-2600-berzerkDNA
Score: 19789
atari-games-on-atari-2600-bowlingDNA
Score: 181
atari-games-on-atari-2600-boxingDNA
Score: 99.9
atari-games-on-atari-2600-breakoutDNA
Score: 626
atari-games-on-atari-2600-centipedeDNA
Score: 100194
atari-games-on-atari-2600-chopper-commandDNA
Score: 31181
atari-games-on-atari-2600-crazy-climberDNA
Score: 131623
atari-games-on-atari-2600-defenderDNA
Score: 152768
atari-games-on-atari-2600-demon-attackDNA
Score: 97909
atari-games-on-atari-2600-double-dunkDNA
Score: -1.3
atari-games-on-atari-2600-enduroDNA
Score: 2059
atari-games-on-atari-2600-fishing-derbyDNA
Score: 57.4
atari-games-on-atari-2600-freewayDNA
Score: 33
atari-games-on-atari-2600-frostbiteDNA
Score: 320
atari-games-on-atari-2600-gopherDNA
Score: 80104
atari-games-on-atari-2600-gravitarDNA
Score: 2190
atari-games-on-atari-2600-heroDNA
Score: 24904
atari-games-on-atari-2600-ice-hockeyDNA
Score: 7.2
atari-games-on-atari-2600-james-bondDNA
Score: 14102
atari-games-on-atari-2600-kangarooDNA
Score: 14373
atari-games-on-atari-2600-krullDNA
Score: 10956
atari-games-on-atari-2600-kung-fu-masterDNA
Score: 110962
atari-games-on-atari-2600-montezumas-revengeDNA
Score: 0
atari-games-on-atari-2600-ms-pacmanDNA
Score: 5894
atari-games-on-atari-2600-name-this-gameDNA
Score: 20226
atari-games-on-atari-2600-phoenixDNA
Score: 391085
atari-games-on-atari-2600-pitfallDNA
Score: 0
atari-games-on-atari-2600-pongDNA
Score: 19.7
atari-games-on-atari-2600-private-eyeDNA
Score: 100
atari-games-on-atari-2600-qbertDNA
Score: 52398
atari-games-on-atari-2600-river-raidDNA
Score: 16789
atari-games-on-atari-2600-road-runnerDNA
Score: 61713
atari-games-on-atari-2600-robotankDNA
Score: 64.8
atari-games-on-atari-2600-seaquestDNA
Score: 4146
atari-games-on-atari-2600-skiingDNA
Score: -29974
atari-games-on-atari-2600-solarisDNA
Score: 2225
atari-games-on-atari-2600-space-invadersDNA
Score: 2731
atari-games-on-atari-2600-star-gunnerDNA
Score: 104125
atari-games-on-atari-2600-surroundDNA
Score: 5.3
atari-games-on-atari-2600-tennisDNA
Score: -10.9
atari-games-on-atari-2600-time-pilotDNA
Score: 12774
atari-games-on-atari-2600-tutankhamDNA
Score: 127
atari-games-on-atari-2600-up-and-downDNA
Score: 291934
atari-games-on-atari-2600-ventureDNA
Score: 0
atari-games-on-atari-2600-video-pinballDNA
Score: 505392
atari-games-on-atari-2600-wizard-of-worDNA
Score: 20851
atari-games-on-atari-2600-yars-revengeDNA
Score: 564513
atari-games-on-atari-2600-zaxxonDNA
Score: 22588

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
DNA: Proximal Policy Optimization with a Dual Network Architecture | Papers | HyperAI