HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Tuomas Haarnoja; Aurick Zhou; Pieter Abbeel; Sergey Levine

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

Abstract

Model-free deep reinforcement learning (RL) algorithms have been demonstrated on a range of challenging decision making and control tasks. However, these methods typically suffer from two major challenges: very high sample complexity and brittle convergence properties, which necessitate meticulous hyperparameter tuning. Both of these challenges severely limit the applicability of such methods to complex, real-world domains. In this paper, we propose soft actor-critic, an off-policy actor-critic deep RL algorithm based on the maximum entropy reinforcement learning framework. In this framework, the actor aims to maximize expected reward while also maximizing entropy. That is, to succeed at the task while acting as randomly as possible. Prior deep RL methods based on this framework have been formulated as Q-learning methods. By combining off-policy updates with a stable stochastic actor-critic formulation, our method achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off-policy methods. Furthermore, we demonstrate that, in contrast to other off-policy algorithms, our approach is very stable, achieving very similar performance across different random seeds.

Code Repositories

baturaysaglam/la3p
pytorch
Mentioned in GitHub
quantumiracle/Popular-RL-Algorithms
pytorch
Mentioned in GitHub
SaminYeasar/off_policy_ac
pytorch
Mentioned in GitHub
kairproject/kair_algorithms_draft
pytorch
Mentioned in GitHub
ku2482/rljax
jax
Mentioned in GitHub
ShawK91/erl_paper_nips18
pytorch
Mentioned in GitHub
Kaixhin/imitation-learning
pytorch
Mentioned in GitHub
dasgringuen/assetto_corsa_gym
pytorch
Mentioned in GitHub
kushagra06/SAC
pytorch
Mentioned in GitHub
timoklein/car_racer
pytorch
Mentioned in GitHub
MrSyee/pg-is-all-you-need
Mentioned in GitHub
polixir/NeoRL
Mentioned in GitHub
core-robotics-lab/icct
pytorch
Mentioned in GitHub
tmjeong1103/RL_with_RAY
pytorch
Mentioned in GitHub
flowersteam/rl_stats
Mentioned in GitHub
thomashirtz/pytorch-soft-actor-critic
pytorch
Mentioned in GitHub
gijskoning/ReproducingCURL
pytorch
Mentioned in GitHub
AmmarFayad/Behavioral-Actor-Critic
pytorch
Mentioned in GitHub
ajaysub110/rl-pytorch
pytorch
Mentioned in GitHub
ac-93/soft-actor-critic
tf
Mentioned in GitHub
Steinheilig/Imbiss
Mentioned in GitHub
araffin/sbx
jax
Mentioned in GitHub
fdcl-gwu/gym-rotor
pytorch
Mentioned in GitHub
lollcat/Soft-Actor-Critic
tf
Mentioned in GitHub
tarod13/SAC
pytorch
Mentioned in GitHub
hyunin-lee/ForecasterSAC
pytorch
Mentioned in GitHub
ku2482/soft-actor-critic.pytorch
pytorch
Mentioned in GitHub
rk1998/robot-sac
tf
Mentioned in GitHub
lanqingli1993/focal-iclr
pytorch
Mentioned in GitHub
haarnoja/sac
Official
tf
Mentioned in GitHub
nagisazj/idaq_public
Mentioned in GitHub
ikostrikov/jax-rl
jax
Mentioned in GitHub
ku2482/gail-airl-ppo.pytorch
pytorch
Mentioned in GitHub
X3N4/car_racer
pytorch
Mentioned in GitHub
toshikwa/discor.pytorch
pytorch
Mentioned in GitHub
RLAgent/state-marginal-matching
pytorch
Mentioned in GitHub
donamin/llc
tf
Mentioned in GitHub
pranz24/pytorch-soft-actor-critic
pytorch
Mentioned in GitHub
cindycia/Atari-SAC-Discrete
pytorch
Mentioned in GitHub
sunfex/weighted-sac
pytorch
Mentioned in GitHub
andrejorsula/drl_grasping
pytorch
Mentioned in GitHub
FOCAL-ICLR/FOCAL-ICLR
pytorch
Mentioned in GitHub
roythuly/obac
pytorch
Mentioned in GitHub
learn-to-race/l2r
Mentioned in GitHub
garyzyr001/rethinking-airl
pytorch
Mentioned in GitHub
lucadellalib/sac-beta
pytorch
Mentioned in GitHub
yining043/SAC-discrete
tf
Mentioned in GitHub
ku2482/discor.pytorch
pytorch
Mentioned in GitHub
facebookresearch/ReAgent
pytorch
Mentioned in GitHub
toshikwa/soft-actor-critic.pytorch
pytorch
Mentioned in GitHub
h-aboutalebi/SparceReward
pytorch
Mentioned in GitHub
tliu1997/rnac
pytorch
Mentioned in GitHub
trackmania-rl/tmrl
pytorch
Mentioned in GitHub
mxblr/DeepRLHockey
tf
Mentioned in GitHub
moreanp/csro
pytorch
Mentioned in GitHub
marload/DeepRL-TensorFlow2
tf
Mentioned in GitHub
ku2482/rltorch
pytorch
Mentioned in GitHub
thomashirtz/soft-actor-critic
pytorch
Mentioned in GitHub
tilkb/thermoai
tf
Mentioned in GitHub
yhisaki/average-reward-drl
pytorch
Mentioned in GitHub
ccolas/rl_stats
Mentioned in GitHub
yimingpeng/sac-master
tf
Mentioned in GitHub
MarsEleven/car_racer_RL
pytorch
Mentioned in GitHub
susan-amin/SparseBaseline1
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
continuous-control-on-lunar-lander-openai-gymSAC
Score: 284.59±0.97
omniverse-isaac-gym-on-allegrohandSAC
Average Return: 296.49
omniverse-isaac-gym-on-antSAC
Average Return: 7717.93
omniverse-isaac-gym-on-anymalSAC
Average Return: 11.87
omniverse-isaac-gym-on-frankacabinetSAC
Average Return: 1721.98
omniverse-isaac-gym-on-humanoidSAC
Average Return: 4028.31
omniverse-isaac-gym-on-ingenuitySAC
Average Return: 5301.99
openai-gym-on-ant-v4SAC
Average Return: 5208.09
openai-gym-on-halfcheetah-v4SAC
Average Return: 15836.04
openai-gym-on-hopper-v4SAC
Average Return: 2882.56
openai-gym-on-humanoid-v4SAC
Average Return: 6211.50
openai-gym-on-walker2d-v4SAC
Average Return: 5745.27

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor | Papers | HyperAI