HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning

Wei-Fang Sun Cheng-Kuang Lee Simon See Chun-Yi Lee

A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning

Abstract

In fully cooperative multi-agent reinforcement learning (MARL) settings, environments are highly stochastic due to the partial observability of each agent and the continuously changing policies of other agents. To address the above issues, we proposed a unified framework, called DFAC, for integrating distributional RL with value function factorization methods. This framework generalizes expected value function factorization methods to enable the factorization of return distributions. To validate DFAC, we first demonstrate its ability to factorize the value functions of a simple matrix game with stochastic rewards. Then, we perform experiments on all Super Hard maps of the StarCraft Multi-Agent Challenge and six self-designed Ultra Hard maps, showing that DFAC is able to outperform a number of baselines.

Code Repositories

j3soon/dfac-extended
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
smac-on-smac-26m-vs-30mVDN
Average Score: 16.69
Median Win Rate: 23.01
smac-on-smac-26m-vs-30mQMIX
Average Score: 18.23
Median Win Rate: 62.78
smac-on-smac-26m-vs-30mQPLEX
Average Score: 18.66
Median Win Rate: 78.12
smac-on-smac-26m-vs-30mDMIX
Average Score: 19.17
Median Win Rate: 81.82
smac-on-smac-26m-vs-30mDDN
Average Score: 18.49
Median Win Rate: 67.90
smac-on-smac-26m-vs-30mDPLEX
Average Score: 18.49
Median Win Rate: 59.38
smac-on-smac-27m-vs-30mQPLEX
Average Score: 19.33
Median Win Rate: 78.12
smac-on-smac-27m-vs-30mDPLEX
Average Score: 19.62
Median Win Rate: 90.62
smac-on-smac-3s5z-vs-3s6z-1DPLEX
Average Score: 20.27
Median Win Rate: 90.62
smac-on-smac-3s5z-vs-3s6z-1QPLEX
Average Score: 20.42
Median Win Rate: 84.38
smac-on-smac-3s5z-vs-4s6zQMIX
Average Score: 13.09
smac-on-smac-3s5z-vs-4s6zDPLEX
Average Score: 14.99
smac-on-smac-3s5z-vs-4s6zQPLEX
Average Score: 13.60
smac-on-smac-3s5z-vs-4s6zDDN
Average Score: 19.65
Median Win Rate: 89.77
smac-on-smac-3s5z-vs-4s6zDMIX
Average Score: 18.61
Median Win Rate: 83.52
smac-on-smac-3s5z-vs-4s6zVDN
Average Score: 17.16
Median Win Rate: 47.16
smac-on-smac-6h-vs-8z-1QPLEX
Average Score: 15.95
smac-on-smac-6h-vs-8z-1DPLEX
Average Score: 17.88
Median Win Rate: 43.75
smac-on-smac-6h-vs-9zDPLEX
Average Score: 14.84
smac-on-smac-6h-vs-9zDMIX
Average Score: 13.73
smac-on-smac-6h-vs-9zVDN
Average Score: 13.57
smac-on-smac-6h-vs-9zQPLEX
Average Score: 13.86
smac-on-smac-6h-vs-9zDDN
Average Score: 16.00
Median Win Rate: 0.28
smac-on-smac-6h-vs-9zQMIX
Average Score: 12.37
Median Win Rate: 1.14
smac-on-smac-corridorDPLEX
Average Score: 19.08
Median Win Rate: 81.25
smac-on-smac-corridorQPLEX
Average Score: 18.73
Median Win Rate: 75.00
smac-on-smac-corridor-2z-vs-24zgDPLEX
Average Score: 10.71
Median Win Rate: 3.12
smac-on-smac-corridor-2z-vs-24zgVDN
Average Score: 7.78
Median Win Rate: 0.00
smac-on-smac-corridor-2z-vs-24zgQPLEX
Average Score: 6.44
smac-on-smac-corridor-2z-vs-24zgDDN
Average Score: 11.10
Median Win Rate: 41.19
smac-on-smac-corridor-2z-vs-24zgDMIX
Average Score: 7.41
smac-on-smac-corridor-2z-vs-24zgQMIX
Average Score: 4.80
smac-on-smac-mmm2-1DPLEX
Average Score: 19.93
Median Win Rate: 96.88
smac-on-smac-mmm2-1QPLEX
Average Score: 19.60
Median Win Rate: 96.88
smac-on-smac-mmm2-7m2m1m-vs-8m4m1mQPLEX
Average Score: 15.52
Median Win Rate: 46.88
smac-on-smac-mmm2-7m2m1m-vs-8m4m1mDPLEX
Average Score: 15.89
Median Win Rate: 50.00
smac-on-smac-mmm2-7m2m1m-vs-8m4m1mDDN
Average Score: 16.50
Median Win Rate: 56.82
smac-on-smac-mmm2-7m2m1m-vs-8m4m1mQMIX
Average Score: 14.40
Median Win Rate: 29.55
smac-on-smac-mmm2-7m2m1m-vs-8m4m1mDMIX
Average Score: 16.24
Median Win Rate: 63.35
smac-on-smac-mmm2-7m2m1m-vs-8m4m1mVDN
Average Score: 13.13
Median Win Rate: 13.35
smac-on-smac-mmm2-7m2m1m-vs-9m3m1mQPLEX
Average Score: 19.06
Median Win Rate: 90.62
smac-on-smac-mmm2-7m2m1m-vs-9m3m1mQMIX
Average Score: 19.01
Median Win Rate: 88.64
smac-on-smac-mmm2-7m2m1m-vs-9m3m1mDDN
Average Score: 19.45
Median Win Rate: 90.34
smac-on-smac-mmm2-7m2m1m-vs-9m3m1mDPLEX
Average Score: 19.40
Median Win Rate: 90.62
smac-on-smac-mmm2-7m2m1m-vs-9m3m1mVDN
Average Score: 17.30
Median Win Rate: 75.00
smac-on-smac-mmm2-7m2m1m-vs-9m3m1mDMIX
Average Score: 19.33
Median Win Rate: 92.33

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning | Papers | HyperAI