Command Palette
Search for a command to run...
A Unified Framework for Factorizing Distributional Value Functions for Multi-Agent Reinforcement Learning
Wei-Fang Sun Cheng-Kuang Lee Simon See Chun-Yi Lee

Abstract
In fully cooperative multi-agent reinforcement learning (MARL) settings, environments are highly stochastic due to the partial observability of each agent and the continuously changing policies of other agents. To address the above issues, we proposed a unified framework, called DFAC, for integrating distributional RL with value function factorization methods. This framework generalizes expected value function factorization methods to enable the factorization of return distributions. To validate DFAC, we first demonstrate its ability to factorize the value functions of a simple matrix game with stochastic rewards. Then, we perform experiments on all Super Hard maps of the StarCraft Multi-Agent Challenge and six self-designed Ultra Hard maps, showing that DFAC is able to outperform a number of baselines.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| smac-on-smac-26m-vs-30m | VDN | Average Score: 16.69 Median Win Rate: 23.01 |
| smac-on-smac-26m-vs-30m | QMIX | Average Score: 18.23 Median Win Rate: 62.78 |
| smac-on-smac-26m-vs-30m | QPLEX | Average Score: 18.66 Median Win Rate: 78.12 |
| smac-on-smac-26m-vs-30m | DMIX | Average Score: 19.17 Median Win Rate: 81.82 |
| smac-on-smac-26m-vs-30m | DDN | Average Score: 18.49 Median Win Rate: 67.90 |
| smac-on-smac-26m-vs-30m | DPLEX | Average Score: 18.49 Median Win Rate: 59.38 |
| smac-on-smac-27m-vs-30m | QPLEX | Average Score: 19.33 Median Win Rate: 78.12 |
| smac-on-smac-27m-vs-30m | DPLEX | Average Score: 19.62 Median Win Rate: 90.62 |
| smac-on-smac-3s5z-vs-3s6z-1 | DPLEX | Average Score: 20.27 Median Win Rate: 90.62 |
| smac-on-smac-3s5z-vs-3s6z-1 | QPLEX | Average Score: 20.42 Median Win Rate: 84.38 |
| smac-on-smac-3s5z-vs-4s6z | QMIX | Average Score: 13.09 |
| smac-on-smac-3s5z-vs-4s6z | DPLEX | Average Score: 14.99 |
| smac-on-smac-3s5z-vs-4s6z | QPLEX | Average Score: 13.60 |
| smac-on-smac-3s5z-vs-4s6z | DDN | Average Score: 19.65 Median Win Rate: 89.77 |
| smac-on-smac-3s5z-vs-4s6z | DMIX | Average Score: 18.61 Median Win Rate: 83.52 |
| smac-on-smac-3s5z-vs-4s6z | VDN | Average Score: 17.16 Median Win Rate: 47.16 |
| smac-on-smac-6h-vs-8z-1 | QPLEX | Average Score: 15.95 |
| smac-on-smac-6h-vs-8z-1 | DPLEX | Average Score: 17.88 Median Win Rate: 43.75 |
| smac-on-smac-6h-vs-9z | DPLEX | Average Score: 14.84 |
| smac-on-smac-6h-vs-9z | DMIX | Average Score: 13.73 |
| smac-on-smac-6h-vs-9z | VDN | Average Score: 13.57 |
| smac-on-smac-6h-vs-9z | QPLEX | Average Score: 13.86 |
| smac-on-smac-6h-vs-9z | DDN | Average Score: 16.00 Median Win Rate: 0.28 |
| smac-on-smac-6h-vs-9z | QMIX | Average Score: 12.37 Median Win Rate: 1.14 |
| smac-on-smac-corridor | DPLEX | Average Score: 19.08 Median Win Rate: 81.25 |
| smac-on-smac-corridor | QPLEX | Average Score: 18.73 Median Win Rate: 75.00 |
| smac-on-smac-corridor-2z-vs-24zg | DPLEX | Average Score: 10.71 Median Win Rate: 3.12 |
| smac-on-smac-corridor-2z-vs-24zg | VDN | Average Score: 7.78 Median Win Rate: 0.00 |
| smac-on-smac-corridor-2z-vs-24zg | QPLEX | Average Score: 6.44 |
| smac-on-smac-corridor-2z-vs-24zg | DDN | Average Score: 11.10 Median Win Rate: 41.19 |
| smac-on-smac-corridor-2z-vs-24zg | DMIX | Average Score: 7.41 |
| smac-on-smac-corridor-2z-vs-24zg | QMIX | Average Score: 4.80 |
| smac-on-smac-mmm2-1 | DPLEX | Average Score: 19.93 Median Win Rate: 96.88 |
| smac-on-smac-mmm2-1 | QPLEX | Average Score: 19.60 Median Win Rate: 96.88 |
| smac-on-smac-mmm2-7m2m1m-vs-8m4m1m | QPLEX | Average Score: 15.52 Median Win Rate: 46.88 |
| smac-on-smac-mmm2-7m2m1m-vs-8m4m1m | DPLEX | Average Score: 15.89 Median Win Rate: 50.00 |
| smac-on-smac-mmm2-7m2m1m-vs-8m4m1m | DDN | Average Score: 16.50 Median Win Rate: 56.82 |
| smac-on-smac-mmm2-7m2m1m-vs-8m4m1m | QMIX | Average Score: 14.40 Median Win Rate: 29.55 |
| smac-on-smac-mmm2-7m2m1m-vs-8m4m1m | DMIX | Average Score: 16.24 Median Win Rate: 63.35 |
| smac-on-smac-mmm2-7m2m1m-vs-8m4m1m | VDN | Average Score: 13.13 Median Win Rate: 13.35 |
| smac-on-smac-mmm2-7m2m1m-vs-9m3m1m | QPLEX | Average Score: 19.06 Median Win Rate: 90.62 |
| smac-on-smac-mmm2-7m2m1m-vs-9m3m1m | QMIX | Average Score: 19.01 Median Win Rate: 88.64 |
| smac-on-smac-mmm2-7m2m1m-vs-9m3m1m | DDN | Average Score: 19.45 Median Win Rate: 90.34 |
| smac-on-smac-mmm2-7m2m1m-vs-9m3m1m | DPLEX | Average Score: 19.40 Median Win Rate: 90.62 |
| smac-on-smac-mmm2-7m2m1m-vs-9m3m1m | VDN | Average Score: 17.30 Median Win Rate: 75.00 |
| smac-on-smac-mmm2-7m2m1m-vs-9m3m1m | DMIX | Average Score: 19.33 Median Win Rate: 92.33 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.