D4Rl On D4Rl
评估指标
Average Reward
评测结果
各个模型在此基准测试上的表现结果
| Paper Title | Repository | ||
|---|---|---|---|
| PMDB | 88.2 | Model-Based Offline Reinforcement Learning with Pessimism-Modulated Dynamics Belief | |
| KFC | 81.8 | Koopman Q-learning: Offline Reinforcement Learning via Symmetries of Dynamics | - |
| Primal.+DT | 77.5 | Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation | |
| Flowformer | 73.5 | Flowformer: Linearizing Transformers with Conservation Flows | |
| Decision Transformer (DT) | 72.2 | Decision Transformer: Reinforcement Learning via Sequence Modeling | |
| cosFormer | 67.8 | cosFormer: Rethinking Softmax in Attention | |
| Linear Transformer | 64.4 | Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention | |
| Reformer | 63.9 | Reformer: The Efficient Transformer | |
| Performer | 63.8 | Rethinking Attention with Performers |
0 of 9 row(s) selected.