3 months ago

Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning

Moritz Reuss Jyothish Pari Pulkit Agrawal Rudolf Lioutikov

Abstract

Diffusion Policies have become widely used in Imitation Learning, offeringseveral appealing properties, such as generating multimodal and discontinuousbehavior. As models are becoming larger to capture more complex capabilities,their computational demands increase, as shown by recent scaling laws.Therefore, continuing with the current architectures will present acomputational roadblock. To address this gap, we propose Mixture-of-DenoisingExperts (MoDE) as a novel policy for Imitation Learning. MoDE surpasses currentstate-of-the-art Transformer-based Diffusion Policies while enablingparameter-efficient scaling through sparse experts and noise-conditionedrouting, reducing both active parameters by 40% and inference costs by 90% viaexpert caching. Our architecture combines this efficient scaling withnoise-conditioned self-attention mechanism, enabling more effective denoisingacross different noise levels. MoDE achieves state-of-the-art performance on134 tasks in four established imitation learning benchmarks (CALVIN andLIBERO). Notably, by pretraining MoDE on diverse robotics data, we achieve 4.01on CALVIN ABC and 0.95 on LIBERO-90. It surpasses both CNN-based andTransformer Diffusion Policies by an average of 57% across 4 benchmarks, whileusing 90% fewer FLOPs and fewer active parameters compared to default DiffusionTransformer architectures. Furthermore, we conduct comprehensive ablations onMoDE's components, providing insights for designing efficient and scalableTransformer architectures for Diffusion Policies. Code and demonstrations areavailable at https://mbreuss.github.io/MoDE_Diffusion_Policy/.

Code Repositories

intuitive-robots/MoDE_Diffusion_Policy

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
zero-shot-generalization-on-calvin	MoDE	Avg. sequence length: 4.01

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette