Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies (2303.07551v3)
Abstract: Recent work has shown the promise of creating generalist, transformer-based, models for language, vision, and sequential decision-making problems. To create such models, we generally require centralized training objectives, data, and compute. It is of interest if we can more flexibly create generalist policies by merging together multiple, task-specific, individually trained policies. In this work, we take a preliminary step in this direction through merging, or averaging, subsets of Decision Transformers in parameter space trained on different MuJoCo locomotion problems, forming multi-task models without centralized training. We also demonstrate the importance of various methodological choices when merging policies, such as utilizing common pre-trained initializations, increasing model capacity, and utilizing Fisher information for weighting parameter importance. In general, we believe research in this direction could help democratize and distribute the process that forms multi-task robotics policies. Our implementation is available at https://github.com/daniellawson9999/merging-decision-transformers.
- Git re-basin: Merging models modulo permutation symmetries, 2022.
- Layer normalization, 2016.
- Pact: Perception-action causal transformer for autoregressive robotics pre-training, 2022.
- Decision transformer: Reinforcement learning via sequence modeling, 2021.
- Fusing finetuned models for better pretraining, 2022.
- Adaptersoup: Weight averaging to improve generalization of pretrained language models, 2023.
- Cold fusion: Collaborative descent for distributed multitask finetuning, 2022.
- Learning universal policies via text-guided video generation, 2023.
- The role of permutation invariance in linear mode connectivity of neural networks, 2021.
- D4rl: Datasets for deep data-driven reinforcement learning, 2020.
- Generalized decision transformer for offline hindsight information matching, 2021.
- Editing models with task arithmetic, 2022.
- Patching open-vocabulary models by interpolating weights, 2022.
- Vima-manipulation, 2022.
- Dataless knowledge fusion by merging weights of language models, 2022.
- Repair: Renormalizing permuted activations for interpolation repair. arXiv preprint arXiv:2211.08403, 2022.
- Towards continual reinforcement learning: A review and perspectives, 2020.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
- Offline q-learning on diverse multi-task data both scales and generalizes, 2022.
- Pre-training for robots: Offline rl enables learning new tasks from a handful of trials, 2022.
- Multi-game decision transformers, 2022.
- Branch-train-merge: Embarrassingly parallel training of expert language models, 2022.
- Pretrained transformers as universal computation engines, 2021.
- Merging models with fisher-weighted averaging, 2021.
- Communication-efficient learning of deep networks from decentralized data. 2016.
- Pointer sentinel mixture models, 2016.
- Re-basin via implicit sinkhorn differentiation, 2022.
- Formal algorithms for transformers. 2022.
- Improving language understanding by generative pre-training. 2018.
- A generalist agent. 2022.
- Can wikipedia help offline reinforcement learning?, 2022.
- Investigating multi-task pretraining and generalization in reinforcement learning. In The Eleventh International Conference on Learning Representations, 2023.
- Shiro Takagi. On the effect of pre-training for transformer in different modality on offline reinforcement learning. 2022.
- Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, 2012.
- Attention is all you need, 2017.
- Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time, 2022.
- Robust fine-tuning of zero-shot models, 2021.
- On the feasibility of cross-task transfer with model-based reinforcement learning, 2022.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.