Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 174 tok/s
Gemini 2.5 Pro 51 tok/s Pro
GPT-5 Medium 38 tok/s Pro
GPT-5 High 34 tok/s Pro
GPT-4o 91 tok/s Pro
Kimi K2 205 tok/s Pro
GPT OSS 120B 438 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies (2303.07551v3)

Published 14 Mar 2023 in cs.LG and cs.AI

Abstract: Recent work has shown the promise of creating generalist, transformer-based, models for language, vision, and sequential decision-making problems. To create such models, we generally require centralized training objectives, data, and compute. It is of interest if we can more flexibly create generalist policies by merging together multiple, task-specific, individually trained policies. In this work, we take a preliminary step in this direction through merging, or averaging, subsets of Decision Transformers in parameter space trained on different MuJoCo locomotion problems, forming multi-task models without centralized training. We also demonstrate the importance of various methodological choices when merging policies, such as utilizing common pre-trained initializations, increasing model capacity, and utilizing Fisher information for weighting parameter importance. In general, we believe research in this direction could help democratize and distribute the process that forms multi-task robotics policies. Our implementation is available at https://github.com/daniellawson9999/merging-decision-transformers.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. Git re-basin: Merging models modulo permutation symmetries, 2022.
  2. Layer normalization, 2016.
  3. Pact: Perception-action causal transformer for autoregressive robotics pre-training, 2022.
  4. Decision transformer: Reinforcement learning via sequence modeling, 2021.
  5. Fusing finetuned models for better pretraining, 2022.
  6. Adaptersoup: Weight averaging to improve generalization of pretrained language models, 2023.
  7. Cold fusion: Collaborative descent for distributed multitask finetuning, 2022.
  8. Learning universal policies via text-guided video generation, 2023.
  9. The role of permutation invariance in linear mode connectivity of neural networks, 2021.
  10. D4rl: Datasets for deep data-driven reinforcement learning, 2020.
  11. Generalized decision transformer for offline hindsight information matching, 2021.
  12. Editing models with task arithmetic, 2022.
  13. Patching open-vocabulary models by interpolating weights, 2022.
  14. Vima-manipulation, 2022.
  15. Dataless knowledge fusion by merging weights of language models, 2022.
  16. Repair: Renormalizing permuted activations for interpolation repair. arXiv preprint arXiv:2211.08403, 2022.
  17. Towards continual reinforcement learning: A review and perspectives, 2020.
  18. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
  19. Offline q-learning on diverse multi-task data both scales and generalizes, 2022.
  20. Pre-training for robots: Offline rl enables learning new tasks from a handful of trials, 2022.
  21. Multi-game decision transformers, 2022.
  22. Branch-train-merge: Embarrassingly parallel training of expert language models, 2022.
  23. Pretrained transformers as universal computation engines, 2021.
  24. Merging models with fisher-weighted averaging, 2021.
  25. Communication-efficient learning of deep networks from decentralized data. 2016.
  26. Pointer sentinel mixture models, 2016.
  27. Re-basin via implicit sinkhorn differentiation, 2022.
  28. Formal algorithms for transformers. 2022.
  29. Improving language understanding by generative pre-training. 2018.
  30. A generalist agent. 2022.
  31. Can wikipedia help offline reinforcement learning?, 2022.
  32. Investigating multi-task pretraining and generalization in reinforcement learning. In The Eleventh International Conference on Learning Representations, 2023.
  33. Shiro Takagi. On the effect of pre-training for transformer in different modality on offline reinforcement learning. 2022.
  34. Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 5026–5033, 2012.
  35. Attention is all you need, 2017.
  36. Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time, 2022.
  37. Robust fine-tuning of zero-shot models, 2021.
  38. On the feasibility of cross-task transfer with model-based reinforcement learning, 2022.
Citations (12)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.