Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Simple Hierarchical Planning with Diffusion (2401.02644v1)

Published 5 Jan 2024 in cs.LG and cs.AI

Abstract: Diffusion-based generative methods have proven effective in modeling trajectories with offline datasets. However, they often face computational challenges and can falter in generalization, especially in capturing temporal abstractions for long-horizon tasks. To overcome this, we introduce the Hierarchical Diffuser, a simple, fast, yet surprisingly effective planning method combining the advantages of hierarchical and diffusion-based planning. Our model adopts a "jumpy" planning strategy at the higher level, which allows it to have a larger receptive field but at a lower computational cost -- a crucial factor for diffusion-based planning methods, as we have empirically verified. Additionally, the jumpy sub-goals guide our low-level planner, facilitating a fine-tuning stage and further improving our approach's effectiveness. We conducted empirical evaluations on standard offline reinforcement learning benchmarks, demonstrating our method's superior performance and efficiency in terms of training and planning speed compared to the non-hierarchical Diffuser as well as other hierarchical planning methods. Moreover, we explore our model's generalization capability, particularly on how our method improves generalization capabilities on compositional out-of-distribution tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (64)
  1. Is conditional generative modeling all you need for decision-making? arXiv preprint arXiv:2211.15657, 2022.
  2. EDGI: Equivariant diffusion for planning with embodied agents. In Workshop on Reincarnating Reinforcement Learning at ICLR 2023, 2023. URL https://openreview.net/forum?id=OrbWCpidbt.
  3. Sample-efficient reinforcement learning with stochastic ensemble value expansion. Advances in neural information processing systems, 31, 2018.
  4. Offline reinforcement learning via high-fidelity generative behavior modeling. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=42zs3qa2kpy.
  5. Decision transformer: Reinforcement learning via sequence modeling. arXiv preprint arXiv:2106.01345, 2021.
  6. Hierarchical multiscale recurrent neural networks. International Conference on Learning Representations, 2017.
  7. Pilco: A model-based and data-efficient approach to policy search. In Proceedings of the 28th International Conference on machine learning (ICML-11), pp.  465–472, 2011.
  8. Diffusion models beat GANs on image synthesis. Advances in Neural Information Processing Systems, 34:8780–8794, 2021.
  9. Rvs: What is essential for offline RL via supervised learning? In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=S874XAIpkR-.
  10. D4RL: Datasets for deep data-driven reinforcement learning. arXiv preprint arXiv:2004.07219, 2020.
  11. Off-policy deep reinforcement learning without exploration. In International conference on machine learning, 2019.
  12. Simplifying model-based RL: Learning representations, latent-space models, and policies with one objective. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=MQcmfgRxf7a.
  13. World models. arXiv preprint arXiv:1803.10122, 2018.
  14. Learning latent dynamics for planning from pixels. arXiv preprint arXiv:1811.04551, 2018.
  15. Dream to control: Learning behaviors by latent imagination. arXiv preprint arXiv:1912.01603, 2019.
  16. Deep hierarchical planning from pixels. arXiv preprint arXiv:2206.04114, 2022.
  17. On the role of planning in model-based deep reinforcement learning. arXiv preprint arXiv:2011.04021, 2020.
  18. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598, 2022.
  19. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems, 33:6840–6851, 2020.
  20. Planning goals for exploration. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=6qeBuZSo7Pr.
  21. Offline reinforcement learning as one big sequence modeling problem. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=wgeK563QgSw.
  22. Planning with diffusion for flexible behavior synthesis. In International Conference on Machine Learning, 2022a.
  23. Planning with diffusion for flexible behavior synthesis. arXiv preprint arXiv:2205.09991, 2022b.
  24. Robustness implies generalization via data-dependent generalization bounds. In International Conference on Machine Learning, pp.  10866–10894. PMLR, 2022.
  25. Morel: Model-based offline reinforcement learning. Advances in neural information processing systems, 33:21810–21823, 2020.
  26. Variational temporal abstraction. ICML Workshop on Generative Modeling and Model-Based Reasoning for Robotics and AI, 2019.
  27. Craig A Knoblock. Learning abstraction hierarchies for problem solving. In AAAI, pp.  923–928, 1990.
  28. Offline reinforcement learning with implicit Q-learning. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=68n2s9ZJWF8.
  29. Stabilizing off-policy q-learning via bootstrapping error reduction. Advances in Neural Information Processing Systems, 32, 2019.
  30. Conservative Q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems, 33:1179–1191, 2020.
  31. URLB: Unsupervised reinforcement learning benchmark. arXiv preprint arXiv:2110.15191, 2021.
  32. Hierarchical planning through goal-conditioned offline reinforcement learning. IEEE Robotics and Automation Letters, 7(4):10216–10223, 2022.
  33. Hierarchical diffusion for offline decision making. In International Conference on Machine Learning, 2023.
  34. Contrastive energy prediction for exact energy-guided diffusion sampling in offline reinforcement learning. arXiv preprint arXiv:2304.12824, 2023.
  35. Iris: Implicit reinforcement without interaction at scale for learning control from offline robot manipulation data. In 2020 IEEE International Conference on Robotics and Automation (ICRA), pp.  4414–4420, 2020. doi: 10.1109/ICRA40945.2020.9196935.
  36. Model-based reinforcement learning: A survey. Foundations and Trends® in Machine Learning, 16(1):1–118, 2023.
  37. Why does hierarchy (sometimes) work so well in reinforcement learning? arXiv preprint arXiv:1909.10618, 2019.
  38. Glide: Towards photorealistic image generation and editing with text-guided diffusion models. arXiv preprint arXiv:2112.10741, 2021.
  39. Imitating human behaviour with diffusion models. In Deep Reinforcement Learning Workshop NeurIPS 2022, 2022. URL https://openreview.net/forum?id=-pqCZ8tbtd.
  40. Long-horizon visual planning with goal-conditioned hierarchical predictors. Advances in Neural Information Processing Systems, 33:17321–17333, 2020.
  41. Combined scaling for open-vocabulary image classification. arXiv preprint arXiv:2111.10050, 2021. doi: 10.48550/ARXIV.2111.10050. URL https://arxiv.org/abs/2111.10050.
  42. Mastering the unsupervised reinforcement learning benchmark from pixels. In International Conference on Machine Learning, 2023.
  43. Hierarchical text-conditional image generation with CLIP latents. arXiv preprint arXiv:2204.06125, 2022.
  44. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10684–10695, 2022.
  45. U-Net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention (MICCAI), volume 9351 of LNCS, pp.  234–241. Springer, 2015.
  46. Earl D Sacerdoti. Planning in a hierarchy of abstraction spaces. Artificial intelligence, 5(2):115–135, 1974.
  47. Photorealistic text-to-image diffusion models with deep language understanding. arXiv preprint arXiv:2205.11487, 2022.
  48. Clockwork variational autoencoders. Advances in Neural Information Processing Systems, 34:29246–29257, 2021.
  49. Mastering atari, go, chess and shogi by planning with a learned model. Nature, 588(7839):604–609, 2020.
  50. Skill-based model-based reinforcement learning. In Conference on Robot Learning, 2022.
  51. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489, 2016.
  52. Satinder P Singh. Reinforcement learning with a hierarchy of abstract models. In Proceedings of the National Conference on Artificial Intelligence, number 10, pp.  202. Citeseer, 1992.
  53. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning, pp.  2256–2265. PMLR, 2015.
  54. Score-based generative modeling through stochastic differential equations. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=PxTIG12RRHS.
  55. Reinforcement learning: An introduction. MIT press, 2018.
  56. Erik Talvitie. Model regularization for stable sample rollouts. In UAI, pp.  780–789, 2014.
  57. FeUdal networks for hierarchical reinforcement learning. In International Conference on Machine Learning, 2017a.
  58. Feudal networks for hierarchical reinforcement learning. In International Conference on Machine Learning, pp.  3540–3549. PMLR, 2017b.
  59. Exploring model-based planning with policy networks. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=H1exf64KwH.
  60. Diffusion policies as an expressive policy class for offline reinforcement learning. arXiv preprint arXiv:2208.06193, 2022.
  61. Aggressive driving with model predictive path integral control. In 2016 IEEE International Conference on Robotics and Automation (ICRA), pp.  1433–1440. IEEE, 2016.
  62. Latent skill planning for exploration and transfer. arXiv preprint arXiv:2011.13897, 2020.
  63. Leveraging jumpy models for planning and fast learning in robotic domains. arXiv preprint arXiv:2302.12617, 2023.
  64. Making better decision by directly planning in continuous control. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=r8Mu7idxyF.
Citations (11)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com