Papers
Topics
Authors
Recent
Detailed Answer
Quick Answer
Concise responses based on abstracts only
Detailed Answer
Well-researched responses based on abstracts and relevant paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses
Gemini 2.5 Flash
Gemini 2.5 Flash 87 tok/s
Gemini 2.5 Pro 53 tok/s Pro
GPT-5 Medium 16 tok/s Pro
GPT-5 High 18 tok/s Pro
GPT-4o 105 tok/s Pro
GPT OSS 120B 471 tok/s Pro
Kimi K2 193 tok/s Pro
2000 character limit reached

Continual Vision-based Reinforcement Learning with Group Symmetries (2210.12301v2)

Published 21 Oct 2022 in cs.LG

Abstract: Continual reinforcement learning aims to sequentially learn a variety of tasks, retaining the ability to perform previously encountered tasks while simultaneously developing new policies for novel tasks. However, current continual RL approaches overlook the fact that certain tasks are identical under basic group operations like rotations or translations, especially with visual inputs. They may unnecessarily learn and maintain a new policy for each similar task, leading to poor sample efficiency and weak generalization capability. To address this, we introduce a unique Continual Vision-based Reinforcement Learning method that recognizes Group Symmetries, called COVERS, cultivating a policy for each group of equivalent tasks rather than individual tasks. COVERS employs a proximal policy optimization-based RL algorithm with an equivariant feature extractor and a novel task grouping mechanism that relies on the extracted invariant features. We evaluate COVERS on sequences of table-top manipulation tasks that incorporate image observations and robot proprioceptive information in both simulations and on real robot platforms. Our results show that COVERS accurately assigns tasks to their respective groups and significantly outperforms existing methods in terms of generalization capability.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (43)
  1. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning, pages 1126–1135. PMLR, 2017.
  2. Deep online learning via meta-learning: Continual adaptation for model-based rl. arXiv preprint arXiv:1812.07671, 2018a.
  3. Learning to adapt in dynamic, real-world environments through meta-reinforcement learning. arXiv preprint arXiv:1803.11347, 2018b.
  4. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pages 737–744. IEEE, 2020.
  5. Towards continual reinforcement learning: A review and perspectives. Journal of Artificial Intelligence Research, 75:1401–1476, 2022.
  6. Towards continual reinforcement learning: A review and perspectives. arXiv preprint arXiv:2012.13490, 2020.
  7. Task-agnostic online reinforcement learning with an infinite mixture of gaussian processes. Advances in Neural Information Processing Systems, 33:6429–6440, 2020.
  8. Reinforcement learning in presence of discrete markovian context evolution. arXiv preprint arXiv:2202.06557, 2022.
  9. Deep reinforcement learning amidst continual structured non-stationarity. In International Conference on Machine Learning, pages 11393–11403. PMLR, 2021.
  10. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  11. S. Thrun and T. M. Mitchell. Lifelong robot learning. Robotics and autonomous systems, 15(1-2):25–46, 1995.
  12. F. Tanaka and M. Yamamura. An approach to lifelong reinforcement learning through multiple environments. In 6th European Workshop on Learning Robots, pages 93–99, 1997.
  13. Z. Chen and B. Liu. Lifelong machine learning. Synthesis Lectures on Artificial Intelligence and Machine Learning, 12(3):1–207, 2018.
  14. Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 2019.
  15. Trustworthy reinforcement learning against intrinsic vulnerabilities: Robustness, safety, and generalizability. arXiv preprint arXiv:2209.08025, 2022.
  16. Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences, 114(13):3521–3526, 2017.
  17. Cora: Benchmarks, baselines, and metrics as a platform for continual reinforcement learning agents. In Conference on Lifelong Learning Agents, pages 705–743. PMLR, 2022.
  18. Uncertainty-based continual learning with adaptive regularization. Advances in neural information processing systems, 32, 2019.
  19. Discorl: Continual reinforcement learning via policy distillation. arXiv preprint arXiv:1907.05855, 2019.
  20. Meta reinforcement learning with latent variable gaussian processes. arXiv preprint arXiv:1803.07551, 2018.
  21. Preparing for the unknown: Learning a universal policy with online system identification. arXiv preprint arXiv:1702.02453, 2017.
  22. Task-agnostic continual reinforcement learning: In praise of a simple baseline. arXiv preprint arXiv:2205.14495, 2022.
  23. Continual learning with tiny episodic memories. 2019.
  24. B. Ravindran and A. G. Barto. Symmetries and model minimization in markov decision processes, 2001.
  25. B. Ravindran and A. G. Barto. Approximate homomorphisms: A framework for non-exact minimization in markov decision processes. 2004.
  26. Mdp homomorphic networks: Group symmetries in reinforcement learning. Advances in Neural Information Processing Systems, 33:4199–4210, 2020.
  27. Multi-agent mdp homomorphic networks. arXiv preprint arXiv:2110.04495, 2021.
  28. So (2) equivariant reinforcement learning. In International conference on learning representations (ICLR), 2022a.
  29. Equivariant q𝑞qitalic_q learning in spatial action spaces. In Conference on Robot Learning, pages 1713–1723. PMLR, 2022b.
  30. Integrating symmetry into differentiable planning with steerable convolutions. In The Eleventh International Conference on Learning Representations, 2023.
  31. The surprising effectiveness of equivariant models in domains with latent symmetry. arXiv preprint arXiv:2211.09231, 2022.
  32. Sample efficient grasp learning using equivariant models. arXiv preprint arXiv:2202.09468, 2022.
  33. T. Cohen and M. Welling. Group equivariant convolutional networks. In International conference on machine learning, pages 2990–2999. PMLR, 2016.
  34. Se (3)-transformers: 3d roto-translation equivariant attention networks. Advances in Neural Information Processing Systems, 33:1970–1981, 2020.
  35. Lietransformer: Equivariant self-attention for lie groups. In International Conference on Machine Learning, pages 4533–4543. PMLR, 2021.
  36. M. Weiler and G. Cesa. General e (2)-equivariant steerable cnns. Advances in Neural Information Processing Systems, 32, 2019.
  37. A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups. In International Conference on Machine Learning, pages 3318–3328. PMLR, 2021.
  38. A program to build e(n)-equivariant steerable CNNs. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=WE4qe9xlnQw.
  39. The monge-kantorovich problem: achievements, connections, and perspectives. Russian Mathematical Surveys, 67(5):785, 2012.
  40. Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on Robot Learning, pages 1094–1100. PMLR, 2020.
  41. Human-to-robot imitation in the wild. arXiv preprint arXiv:2207.09450, 2022.
  42. B. Bakker. Reinforcement learning with long short-term memory. Advances in neural information processing systems, 14, 2001.
  43. Introduction to representation theory, volume 59. American Mathematical Soc., 2011.
Citations (7)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

We haven't generated a summary for this paper yet.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.