Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Quantum Advantage Actor-Critic for Reinforcement Learning (2401.07043v1)

Published 13 Jan 2024 in quant-ph, cs.AI, and cs.LG

Abstract: Quantum computing offers efficient encapsulation of high-dimensional states. In this work, we propose a novel quantum reinforcement learning approach that combines the Advantage Actor-Critic algorithm with variational quantum circuits by substituting parts of the classical components. This approach addresses reinforcement learning's scalability concerns while maintaining high performance. We empirically test multiple quantum Advantage Actor-Critic configurations with the well known Cart Pole environment to evaluate our approach in control tasks with continuous state spaces. Our results indicate that the hybrid strategy of using either a quantum actor or quantum critic with classical post-processing yields a substantial performance increase compared to pure classical and pure quantum variants with similar parameter counts. They further reveal the limits of current quantum approaches due to the hardware constraints of noisy intermediate-scale quantum computers, suggesting further research to scale hybrid approaches for larger and more complex control tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Agarap, A. F. (2018). Deep learning using rectified linear units (relu). arXiv preprint arXiv:1803.08375.
  2. What matters for on-policy deep actor-critic methods? a large-scale study. In International conference on learning representations.
  3. Quantum Algorithms for Quantum Chemistry and Quantum Materials Science. Chemical Reviews, 120(22):12685–12717. Publisher: American Chemical Society.
  4. Pennylane: Automatic differentiation of hybrid quantum-classical computations.
  5. Quantum machine learning. Nature, 549(7671):195–202.
  6. Bridle, J. S. (1990). Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In Soulié, F. F. and Hérault, J., editors, Neurocomputing, pages 227–236, Berlin, Heidelberg. Springer Berlin Heidelberg.
  7. Openai gym. arXiv preprint arXiv:1606.01540.
  8. Quantum chemistry in the age of quantum computing. Chemical Reviews, 119(19):10856–10915.
  9. Variational quantum algorithms. Nature Reviews Physics, 3:625 – 644.
  10. Chen, S. Y.-C. (2023). Asynchronous training of quantum reinforcement learning.
  11. Variational quantum reinforcement learning via evolutionary optimization. Machine Learning: Science and Technology, 3(1):015025.
  12. An end-to-end trainable hybrid classical-quantum classifier. Machine Learning: Science and Technology, 2(4):045021.
  13. Variational quantum circuits for deep reinforcement learning.
  14. Variational quantum circuits for deep reinforcement learning. IEEE Access, 8:141007–141024.
  15. The dawn of quantum natural language processing.
  16. Dral, P. O. (2020). Quantum Chemistry in the Age of Machine Learning. The Journal of Physical Chemistry Letters, 11(6):2336–2347. Publisher: American Chemical Society.
  17. A Quantum Approximate Optimization Algorithm. arXiv:1411.4028 [quant-ph]. arXiv: 1411.4028.
  18. Quantum supremacy through the quantum approximate optimization algorithm.
  19. Gymlibrary, F. F. (2022). Cart pole - gym documentation.
  20. Quantum deep reinforcement learning for robot navigation tasks.
  21. Deep reinforcement learning that matters.
  22. Homeister, M. (2018). Quantum Computing verstehen: Grundlagen – Anwendungen – Perspektiven. Computational Intelligence. Springer Fachmedien Wiesbaden.
  23. Unentangled quantum reinforcement learning agents in the openai gym.
  24. Quantum policy gradient algorithms.
  25. Parametrized quantum policies for reinforcement learning.
  26. Adam: A method for stochastic optimization.
  27. Reinforcement learning in robotics: A survey. The International Journal of Robotics Research, 32(11):1238–1274.
  28. Actor-critic algorithms. In Solla, S., Leen, T., and Müller, K., editors, Advances in Neural Information Processing Systems, volume 12. MIT Press.
  29. Introduction to quantum reinforcement learning: Theory and pennylane-based implementation. In 2021 International Conference on Information and Communication Technology Convergence (ICTC), pages 416–420.
  30. Introduction to quantum reinforcement learning: Theory and pennylane-based implementation.
  31. Improving convergence for quantum variational classifiers using weight re-mapping.
  32. Lan, Q. (2021). Variational quantum soft actor-critic.
  33. Continuous control with deep reinforcement learning.
  34. Reinforcement learning with quantum variational circuits.
  35. Transfer learning in hybrid classical-quantum neural networks. Quantum, 4:340.
  36. Barren plateaus in quantum neural network training landscapes. Nature Communications, 9(1).
  37. Quantum policy gradient algorithm with optimized action decoding.
  38. A survey on quantum reinforcement learning.
  39. Asynchronous methods for deep reinforcement learning. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML’16, page 1928–1937. JMLR.org.
  40. Human-level control through deep reinforcement learning. Nature, 518(7540):529–533. Number: 7540 Publisher: Nature Publishing Group.
  41. Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press.
  42. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc.
  43. PennyLane Team, X. (2022). Variational circuit - pennylane.
  44. Data re-uploading for a universal quantum classifier. Quantum, 4:226.
  45. Advances in quantum cryptography. Advances in Optics and Photonics, 12(4):1012.
  46. Preskill, J. (2018). Quantum computing in the NISQ era and beyond. Quantum, 2:79.
  47. Supervised Learning with Quantum Computers. Springer Publishing Company, Incorporated, 1st edition.
  48. Proximal policy optimization algorithms.
  49. Policy gradients using variational quantum circuits.
  50. Shor, P. W. (1997). Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Journal on Computing, 26(5):1484–1509.
  51. Mastering the game of Go without human knowledge. Nature, 550(7676):354–359. Bandiera_abtest: a Cg_type: Nature Research Journals Number: 7676 Primary_atype: Research Publisher: Nature Publishing Group Subject_term: Computational science;Computer science;Reward Subject_term_id: computational-science;computer-science;reward.
  52. Quantum agents in the gym: a variational quantum algorithm for deep q-learning. Quantum, 6:720.
  53. Reinforcement Learning, second edition: An Introduction. Adaptive Computation and Machine Learning series. MIT Press.
  54. Policy gradient methods for reinforcement learning with function approximation. In Solla, S., Leen, T., and Müller, K., editors, Advances in Neural Information Processing Systems, volume 12. MIT Press.
  55. Virtual to real reinforcement learning for autonomous driving. CoRR, abs/1704.03952.
  56. Variational policy gradient method for reinforcement learning with general utilities.
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com