Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Quantum Mixed-State Self-Attention Network (2403.02871v3)

Published 5 Mar 2024 in quant-ph and cs.LG

Abstract: Attention mechanisms have revolutionized natural language processing. Combining them with quantum computing aims to further advance this technology. This paper introduces a novel Quantum Mixed-State Self-Attention Network (QMSAN) for natural language processing tasks. Our model leverages quantum computing principles to enhance the effectiveness of self-attention mechanisms. QMSAN uses a quantum attention mechanism based on mixed state, allowing for direct similarity estimation between queries and keys in the quantum domain. This approach leads to more effective attention coefficient calculations. We also propose an innovative quantum positional encoding scheme, implemented through fixed quantum gates within the circuit, improving the model's ability to capture sequence information without additional qubit resources. In numerical experiments of text classification tasks on public datasets, QMSAN outperforms Quantum Self-Attention Neural Network (QSANN). Furthermore, we demonstrate QMSAN's robustness in different quantum noise environments, highlighting its potential for near-term quantum devices.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. Quantum biology. Nature Physics, 9(1):10–18, 2013.
  2. Secure quantum key distribution with realistic devices. Reviews of Modern Physics, 92(2):025002, 2020.
  3. Quantum communication. Nature photonics, 1(3):165–171, 2007.
  4. Quantum machine learning. Nature, 549(7671):195–202, 2017.
  5. An introduction to quantum machine learning. Contemporary Physics, 56(2):172–185, 2015.
  6. Quantum machine learning in feature hilbert spaces. Physical review letters, 122(4):040504, 2019.
  7. Peter Wittek. Quantum machine learning: what quantum computing means to data mining. Academic Press, 2014.
  8. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
  9. Jeffrey L Elman. Finding structure in time. Cognitive science, 14(2):179–211, 1990.
  10. Yoon Kim. Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882, 2014.
  11. Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI conference on artificial intelligence, volume 29, 2015.
  12. Text sentiment orientation analysis based on multi-channel cnn and bidirectional gru with attention mechanism. IEEE Access, 8:134964–134975, 2020.
  13. Convolutional recurrent deep learning model for sentence classification. Ieee Access, 6:13949–13957, 2018.
  14. A recursive recurrent neural network for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1491–1500, 2014.
  15. Layer-wise coordination between encoder and decoder for neural machine translation. Advances in Neural Information Processing Systems, 31, 2018.
  16. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  17. Improving language understanding by generative pre-training. 2018.
  18. Gpt-4 technical report. arXiv preprint arXiv:2303.08774, 2023.
  19. Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 63(10):1872–1897, 2020.
  20. Kepler: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9:176–194, 2021.
  21. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys, 56(2):1–40, 2023.
  22. Multi-task learning based pre-trained language model for code completion. In Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, pages 473–485, 2020.
  23. An era of chatgpt as a significant futuristic support tool: A study on features, abilities, and challenges. BenchCouncil transactions on benchmarks, standards and evaluations, 2(4):100089, 2022.
  24. Pre-trained models: Past, present and future. AI Open, 2:225–250, 2021.
  25. A survey on model compression for large language models. arXiv preprint arXiv:2308.07633, 2023.
  26. Intelligent computing: the latest advances, challenges, and future. Intelligent Computing, 2:0006, 2023.
  27. Man-Hong Yung. Quantum supremacy: some fundamental concepts. National Science Review, 6(1):22–23, 2019.
  28. Quantum computational supremacy. Nature, 549(7671):203–209, 2017.
  29. Challenges and opportunities in quantum machine learning. Nature Computational Science, 2(9):567–576, 2022.
  30. Is quantum advantage the right goal for quantum machine learning? Prx Quantum, 3(3):030101, 2022.
  31. Maria Schuld. Supervised quantum machine learning models are kernel methods. arXiv preprint arXiv:2101.11020, 2021.
  32. A rigorous and robust quantum speed-up in supervised machine learning. Nature Physics, 17(9):1013–1017, 2021.
  33. Practical quantum advantage in quantum simulation. Nature, 607(7920):667–676, 2022.
  34. Application of quantum natural language processing for language translation. IEEE Access, 9:130434–130448, 2021.
  35. Quantum recurrent neural networks for sequential learning. arXiv preprint arXiv:2302.03244, 2023.
  36. Quantum long short-term memory. In ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 8622–8626. IEEE, 2022.
  37. LSTM Bi-Directional. Mechanism for sentence modeling. In Neural Information Processing: 24th International Conference, ICONIP 2017, Guangzhou, China, November 14-18, 2017, Proceedings, Part II, volume 10635, page 178. Springer, 2017.
  38. Quantum self-attention neural networks for text classification. arXiv preprint arXiv:2205.05625, 2022.
  39. Qsan: A near-term achievable quantum self-attention network. arXiv preprint arXiv:2207.07563, 2022.
  40. Qksan: A quantum kernel self-attention network. arXiv preprint arXiv:2308.13422, 2023.
  41. Quantum computation and quantum information. Cambridge university press, 2010.
  42. Quantum convolutional neural networks. Nature Physics, 15(12):1273–1278, 2019.
  43. Circuit-centric quantum classifiers. Physical Review A, 101(3):032308, 2020.
  44. Reinforcement learning for optimization of variational quantum circuit architectures. Advances in Neural Information Processing Systems, 34:18182–18194, 2021.
  45. Structure optimization for parameterized quantum circuits. Quantum, 5:391, 2021.
  46. Effect of data encoding on the expressive power of variational quantum-machine-learning models. Physical Review A, 103(3):032430, 2021.
  47. Quantum embeddings for machine learning. arXiv preprint arXiv:2001.03622, 2020.
  48. Concentration of data encoding in parameterized quantum circuits. Advances in Neural Information Processing Systems, 35:19456–19469, 2022.
  49. Review of deep learning algorithms and architectures. IEEE access, 7:53040–53065, 2019.
  50. Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. IEEE transactions on geoscience and remote sensing, 54(10):6232–6251, 2016.
  51. Strong bound between trace distance and hilbert-schmidt distance for low-rank states. Physical Review A, 100(2):022103, 2019.
  52. Swap test and hong-ou-mandel effect are equivalent. Physical Review A, 87(5):052330, 2013.
  53. Quantum merlin-arthur proof systems: Are multiple merlins more helpful to arthur? In Algorithms and Computation: 14th International Symposium, ISAAC 2003, Kyoto, Japan, December 15-17, 2003. Proceedings 14, pages 189–198. Springer, 2003.
  54. Entangling quantum generative adversarial networks. Physical Review Letters, 128(22):220505, 2022.
  55. Rethinking and improving relative position encoding for vision transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10033–10041, 2021.
  56. Roformer: Enhanced transformer with rotary position embedding. Neurocomputing, 568:127063, 2024.
  57. Are transformers universal approximators of sequence-to-sequence functions? In International Conference on Learning Representations, 2019.
  58. Quantum generative diffusion model. arXiv preprint arXiv:2401.07039, 2024.
  59. Tensorcircuit: a quantum software framework for the nisq era. Quantum, 7:912, 2023.
  60. Qnlp in practice: Running compositional models of meaning on a quantum computer. Journal of Artificial Intelligence Research, 76:1305–1342, 2023.
  61. Dimitrios Kotzias. Sentiment Labelled Sentences. UCI Machine Learning Repository, 2015. DOI: https://doi.org/10.24432/C57604.
  62. Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms. Advanced Quantum Technologies, 2(12):1900070, 2019.
  63. A generative modeling approach for benchmarking and training shallow quantum circuits. npj Quantum Information, 5(1):45, 2019.
  64. {{\{{TensorFlow}}\}}: a system for {{\{{Large-Scale}}\}} machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16), pages 265–283, 2016.
  65. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

HackerNews