GPT on a Quantum Computer (2403.09418v1)
Abstract: LLMs such as ChatGPT have transformed how we interact with and understand the capabilities of AI. However, the intersection of LLMs with the burgeoning field of Quantum Machine Learning (QML) is only in its nascent stages. This paper presents an exploration of this niche by detailing a comprehensive framework for implementing the foundational Transformer architecture -- integral to ChatGPT -- within a quantum computing paradigm. We meticulously design quantum circuits that implement adapted versions of the transformer's core components and the generative pre-training phase. By integrating quantum computing with LLMs, we aspire to open new avenues for research in QML and contribute to the ongoing evolution of AI technologies.
- Travis L Scholten, Carl J Williams, Dustin Moody, Michele Mosca, William Hurley, William J Zeng, Matthias Troyer, Jay M Gambetta, et al., “Assessing the benefits and risks of quantum computers,” arXiv preprint arXiv:2401.16317 (2024).
- Marco Cerezo, Guillaume Verdon, Hsin-Yuan Huang, Lukasz Cincio, and Patrick J Coles, “Challenges and opportunities in quantum machine learning,” Nature Computational Science 2, 567–576 (2022).
- Junyu Liu, Minzhao Liu, Jin-Peng Liu, Ziyu Ye, Yunfei Wang, Yuri Alexeev, Jens Eisert, and Liang Jiang, “Towards provably efficient quantum algorithms for large-scale machine-learning models,” Nature Communications 15, 434 (2024).
- Yeqi Gao, Zhao Song, Xin Yang, and Ruizhe Zhang, “Fast quantum algorithm for attention computation,” arXiv preprint arXiv:2307.08045 (2023).
- Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever, et al., “Improving language understanding by generative pre-training,” (2018).
- Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin, “Attention is all you need,” in Advances in neural information processing systems, Vol. 30 (2017).
- Peter J Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi, Lukasz Kaiser, and Noam Shazeer, “Generating wikipedia by summarizing long sequences,” arXiv preprint arXiv:1801.10198 (2018).
- Larry Medsker and Lakhmi C Jain, Recurrent neural networks: design and applications (CRC press, 1999).
- Benyamin Ghojogh and Ali Ghodsi, “Attention mechanism, transformers, bert, and gpt: tutorial and survey,” OSF Preprints (2020).
- Daniel Jurafsky and James H. Martin, Speech and Language Processing, 3rd ed. (Pearson, 2021).
- Peter F Brown, Vincent J Della Pietra, Stephen A Della Pietra, and Robert L Mercer, “Class-based n-gram models of natural language,” Computational linguistics 18, 467–479 (1992).
- Sepp Hochreiter and Jürgen Schmidhuber, “Long short-term memory,” Neural Computation 9, 1735–1780 (1997).
- Tomas Mikolov, Kai Chen, Greg Corrado, and Jeff Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781 (2013).
- Jeffrey Pennington, Richard Socher, and Christopher D Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014) pp. 1532–1543.
- Tomas Mikolov, Quoc V Le, and Ilya Sutskever, “Computing numeric representations of words in a high-dimensional space,” in Machine Learning and Knowledge Discovery in Databases (Springer, 2015) pp. 522–536.
- Pei Yuan and Shengyu Zhang, “Optimal (controlled) quantum state preparation and improved unitary synthesis by quantum circuits with any number of ancillary qubits,” Quantum 7, 956 (2023).
- Jiachen Lu, Jinghan Yao, Junge Zhang, Xiatian Zhu, Hang Xu, Weiguo Gao, Chunjing Xu, Tao Xiang, and Li Zhang, “Soft: Softmax-free transformer with linear complexity,” Advances in Neural Information Processing Systems 34, 21297–21309 (2021).
- Zhen Qin, Weixuan Sun, Hui Deng, Dongxu Li, Yunshen Wei, Baohong Lv, Junjie Yan, Lingpeng Kong, and Yiran Zhong, “cosformer: Rethinking softmax in attention,” arXiv preprint arXiv:2202.08791 (2022).
- Quynh T Nguyen, Bobak T Kiani, and Seth Lloyd, “Block-encoding dense and full-rank kernels using hierarchical matrices: applications in quantum numerical linear algebra,” Quantum 6, 876 (2022).
- Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio, “Graph attention networks,” in Proc. of ICLR (2017).
- Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Deep residual learning for image recognition,” in Proc. of CVPR (2016) pp. 770–778.
- Jonathan Allcock, Chang-Yu Hsieh, Iordanis Kerenidis, and Shengyu Zhang, “Quantum algorithms for feedforward neural networks,” ACM Transactions on Quantum Computing 1 (2020).
- Gilles Brassard, Peter Hoyer, Michele Mosca, and Alain Tapp, “Quantum amplitude amplification and estimation,” Contemporary Mathematics 305, 53–74 (2002).
- Jonas Landman, “Quantum algorithms for unsupervised machine learning and neural networks,” arXiv preprint arXiv:2111.03598 (2021).
- Yidong Liao, Min-Hsiu Hsieh, and Chris Ferrie, “Quantum optimization for training quantum neural networks,” arXiv preprint arXiv:2103.17047 (2021).
- Christoph Sünderhauf, Earl Campbell, and Joan Camps, “Block-encoding structured matrices for data input in quantum computing,” Quantum 8, 1226 (2024).
- Andrew M. Childs and Nathan Wiebe, “Hamiltonian simulation using linear combinations of unitary operations,” arXiv:1202.5822 (2012).
- Lin Lin, “Lecture notes on quantum algorithms for scientific computation,” arXiv preprint arXiv:2201.08309 (2022).
- Martin Larocca, Nathan Ju, Diego García-Martín, Patrick J. Coles, and M. Cerezo, “Theory of overparametrization in quantum neural networks,” Nature Computational Science 3, 542––551 (2023).
- Abhinav Kandala, Antonio Mezzacapo, Kristan Temme, Maika Takita, Markus Brink, Jerry M. Chow, and Jay M. Gambetta, “Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets,” Nature 549, 242–246 (2017).
- Stuart Hadfield, Zhihui Wang, Bryan O’Gorman, Eleanor G Rieffel, Davide Venturelli, and Rupak Biswas, “From the quantum approximate optimization algorithm to a quantum alternating operator ansatz,” Algorithms 12, 34 (2019).
- Alexandre Choquette, Agustin Di Paolo, Panagiotis Kl Barkoutsos, David Sénéchal, Ivano Tavernelli, and Alexandre Blais, “Quantum-optimal-control-inspired ansatz for variational quantum algorithms,” Physical Review Research 3, 023092 (2021).
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Collections
Sign up for free to add this paper to one or more collections.