Exponential Quantum Communication Advantage in Distributed Inference and Learning (2310.07136v3)
Abstract: Training and inference with large machine learning models that far exceed the memory capacity of individual devices necessitates the design of distributed architectures, forcing one to contend with communication constraints. We present a framework for distributed computation over a quantum network in which data is encoded into specialized quantum states. We prove that for models within this framework, inference and training using gradient descent can be performed with exponentially less communication compared to their classical analogs, and with relatively modest overhead relative to standard gradient-based methods. We show that certain graph neural networks are particularly amenable to implementation within this framework, and moreover present empirical evidence that they perform well on standard benchmarks. To our knowledge, this is the first example of exponential quantum advantage for a generic class of machine learning problems that hold regardless of the data encoding cost. Moreover, we show that models in this class can encode highly nonlinear features of their inputs, and their expressivity increases exponentially with model depth. We also delineate the space of models for which exponential communication advantages hold by showing that they cannot hold for linear classification. Our results can be combined with natural privacy advantages in the communicated quantum states that limit the amount of information that can be extracted from them about the data and model parameters. Taken as a whole, these findings form a promising foundation for distributed machine learning over quantum networks.
- Scott Aaronson. 2015. Read the fine print. Nature physics 11, 4 (April 2015), 291–293. https://doi.org/10.1038/nphys3272
- Scott Aaronson. 2017a. Introduction to Quantum Information Science. https://www.scottaaronson.com/qclec.pdf.
- Scott Aaronson. 2017b. Shadow Tomography of Quantum States. (Nov. 2017). arXiv:1711.01053Â [quant-ph]
- Online learning of quantum states. Journal of statistical mechanics 2019, 12 (Dec. 2019), 124019. https://doi.org/10.1088/1742-5468/ab3988
- Scott Aaronson and Guy N Rothblum. 2019. Gentle Measurement of Quantum States and Differential Privacy. (April 2019). arXiv:1904.08747 [quant-ph]
- On quantum backpropagation, information reuse, and cheating measurement collapse. (May 2023). arXiv:2305.13362Â [quant-ph]
- Dimitris Achlioptas. 2003. Database-friendly random projections: Johnson-Lindenstrauss with binary coins. J. Comput. System Sci. 66, 4 (June 2003), 671–687. https://doi.org/10.1016/S0022-0000(03)00025-4
- Information-theoretic lower bounds on the oracle complexity of stochastic convex optimization. (Sept. 2010). arXiv:1009.0571Â [stat.ML]
- One Clean Qubit Suffices for Quantum Communication Advantage. (Oct. 2023). arXiv:2310.02406Â [quant-ph]
- Quantum supremacy using a programmable superconducting processor. Nature 574, 7779 (Oct. 2019), 505–510. https://doi.org/10.1038/s41586-019-1666-5
- Quantum repeaters: From quantum networks to the quantum internet. (Dec. 2022). arXiv:2212.10820Â [quant-ph]
- Costin Bădescu and Ryan O’Donnell. 2021. Improved quantum data analysis. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing. 1398–1411.
- Krishna C Balram and Kartik Srinivasan. 2021. Piezoelectric optomechanical approaches for efficient quantum microwave-to-optical signal transduction: the need for co-design. (Aug. 2021). arXiv:2108.11797 [physics.optics]
- Exponential Separation of Quantum and Classical One-Way Communication Complexity. SIAM J. Comput. 38, 1 (Jan. 2008), 366–384. https://doi.org/10.1137/060651835
- Pathways: Asynchronous Distributed Dataflow for ML. (March 2022). arXiv:2203.12533Â [cs.DC]
- Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels. Physical review letters 70, 13 (March 1993), 1895–1899. https://doi.org/10.1103/PhysRevLett.70.1895
- Purification of Noisy Entanglement and Faithful Teleportation via Noisy Channels. (Nov. 1995). arXiv:quant-ph/9511027Â [quant-ph]
- On the Opportunities and Risks of Foundation Models. (Aug. 2021). arXiv:2108.07258Â [cs.LG]
- Quantum SDP Solvers: Large Speed-ups, Optimality, and Applications to Quantum Learning. (Oct. 2017). arXiv:1710.02581Â [quant-ph]
- Gilles Brassard. 2001. Quantum Communication Complexity (A Survey). (Jan. 2001). arXiv:quant-ph/0101005Â [quant-ph]
- Adam R Brown and Leonard Susskind. 2017. The Second Law of Quantum Complexity. (Jan. 2017). arXiv:1701.01107 [hep-th]
- Language Models are Few-Shot Learners. (May 2020). arXiv:2005.14165Â [cs.CL]
- Sébastien Bubeck. 2014. Convex Optimization: Algorithms and Complexity. (May 2014). https://doi.org/10.1561/2200000050 arXiv:1405.4980 [math.OC]
- Non-locality and Communication Complexity. arXiv [quant-ph] (July 2009). https://doi.org/10.1103/RevModPhys.82.665 arXiv:0907.3584Â [quant-ph]
- Variational Quantum Algorithms. (Dec. 2020). arXiv:2012.09265Â [quant-ph]
- The power of block-encoded matrix powers: improved regression techniques via faster Hamiltonian simulation. (April 2018). arXiv:1804.01973Â [quant-ph]
- A Chi-Chih Yao. 1993. Quantum circuit complexity. In Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science. 352–361. https://doi.org/10.1109/SFCS.1993.366852
- Andrew M Childs and Wim van Dam. 2008. Quantum algorithms for algebraic problems. (Dec. 2008). arXiv:0812.0380 [quant-ph]
- On the expressive power of deep learning: A tensor analysis. (Sept. 2015). arXiv:1509.05009Â [cs.NE]
- Nadav Cohen and Amnon Shashua. 2016. Inductive Bias of Deep Convolutional Networks through Pooling Geometry. (May 2016). arXiv:1605.06743Â [cs.NE]
- Gavin E Crooks. 2019. Gradients of parameterized quantum gates using the parameter-shift rule and gate decomposition. (May 2019). arXiv:1905.13311 [quant-ph]
- PaLM-E: An Embodied Multimodal Language Model. (March 2023). arXiv:2303.03378Â [cs.LG]
- Edward Farhi and Hartmut Neven. 2018. Classification with Quantum Neural Networks on Near Term Processors. (Feb. 2018). arXiv:1802.06002Â [quant-ph]
- Richard P Feynman. 1982. Simulating physics with computers. International Journal of Theoretical Physics 21, 6 (June 1982), 467–488. https://doi.org/10.1007/BF02650179
- Quantum singular value transformation and beyond: exponential improvements for quantum matrix arithmetics. (June 2018). arXiv:1806.01838Â [quant-ph]
- Quantum random access memory. Physical review letters 100, 16 (April 2008), 160501. https://doi.org/10.1103/PhysRevLett.100.160501 arXiv:0708.1879Â [quant-ph]
- Lukas Gonon and Antoine Jacquier. 2023. Universal Approximation Theorem and error bounds for quantum neural networks and quantum reservoirs. (July 2023). arXiv:2307.12904Â [quant-ph]
- Google Quantum AI. 2023. Suppressing quantum errors by scaling a surface code logical qubit. Nature 614, 7949 (Feb. 2023), 676–681. https://doi.org/10.1038/s41586-022-05434-1
- Goren Gordon and Gustavo Rigolin. 2005. Generalized Teleportation Protocol. (Nov. 2005). arXiv:quant-ph/0511077Â [quant-ph]
- Stochastic Block BFGS: Squeezing More Curvature out of Data. In Proceedings of The 33rd International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 48), Maria Florina Balcan and Kilian Q Weinberger (Eds.). PMLR, New York, New York, USA, 1869–1878.
- Quantum algorithm for linear systems of equations. Physical review letters 103, 15 (Oct. 2009), 150502. https://doi.org/10.1103/PhysRevLett.103.150502 arXiv:0811.3171Â [quant-ph]
- Aram W Harrow and John C Napp. 2021. Low-Depth Gradient Measurements Can Improve Convergence in Variational Hybrid Quantum-Classical Algorithms. Physical review letters 126, 14 (April 2021), 140502. https://doi.org/10.1103/PhysRevLett.126.140502
- Training Compute-Optimal Large Language Models. (March 2022). arXiv:2203.15556Â [cs.CL]
- Alexander Semenovich Holevo. 1973. Bounds for the quantity of information transmitted by a quantum communication channel. Rossiiskaya Akademiya Nauk. Problemy Peredachi Informatsii (1973).
- Jeremy Howard and Sebastian Ruder. 2018. Universal Language Model Fine-tuning for Text Classification. (Jan. 2018). arXiv:1801.06146Â [cs.CL]
- Power of data in quantum machine learning. Nature communications 12, 1 (May 2021), 2631. https://doi.org/10.1038/s41467-021-22539-9
- GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism. (Nov. 2018). arXiv:1811.06965Â [cs.CV]
- William J Huggins and Jarrod R McClean. 2023. Accelerating Quantum Algorithms with Precomputation. (May 2023). arXiv:2305.09638 [quant-ph]
- The quantum communication complexity of the pointer chasing problem: The bit version. In FST TCS 2002: Foundations of Software Technology and Theoretical Computer Science. Springer Berlin Heidelberg, Berlin, Heidelberg, 218–229. https://doi.org/10.1007/3-540-36206-1_20
- Pseudorandom Quantum States. In Lecture Notes in Computer Science. Springer International Publishing, Cham, 126–152. https://doi.org/10.1007/978-3-319-96878-0_5
- Rie Johnson and Tong Zhang. 2013. Accelerating stochastic gradient descent using predictive variance reduction. In NeurIPS.
- TPU v4: An Optically Reconfigurable Supercomputer for Machine Learning with Hardware Support for Embeddings. (April 2023). https://doi.org/10.1145/3579371.3589350 arXiv:2304.01433Â [cs.AR]
- Scaling laws for neural language models. arXiv preprint arXiv (2020).
- Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention. (June 2020). arXiv:2006.16236Â [cs.LG]
- Entanglement of trapped-ion qubits separated by 230 meters. (Aug. 2022). arXiv:2208.14907Â [quant-ph]
- Eyal Kushilevitz and Noam Nisan. 2011. Communication Complexity. Cambridge University Press, Cambridge, England.
- Perspectives on quantum transduction. Quantum science and technology 5, 2 (March 2020), 020501. https://doi.org/10.1088/2058-9565/ab788a
- A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets. (Feb. 2012). arXiv:1202.6258Â [math.OC]
- FNet: Mixing Tokens with Fourier Transforms. (May 2021). arXiv:2105.03824Â [cs.CL]
- On the Long-Term Memory of Deep Recurrent Networks. (Oct. 2017). arXiv:1710.09431Â [cs.LG]
- The Depth-to-Width Interplay in Self-Attention. (June 2020). arXiv:2006.12467Â [cs.LG]
- Quantum State Transfer over 1200 km Assisted by Prior Distributed Entanglement. Physical review letters 128, 17 (April 2022), 170501. https://doi.org/10.1103/PhysRevLett.128.170501
- S Lloyd. 1996. Universal Quantum Simulators. Science 273, 5278 (Aug. 1996), 1073–1078. https://doi.org/10.1126/science.273.5278.1073
- Quantum principal component analysis. Nature physics 10, 9 (July 2014), 631–633. https://doi.org/10.1038/nphys3029
- Guang Hao Low and Isaac L Chuang. 2017. Optimal Hamiltonian Simulation by Quantum Signal Processing. Physical review letters 118, 1 (Jan. 2017), 010501. https://doi.org/10.1103/PhysRevLett.118.010501 arXiv:1606.02685 [quant-ph]
- Microwave Quantum Link between Superconducting Circuits Housed in Spatially Separated Cryogenic Systems. Physical review letters 125, 26 (Dec. 2020), 260502. https://doi.org/10.1103/PhysRevLett.125.260502
- Vitaly Maiorov and Allan Pinkus. 1999. Lower bounds for approximation by MLP neural networks. Neurocomputing 25, 1 (April 1999), 81–91. https://doi.org/10.1016/S0925-2312(98)00111-8
- A Grand Unification of Quantum Algorithms. (May 2021). arXiv:2105.02859Â [quant-ph]
- The theory of variational hybrid quantum-classical algorithms. (Sept. 2015). arXiv:1509.04279Â [quant-ph]
- Alias-Free Convnets: Fractional Shift Invariance via Polynomial Activations. (March 2023). arXiv:2303.08085Â [cs.CV]
- Boris Mityagin. 2015. The Zero Set of a Real Analytic Function. (Dec. 2015). arXiv:1512.07276Â [math.CA]
- Ashley Montanaro and Sam Pallister. 2015. Quantum algorithms and the finite element method. (Dec. 2015). arXiv:1512.05903Â [quant-ph]
- Ashley Montanaro and Changpeng Shao. 2022. Quantum communication complexity of linear regression. arXiv preprint arXiv:2210.01601 (2022). https://arxiv.org/abs/2210.01601
- A Linearly-Convergent Stochastic L-BFGS Algorithm. In Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (Proceedings of Machine Learning Research, Vol. 51), Arthur Gretton and Christian C Robert (Eds.). PMLR, Cadiz, Spain, 249–258.
- Danial Motlagh and Nathan Wiebe. 2023. Generalized Quantum Signal Processing. (Aug. 2023). arXiv:2308.01501Â [quant-ph]
- Inside Quantum Repeaters. IEEE Journal of Selected Topics in Quantum Electronics 21, 3 (May 2015), 78–90. https://doi.org/10.1109/jstqe.2015.2392076
- PipeDream: generalized pipeline parallelism for DNN training. In Proceedings of the 27th ACM Symposium on Operating Systems Principles (Huntsville, Ontario, Canada) (SOSP ’19). Association for Computing Machinery, New York, NY, USA, 1–15. https://doi.org/10.1145/3341301.3359646
- Ashwin Nayak and Felix Wu. 1998. The quantum query complexity of approximating the median and related statistics. (April 1998). arXiv:quant-ph/9804066Â [quant-ph]
- Michael A Nielsen and Isaac L Chuang. 2010. Quantum Computation and Quantum Information: 10th Anniversary Edition. Cambridge University Press. https://doi.org/10.1017/CBO9780511976667
- Brad G Osgood. 2019. Lectures on the Fourier Transform and Its Applications (Pure and Applied Undergraduate Texts) (Pure and Applied Undergraduate Texts, 33). American Mathematical Society.
- Data re-uploading for a universal quantum classifier. (July 2019). arXiv:1907.02085Â [quant-ph]
- Lirandë Pira and Chris Ferrie. 2023. An invitation to distributed quantum neural networks. Quantum Machine Intelligence 5, 2 (2023), 1–24. https://link.springer.com/article/10.1007/s42484-023-00114-3
- Realization of a multinode quantum network of remote solid-state qubits. Science 372, 6539 (April 2021), 259–264. https://doi.org/10.1126/science.abg1919 arXiv:2102.04471 [quant-ph]
- The Communication Complexity of Pointer Chasing. J. Comput. System Sci. 62, 2 (March 2001), 323–355. https://doi.org/10.1006/jcss.2000.1731
- Efficiently Scaling Transformer Inference. (Nov. 2022). arXiv:2211.05102Â [cs.LG]
- Anup Rao and Amir Yehudayoff. 2020. Communication Complexity and Applications. Cambridge University Press. https://doi.org/10.1017/9781108671644
- DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (Virtual Event, CA, USA) (KDD ’20). Association for Computing Machinery, New York, NY, USA, 3505–3506. https://doi.org/10.1145/3394486.3406703
- Arthur G Rattew and Patrick Rebentrost. 2023. Non-Linear Transformations of Quantum Amplitudes: Exponential Improvement, Generalization, and Applications. (Sept. 2023). arXiv:2309.09839 [quant-ph]
- Ran Raz. 1999. Exponential separation of quantum and classical communication complexity. In Proceedings of the thirty-first annual ACM symposium on Theory of Computing (Atlanta, Georgia, USA) (STOC ’99). Association for Computing Machinery, New York, NY, USA, 358–367. https://doi.org/10.1145/301250.301343
- Alexander Razborov. 2002. Quantum communication complexity of symmetric predicates. (April 2002). arXiv:quant-ph/0204025Â [quant-ph]
- Tim Roughgarden. 2015. Communication Complexity (for Algorithm Designers). (Sept. 2015). arXiv:1509.06257Â [cs.CC]
- The effect of data encoding on the expressive power of variational quantum machine learning models. (Aug. 2020). arXiv:2008.08605Â [quant-ph]
- PÂ W Shor. 1994. Algorithms for quantum computation: discrete logarithms and factoring. In Proceedings 35th Annual Symposium on Foundations of Computer Science (Santa Fe, NM, USA). IEEE Comput. Soc. Press. https://doi.org/10.1109/sfcs.1994.365700
- Retentive Network: A Successor to Transformer for Large Language Models. (July 2023). arXiv:2307.08621Â [cs.CL]
- Attention is All you Need. In Advances in Neural Information Processing Systems, I Guyon, U V Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett (Eds.), Vol. 30. Curran Associates, Inc., 5998–6008.
- Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning. (Oct. 2022). arXiv:2211.04325Â [cs.LG]
- A J Walker. 1974. New fast method for generating discrete random numbers with arbitrary frequency distributions. Electronics letters 10, 8 (April 1974), 127–128. https://doi.org/10.1049/el:19740097
- High-efficiency microwave-optical quantum transduction based on a cavity electro-optic superconducting system with long coherence time. npj Quantum Information 8, 1 (Dec. 2022), 1–10. https://doi.org/10.1038/s41534-022-00664-7
- GSPMD: General and Scalable Parallelization for ML Computation Graphs. (May 2021). arXiv:2105.04663Â [cs.DC]
- Andrew Chi-Chih Yao. 1979. Some complexity questions related to distributive computing(Preliminary Report). In Proceedings of the eleventh annual ACM symposium on Theory of computing (Atlanta, Georgia, USA) (STOC ’79). Association for Computing Machinery, New York, NY, USA, 209–213. https://doi.org/10.1145/800135.804414
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.