Application of Quantum Annealing to Training of Deep Neural Networks (1510.06356v1)

Published 21 Oct 2015 in quant-ph, cs.LG, and stat.ML

Abstract: In Deep Learning, a well-known approach for training a Deep Neural Network starts by training a generative Deep Belief Network model, typically using Contrastive Divergence (CD), then fine-tuning the weights using backpropagation or other discriminative techniques. However, the generative training can be time-consuming due to the slow mixing of Gibbs sampling. We investigated an alternative approach that estimates model expectations of Restricted Boltzmann Machines using samples from a D-Wave quantum annealing machine. We tested this method on a coarse-grained version of the MNIST data set. In our tests we found that the quantum sampling-based training approach achieves comparable or better accuracy with significantly fewer iterations of generative training than conventional CD-based training. Further investigation is needed to determine whether similar improvements can be achieved for other data sets, and to what extent these improvements can be attributed to quantum effects.

Citations (231)

View on Semantic Scholar

Summary

The paper proposes using quantum sampling from a D-Wave annealer for pre-training Restricted Boltzmann Machines in Deep Belief Networks, aiming for more efficient sampling than classical methods.
Experiments on a coarse MNIST dataset show the quantum annealing approach achieves comparable or higher accuracy with significantly fewer training iterations than the classical Contrastive Divergence method.
The study suggests future advancements in quantum annealers could make this approach scalable for larger networks, presenting a case for quantum annealing in sampling tasks beyond optimization.

Application of Quantum Annealing to Training Deep Neural Networks

The paper presented by Adachi and Henderson explores an innovative approach to the training of Deep Neural Networks (DNNs) by utilizing quantum annealing techniques, particularly with a focus on the D-Wave quantum annealing machine. The conventional training of deep generative models such as Deep Belief Networks (DBNs) entails generative pre-training, usually through Contrastive Divergence (CD), followed by fine-tuning via backpropagation. This process, however, is computationally intensive, primarily due to the slow mixing rates characteristic of Gibbs sampling involved in CD. In contrast, the authors propose an approach leveraging quantum sampling, which, according to their findings, not only achieves comparable or improved accuracy but also requires significantly fewer iterations, potentially opening avenues for efficient training strategies in deep learning.

Methodology and Quantum Sampling

The approach centers on using quantum sampling from a D-Wave quantum annealer to estimate model expectations during the pre-training phase of DBNs. This method specifically targets Restricted Boltzmann Machines (RBMs), the foundational units in DBNs, to allow for more efficient sampling compared to traditional Gibbs methods. The inherent properties of quantum annealing, such as superposition and tunneling, offer a theoretical basis for faster exploration of energy landscapes and subsequently quicker convergence to equilibrium distributions.

Key to this approach is adapting quantum hardware limitations, like restricted qubit connectivity and potential faulty qubits, into the training process. The researchers map visible and hidden nodes of the RBM to a network of qubits, inherently addressing the qubit connectivity challenge. This mapping effectively translates the bipartite architecture of RBMs into the "Chimera" graph structure of the D-Wave machine. Additionally, gauge transformations and voting threshold strategies are employed to mitigate effects of hardware noise and intrinsic control errors, thereby enhancing the reliability of quantum sample-based model expectations.

Experimental Evaluation

The experiments carried out on a coarse-grained version of the MNIST dataset validate the practical applicability of the quantum annealing-based training method. The MNIST images are reduced to 32 super-pixels to adhere to the quantum hardware constraints, offering a more challenging classification task compared to the original dataset. Across various testing scenarios, this quantum-based approach consistently demonstrates faster convergence and higher accuracy levels than the classical approach with CD. Notably, results show that quantum networks procure comparable model accuracy with considerably fewer iterations of both pre-training and subsequent backpropagation.

The empirical results underline a notable reduction in training time and processing resources, corroborating the paper's assertion that quantum sampling might indeed yield faster convergence to equilibrium states than classical techniques. Yet, the paper acknowledges that further theoretical and experimental research is necessary to substantiate if these efficiencies genuinely arise from quantum effects, as distinguishing quantum advantages poses considerable experimental challenges.

Implications and Future Directions

The implications of this paper could be broad-ranging, especially considering future advancement in quantum technologies. As quantum annealers like the D-Wave become more robust with higher connectivity and qubit count, the scalability of this approach could exponentially grow, potentially facilitating training for more complex and larger networks. The research presents a compelling case for deploying quantum annealing not just in optimization, but markedly in sampling and inference tasks prevalent in probabilistic graphical models and Markov Chains, where classical methods face barriers due to inefficiencies in convergence.

Ultimately, while this work provides promising insights, it opens several research questions, including optimal configuration parameters for quantum annealers and a deeper exploration of quantum advantageous behaviors in bolstering ML algorithms. Continued investigations into both the hardware and theoretical underpinnings will be paramount in realizing the full potential of quantum annealing within the field of AI.

Related Papers

Tweets

https://twitter.com/BullMeechum3/status/1875564505155223917

https://twitter.com/BullMeechum3/status/1878445629841576132