Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Structural Restricted Boltzmann Machine for image denoising and classification (2306.09628v1)

Published 16 Jun 2023 in cs.CV and stat.ML

Abstract: Restricted Boltzmann Machines are generative models that consist of a layer of hidden variables connected to another layer of visible units, and they are used to model the distribution over visible variables. In order to gain a higher representability power, many hidden units are commonly used, which, in combination with a large number of visible units, leads to a high number of trainable parameters. In this work we introduce the Structural Restricted Boltzmann Machine model, which taking advantage of the structure of the data in hand, constrains connections of hidden units to subsets of visible units in order to reduce significantly the number of trainable parameters, without compromising performance. As a possible area of application, we focus on image modelling. Based on the nature of the images, the structure of the connections is given in terms of spatial neighbourhoods over the pixels of the image that constitute the visible variables of the model. We conduct extensive experiments on various image domains. Image denoising is evaluated with corrupted images from the MNIST dataset. The generative power of our models is compared to vanilla RBMs, as well as their classification performance, which is assessed with five different image domains. Results show that our proposed model has a faster and more stable training, while also obtaining better results compared to an RBM with no constrained connections between its visible and hidden units.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. P. V. Gehler, A. D. Holub, and M. Welling, “The Rate Adapting Poisson Model for Information Retrieval and Object Recognition,” in Proceedings of the 23rd International Conference on Machine Learning, pp. 337–344, 2006.
  2. G. E. Hinton, “To recognize shapes, first learn to generate images,” Progress in Brain Research, vol. 165, pp. 535–547, 2007.
  3. J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proceedings of the National Academy of Sciences, vol. 79, no. 8, pp. 2554–2558, 1982.
  4. G. E. Hinton and T. J. Sejnowski, “Optimal perceptual inference,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 448, pp. 448–453, 1983.
  5. P. Smolensky, “Information processing in dynamical systems: Foundations of harmony theory,” tech. rep., Colorado Univ. at Boulder Dept. of Computer Science, 1986.
  6. M. Welling, M. Rosen-Zvi, and G. E. Hinton, “Exponential family harmoniums with an application to information retrieval,” Advances in Neural Information Processing Systems, vol. 17, 2004.
  7. A. Courville, J. Bergstra, and Y. Bengio, “A spike and slab restricted Boltzmann machine,” in Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, pp. 233–241, JMLR Workshop and Conference Proceedings, 2011.
  8. G. Montúfar, “Restricted Boltzmann machines: Introduction and review,” in Information Geometry and Its Applications IV, pp. 75–115, Springer, 2016.
  9. V. Upadhya and P. Sastry, “An overview of restricted Boltzmann machines,” Journal of the Indian Institute of Science, pp. 1–12, 2019.
  10. G. E. Hinton, “Training products of experts by minimizing contrastive divergence,” Neural Computation, vol. 14, no. 8, pp. 1771–1800, 2002.
  11. S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 6, pp. 721–741, 1984.
  12. M. A. Carreira-Perpinan and G. Hinton, “On contrastive divergence learning,” in International Workshop on Artificial Intelligence and Statistics, pp. 33–40, PMLR, 2005.
  13. O. Loukas, “Self-regularizing restricted Boltzmann machines,” arXiv preprint arXiv:1912.05634, 2019.
  14. R. Salakhutdinov and I. Murray, “On the quantitative analysis of deep belief networks,” in Proceedings of the 25th International Conference on Machine Learning, pp. 872–879, 2008.
  15. H. Larochelle and Y. Bengio, “Classification using discriminative Restricted Boltzmann machines,” in Proceedings of the 25th International Conference on Machine Learning, pp. 536–543, 2008.
  16. X. Bi and H. Wang, “Early Alzheimer’s disease diagnosis based on EEG spectral images using deep learning,” Neural Networks, vol. 114, pp. 119–135, 2019.
  17. S. He, S. Wang, W. Lan, H. Fu, and Q. Ji, “Facial expression recognition using deep Boltzmann machine from thermal infrared images,” in 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp. 239–244, IEEE, 2013.
  18. S. Pilati and P. Pieri, “Simulating disordered quantum Ising chains via dense and sparse Restricted Boltzmann machines,” Physical Review E, vol. 101, no. 6, p. 063308, 2020.
  19. D. Sehayek, A. Golubeva, M. S. Albergo, B. Kulchytskyy, G. Torlai, and R. G. Melko, “Learnability scaling of quantum states: Restricted Boltzmann machines,” Physical Review B, vol. 100, no. 19, p. 195125, 2019.
  20. R. Mittelman, B. Kuipers, S. Savarese, and H. Lee, “Structured recurrent temporal restricted Boltzmann machines,” in International Conference on Machine Learning, pp. 1647–1655, 2014.
  21. N. Ji, J. Zhang, C. Zhang, and Q. Yin, “Enhancing performance of restricted Boltzmann machines via log-sum regularization,” Knowledge-Based Systems, vol. 63, pp. 82–96, 2014.
  22. K. Cho, A. Ilin, and T. Raiko, “Tikhonov-type regularization for restricted Boltzmann machines,” in International Conference on Artificial Neural Networks, pp. 81–88, Springer, 2012.
  23. D. C. Mocanu, E. Mocanu, P. H. Nguyen, M. Gibescu, and A. Liotta, “A topological insight into Restricted Boltzmann machines,” Machine Learning, vol. 104, no. 2, pp. 243–270, 2016.
  24. D. C. Mocanu, E. Mocanu, P. Stone, P. H. Nguyen, M. Gibescu, and A. Liotta, “Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science,” Nature Communications, vol. 9, no. 1, pp. 1–12, 2018.
  25. H. Kunsch, S. Geman, and A. Kehagias, “Hidden Markov random fields,” The Annals of Applied Probability, vol. 5, no. 3, pp. 577–602, 1995.
  26. J. Frankle and M. Carbin, “The lottery ticket hypothesis: Finding sparse, trainable neural networks,” arXiv preprint arXiv:1803.03635, 2018.
  27. N. Lee, T. Ajanthan, and P. H. Torr, “Snip: Single-shot network pruning based on connection sensitivity,” arXiv preprint arXiv:1810.02340, 2018.
  28. S. Bengio and Y. Bengio, “Taking on the curse of dimensionality in joint distributions using neural networks,” IEEE Transactions on Neural Networks, vol. 11, no. 3, pp. 550–557, 2000.
  29. A. Jain, P. Abbeel, and D. Pathak, “Locally masked convolution for autoregressive models,” in Conference on Uncertainty in Artificial Intelligence, pp. 1358–1367, PMLR, 2020.
  30. G. Mayraz and G. E. Hinton, “Recognizing hand-written digits using hierarchical products of experts,” Advances in Neural Information Processing Systems, vol. 13, 2000.
  31. N. Mu and J. Gilmer, “MNIST-C: A robustness benchmark for computer vision,” arXiv preprint arXiv:1906.02337, 2019.
  32. Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
  33. H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms,” CoRR, vol. abs/1708.07747, 2017.
  34. J. Yang, R. Shi, and B. Ni, “MedMNIST Classification Decathlon: A Lightweight AutoML Benchmark for Medical Image Analysis,” in IEEE 18th International Symposium on Biomedical Imaging (ISBI), pp. 191–195, 2021.
  35. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015. Software available from tensorflow.org.
  36. N. Qian, “On the momentum term in gradient descent learning algorithms,” Neural networks, vol. 12, no. 1, pp. 145–151, 1999.
  37. X. Glorot and Y. Bengio, “Understanding the difficulty of training deep feedforward neural networks,” in Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, pp. 249–256, JMLR Workshop and Conference Proceedings, 2010.
  38. G. E. Hinton, “A practical guide to training Restricted Boltzmann machines,” in Neural Networks: Tricks of the Trade, pp. 599–619, Springer, 2012.
  39. Y. Bengio et al., “Learning deep architectures for AI,” Foundations and trends® in Machine Learning, vol. 2, no. 1, pp. 1–127, 2009.
  40. J. L. Hintze and R. D. Nelson, “Violin plots: a box plot-density trace synergism,” The American Statistician, vol. 52, no. 2, pp. 181–184, 1998.
  41. A. Hyvärinen, “Some extensions of Score Matching,” Computational Statistics & Data Analysis, vol. 51, no. 5, pp. 2499–2512, 2007.

Summary

We haven't generated a summary for this paper yet.