Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations (2310.09382v1)

Published 13 Oct 2023 in cs.LG and cs.CV

Abstract: In this paper we introduce learnable lattice vector quantization and demonstrate its effectiveness for learning discrete representations. Our method, termed LL-VQ-VAE, replaces the vector quantization layer in VQ-VAE with lattice-based discretization. The learnable lattice imposes a structure over all discrete embeddings, acting as a deterrent against codebook collapse, leading to high codebook utilization. Compared to VQ-VAE, our method obtains lower reconstruction errors under the same training conditions, trains in a fraction of the time, and with a constant number of parameters (equal to the embedding dimension $D$), making it a very scalable approach. We demonstrate these results on the FFHQ-1024 dataset and include FashionMNIST and Celeb-A.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Closest point search in lattices. IEEE transactions on information theory, 48(8):2201–2214, 2002.
  2. wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33:12449–12460, 2020.
  3. Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254, 2021.
  4. Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8):1798–1828, 2013.
  5. Generative pretraining from pixels. In International conference on machine learning, pp. 1691–1703. PMLR, 2020.
  6. Universal deep neural network compression. IEEE Journal of Selected Topics in Signal Processing, 14(4):715–726, 2020.
  7. Lattice quantization. In Advances in electronics and electron physics, volume 72, pp.  259–330. Elsevier, 1988.
  8. Cascaded diffusion models for high fidelity image generation. The Journal of Machine Learning Research, 23(1):2249–2281, 2022.
  9. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
  10. Offline reinforcement learning as one big sequence modeling problem. Advances in neural information processing systems, 34:1273–1286, 2021.
  11. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
  12. Lvq-vae: End-to-end hyperprior-based variational image compression with lattice vector quantization. 2022.
  13. Robust training of vector quantized bottleneck models. In 2020 International Joint Conference on Neural Networks (IJCNN), pp.  1–7. IEEE, 2020.
  14. Luis A Lastras. Lattice representation learning. arXiv preprint arXiv:2006.13833, 2020.
  15. The concrete distribution: A continuous relaxation of discrete random variables. CoRR, abs/1611.00712, 2016. URL http://arxiv.org/abs/1611.00712.
  16. Neural variational inference and learning in belief networks. In International Conference on Machine Learning, pp. 1791–1799. PMLR, 2014.
  17. Variational inference for monte carlo objectives. CoRR, abs/1602.06725, 2016. URL http://arxiv.org/abs/1602.06725.
  18. Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
  19. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
  20. Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32, 2019.
  21. Stochastic backpropagation and approximate inference in deep generative models. In International conference on machine learning, pp. 1278–1286. PMLR, 2014.
  22. Uveqfed: Universal vector quantization for federated learning. IEEE Transactions on Signal Processing, 69:500–514, 2020.
  23. Sq-vae: Variational bayes on discrete representation with self-annealed stochastic quantization. arXiv preprint arXiv:2205.07547, 2022.
  24. The information bottleneck method. arXiv preprint physics/0004057, 2000.
  25. Pixel recurrent neural networks. In International conference on machine learning, pp. 1747–1756. PMLR, 2016.
  26. Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
  27. Hierarchical quantized autoencoders. Advances in Neural Information Processing Systems, 33:4524–4535, 2020.
  28. Videogpt: Video generation using vq-vae and transformers. arXiv preprint arXiv:2104.10157, 2021.
  29. Improved bottleneck features using pretrained deep neural networks. In Twelfth annual conference of the international speech communication association, 2011.
  30. Scaling autoregressive models for content-rich text-to-image generation. arXiv preprint arXiv:2206.10789, 2(3):5, 2022.
  31. Privacy-preserving federated learning on lattice quantization. International Journal of Wavelets, Multiresolution and Information Processing, pp.  2350020, 2023.
  32. Xi Zhang and Xiaolin Wu. Lvqac: Lattice vector quantization coupled with spatially adaptive companding for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10239–10248, 2023.
  33. Communication reducing quantization for federated learning with local differential privacy mechanism. In 2021 IEEE/CIC International Conference on Communications in China (ICCC), pp.  75–80. IEEE, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Ahmed Khalil (7 papers)
  2. Robert Piechocki (30 papers)
  3. Raul Santos-Rodriguez (70 papers)
Citations (1)