LL-VQ-VAE: Learnable Lattice Vector-Quantization For Efficient Representations (2310.09382v1)
Abstract: In this paper we introduce learnable lattice vector quantization and demonstrate its effectiveness for learning discrete representations. Our method, termed LL-VQ-VAE, replaces the vector quantization layer in VQ-VAE with lattice-based discretization. The learnable lattice imposes a structure over all discrete embeddings, acting as a deterrent against codebook collapse, leading to high codebook utilization. Compared to VQ-VAE, our method obtains lower reconstruction errors under the same training conditions, trains in a fraction of the time, and with a constant number of parameters (equal to the embedding dimension $D$), making it a very scalable approach. We demonstrate these results on the FFHQ-1024 dataset and include FashionMNIST and Celeb-A.
- Closest point search in lattices. IEEE transactions on information theory, 48(8):2201–2214, 2002.
- wav2vec 2.0: A framework for self-supervised learning of speech representations. Advances in neural information processing systems, 33:12449–12460, 2020.
- Beit: Bert pre-training of image transformers. arXiv preprint arXiv:2106.08254, 2021.
- Representation learning: A review and new perspectives. IEEE transactions on pattern analysis and machine intelligence, 35(8):1798–1828, 2013.
- Generative pretraining from pixels. In International conference on machine learning, pp. 1691–1703. PMLR, 2020.
- Universal deep neural network compression. IEEE Journal of Selected Topics in Signal Processing, 14(4):715–726, 2020.
- Lattice quantization. In Advances in electronics and electron physics, volume 72, pp. 259–330. Elsevier, 1988.
- Cascaded diffusion models for high fidelity image generation. The Journal of Machine Learning Research, 23(1):2249–2281, 2022.
- Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144, 2016.
- Offline reinforcement learning as one big sequence modeling problem. Advances in neural information processing systems, 34:1273–1286, 2021.
- Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114, 2013.
- Lvq-vae: End-to-end hyperprior-based variational image compression with lattice vector quantization. 2022.
- Robust training of vector quantized bottleneck models. In 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–7. IEEE, 2020.
- Luis A Lastras. Lattice representation learning. arXiv preprint arXiv:2006.13833, 2020.
- The concrete distribution: A continuous relaxation of discrete random variables. CoRR, abs/1611.00712, 2016. URL http://arxiv.org/abs/1611.00712.
- Neural variational inference and learning in belief networks. In International Conference on Machine Learning, pp. 1791–1799. PMLR, 2014.
- Variational inference for monte carlo objectives. CoRR, abs/1602.06725, 2016. URL http://arxiv.org/abs/1602.06725.
- Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.
- Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
- Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32, 2019.
- Stochastic backpropagation and approximate inference in deep generative models. In International conference on machine learning, pp. 1278–1286. PMLR, 2014.
- Uveqfed: Universal vector quantization for federated learning. IEEE Transactions on Signal Processing, 69:500–514, 2020.
- Sq-vae: Variational bayes on discrete representation with self-annealed stochastic quantization. arXiv preprint arXiv:2205.07547, 2022.
- The information bottleneck method. arXiv preprint physics/0004057, 2000.
- Pixel recurrent neural networks. In International conference on machine learning, pp. 1747–1756. PMLR, 2016.
- Neural discrete representation learning. Advances in neural information processing systems, 30, 2017.
- Hierarchical quantized autoencoders. Advances in Neural Information Processing Systems, 33:4524–4535, 2020.
- Videogpt: Video generation using vq-vae and transformers. arXiv preprint arXiv:2104.10157, 2021.
- Improved bottleneck features using pretrained deep neural networks. In Twelfth annual conference of the international speech communication association, 2011.
- Scaling autoregressive models for content-rich text-to-image generation. arXiv preprint arXiv:2206.10789, 2(3):5, 2022.
- Privacy-preserving federated learning on lattice quantization. International Journal of Wavelets, Multiresolution and Information Processing, pp. 2350020, 2023.
- Xi Zhang and Xiaolin Wu. Lvqac: Lattice vector quantization coupled with spatially adaptive companding for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10239–10248, 2023.
- Communication reducing quantization for federated learning with local differential privacy mechanism. In 2021 IEEE/CIC International Conference on Communications in China (ICCC), pp. 75–80. IEEE, 2021.
- Ahmed Khalil (7 papers)
- Robert Piechocki (30 papers)
- Raul Santos-Rodriguez (70 papers)