From Bricks to Bridges: Product of Invariances to Enhance Latent Space Communication (2310.01211v2)
Abstract: It has been observed that representations learned by distinct neural networks conceal structural similarities when the models are trained under similar inductive biases. From a geometric perspective, identifying the classes of transformations and the related invariances that connect these representations is fundamental to unlocking applications, such as merging, stitching, and reusing different neural modules. However, estimating task-specific transformations a priori can be challenging and expensive due to several factors (e.g., weights initialization, training hyperparameters, or data modality). To this end, we introduce a versatile method to directly incorporate a set of invariances into the representations, constructing a product space of invariant components on top of the latent representations without requiring prior knowledge about the optimal invariance to infuse. We validate our solution on classification and reconstruction tasks, observing consistent latent similarity and downstream performance improvements in a zero-shot stitching setting. The experimental analysis comprises three modalities (vision, text, and graphs), twelve pretrained foundational models, nine benchmarks, and several architectures trained from scratch.
- Invariant risk minimization. arXiv preprint arXiv:1907.02893, 2019.
- Learning invariances in neural networks from training data. Advances in neural information processing systems, 33:17605–17616, 2020.
- Bootstrapping parallel anchors for relative representations. In Krystal Maughan, Rosanne Liu, and Thomas F. Burns (eds.), The First Tiny Papers Track at ICLR 2023, Tiny Papers @ ICLR 2023, Kigali, Rwanda, May 5, 2023. OpenReview.net, 2023. URL https://openreview.net/pdf?id=VBuUL2IWlq.
- Weakly supervised vision-and-language pre-training with relative representations. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 8341–8355, Toronto, Canada, July 2023. Association for Computational Linguistics. doi: 10.18653/v1/2023.acl-long.464. URL https://aclanthology.org/2023.acl-long.464.
- ELECTRA: pre-training text encoders as discriminators rather than generators. In 8th International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, April 26-30, 2020. OpenReview.net, 2020. URL https://openreview.net/forum?id=r1xMH1BtvB.
- Group equivariant convolutional networks. In International conference on machine learning, pp. 2990–2999. PMLR, 2016.
- A general theory of equivariant cnns on homogeneous spaces. Advances in neural information processing systems, 32, 2019.
- Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116, 2019.
- The heat method for distance computation. Commun. ACM, 60(11):90–99, October 2017. ISSN 0001-0782. doi: 10.1145/3131280. URL http://doi.acm.org/10.1145/3131280.
- From charts to atlas: Merging latent spaces into one. In NeurIPS 2023 Workshop on Symmetry and Geometry in Neural Representations, 2023. URL https://openreview.net/forum?id=ZFu7CPtznY.
- Reliability of cka as a similarity measure in deep learning. arXiv preprint arXiv:2210.16156, 2022.
- Li Deng. The mnist database of handwritten digit images for machine learning research. IEEE Signal Processing Magazine, 29(6):141–142, 2012.
- BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, Minneapolis, Minnesota, 2019. Association for Computational Linguistics. doi: 10.18653/v1/N19-1423. URL https://aclanthology.org/N19-1423.
- An image is worth 16x16 words: Transformers for image recognition at scale. In 9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net, 2021. URL https://openreview.net/forum?id=YicbFdNTTy.
- Adaptive data augmentation for image classification. In 2016 IEEE international conference on image processing (ICIP), pp. 3688–3692. Ieee, 2016.
- Learning disentangled representations via product manifold projection. In Marina Meila and Tong Zhang (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp. 3530–3540. PMLR, 18–24 Jul 2021. URL https://proceedings.mlr.press/v139/fumero21a.html.
- Training or architecture? how to incorporate invariance in neural networks. arXiv preprint arXiv:2106.10044, 2021.
- Rethinking channel dimensions for efficient model design. ArXiv preprint, abs/2007.00992, 2020. URL https://arxiv.org/abs/2007.00992.
- Symmetry-based representations for artificial and biological general intelligence, 2022.
- Harold Hotelling. Relations between two sets of variates. Breakthroughs in statistics: methodology and distribution, pp. 162–190, 1992.
- Toward semantics-based answer pinpointing. In Proceedings of the First International Conference on Human Language Technology Research, 2001. URL https://aclanthology.org/H01-1069.
- Invariance learning in deep neural networks with differentiable laplace approximations. Advances in Neural Information Processing Systems, 35:12449–12463, 2022.
- Incorporating rotational invariance in convolutional neural network architecture. Pattern Analysis and Applications, 22:935–948, 2019.
- Adam: A method for stochastic optimization. In Yoshua Bengio and Yann LeCun (eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1412.6980.
- Similarity of neural network models: A survey of functional and representational measures. arXiv preprint arXiv:2305.06329, 2023.
- Similarity of neural network representations revisited. In International Conference on Machine Learning, pp. 3519–3529. PMLR, 2019.
- Learning multiple layers of features from tiny images. Toronto, ON, Canada, 2009.
- Dvc: Data version control - git for data & models, 2022. URL https://doi.org/10.5281/zenodo.7083378.
- On the direct alignment of latent spaces. In UniReps: the First Workshop on Unifying Representations in Neural Models, 2023. URL https://openreview.net/forum?id=nro8tEfIfw.
- Albert: A lite bert for self-supervised learning of language representations. arXiv preprint arXiv:1909.11942, 2019.
- Metric space magnitude for evaluating unsupervised representation learning. arXiv preprint arXiv:2311.16054, 2023.
- Roberta: A robustly optimized bert pretraining approach. ArXiv preprint, abs/1907.11692, 2019. URL https://arxiv.org/abs/1907.11692.
- On the benefits of invariance in neural networks. arXiv preprint arXiv:2005.00178, 2020.
- Latent space translation via semantic alignment. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (eds.), Advances in Neural Information Processing Systems, volume 36, pp. 55394–55414. Curran Associates, Inc., 2023. URL https://proceedings.neurips.cc/paper_files/paper/2023/file/ad5fa03c906ca15905144ca3fbf2a768-Paper-Conference.pdf.
- Harmonics of learning: Universal fourier features emerge in invariant networks. arXiv preprint arXiv:2312.08550, 2023.
- Insights on representational similarity in neural networks with canonical correlation. Advances in Neural Information Processing Systems, 31, 2018.
- Relative representations enable zero-shot latent space communication. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=SrC-nwieGJ.
- Asif: Coupled data turns unimodal models to multimodal without training. In A. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine (eds.), Advances in Neural Information Processing Systems, volume 36, pp. 15303–15319. Curran Associates, Inc., 2023. URL https://proceedings.neurips.cc/paper_files/paper/2023/file/3186591903d9db31770ad131adb5ceb4-Paper-Conference.pdf.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
- Svcca: Singular vector canonical correlation analysis for deep learning dynamics and interpretability. Advances in neural information processing systems, 30, 2017.
- Deep neural networks with efficient guaranteed invariances. In International Conference on Artificial Intelligence and Statistics, pp. 2460–2480. PMLR, 2023.
- Zero-shot stitching in reinforcement learning using relative representations. In Sixteenth European Workshop on Reinforcement Learning, 2023. URL https://openreview.net/forum?id=4tcXsImfsS1.
- Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal processing letters, 24(3):279–283, 2017.
- A general framework for robust g-invariance in g-equivariant networks. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=llP6lmMiXE.
- Collective Classification in Network Data. AIMag., 29(3):93, 2008. ISSN 2371-9621. doi: 10.1609/aimag.v29i3.2157.
- The riemannian geometry of deep generative models, 2017.
- A global geometric framework for nonlinear dimensionality reduction. science, 290(5500):2319–2323, 2000.
- Tycho FA van der Ouderaa and Mark van der Wilk. Learning invariant weights in neural networks. In Uncertainty in Artificial Intelligence, pp. 1992–2001. PMLR, 2022.
- N24News: A new dataset for multimodal news classification. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pp. 6768–6775, Marseille, France, June 2022. European Language Resources Association. URL https://aclanthology.org/2022.lrec-1.729.
- Mapping the multiverse of latent representations. arXiv preprint arXiv:2402.01514, 2024.
- Harmonic networks: Deep translation and rotation equivariance. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5028–5037, 2017.
- Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747, 2017.
- Character-level convolutional networks for text classification. In Corinna Cortes, Neil D. Lawrence, Daniel D. Lee, Masashi Sugiyama, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, December 7-12, 2015, Montreal, Quebec, Canada, pp. 649–657, 2015. URL https://proceedings.neurips.cc/paper/2015/hash/250cf8b51c773f3f8dc8b4be867a9a02-Abstract.html.
- Deep set prediction networks. Advances in Neural Information Processing Systems, 32, 2019.
- Oriented response networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 519–528, 2017.