Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
98 tokens/sec
GPT-4o
11 tokens/sec
Gemini 2.5 Pro Pro
52 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
15 tokens/sec
DeepSeek R1 via Azure Pro
33 tokens/sec
Gemini 2.5 Flash Deprecated
12 tokens/sec
2000 character limit reached

Exploring new ways: Enforcing representational dissimilarity to learn new features and reduce error consistency (2307.02516v1)

Published 5 Jul 2023 in cs.LG, cs.AI, and cs.CV

Abstract: Independently trained machine learning models tend to learn similar features. Given an ensemble of independently trained models, this results in correlated predictions and common failure modes. Previous attempts focusing on decorrelation of output predictions or logits yielded mixed results, particularly due to their reduction in model accuracy caused by conflicting optimization objectives. In this paper, we propose the novel idea of utilizing methods of the representational similarity field to promote dissimilarity during training instead of measuring similarity of trained models. To this end, we promote intermediate representations to be dissimilar at different depths between architectures, with the goal of learning robust ensembles with disjoint failure modes. We show that highly dissimilar intermediate representations result in less correlated output predictions and slightly lower error consistency, resulting in higher ensemble accuracy. With this, we shine first light on the connection between intermediate representations and their impact on the output predictions.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (27)
  1. Git re-basin: Merging models modulo permutation symmetries. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=CQsmMYmlP5T.
  2. Revisiting model stitching to compare neural representations. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=ak06J5jNR4.
  3. Cohen, J. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, 20(1):37–46, 1960. doi: 10.1177/001316446002000104. URL https://doi.org/10.1177/001316446002000104.
  4. Similarity and matching of neural network representations. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=aedFIIRRfXr.
  5. Autoaugment: Learning augmentation strategies from data. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp.  113–123, 2019. doi: 10.1109/CVPR.2019.00020.
  6. Dietterich, T. G. Ensemble methods in machine learning. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 1857 LNCS, pp.  1–15. Springer Verlag, 2000. ISBN 3540677046. doi: 10.1007/3-540-45014-9_1. URL https://link.springer.com/chapter/10.1007/3-540-45014-9_1.
  7. Shortcut learning in deep neural networks. Nature Machine Intelligence, 2:665–673, 2020. URL http://arxiv.org/abs/2004.07780.
  8. No one representation to rule them all: Overlapping features of training methods. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=BK-4qbGgIE3.
  9. Deep residual learning for image recognition. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2016-Decem, pp.  770–778. IEEE Computer Society, dec 2016. ISBN 9781467388504. doi: 10.1109/CVPR.2016.90. URL http://image-net.org/challenges/LSVRC/2015/.
  10. A constructive algorithm for training cooperative neural network ensembles. IEEE Transactions on Neural Networks, 14(4):820–834, 2003. ISSN 10459227. doi: 10.1109/TNN.2003.813832. URL https://pubmed.ncbi.nlm.nih.gov/18238062/.
  11. Similarity of neural network models: A survey of functional and representational measures, 2023.
  12. Similarity of neural network representations revisited. In 36th International Conference on Machine Learning, ICML 2019, volume 2019-June, pp.  6156–6175, 2019. ISBN 9781510886988. URL https://arxiv.org/abs/1905.00414.
  13. Kriegeskorte, N. Representational similarity analysis – connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2:4, 11 2008. ISSN 16625137. doi: 10.3389/neuro.06.004.2008. URL http://journal.frontiersin.org/article/10.3389/neuro.06.004.2008/abstract.
  14. Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. Advances in Neural Information Processing Systems, 2017-Decem:6403–6414, dec 2016. URL http://arxiv.org/abs/1612.01474.
  15. Understanding image representations by measuring their equivariance and equivalence. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.  991–999, 2015. doi: 10.1109/CVPR.2015.7298701.
  16. Convergent learning: Do different neural networks learn the same representations? In International Conference on Learning Representations, volume 44, pp.  196–212, 2016. URL http://proceedings.mlr.press/v44/li15convergent.pdf.
  17. Ensemble learning via negative correlation. Neural Networks, 12(10):1399–1404, dec 1999a. ISSN 08936080. doi: 10.1016/S0893-6080(99)00073-8.
  18. Simultaneous training of negatively correlated neural networks in an ensemble. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 29(6):716–725, 1999b. ISSN 10834419. doi: 10.1109/3477.809027.
  19. Automatic shortcut removal for self-supervised representation learning. In International Conference on Machine Learning, 2020.
  20. Insights on representational similarity in neural networks with canonical correlation. Advances in Neural Information Processing Systems, 2018-Decem:5727–5736, jun 2018. URL http://arxiv.org/abs/1806.05759.
  21. Relative representations enable zero-shot latent space communication. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=SrC-nwieGJ.
  22. Do wide and deep networks learn the same things? uncovering how neural network representations vary with width and depth. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=KJNcAkY8tY4.
  23. Improving adversarial robustness via promoting ensemble diversity. In International Conference on Machine Learning, pp. 4970–4979. PMLR, 2019.
  24. SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability. Advances in Neural Information Processing Systems, 2017-Decem:6077–6086, jun 2017. URL http://arxiv.org/abs/1706.05806.
  25. Feature selection via dependence maximization. Journal of Machine Learning Research, 13:1393–1434, 2012. ISSN 15324435.
  26. When are ensembles really effective?, 2023.
  27. Towards understanding learning representations: To what extent do different neural networks learn the same representation. Advances in neural information processing systems, 31, 2018.
Citations (1)

Summary

We haven't generated a summary for this paper yet.