Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding Matrix Function Normalizations in Covariance Pooling through the Lens of Riemannian Geometry (2407.10484v2)

Published 15 Jul 2024 in cs.CV and cs.LG

Abstract: Global Covariance Pooling (GCP) has been demonstrated to improve the performance of Deep Neural Networks (DNNs) by exploiting second-order statistics of high-level representations. GCP typically performs classification of the covariance matrices by applying matrix function normalization, such as matrix logarithm or power, followed by a Euclidean classifier. However, covariance matrices inherently lie in a Riemannian manifold, known as the Symmetric Positive Definite (SPD) manifold. The current literature does not provide a satisfactory explanation of why Euclidean classifiers can be applied directly to Riemannian features after the normalization of the matrix power. To mitigate this gap, this paper provides a comprehensive and unified understanding of the matrix logarithm and power from a Riemannian geometry perspective. The underlying mechanism of matrix functions in GCP is interpreted from two perspectives: one based on tangent classifiers (Euclidean classifiers on the tangent space) and the other based on Riemannian classifiers. Via theoretical analysis and empirical validation through extensive experiments on fine-grained and large-scale visual classification datasets, we conclude that the working mechanism of the matrix functions should be attributed to the Riemannian classifiers they implicitly respect.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (60)
  1. Covariance pooling for facial expression recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.  367–374, 2018. URL https://doi.org/10.1109/CVPRW.2018.00077.
  2. Fast and simple computations on tensors with log-Euclidean metrics. PhD thesis, INRIA, 2005. URL https://doi.org/10.1007/11566465_15.
  3. Rajendra Bhatia. Positive Definite Matrices. Princeton University Press, 2009. URL https://doi.org/10.1515/9781400827787.
  4. On the Bures-Wasserstein distance between positive definite matrices. Expositiones Mathematicae, 37(2):165–191, 2019. URL https://doi.org/10.1016/j.exmath.2018.01.002.
  5. Silvere Bonnabel. Stochastic gradient descent on Riemannian manifolds. IEEE Transactions on Automatic Control, 58(9):2217–2229, 2013. URL https://doi.org/10.1109/TAC.2013.2254619.
  6. Adaptive Riemannian metrics on SPD manifolds. arXiv preprint arXiv:2303.15477, 2023a. URL https://arxiv.org/abs/2303.15477.
  7. Riemannian local mechanism for SPD neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, pp.  7104–7112, 2023b. URL https://doi.org/10.1609/aaai.v37i6.25867.
  8. Riemannian multinomial logistics regression for SPD neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  17086–17096, 2024a. URL https://openaccess.thecvf.com/content/CVPR2024/html/Chen_Riemannian_Multinomial_Logistics_Regression_for_SPD_Neural_Networks_CVPR_2024_paper.html.
  9. A Lie group approach to Riemannian normalization for SPD neural networks. In The Twelfth International Conference on Learning Representations, 2024b. URL https://openreview.net/forum?id=okYdj8Ysru.
  10. Intrinsic Riemannian classifiers on the deformed SPD manifolds: A unified framework, 2024c. URL https://openreview.net/forum?id=EyWKb7Ltcx.
  11. Kernel pooling for convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  2921–2930, 2017. URL https://doi.org/10.1109/CVPR.2017.325.
  12. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp.  248–255. IEEE, 2009. URL https://doi.org/10.1109/CVPR.2009.5206848.
  13. Power Euclidean metrics for covariance matrices with application to diffusion tensor imaging. arXiv preprint arXiv:1009.3045, 2010. URL https://arxiv.org/abs/1009.3045.
  14. Hyperbolic neural networks. Advances in Neural Information Processing Systems, 31, 2018. URL https://dl.acm.org/doi/10.5555/3327345.3327440.
  15. Compact bilinear pooling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  317–326, 2016. URL https://doi.org/10.1109/CVPR.2016.41.
  16. Learning with symmetric positive definite matrices via generalized Bures-Wasserstein geometry. In International Conference on Geometric Science of Information, pp.  405–415. Springer, 2023. URL https://doi.org/10.1007/978-3-031-38271-0_40.
  17. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  770–778, 2016. URL https://doi.org/10.1109/CVPR.2016.90.
  18. Matrix backpropagation for deep networks with structured layers. In Proceedings of the IEEE International Conference on Computer Vision, pp.  2965–2973, 2015. URL https://doi.org/10.1109/ICCV.2015.339.
  19. Low-rank bilinear pooling for fine-grained classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  365–374, 2017. URL https://doi.org/10.1109/CVPR.2017.743.
  20. 3D object representations for fine-grained categorization. In Proceedings of the IEEE International Conference on Computer Cision Workshops, pp.  554–561, 2013. URL https://doi.org/10.1109/ICCVW.2013.77.
  21. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 2012. URL https://doi.org//10.1145/3065386.
  22. Hyperplane margin classifiers on the multinomial manifold. In Proceedings of the twenty-first international conference on Machine learning, pp.  66, 2004. URL https://doi.org/10.1145/1015330.1015333.
  23. Is second-order information helpful for large-scale visual recognition? In Proceedings of the IEEE International Conference on Computer Vision, pp.  2070–2078, 2017. URL https://doi.org/10.1109/ICCV.2017.228.
  24. Towards faster training of global covariance pooling networks by iterative matrix square root normalization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  947–955, 2018. URL https://doi.org/10.1109/CVPR.2018.00105.
  25. Improved bilinear pooling with CNNs. arXiv preprint arXiv:1707.06772, 2017. URL https://arxiv.org/abs/1707.06772.
  26. Bilinear CNN models for fine-grained visual recognition. In Proceedings of the IEEE International Conference on Computer Vision, pp.  1449–1457, 2015. URL https://doi.org/10.1109/ICCV.2015.170.
  27. Zhenhua Lin. Riemannian geometry of symmetric positive definite matrices via Cholesky decomposition. SIAM Journal on Matrix Analysis and Applications, 40(4):1353–1370, 2019. URL https://doi.org/10.1137/18M1221084.
  28. DARTS: Differentiable architecture search. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=S1eYHoC5FX.
  29. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  10012–10022, 2021. URL https://doi.org/10.1109/ICCV48922.2021.00986.
  30. Fine-grained visual classification of aircraft. arXiv preprint arXiv:1306.5151, 2013. URL https://arxiv.org/abs/1306.5151.
  31. Xuan Son Nguyen. Geomnet: A neural network based on Riemannian geometries of SPD matrix space and Cholesky space for 3D skeleton-based interaction recognition. In Proceedings of the IEEE International Conference on Computer Vision, pp.  13379–13389, 2021. URL https://doi.org/10.1109/ICCV48922.2021.01313.
  32. Xuan Son Nguyen. The Gyro-structure of some matrix manifolds. In Advances in Neural Information Processing Systems, volume 35, pp.  26618–26630, 2022a. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/a9ad92a81748a31ef6f2ef68d775da46-Paper-Conference.pdf.
  33. Xuan Son Nguyen. A Gyrovector space approach for symmetric positive semi-definite matrix learning. In Proceedings of the European Conference on Computer Vision, pp.  52–68, 2022b. URL https://doi.org/10.1007/978-3-031-19812-0_4.
  34. Building neural networks on matrix manifolds: A Gyrovector space approach. arXiv preprint arXiv:2305.04560, 2023. URL https://dl.acm.org/doi/10.5555/3618408.3619491.
  35. A Riemannian framework for tensor computing. International Journal of Computer Vision, 66(1):41–66, 2006. URL https://doi.org/10.1007/s11263-005-3222-z.
  36. Redro: Efficiently learning large-sized SPD visual representation. In European Conference on Computer Vision, pp.  1–17. Springer, 2020. URL https://doi.org/10.1007/978-3-030-58555-6_1.
  37. Learning partial correlation based deep visual representation for image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  6231–6240, 2023. URL https://doi.org/10.1109/CVPR52729.2023.00603.
  38. Why approximate matrix square root outperforms accurate SVD in global covariance pooling? In Proceedings of the IEEE International Conference on Computer Vision, pp.  1115–1123, 2021. URL https://doi.org/10.1109/ICCV48922.2021.00115.
  39. On the eigenvalues of global covariance pooling for fine-grained visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(3):3554–3566, 2022a. URL https://doi.org/10.1109/TPAMI.2022.3178802.
  40. Fast differentiable matrix square root and inverse square root. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(6):7367–7380, 2022b. URL https://doi.org/10.1109/TPAMI.2022.3216339.
  41. Fast differentiable matrix square root. In International Conference on Learning Representations, 2022c. URL https://openreview.net/forum?id=-AOEi-5VTU8.
  42. Improving covariance conditioning of the svd meta-layer by orthogonality. In European Conference on Computer Vision, pp.  356–372. Springer, 2022d. URL https://doi.org/10.1007/978-3-031-20053-3_21.
  43. Exploration of balanced metrics on symmetric positive definite matrices. In Geometric Science of Information: 4th International Conference, GSI 2019, Toulouse, France, August 27–29, 2019, Proceedings 4, pp.  484–493. Springer, 2019. URL https://doi.org/10.1007/978-3-030-26980-7_50.
  44. The geometry of mixed-Euclidean metrics on symmetric positive definite matrices. Differential Geometry and its Applications, 81:101867, 2022. URL https://doi.org/10.1016/j.difgeo.2022.101867.
  45. O (n)-invariant Riemannian metrics on SPD matrices. Linear Algebra and its Applications, 661:163–201, 2023. URL https://doi.org/10.1016/j.laa.2022.12.009.
  46. Training data-efficient image transformers & distillation through attention. In International Conference on Machine Learning, pp.  10347–10357. PMLR, 2021. URL https://proceedings.mlr.press/v139/touvron21a.html.
  47. G2DeNet: Global Gaussian distribution embedding network and its application to visual recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.  2730–2739, 2017. URL https://doi.org/10.1109/CVPR.2017.689.
  48. Deep CNNs meet global covariance pooling: Better representation and generalization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(8):2582–2597, 2020a. URL https://doi.org/10.1109/TPAMI.2020.2974833.
  49. What deep CNNs benefit from global covariance pooling: An optimization perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.  10771–10780, 2020b. URL https://doi.org/10.1109/CVPR42600.2020.01078.
  50. Dropcov: a simple yet effective method for improving deep architectures. Advances in Neural Information Processing Systems, 35:33576–33588, 2022a. URL https://openreview.net/forum?id=QLGuUwDx4S.
  51. Towards a deeper understanding of global covariance pooling in deep learning: An optimization perspective. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023. URL https://10.1109/TPAMI.2023.3321392.
  52. SymNet: A simple symmetric positive definite manifold deep learning method for image set classification. IEEE Transactions on Neural Networks and Learning Systems, 33(5):2208–2222, 2021. URL https://doi.org/10.1109/tnnls.2020.3044176.
  53. DreamNet: A deep Riemannian manifold network for SPD matrix learning. In Proceedings of the Asian Conference on Computer Vision, pp.  3241–3257, 2022b. URL https://doi.org/10.1007/978-3-031-26351-4_39.
  54. Learning a discriminative SPD manifold neural network for image set classification. Neural Networks, 151:94–110, 2022c. URL https://doi.org/10.1016/j.neunet.2022.03.012.
  55. Caltech-ucsd birds 200. 2010. URL https://www.florian-schroff.de/publications/CUB-200.pdf.
  56. Statistically-motivated second-order pooling. In Proceedings of the European Conference on Computer Vision, pp.  600–616, 2018. URL https://doi.org/10.1007/978-3-030-01234-2_37.
  57. Toward faster and simpler matrix normalization via rank-1 update. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XIX 16, pp.  203–219. Springer, 2020. URL https://doi.org/10.1007/978-3-030-58529-7_13.
  58. Tokens-to-token ViT: Training vision transformers from scratch on ImageNet. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp.  558–567, 2021. URL https://doi.org/10.1109/ICCV48922.2021.00060.
  59. Learning deep bilinear transformation for fine-grained image representation. Advances in Neural Information Processing Systems, 32, 2019. URL https://dl.acm.org/doi/10.5555/3454287.3454672.
  60. CCP-GNN: Competitive covariance pooling for improving graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, 2024. URL https://doi.org/10.1109/TNNLS.2024.3390249.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Ziheng Chen (30 papers)
  2. Yue Song (56 papers)
  3. Xiao-Jun Wu (114 papers)
  4. Gaowen Liu (60 papers)
  5. Nicu Sebe (270 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets