Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Efficient 3D affinely equivariant CNNs with adaptive fusion of augmented spherical Fourier-Bessel bases (2402.16825v4)

Published 26 Feb 2024 in cs.CV

Abstract: Filter-decomposition-based group equivariant convolutional neural networks (CNNs) have shown promising stability and data efficiency for 3D image feature extraction. However, these networks, which rely on parameter sharing and discrete transformation groups, often underperform in modern deep neural network architectures for processing volumetric images, such as the common 3D medical images. To address these limitations, this paper presents an efficient non-parameter-sharing continuous 3D affine group equivariant neural network for volumetric images. This network uses an adaptive aggregation of Monte Carlo augmented spherical Fourier-Bessel filter bases to improve the efficiency and flexibility of 3D group equivariant CNNs for volumetric data. Unlike existing methods that focus only on angular orthogonality in filter bases, the introduced spherical Bessel Fourier filter base incorporates both angular and radial orthogonality to improve feature extraction. Experiments on four medical image segmentation datasets show that the proposed methods achieve better affine group equivariance and superior segmentation accuracy than existing 3D group equivariant convolutional neural network layers, significantly improving the training stability and data efficiency of conventional CNN layers (at 0.05 significance level). The code is available at https://github.com/ZhaoWenzhao/WMCSFB.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (37)
  1. Kunihiko Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4):193–202, 1980.
  2. Backpropagation applied to handwritten zip code recognition. Neural computation, 1(4):541–551, 1989.
  3. nnu-net: a self-configuring method for deep learning-based biomedical image segmentation. Nature methods, 18(2):203–211, 2021.
  4. Unetr: Transformers for 3d medical image segmentation. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 574–584, 2022.
  5. Mednext: transformer-driven scaling of convnets for medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 405–415. Springer, 2023.
  6. Unetr++: delving into efficient and accurate 3d medical image segmentation. arXiv preprint arXiv:2212.04497, 2022.
  7. Are large kernels better teachers than transformers for convnets? arXiv preprint arXiv:2305.19412, 2023.
  8. Beyond self-attention: Deformable large kernel attention for medical image segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1287–1297, 2024.
  9. Scaling up 3d kernels with bayesian frequency re-parameterization for medical image segmentation. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 632–641. Springer, 2023.
  10. On the generalization of equivariance and convolution in neural networks to the action of compact groups. In International Conference on Machine Learning, pages 2747–2755. PMLR, 2018.
  11. Dcfnet: Deep neural network with decomposed convolutional filters. In International Conference on Machine Learning, pages 4198–4207. PMLR, 2018.
  12. Steerable cnns. arXiv preprint arXiv:1612.08498, 2016a.
  13. Understanding image representations by measuring their equivariance and equivalence. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 991–999, 2015.
  14. Data augmentation vs. equivariant networks: A theory of generalization on dynamics forecasting. arXiv preprint arXiv:2206.09450, 2022.
  15. Revisiting data augmentation for rotational invariance in convolutional neural networks. In Modelling and Simulation in Management Sciences: Proceedings of the International Conference on Modelling and Simulation in Management Sciences (MS-18), pages 127–141. Springer, 2020.
  16. Group equivariant convolutional networks. In International conference on machine learning, pages 2990–2999. PMLR, 2016b.
  17. Scale equivariant neural networks with morphological scale-spaces. In Discrete Geometry and Mathematical Morphology: First International Joint Conference, DGMM 2021, Uppsala, Sweden, May 24–27, 2021, Proceedings, pages 483–495. Springer, 2021.
  18. Scale-equivariant steerable networks. arXiv preprint arXiv:1910.11093, 2019.
  19. Disco: accurate discrete scale convolutions. arXiv preprint arXiv:2106.02733, 2021.
  20. Scaling-translation-equivariant networks with decomposed convolutional filters. Journal of machine learning research, 23(68):1–45, 2022.
  21. Deformation robust roto-scale-translation equivariant cnns. arXiv preprint arXiv:2111.10978, 2021.
  22. Adaptive aggregation of monte carlo augmented decomposed filters for efficient group-equivariant convolutional neural network. arXiv preprint arXiv:2305.10110, 2023.
  23. Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv preprint arXiv:1802.08219, 2018.
  24. E2pn: Efficient se (3)-equivariant point network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1223–1232, 2023.
  25. Implicit convolutional kernels for steerable cnns. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  26. Rotation-equivariant quaternion neural networks for 3d point cloud processing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024.
  27. 3d steerable cnns: Learning rotationally equivariant features in volumetric data. Advances in Neural Information Processing Systems, 31, 2018.
  28. A program to build e (n)-equivariant steerable cnns. In International Conference on Learning Representations, 2021.
  29. Orthogonality-promoting distance metric learning: convex relaxation and theoretical analysis. In International Conference on Machine Learning, pages 5403–5412. PMLR, 2018.
  30. Gaussian random fields in spherical coordinates. Monthly Notices of the Royal Astronomical Society, 249(4):678–683, 1991.
  31. Wiener reconstruction of density, velocity and potential fields from all-sky galaxy redshift surveys. Monthly Notices of the Royal Astronomical Society, 272(4):885–908, 1995.
  32. Miccai multi-atlas labeling beyond the cranial vault–workshop and challenge. In Proc. MICCAI Multi-Atlas Labeling Beyond Cranial Vault—Workshop Challenge, volume 5, page 12, 2015.
  33. Deeporgan: Multi-level deep convolutional networks for automated pancreas segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part I 18, pages 556–564. Springer, 2015.
  34. Reza Azad. The nih pancreas dataset, 2024. URL https://github.com/xmindflow/deformableLKA/tree/main/3D.
  35. 3d ux-net: A large kernel volumetric convnet modernizing hierarchical transformer for medical image segmentation. arXiv preprint arXiv:2209.15076, 2022.
  36. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492–1500, 2017.
  37. A convnet for the 2020s. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 11976–11986, 2022.

Summary

We haven't generated a summary for this paper yet.