Spherical Feature Pyramid Networks For Semantic Segmentation (2307.02658v1)
Abstract: Semantic segmentation for spherical data is a challenging problem in machine learning since conventional planar approaches require projecting the spherical image to the Euclidean plane. Representing the signal on a fundamentally different topology introduces edges and distortions which impact network performance. Recently, graph-based approaches have bypassed these challenges to attain significant improvements by representing the signal on a spherical mesh. Current approaches to spherical segmentation exclusively use variants of the UNet architecture, meaning more successful planar architectures remain unexplored. Inspired by the success of feature pyramid networks (FPNs) in planar image segmentation, we leverage the pyramidal hierarchy of graph-based spherical CNNs to design spherical FPNs. Our spherical FPN models show consistent improvements over spherical UNets, whilst using fewer parameters. On the Stanford 2D-3D-S dataset, our models achieve state-of-the-art performance with an mIOU of 48.75, an improvement of 3.75 IoU points over the previous best spherical CNN.
- Pyramid Methods in Image Processing.
- Joint 2D-3D-Semantic Data for Indoor Scene Understanding. ArXiv e-prints.
- Icosahedral Discretization of the Two-Sphere. SIAM Journal on Numerical Analysis, 22(6): 1107–1115.
- SURF: Speeded Up Robust Features. ECCV.
- Matterport3D: Learning from RGB-D Data in Indoor Environments.
- Efficient Generalized Spherical CNNs. arXiv:2010.11661.
- Spherical CNNs. arXiv:1801.10130.
- Gauge Equivariant Convolutional Networks and the Icosahedral CNN. arXiv:1902.04615.
- Crane, K. 2015. Discrete Differential Geometry: An applied introduction.
- Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), volume 1, 886–893 vol. 1.
- DeepSphere: a graph-based spherical CNN. arXiv:2012.15000.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, 248–255. Ieee.
- A fast and accurate algorithm for spherical harmonic analysis on HEALPix grids with applications to the cosmic microwave background radiation. Journal of Computational Physics, 416: 109544.
- Learning SO(3) Equivariant Representations with Spherical CNNs. arXiv:1711.06721.
- Fixation prediction for 360 video streaming in head-mounted virtual reality. In Proceedings of the 27th Workshop on Network and Operating Systems Support for Digital Audio and Video, 67–72.
- Discriminative analysis of the human cortex using spherical CNNs - a study on Alzheimer’s disease diagnosis.
- Vision meets robotics: The KITTI dataset. The International Journal of Robotics Research, 32(11): 1231–1237.
- HEALPix: A Framework for High‐Resolution Discretization and Fast Analysis of Data Distributed on the Sphere. The Astrophysical Journal, 622(2): 759–771.
- QM7-X, a comprehensive dataset of quantum-mechanical properties spanning the chemical space of small organic molecules. Scientific Data, 8(1).
- libigl: A simple C++ geometry processing library. Https://libigl.github.io/.
- Spherical CNNs on Unstructured Grids. arXiv:1901.02039.
- Room layout estimation with object and material attributes information using a spherical camera. In 2016 Fourth International Conference on 3D Vision (3DV), 519–527. IEEE.
- Panoptic Feature Pyramid Networks. arXiv:1901.02446.
- Clebsch-Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network. arXiv:1806.09231.
- Feature Pyramid Networks for Object Detection. arXiv:1612.03144.
- Microsoft COCO: Common Objects in Context. Cite arxiv:1405.0312Comment: 1) updated annotation pipeline description and figures; 2) added new section describing datasets splits; 3) updated author list.
- SSD: Single Shot MultiBox Detector. In Computer Vision – ECCV 2016, 21–37. Springer International Publishing.
- Lowe, D. G. 2004. Distinctive Image Features from Scale-Invariant Keypoints. International Journal of Computer Vision, 60(2): 91–110.
- Scattering Networks on the Sphere for Scalable and Rotationally Equivariant Spherical CNNs. arXiv:2102.02828.
- Segmenting and Tracking Extreme Climate Events using Neural Networks.
- Description of the NCAR Community Atmosphere Model (CAM 5.0).
- DeepSphere: Efficient spherical convolutional neural network with HEALPix sampling for cosmological applications. Astronomy and Computing, 27: 130–146.
- OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks.
- Equivariant Networks for Pixelized Spheres.
- PDO-eS2CNNs: Partial Differential Operator Based Equivariant Spherical CNNs. arXiv:2104.03584.
- Learning spherical convolution for fast features from 360 imagery. Advances in Neural Information Processing Systems, 30.
- EfficientDet: Scalable and Efficient Object Detection.
- Large Scale Business Discovery from Street Level Imagery. arXiv:1512.05430.
- Multi Receptive Field Network for Semantic Segmentation. CoRR, abs/2011.08577.
- TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios.