Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adapting CNNs for Fisheye Cameras without Retraining (2404.08187v1)

Published 12 Apr 2024 in cs.CV and cs.RO

Abstract: The majority of image processing approaches assume images are in or can be rectified to a perspective projection. However, in many applications it is beneficial to use non conventional cameras, such as fisheye cameras, that have a larger field of view (FOV). The issue arises that these large-FOV images can't be rectified to a perspective projection without significant cropping of the original image. To address this issue we propose Rectified Convolutions (RectConv); a new approach for adapting pre-trained convolutional networks to operate with new non-perspective images, without any retraining. Replacing the convolutional layers of the network with RectConv layers allows the network to see both rectified patches and the entire FOV. We demonstrate RectConv adapting multiple pre-trained networks to perform segmentation and detection on fisheye imagery from two publicly available datasets. Our approach requires no additional data or training, and operates directly on the native image as captured from the camera. We believe this work is a step toward adapting the vast resources available for perspective images to operate across a broad range of camera geometries.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (26)
  1. “Woodscape: A multi-task, multi-camera fisheye dataset for autonomous driving,” in Int. Conf. on Computer Vision, 2019, pp. 9308–9318.
  2. Y.-C. Su and K. Grauman, “Learning spherical convolution for fast features from 360 imagery,” Advances in Neural Information Processing Systems, vol. 30, 2017.
  3. “A generic fisheye camera model for robotic applications,” in Intelligent Robots and Systems. IEEE, 2007, pp. 1683–1688.
  4. “Deformable convolutional networks,” in Int. Conf. on Computer Vision, 2017, pp. 764–773.
  5. “Generalized object detection on fisheye cameras for autonomous driving: Dataset, representations and baseline,” in Winter Conf. on Applications of Computer Vision, 2021, pp. 2272–2280.
  6. “Pass: Panoramic annular semantic segmentation,” Transactions on Intelligent Transportation Systems, vol. 21, no. 10, pp. 4171–4185, 2019.
  7. “Universal semantic segmentation for fisheye urban driving images,” in Systems, Man, and Cybernetics. IEEE, 2020, pp. 648–655.
  8. S. Kim and S.-Y. Park, “Expandable spherical projection and feature concatenation methods for real-time road object detection using fisheye image,” Applied Sciences, vol. 12, no. 5, pp. 2403, 2022.
  9. “Fisheye multiple object tracking by learning distortions without dewarping,” in Int. Conf. on Image Processing. IEEE, 2023, pp. 1855–1859.
  10. “Spatial transformer networks,” Advances in Neural Information Processing Systems, vol. 28, 2015.
  11. Y. Jeon and J. Kim, “Active convolution: Learning the shape of convolution for image classification,” in Computer Vision and Pattern Recog., 2017, pp. 4201–4209.
  12. “Deformable convnets v2: More deformable, better results,” in Computer Vision and Pattern Recog., 2019, pp. 9308–9316.
  13. “CAM-Convs: Camera-aware multi-scale convolutions for single-view depth,” in Computer Vision and Pattern Recog., 2019, pp. 11826–11835.
  14. “Spherenet: Learning spherical representations for detection and classification in omnidirectional images,” in European Conf. on Computer Vision, 2018, pp. 518–533.
  15. “Learning so (3) equivariant representations with spherical cnns,” in European Conf. on Computer Vision, 2018, pp. 52–68.
  16. “Bending reality: Distortion-aware transformers for adapting to panoramic semantic segmentation,” in Computer Vision and Pattern Recog., 2022, pp. 16917–16927.
  17. “Spherical CNNs,” in Int. Conf. on Learning Representations, 2018.
  18. Y.-C. Su and K. Grauman, “Kernel transformer networks for compact spherical convolution,” in Computer Vision and Pattern Recog., 2019, pp. 9442–9451.
  19. “Distortion-aware convolutional filters for dense prediction in panoramic images,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 707–722.
  20. “Fully convolutional networks for semantic segmentation,” in Computer Vision and Pattern Recog., 2015, pp. 3431–3440.
  21. “Robust people indoor localization with omnidirectional cameras using a grid of spatial-aware classifiers,” Signal Processing: Image Communication, vol. 93, pp. 116135, 2021.
  22. “Deep residual learning for image recognition,” in Computer Vision and Pattern Recog., 2016, pp. 770–778.
  23. “Rethinking atrous convolution for semantic image segmentation,” arXiv preprint arXiv:1706.05587, 2017.
  24. “Fcos: Fully convolutional one-stage object detection,” in Int. Conf. on Computer Vision, 2019, pp. 9627–9636.
  25. “The cityscapes dataset for semantic urban scene understanding,” in Computer Vision and Pattern Recog., 2016, pp. 3213–3223.
  26. “The pascal visual object classes (VOC) challenge,” Int. journal of computer vision, vol. 88, pp. 303–338, 2010.

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com