Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

NocPlace: Nocturnal Visual Place Recognition via Generative and Inherited Knowledge Transfer (2402.17159v2)

Published 27 Feb 2024 in cs.CV

Abstract: Visual Place Recognition (VPR) is crucial in computer vision, aiming to retrieve database images similar to a query image from an extensive collection of known images. However, like many vision tasks, VPR always degrades at night due to the scarcity of nighttime images. Moreover, VPR needs to address the cross-domain problem of night-to-day rather than just the issue of a single nighttime domain. In response to these issues, we present NocPlace, which leverages generative and inherited knowledge transfer to embed resilience against dazzling lights and extreme darkness in the global descriptor. First, we establish a day-night urban scene dataset called NightCities, capturing diverse lighting variations and dark scenarios across 60 cities globally. Then, an image generation network is trained on this dataset and processes a large-scale VPR dataset, obtaining its nighttime version. Finally, VPR models are fine-tuned using descriptors inherited from themselves and night-style images, which builds explicit cross-domain contrastive relationships. Comprehensive experiments on various datasets demonstrate our contributions and the superiority of NocPlace. Without adding any real-time computing resources, NocPlace improves the performance of Eigenplaces by 7.6% on Tokyo 24/7 Night and 16.8% on SVOX Night.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (71)
  1. Gsv-cities: Toward appropriate supervised visual place recognition. Neurocomputing, 513:194–203, 2022.
  2. Mixvpr: Feature mixing for visual place recognition. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2998–3007, 2023.
  3. Google street view: Capturing the world at street level. Computer, 43(6):32–38, 2010.
  4. Night-to-day image translation for retrieval-based localization. In 2019 International Conference on Robotics and Automation (ICRA), pages 5958–5964. IEEE, 2019.
  5. Netvlad: Cnn architecture for weakly supervised place recognition. IEEE Transactions on Pattern Analysis & Machine Intelligence, 40(06):1437–1451, 2018.
  6. Are local features all you need for cross-domain visual place recognition? In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6154–6164, 2023.
  7. One-sided unsupervised domain mapping. Advances in neural information processing systems, 30, 2017.
  8. Viewpoint invariant dense matching for visual geolocalization. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 12169–12178, 2021.
  9. Rethinking visual geo-localization for large-scale applications. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4878–4888, 2022a.
  10. Deep visual geo-localization benchmark. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5396–5407, 2022b.
  11. Eigenplaces: Training viewpoint robust models for visual place recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 11080–11090, 2023.
  12. Unifying deep local and global features for image search. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16, pages 726–743. Springer, 2020.
  13. Stargan v2: Diverse image synthesis for multiple domains. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8188–8197, 2020.
  14. Multitask aet with orthogonal tangent regularity for dark object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2553–2562, 2021.
  15. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pages 224–236, 2018.
  16. Geometry-consistent generative adversarial networks for one-sided unsupervised domain mapping. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2427–2436, 2019.
  17. Learning to reduce scale differences for large-scale invariant image matching. IEEE Transactions on Circuits and Systems for Video Technology, 2022.
  18. Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
  19. Self-supervising fine-grained region similarities for large-scale image localization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16, pages 369–386. Springer, 2020.
  20. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013.
  21. Zero-reference deep curve estimation for low-light image enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1780–1789, 2020.
  22. Lime: Low-light image enhancement via illumination map estimation. IEEE Transactions on image processing, 26(2):982–993, 2016.
  23. Image matching using local symmetry features. In 2012 IEEE conference on computer vision and pattern recognition, pages 206–213. IEEE, 2012.
  24. Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14141–14152, 2021.
  25. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
  26. Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
  27. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
  28. Progressive growing of gans for improved quality, stability, and variation. In International Conference on Learning Representations, 2018.
  29. Transient attributes for high-level understanding and editing of outdoor scenes. ACM Transactions on Graphics (proceedings of SIGGRAPH), 33(4), 2014.
  30. LightGlue: Local Feature Matching at Light Speed. In ICCV, 2023.
  31. Stochastic attraction-repulsion embedding for large scale image localization. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 2570–2579, 2019.
  32. Visual place recognition: A survey. ieee transactions on robotics, 32(1):1–19, 2015.
  33. Aanet: Aggregation and alignment network with semi-hard positive sample mining for hierarchical place recognition. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 11771–11778. IEEE, 2023.
  34. 1 year, 1000 km: The oxford robotcar dataset. The International Journal of Robotics Research, 36(1):3–15, 2017.
  35. Mapping a suburb with a single camera using a biologically inspired slam system. IEEE Transactions on Robotics, 24(5):1038–1053, 2008.
  36. Seqslam: Visual route-based navigation for sunny summer days and stormy winter nights. In 2012 IEEE international conference on robotics and automation, pages 1643–1649. IEEE, 2012.
  37. Large-scale landmark retrieval/recognition under a noisy and diverse dataset. ArXiv, 2019.
  38. Contrastive learning for unpaired image-to-image translation. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16, pages 319–345. Springer, 2020.
  39. Attentional pyramid pooling of salient visual residuals for place recognition. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 885–894, 2021.
  40. Structure-aware feature disentanglement with knowledge transfer for appearance-changing place recognition. IEEE Transactions on Neural Networks and Learning Systems, 2021.
  41. Fine-tuning cnn image retrieval with no human annotation. IEEE transactions on pattern analysis and machine intelligence, 41(7):1655–1668, 2018.
  42. From coarse to fine: Robust hierarchical localization at large scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12716–12725, 2019.
  43. Superglue: Learning feature matching with graph neural networks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4938–4947, 2020.
  44. Lamar: Benchmarking localization and mapping for augmented reality. In Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part VII, pages 686–704. Springer, 2022.
  45. Are large-scale 3d models really necessary for accurate visual localization? In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1637–1646, 2017.
  46. Benchmarking 6dof outdoor visual localization in changing conditions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8601–8610, 2018.
  47. Learning from simulated and unsupervised images through adversarial training. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2107–2116, 2017.
  48. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations (ICLR 2015). Computational and Biological Learning Society, 2015.
  49. Are we there yet? challenging seqslam on a 3000 km journey across all four seasons. In Proc. of workshop on long-term autonomy, IEEE international conference on robotics and automation (ICRA), page 2013, 2013.
  50. Long-term visual localization revisited. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(4):2074–2088, 2020.
  51. Visual place recognition with repetitive structures. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(11):2346–2359, 2015.
  52. 24/7 place recognition by view synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence, page 14, 2017.
  53. Attention is all you need. Advances in neural information processing systems, 30, 2017.
  54. A perspective view and survey of meta-learning. Artificial intelligence review, 18:77–95, 2002.
  55. Cosface: Large margin cosine loss for deep face recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5265–5274, 2018.
  56. Transvpr: Transformer-based place recognition with multi-level attention aggregation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13648–13657, 2022.
  57. Instance-wise hard negative example generation for contrastive learning in unpaired image-to-image translation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 14020–14029, 2021.
  58. Mapillary street-level sequences: A dataset for lifelong place recognition. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2626–2635, 2020.
  59. Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560, 2018.
  60. Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval. In Proc. CVPR, 2020.
  61. Deep supervised hashing with similar hierarchy for place recognition. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3781–3786. IEEE, 2019.
  62. Reliability of gan generated data to train and validate perception systems for autonomous vehicles. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 171–180, 2021.
  63. Condition-invariant and compact visual place description by convolutional autoencoder. Robotica, 41(6):1718–1732, 2023.
  64. Dualgan: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE International Conference on Computer Vision, pages 2849–2857, 2017.
  65. Vpr-bench: An open-source visual place recognition evaluation framework with quantifiable viewpoint and appearance change. International Journal of Computer Vision, 129(7):2136–2174, 2021.
  66. Robust visual knowledge transfer via extreme learning machine-based domain adaptation. IEEE Transactions on Image Processing, 25(10):4959–4973, 2016.
  67. Dual illumination estimation for robust exposure correction. In Computer Graphics Forum, pages 243–252. Wiley Online Library, 2019.
  68. A survey on multi-task learning. IEEE Transactions on Knowledge and Data Engineering, 34(12):5586–5609, 2021.
  69. Self-supervised image enhancement network: Training with low light images only. arXiv preprint arXiv:2002.11300, 2020.
  70. Egsde: Unpaired image-to-image translation via energy-guided stochastic differential equations. Advances in Neural Information Processing Systems, 35:3609–3623, 2022.
  71. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 2223–2232, 2017.

Summary

We haven't generated a summary for this paper yet.