Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ConGeo: Robust Cross-view Geo-localization across Ground View Variations (2403.13965v2)

Published 20 Mar 2024 in cs.CV

Abstract: Cross-view geo-localization aims at localizing a ground-level query image by matching it to its corresponding geo-referenced aerial view. In real-world scenarios, the task requires accommodating diverse ground images captured by users with varying orientations and reduced field of views (FoVs). However, existing learning pipelines are orientation-specific or FoV-specific, demanding separate model training for different ground view variations. Such models heavily depend on the North-aligned spatial correspondence and predefined FoVs in the training data, compromising their robustness across different settings. To tackle this challenge, we propose ConGeo, a single- and cross-view Contrastive method for Geo-localization: it enhances robustness and consistency in feature representations to improve a model's invariance to orientation and its resilience to FoV variations, by enforcing proximity between ground view variations of the same location. As a generic learning objective for cross-view geo-localization, when integrated into state-of-the-art pipelines, ConGeo significantly boosts the performance of three base models on four geo-localization benchmarks for diverse ground view variations and outperforms competing methods that train separate models for each ground view variation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (41)
  1. Cross-view image geolocalization. In IEEE Conference on Computer Vision and Pattern Recognition, pages 891–898, 2013.
  2. Where am I looking at? joint location and orientation estimation by cross-view matching. In IEEE Conference on Computer Vision and Pattern Recognition, pages 4064–4072, 2020.
  3. Spatial-aware feature aggregation for image based cross-view geo-localization. Advances in Neural Information Processing Systems, 32, 2019.
  4. TransGeo: Transformer is all you need for cross-view image geo-localization. In IEEE Conference on Computer Vision and Pattern Recognition, pages 1162–1171, 2022.
  5. Sample4Geo: Hard negative sampling for cross-view geo-localisation. In IEEE International Conference on Computer Vision, pages 16847–16856, 2023.
  6. Accurate image localization based on google maps street view. In European Conference on Computer Vision, pages 255–268, 2010.
  7. Cross-view policy learning for street navigation. In IEEE Conference on Computer Vision and Pattern Recognition, pages 8100–8109, 2019.
  8. Uncertainty-aware vision-based metric cross-view geolocalization. In IEEE Conference on Computer Vision and Pattern Recognition, pages 21621–21631, 2023.
  9. Wide-area image geolocalization with aerial reference imagery. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3961–3969, 2015.
  10. Lending orientation to neural networks for cross-view geo-localization. In IEEE Conference on Computer Vision and Pattern Recognition, pages 5624–5633, 2019.
  11. VIGOR: Cross-view image geo-localization beyond one-to-one retrieval. In IEEE Conference on Computer Vision and Pattern Recognition, pages 3640–3649, 2021.
  12. Simple, effective and general: A new backbone for cross-view image geo-localization. arXiv preprint arXiv:2302.01572, 2023.
  13. Soft exemplar highlighting for cross-view image-based geo-localization. IEEE Transactions on Image Processing, 31:2094–2105, 2022.
  14. Signature verification using a “siamese" time delay neural network. Advances in Neural Information Processing Systems, 6, 1993.
  15. A convnet for the 2020s. In IEEE Conference on Computer Vision and Pattern Recognition, pages 11976–11986, 2022.
  16. Revisiting street-to-aerial view image geo-localization and orientation estimation. In IEEE Workshops on Applications of Computer Vision, pages 756–765, 2021.
  17. Global assists local: Effective aerial representations for field of view constrained image geo-localization. In IEEE Workshops on Applications of Computer Vision, pages 3871–3879, 2022.
  18. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning, 2020.
  19. Momentum contrast for unsupervised visual representation learning. In IEEE Conference on Computer Vision and Pattern Recognition, pages 9729–9738, 2020.
  20. Supervised contrastive learning. Advances in Neural Information Processing Systems, 2020.
  21. What makes for good views for contrastive learning? Advances in Neural Information Processing Systems, 33:6827–6839, 2020.
  22. Contrastive learning based hybrid networks for long-tailed image classification. In IEEE Conference on Computer Vision and Pattern Recognition, pages 943–952, 2021.
  23. Detco: Unsupervised contrastive learning for object detection. In IEEE International Conference on Computer Vision, pages 8392–8401, 2021.
  24. Learning to contrast the counterfactual samples for robust visual question answering. In EMNLP, pages 3285–3292, 2020.
  25. CVM-Net: Cross-view matching network for image-based ground-to-aerial geo-localization. In IEEE Conference on Computer Vision and Pattern Recognition, pages 7258–7267, 2018.
  26. Computational optimal transport: With applications to data science. Foundations and Trends in Machine Learning, 11(5-6):355–607, 2019.
  27. Predicting ground-level scene layout from aerial imagery. In IEEE Conference on Computer Vision and Pattern Recognition, pages 867–875, 2017.
  28. University-1652: A multi-view multi-source benchmark for drone-based geo-localization. In ACM International Conference on Multimedia, pages 1395–1403, 2020.
  29. Optimal feature transport for cross-view image geo-localization. In AAAI Conference on Artificial Intelligence, volume 34, pages 11990–11997, 2020.
  30. Cross-view geo-localization with layer-to-layer transformer. Advances in Neural Information Processing Systems, 34:29009–29020, 2021.
  31. Cross-view geo-localization via learning disentangled geometric layout correspondence. In AAAI Conference on Artificial Intelligence, volume 37, pages 3480–3488, 2023.
  32. Visual cross-view metric localization with dense uncertainty estimates. In European Conference on Computer Vision, pages 90–106, 2022.
  33. Each part matters: Local patterns facilitate cross-view geo-localization. IEEE Transactions on Circuits and Systems for Video Technology, 32(2):867–879, 2021.
  34. Robust image geolocalization. Technical report, 2023.
  35. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101, 2017.
  36. ASAM: Adaptive sharpness-aware minimization for scale-invariant learning of deep neural networks. In International Conference on Machine Learning, pages 5905–5914, 2021.
  37. Swin transformer: Hierarchical vision transformer using shifted windows. In IEEE International Conference on Computer Vision, pages 10012–10022, 2021.
  38. Coming down to Earth: Satellite-to-street view synthesis for geo-localization. In IEEE Conference on Computer Vision and Pattern Recognition, pages 6488–6497, 2021.
  39. Jacob Gildenblat and contributors. Pytorch library for cam methods. https://github.com/jacobgil/pytorch-grad-cam, 2021.
  40. Benchmarking neural network robustness to common corruptions and perturbations. In International Conference on Learning Representations, 2019.
  41. Content-based unrestricted adversarial attack. Advances in Neural Information Processing Systems, 36, 2024.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Li Mi (7 papers)
  2. Chang Xu (323 papers)
  3. Javiera Castillo-Navarro (6 papers)
  4. Syrielle Montariol (22 papers)
  5. Wen Yang (185 papers)
  6. Antoine Bosselut (85 papers)
  7. Devis Tuia (81 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.