Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BEV-CV: Birds-Eye-View Transform for Cross-View Geo-Localisation (2312.15363v2)

Published 23 Dec 2023 in cs.CV and cs.LG

Abstract: Cross-view image matching for geo-localisation is a challenging problem due to the significant visual difference between aerial and ground-level viewpoints. The method provides localisation capabilities from geo-referenced images, eliminating the need for external devices or costly equipment. This enhances the capacity of agents to autonomously determine their position, navigate, and operate effectively in GNSS-denied environments. Current research employs a variety of techniques to reduce the domain gap such as applying polar transforms to aerial images or synthesising between perspectives. However, these approaches generally rely on having a 360{\deg} field of view, limiting real-world feasibility. We propose BEV-CV, an approach introducing two key novelties with a focus on improving the real-world viability of cross-view geo-localisation. Firstly bringing ground-level images into a semantic Birds-Eye-View before matching embeddings, allowing for direct comparison with aerial image representations. Secondly, we adapt datasets into application realistic format - limited Field-of-View images aligned to vehicle direction. BEV-CV achieves state-of-the-art recall accuracies, improving Top-1 rates of 70{\deg} crops of CVUSA and CVACT by 23% and 24% respectively. Also decreasing computational requirements by reducing floating point operations to below previous works, and decreasing embedding dimensionality by 33% - together allowing for faster localisation capabilities.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. “Beyond Cross-view Image Retrieval: Highly Accurate Vehicle Localization Using Satellite Image” In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022
  2. “Where Am I Looking At? Joint Location and Orientation Estimation by Cross-View Matching” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 4063–4071
  3. “CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization” arXiv, 2022
  4. “On the location dependence of convolutional neural network features” In 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2015, pp. 70–78
  5. “Learning deep representations for ground-to-aerial geolocalization” In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 5007–5015
  6. Scott Workman, Richard Souvenir and Nathan Jacobs “Wide-Area Image Geolocalization with Aerial Reference Imagery” In 2015 IEEE International Conference on Computer Vision (ICCV), 2015, pp. 3961–3969
  7. Nam N. Vo and James Hays “Localizing and Orienting Street Views Using Overhead Imagery” In European Conference on Computer Vision, 2016
  8. “CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization” In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 7258–7267
  9. “NetVLAD: CNN Architecture for Weakly Supervised Place Recognition” In IEEE Transactions on Pattern Analysis and Machine Intelligence 40, 2015, pp. 1437–1451
  10. “Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization” In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 6484–6493
  11. Sijie Zhu, Taojiannan Yang and Chen Chen “Revisiting Street-to-Aerial View Image Geo-localization and Orientation Estimation” In 2021 IEEE Winter Conference on Applications of Computer Vision (WACV), 2020, pp. 756–765
  12. Hongji Yang, Xiufan Lu and Ying J. Zhu “Cross-view Geo-localization with Layer-to-Layer Transformer” In Neural Information Processing Systems, 2021
  13. Sijie Zhu, Mubarak Shah and Chen Chen “TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization” In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 1152–1161
  14. “GEOCAPSNET: Ground to Aerial View Image Geo-Localization using Capsule Network” In 2019 IEEE International Conference on Multimedia and Expo (ICME), 2019, pp. 742–747
  15. “Lending Orientation to Neural Networks for Cross-View Geo-Localization” In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 5617–5626
  16. “Spatial-Aware Feature Aggregation for Image based Cross-View Geo-Localization” In Neural Information Processing Systems, 2019
  17. “Bridging the Domain Gap for Ground-to-Aerial Image Matching” In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 470–479
  18. “Optimal Feature Transport for Cross-View Image Geo-Localization” In ArXiv abs/1907.05021, 2019
  19. Chenyang Lu, M.J.G. Molengraft and Gijs Dubbelman “Monocular Semantic Occupancy Grid Mapping With Convolutional Variational Encoder–Decoder Networks” In IEEE Robotics and Automation Letters 4, 2018, pp. 445–452
  20. “Learning to Look around Objects for Top-View Representations of Outdoor Scenes” In European Conference on Computer Vision, 2018
  21. “Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks” In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 11135–11144
  22. “Enabling spatio-temporal aggregation in Birds-Eye-View Vehicle Estimation” In 2021 IEEE International Conference on Robotics and Automation (ICRA), 2021, pp. 5133–5139
  23. “Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation” In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 15531–15540
  24. “Translating Images into Maps” In 2022 International Conference on Robotics and Automation (ICRA), 2021, pp. 9200–9206
  25. “’The Pedestrian next to the Lamppost” Adaptive Object Graphs for Better Instantaneous Mapping” In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022, pp. 19506–19515
  26. “Uncertainty-aware Vision-based Metric Cross-view Geolocalization” In ArXiv abs/2211.12145, 2022
  27. Olaf Ronneberger, Philipp Fischer and Thomas Brox “U-Net: Convolutional Networks for Biomedical Image Segmentation” In ArXiv abs/1505.04597, 2015
  28. “A Simple Framework for Contrastive Learning of Visual Representations” In arXiv preprint arXiv:2002.05709, 2020
  29. “nuScenes: A multimodal dataset for autonomous driving” In CVPR, 2020
  30. Volodymyr Mnih “Machine Learning for Aerial Image Labeling”, 2013
  31. “Global Assists Local: Effective Aerial Representations for Field of View Constrained Image Geo-Localization” In 2022 IEEE Winter Conference on Applications of Computer Vision (WACV), 2022
  32. “Semantic understanding of scenes through the ade20k dataset” In International Journal on Computer Vision, 2018
  33. “PyTorch: An Imperative Style, High-Performance Deep Learning Library” In Advances in Neural Information Processing Systems 32 Curran Associates, Inc., 2019
  34. William Falcon and The PyTorch Lightning team “PyTorch Lightning”, 2019
Citations (1)

Summary

We haven't generated a summary for this paper yet.