F$^3$Loc: Fusion and Filtering for Floorplan Localization (2403.03370v1)
Abstract: In this paper we propose an efficient data-driven solution to self-localization within a floorplan. Floorplan data is readily available, long-term persistent and inherently robust to changes in the visual appearance. Our method does not require retraining per map and location or demand a large database of images of the area of interest. We propose a novel probabilistic model consisting of an observation and a novel temporal filtering module. Operating internally with an efficient ray-based representation, the observation module consists of a single and a multiview module to predict horizontal depth from images and fuses their results to benefit from advantages offered by either methodology. Our method operates on conventional consumer hardware and overcomes a common limitation of competing methods that often demand upright images. Our full system meets real-time requirements, while outperforming the state-of-the-art by a significant margin.
- Netvlad: Cnn architecture for weakly supervised place recognitio. In CVPR, pages 5297–5307, 2016.
- Relocnet: Continuous metric learning relocalisation using neural nets. In ECCV, pages 751–767, 2018.
- Robust lidar-based localization in architectural floor plans. In IROS, pages 3318–3324, 2017.
- A pose graph-based localization system for long-term navigation in cad floor plans. pages 84–97, 2019a.
- Robot localization in floor plans using a room layout edge extraction network. In IROS, pages 5291–5297, 2019b.
- Dsac-differentiable ransac for camera localization. In CVPR, pages 6684–6692, 2017.
- Deep stereo using adaptive thin volume representation with uncertainty awareness. In CVPR, pages 2524–2534, 2020.
- You are here: Mimicking the human thinking process in reading floor-plans. In ICCV, pages 2210–2218, 2015.
- Robert T Collins. A space-sweep approach to true multi-image matching. In CVPR, pages 358–363, 1996.
- Monte carlo localization for mobile robots. In ICRA, pages 1322–1328, 1999.
- The current state and future outlook of rescue robotics. Journal of Field Robotics, 36(7):1171–1191, 2019.
- Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In NeurIPS, pages 2650–2658, 2015.
- Depth map prediction from a single image using a multi-scale deep network. 2014.
- Unsupervised monocular depth estimation with left-right consistency. In CVPR, pages 270–279, 2017.
- Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
- Lalaloc++: Global floor plan comprehension for layout localisation in unvisited environments. In ECCV, pages 693–709, 2022.
- Lalaloc: Latent layout localisation in dynamic, unvisited environments. In ICCV, pages 10107–10116, 2021.
- W-rgb-d: floor-plan-based indoor global localization using a depth camera and wifi. In ICRA, pages 417–422, 2014.
- End-to-end learnable histogram filters. In Workshop on Deep Learning for Action and Interaction at NIPS, 2016.
- Particle filter networks with application to visual localization. In CoRL, pages 169–178, 2018.
- Posenet: A convolutional network for real-time 6-dof camera relocalization. In ICCV, pages 2938–2946, 2015.
- Imagenet classification with deep convolutional neural networks. In NeurIPS, 2012.
- Online localization with imprecise floor space maps using stochastic gradient descent. In IROS, pages 8571–8578.
- Efficient global 2d-3d matching for camera localization in a large-scale 3d map. In ICCV, pages 2372–2381, 2017.
- P-mvsnet: Learning patch-wise matching confidence aggregation for multi-view stereo. In ICCV, pages 10452–10461, 2019.
- Attention-aware multi-view stereo. In CVPR, pages 1590–1599, 2020.
- Sedar: Reading floorplans like a human—using deep learning to enable human-inspired localisation. IJCV, 128:1286–1310, 2020.
- Laser: Latent space rendering for 2d visual localization. In CVPR, pages 11122–11131, 2022.
- Rethinking depth estimation for multi-view stereo: A unified representation. In CVPR, pages 8645–8654, 2022.
- PointNet: Deep learning on point sets for 3d classification and segmentation. In CVPR, pages 652–660, 2017.
- Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer. IEEE TPAMI, 44(3):1623–1637, 2020.
- Vision transformers for dense prediction. In ICCV, pages 12179–12188, 2021.
- You are here: Geolocation by embedding maps and images. In ECCV, pages 502–518, 2020.
- From coarse to fine: Robust hierarchical localization at large scale. In CVPR, pages 12716–12725, 2019.
- Lamar: Benchmarking localization and mapping for augmented reality. In ECCV, pages 686–704, 2022.
- Orienternet: Visual localization in 2d public maps with neural matching. In CVPR, pages 21632–21642, 2023.
- Fast image-based localization using direct 2d-to-3d matching. In ICCV, pages 667–674, 2011.
- Improving image-based localization by active correspondence search. In ECCV, pages 752–765, 2012.
- Efficient & effective prioritized matching for large-scale image-based localization. PAMI, 39(9):1744–1756, 2016.
- City-scale location recognition. In CVPR, pages 1–7, 2007.
- igibson 1.0: A simulation environment for interactive tasks in large realistic scenes. In IROS.
- Scene coordinate regression forests for camera relocalization in rgb-d images. In CVPR, pages 2930–2937, 2013.
- DeepV2D: Video to depth with differentiable structure from motion. In ICLR, 2020.
- Exploiting uncertainty in regression forests for accurate camera relocalization. In CVPR, pages 4400–4408, 2015.
- The unscented particle filter. In NeurIPS, 2000.
- Attention is all you need. 2017.
- Image-based localization using lstms for structured feature correlation. In ICCV, pages 627–637, 2017.
- Glfp: Global localization from a floor plan. In IROS, pages 1627–1632, 2019.
- An introduction to the kalman filter. Technical Report 95-041, University of North Carolina at Chapel Hill, 1995.
- Delving deeper into convolutional neural networks for camera relocalization. In ICRA, pages 5644–5651, 2017.
- Visual cross-view metric localization with dense uncertainty estimates. In ECCV, pages 90–106, 2022.
- Mvsnet: Depth inference for unstructured multi-view stereo. In ECCV, pages 767–783, 2018.
- Recurrent mvsnet for high-resolution multi-view stereo depth inference. In CVPR, pages 5525–5534, 2019.
- Structured3d: A large photo-realistic dataset for structured 3d modeling. In ECCV, pages 519–535, 2020.
- Deeptam: Deep tracking and mapping. In ECCV, pages 822–838, 2018.
- Vigor: Cross-view image geo-localization beyond one-to-one retrieval. In CVPR, pages 3640–3649, 2021.