Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning Monocular Visual Odometry via Self-Supervised Long-Term Modeling (2007.10983v1)

Published 21 Jul 2020 in cs.CV

Abstract: Monocular visual odometry (VO) suffers severely from error accumulation during frame-to-frame pose estimation. In this paper, we present a self-supervised learning method for VO with special consideration for consistency over longer sequences. To this end, we model the long-term dependency in pose prediction using a pose network that features a two-layer convolutional LSTM module. We train the networks with purely self-supervised losses, including a cycle consistency loss that mimics the loop closure module in geometric VO. Inspired by prior geometric systems, we allow the networks to see beyond a small temporal window during training, through a novel a loss that incorporates temporally distant (e.g., O(100)) frames. Given GPU memory constraints, we propose a stage-wise training mechanism, where the first stage operates in a local time window and the second stage refines the poses with a "global" loss given the first stage features. We demonstrate competitive results on several standard VO datasets, including KITTI and TUM RGB-D.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Yuliang Zou (11 papers)
  2. Pan Ji (53 papers)
  3. Quoc-Huy Tran (18 papers)
  4. Jia-Bin Huang (106 papers)
  5. Manmohan Chandraker (108 papers)
Citations (63)

Summary

We haven't generated a summary for this paper yet.