Warp Consistency for Unsupervised Learning of Dense Correspondences (2104.03308v3)

Published 7 Apr 2021 in cs.CV

Abstract: The key challenge in learning dense correspondences lies in the lack of ground-truth matches for real image pairs. While photometric consistency losses provide unsupervised alternatives, they struggle with large appearance changes, which are ubiquitous in geometric and semantic matching tasks. Moreover, methods relying on synthetic training pairs often suffer from poor generalisation to real data. We propose Warp Consistency, an unsupervised learning objective for dense correspondence regression. Our objective is effective even in settings with large appearance and view-point changes. Given a pair of real images, we first construct an image triplet by applying a randomly sampled warp to one of the original images. We derive and analyze all flow-consistency constraints arising between the triplet. From our observations and empirical results, we design a general unsupervised objective employing two of the derived constraints. We validate our warp consistency loss by training three recent dense correspondence networks for the geometric and semantic matching tasks. Our approach sets a new state-of-the-art on several challenging benchmarks, including MegaDepth, RobotCar and TSS. Code and models are at github.com/PruneTruong/DenseMatching.

Citations (42)

View on Semantic Scholar

Summary

The paper introduces a warp consistency loss that effectively enables unsupervised dense correspondence learning without relying on ground-truth data.
It constructs image triplets with random warps and flow consistency constraints to maintain robustness under significant appearance and viewpoint variations.
Experimental evaluations show a marked performance improvement, including an 18.2% increase in PCK-5 compared to state-of-the-art methods.

Overview of "Warp Consistency for Unsupervised Learning of Dense Correspondences"

The paper "Warp Consistency for Unsupervised Learning of Dense Correspondences" by Truong et al. introduces an innovative approach to learning dense correspondences between image pairs without relying on ground-truth matches. This is achieved through a warp consistency loss that leverages warp-based transformations to handle large appearance and viewpoint changes typically challenging for unsupervised learning methods.

Key Contributions

The challenge in dense correspondence learning lies in the scarcity of ground-truth data. While photometric consistency losses provide an unsupervised solution, their effectiveness diminishes under substantial appearance variations. Existing methods utilizing synthetic training pairs often fail to generalize to real-world data.

This work proposes a novel warp consistency loss that produces robust correspondence estimations even with significant appearance and viewpoint alterations. The approach constructs an image triplet by applying a random warp to an image pair, then derives flow consistency constraints to formulate an unsupervised learning objective.

Methodology

Warp Triplet Construction: From a real image pair, a warped image is generated to form a triplet. A flow field created through random transformations such as homographies and TPS is applied to one image, producing the triplet.
Warp Consistency Graph: The warp consistency graph encompasses all possible flow consistency constraints derived from the image triplet. The $W$ -bipath consistency loss emerges from these constraints, avoiding degenerate solutions and enabling unsupervised learning.
Unsupervised Learning Objective: The objective combines the $W$ -bipath loss and warp-supervision loss. The former provides realistic supervision, while the latter accelerates convergence. Visibility masks further refine the loss by focusing on non-occluded regions.
Adaptive Balancing: The losses are balanced adaptively to ensure appropriate scaling without manual tuning.

Empirical Evaluation

The proposed method significantly enhances performance across several benchmarks:

Geometric Matching: WarpC-GLU-Net outperformed state-of-the-art methods like GLU-Net and RANSAC-Flow on datasets such as MegaDepth and RobotCar, demonstrating superior robustness to large appearance variations.
Semantic Matching: WarpC-SemanticGLU-Net showed substantial improvements on TSS and PF-Pascal datasets, indicating the utility of the approach in addressing intra-class variations in semantic correspondence tasks.

Numerical Results

The method reported gains such as a $+18.2\%$ increase in PCK-5 for GLU-Net on MegaDepth. It consistently outperforms alternatives employing photometric consistency or pure warp-supervision, showing better generalization capabilities.

Implications and Future Work

The technique innovatively circumvents the dependence on synthetic datasets, thus improving generalizability to real-world scenarios. The framework's adaptability to different network architectures and tasks highlights its potential extensibility to other correspondence-related applications in computer vision, such as optical flow.

Future research could explore enhancing the architectural components or integrating additional constraints to further improve accuracy. Additionally, investigating the application of this loss in other problem domains could prove beneficial, broadening the impact of the approach in various AI and computer vision applications.

In summary, this paper advances the field by providing a robust, unsupervised framework for dense correspondence learning, addressing limitations of previous methods and setting new standards for both geometric and semantic matching tasks.

PDF Markdown

Related Papers

GitHub

GitHub - PruneTruong/DenseMatching: Dense matching library based on PyTorch (671 stars)

YouTube

Show All Videos