Repeatability Is Not Enough: Learning Affine Regions via Discriminability

Published 17 Nov 2017 in cs.CV and cs.NE | (1711.06704v4)

Abstract: A method for learning local affine-covariant regions is presented. We show that maximizing geometric repeatability does not lead to local regions, a.k.a features,that are reliably matched and this necessitates descriptor-based learning. We explore factors that influence such learning and registration: the loss function, descriptor type, geometric parametrization and the trade-off between matchability and geometric accuracy and propose a novel hard negative-constant loss function for learning of affine regions. The affine shape estimator -- AffNet -- trained with the hard negative-constant loss outperforms the state-of-the-art in bag-of-words image retrieval and wide baseline stereo. The proposed training process does not require precisely geometrically aligned patches.The source codes and trained weights are available at https://github.com/ducha-aiki/affnet

Abstract PDF Upgrade to Chat

Authors (3)

Citations (185)

View on Semantic Scholar

Summary

The paper demonstrates that descriptor discriminability is more vital than geometric repeatability for achieving reliable feature matching.
It introduces the Hard Negative-Constant loss to effectively train the AffNet estimator for improved image retrieval and wide baseline stereo performance.
Empirical results show that AffNet yields higher repeatability and mean average precision compared to traditional methods under challenging conditions.

Learning Affine Regions via Discriminability: A Detailed Analysis

The paper "Repeatability Is Not Enough: Learning Affine Regions via Discriminability" by Mishkin, Radenović, and Matas presents a novel approach for learning affine-covariant regions. The research critiques the traditional emphasis on geometric repeatability in feature matching, proposing a shift towards descriptor-based learning to achieve more reliable matching.

Key Contributions

The authors highlight several significant contributions in this work:

Descriptor-Based Motivation: Traditionally, focus has been on maximizing geometric repeatability for feature detection. This paper argues that repeatability alone does not suffice for reliable matching. Instead, incorporating descriptor discriminability is vital.
Introduction of Hard Negative-Constant Loss: A new loss function, the hard negative-constant (HardNegC) loss, is introduced. This loss function is designed to improve the learning of affine regions by treating the distance to the closest negative example as a constant, thus enhancing the discriminability of descriptors.
Affine Shape Estimator (AffNet): The paper presents AffNet, an affine shape estimator trained using the HardNegC loss function. Empirical results demonstrate that AffNet outperforms existing methods in image retrieval tasks and wide baseline stereo.
Independence from Precise Geometric Alignment: The proposed training method does not require exact geometric alignment of patches, facilitating more robust learning under varying conditions.

Methodological Insights

The experimental section provides a rigorous assessment of the proposed methods, focusing on various factors influencing descriptor learning and matching:

Affine Parameterization: The study examines different parameterizations of affine transformation matrices and their impact on the performance of Convolutional Neural Networks (CNNs) in estimating local geometry. The decomposition technique adopted in the study emphasizes residual shape.
Descriptor Optimization: The authors explore the interplay between geometric accuracy and descriptor matchability. Through meticulous experiments, they illustrate how descriptors can be optimized for better shape registration even when geometric correspondence is not strictly maintained.

Empirical Evaluation

The performance evaluation includes both repeatability assessments and image retrieval benchmarks:

Repeatability: Experiments conducted on the HSequences dataset reveal that AffNet surpasses traditional Baumberg iterations in repeatability. Notably, AffNet maintains a high number of correspondences under challenging conditions including changes in viewpoint and illumination.
Image Retrieval: In the domain of image retrieval, the proposed methods yield superior results on Oxford5k and Paris6k benchmarks. The integration of AffNet in the local feature detection pipeline offers a marked improvement in mean average precision (mAP) over traditional methods.

Implications and Future Directions

This paper's findings carry substantial implications for both practical applications and theoretical advancements in computer vision. By shifting the focus to descriptor discriminability, new avenues are opened for developing more robust and reliable feature matching algorithms. Future work could explore extending these techniques to other domains of computer vision, improve integration with global feature descriptors, or explore real-time applications in dynamic environments.

In summary, this research articulately challenges the status quo of feature detector evaluation, proposing a sophisticated yet practical approach that promises enhanced performance in image-related tasks. The introduction of the HardNegC loss function and AffNet represents a significant advancement, providing a template for future explorations into affine transformation learning.

Markdown Report Issue