Papers

Topics

Authors

Recent

View all

Assistant

AI Research Assistant

Well-researched responses based on relevant abstracts and paper content.

Custom Instructions Pro

Preferences or requirements that you'd like Emergent Mind to consider when generating responses.

Gemini 2.5 Flash

Gemini 2.5 Flash 67 tok/s

Gemini 2.5 Pro 51 tok/s Pro

GPT-5 Medium 21 tok/s Pro

GPT-5 High 32 tok/s Pro

GPT-4o 120 tok/s Pro

Kimi K2 166 tok/s Pro

GPT OSS 120B 446 tok/s Pro

Claude Sonnet 4.5 35 tok/s Pro

2000 character limit reached

Neural Best-Buddies: Sparse Cross-Domain Correspondence (1805.04140v2)

Published 10 May 2018 in cs.CV

Abstract: Correspondence between images is a fundamental problem in computer vision, with a variety of graphics applications. This paper presents a novel method for sparse cross-domain correspondence. Our method is designed for pairs of images where the main objects of interest may belong to different semantic categories and differ drastically in shape and appearance, yet still contain semantically related or geometrically similar parts. Our approach operates on hierarchies of deep features, extracted from the input images by a pre-trained CNN. Specifically, starting from the coarsest layer in both hierarchies, we search for Neural Best Buddies (NBB): pairs of neurons that are mutual nearest neighbors. The key idea is then to percolate NBBs through the hierarchy, while narrowing down the search regions at each level and retaining only NBBs with significant activations. Furthermore, in order to overcome differences in appearance, each pair of search regions is transformed into a common appearance. We evaluate our method via a user study, in addition to comparisons with alternative correspondence approaches. The usefulness of our method is demonstrated using a variety of graphics applications, including cross-domain image alignment, creation of hybrid images, automatic image morphing, and more.

Citations (40)

View on Semantic Scholar

Summary

The paper introduces a sparse correspondence method using Neural Best-Buddies to align image regions across diverse domains.
It leverages hierarchical deep features from a pre-trained VGG-19 to identify mutual nearest neighbors and refine semantic alignments.
The method outperforms classical descriptors in cross-domain matching, enabling automated applications such as image morphing and semantic hybridization.

Sparse Cross-Domain Correspondence Using Neural Best-Buddies

The paper "Neural Best-Buddies: Sparse Cross-Domain Correspondence" addresses a fundamental problem in computer vision concerning the establishment of correspondences between image pairs, particularly where the main objects differ significantly in shape, appearance, or semantic category. While traditional correspondence techniques assume similarity in the objects or scenes depicted within the image pair, this methodology introduces a novel approach suitable for handling cross-domain instances—where such assumptions do not hold.

Methodology Overview

The core contribution of this work is the introduction of a sparse correspondence technique based on the concept of Neural Best Buddies (NBB). The technique leverages hierarchies of deep features derived from a pre-trained Convolutional Neural Network (CNN) classifier, exploiting its layers that encode diverse levels of semantic and geometrical information. The correspondence formulation operates across this hierarchy, beginning from a coarse level in feature representation, advancing towards finer levels while refining search regions and focusing on significant neural activations.

In practical terms, the methodology can be described in a few primary stages:

Feature Hierarchy Construction: Images are processed through a pre-trained VGG-19 network to obtain hierarchical feature maps, capturing various levels of semantic information.
Mutual Nearest Neighbors and NBB Identification: The NBB approach identifies mutual nearest neighbor neurons across layers, thereby constructing sparse sets of correspondence that reflect deep semantic similarity.
Region Transformation for Appearance Disparities: Regions in feature maps are converted into a common appearance space, allowing effective patch correlation even for images with vastly different appearances—a significant challenge in cross-domain matching.
Hierarchical Percolation: NBBs are refined through the network hierarchy, enhancing localization and correspondence granularity using the lower-level receptive fields.

Evaluation and Results

The evaluation of this method is demonstrated across several fronts. First, compared to classical descriptors like SIFT or SURF, which fail to manage substantial appearance variability, the proposed method excels in identifying semantically significant correspondences. Additionally, when tested against state-of-the-art dense correspondence methods, Neural Best Buddies exhibit superior performance for cross-domain scenarios, illustrated by successful alignment and matching of object parts across different semantic categories.

Furthermore, through a user paper, the correspondences produced by this method have shown a high degree of alignment with human annotations, indicating its effectiveness in intuitive semantic matching tasks. The method's robustness is quantitatively assessed on intra-class benchmarks, where it achieves a significant percentage of correct keypoint transfers.

Implications and Applications

The practicality of this method extends to diverse graphics applications including automatic image morphing and semantic hybridization. Applying the correspondence retrieved by NBBs allows fully automated image morph sequences and facilitates segment extraction for semantic hybridization of image components, potentially transforming the image editing process by minimizing manual intervention.

Theoretically, this research enriches the understanding of multi-level feature utilization in neural networks, emphasizing the significance of high-level features for abstract correspondence beyond photometric constraints. Practically, it introduces an automated, robust solution to a class of problems previously reliant on user input, providing a foundation for future explorations in multi-domain visual computing challenges.

Future Directions

The expansion of this framework could explore its adaptability across networks trained for various tasks beyond classification, thereby generalizing the concept of NBBs. Further, addressing identified limitations related to geometric dissimilarities or conducting co-analysis across multiple images represents potential avenues for enhancing functionality and applicability in complex visual environments.

In conclusion, this paper presents a robust framework for sparse cross-domain correspondence, facilitating new advancements in the application of deep learning to complex visual matching tasks.