Contrastive Embedding for Generalized Zero-Shot Learning (2103.16173v1)

Published 30 Mar 2021 in cs.CV and cs.AI

Abstract: Generalized zero-shot learning (GZSL) aims to recognize objects from both seen and unseen classes, when only the labeled examples from seen classes are provided. Recent feature generation methods learn a generative model that can synthesize the missing visual features of unseen classes to mitigate the data-imbalance problem in GZSL. However, the original visual feature space is suboptimal for GZSL classification since it lacks discriminative information. To tackle this issue, we propose to integrate the generation model with the embedding model, yielding a hybrid GZSL framework. The hybrid GZSL approach maps both the real and the synthetic samples produced by the generation model into an embedding space, where we perform the final GZSL classification. Specifically, we propose a contrastive embedding (CE) for our hybrid GZSL framework. The proposed contrastive embedding can leverage not only the class-wise supervision but also the instance-wise supervision, where the latter is usually neglected by existing GZSL researches. We evaluate our proposed hybrid GZSL framework with contrastive embedding, named CE-GZSL, on five benchmark datasets. The results show that our CEGZSL method can outperform the state-of-the-arts by a significant margin on three datasets. Our codes are available on https://github.com/Hanzy1996/CE-GZSL.

Citations (168)

View on Semantic Scholar

Summary

The paper introduces a novel hybrid framework that integrates feature generation and contrastive embedding to improve discrimination in generalized zero-shot learning.
It leverages both instance- and class-level contrastive learning to mitigate bias towards seen classes and boost recognition accuracy for unseen classes.
Empirical evaluations on benchmarks such as AWA1 and CUB show that the approach competes favorably against state-of-the-art methods.

Contrastive Embedding for Generalized Zero-Shot Learning: A Comprehensive Overview

The research paper "Contrastive Embedding for Generalized Zero-Shot Learning" presents a novel framework designed to tackle Generalized Zero-Shot Learning (GZSL) by addressing the limitations of feature generation and embedding models traditionally used in this domain. GZSL is characterized by the challenge of recognizing both seen and unseen classes when only labeled data from seen classes is available during training. The authors propose a hybrid framework that integrates both feature generation models and embedding models to improve the discriminative capability required for effective GZSL.

Background and Motivations

GZSL extends Zero-Shot Learning (ZSL) by incorporating object classes that are available during the testing phase but absent during training. Conventional approaches in ZSL often involve mapping visual features into a semantic embedding space derived from class-descriptive attributes or word vectors. While these methods work efficiently in scenarios restricted to unseen classes, they exhibit a bias towards seen classes in GZSL tasks, where both seen and unseen classes are present.

Feature generation strategies have been introduced to mitigate such biases. These methods synthesize visual features for unseen classes using generative models, enabling a more balanced dataset for training models that classify both seen and unseen instances. However, the authors argue that generating synthetic features in the original visual feature space might still leave the model short of possessing adequate discriminative properties for GZSL.

Proposed Framework: Hybrid GZSL with Contrastive Embedding

The authors propose enhancing the GZSL classification by integrating a feature generation model with an embedding model into a comprehensive hybrid framework. The core innovation lies in the novel contrastive embedding, which captures both instance-wise and class-wise supervision:

Feature Generation Model: The generator synthesizes visual features for unseen classes using learned mappings from semantic descriptors, supplemented by discriminative signals from a paired discriminator.
Embedding Model: Instead of solely relying on common semantic embeddings, a contrastive embedding is introduced. This leverages instance-level contrastive learning to enhance the discriminative potency of embeddings by considering both individual and class-level distinctions in the new embedding space.

The contrastive embedding leverages a non-linear projection to transform the traditional semantic space into a new space where both real and synthetic instances exhibit stronger class separability. This is achieved through contrastive loss designs, which numerically optimize embedding distances that can effectively distinguish between true-positive and negative class pairs.

Empirical Evaluation and Results

The authors evaluate the proposed CE-GZSL framework across five benchmark datasets: AWA1, AWA2, CUB, FLO, and SUN. The empirical analysis indicates that their approach outperforms or competes favorably against state-of-the-art methods, particularly in challenging settings with diverse seen and unseen classes. Notably, the framework's synthesis provides substantial improvements on datasets such as AWA1 and CUB.

Implications and Future Directions

By constructing a robust hybrid framework that enlists feature generation and contrastive embeddings, the research provides a new avenue for overcoming the inherent challenges associated with data imbalance and class biases in GZSL. This approach not only facilitates improved class separation in the embedding space but also enhances the generalization capability of the model across various datasets.

Future research may explore further enhancing the embedding space with advanced regularization techniques or exploring alternate neural architectures for embedding mapping functions. Additionally, applying similar hybrid methods in other few-shot learning domains or extending their utility across broader transfer learning applications can offer valuable insights and applications.

In conclusion, the integration of contrastive embeddings within a feature-enriched hybrid model presents a promising paradigm enhancing the capacity of GZSL frameworks. This methodology potentially paves the way for more generalized and scalable solutions to zero-shot tasks, reinforcing the utility of contrastive learning principles combined with generative feature modeling.

PDF Markdown

Related Papers

GitHub

GitHub - Hanzy1996/CE-GZSL: Codes for the CVPR 2021 paper: Contrastive Embedding for Generalized Zero-Shot Learning (91 stars)