Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
94 tokens/sec
Gemini 2.5 Pro Premium
55 tokens/sec
GPT-5 Medium
38 tokens/sec
GPT-5 High Premium
24 tokens/sec
GPT-4o
106 tokens/sec
DeepSeek R1 via Azure Premium
98 tokens/sec
GPT OSS 120B via Groq Premium
518 tokens/sec
Kimi K2 via Groq Premium
188 tokens/sec
2000 character limit reached

Beyond One-hot Encoding: lower dimensional target embedding (1806.10805v1)

Published 28 Jun 2018 in cs.CV and cs.AI

Abstract: Target encoding plays a central role when learning Convolutional Neural Networks. In this realm, One-hot encoding is the most prevalent strategy due to its simplicity. However, this so widespread encoding schema assumes a flat label space, thus ignoring rich relationships existing among labels that can be exploited during training. In large-scale datasets, data does not span the full label space, but instead lies in a low-dimensional output manifold. Following this observation, we embed the targets into a low-dimensional space, drastically improving convergence speed while preserving accuracy. Our contribution is two fold: (i) We show that random projections of the label space are a valid tool to find such lower dimensional embeddings, boosting dramatically convergence rates at zero computational cost; and (ii) we propose a normalized eigenrepresentation of the class manifold that encodes the targets with minimal information loss, improving the accuracy of random projections encoding while enjoying the same convergence rates. Experiments on CIFAR-100, CUB200-2011, Imagenet, and MIT Places demonstrate that the proposed approach drastically improves convergence speed while reaching very competitive accuracy rates.

Citations (308)

Summary

  • The paper proposes using lower-dimensional embeddings instead of one-hot encoding for CNN targets to leverage latent label relationships and improve training efficiency.
  • They achieve these embeddings through random projections for faster convergence and normalized eigenrepresentations for improved accuracy by capturing data structure.
  • Experiments show the methods lead to faster CNN convergence, especially with small batches, and data-dependent encoding enhances accuracy, offering practical benefits.

Beyond One-hot Encoding: Lower Dimensional Target Embedding

This paper presents an advanced approach for target encoding in Convolutional Neural Networks (CNNs), challenging the traditionally dominant one-hot encoding method. The authors propose embedding targets into a low-dimensional space, which offers several advantages including improved convergence speed and maintained accuracy. Their contributions include the use of random projections of the label space and a normalized eigenrepresentation of the class manifold to encode targets effectively.

In multi-class classification tasks, especially those with large output spaces, one-hot encoding can become inadequate due to its ignorance of inherent label correlations and its inefficiency in parameter space management. This work explores the potential of utilizing low-dimensional embeddings to address these limitations. The authors argue that large-scale datasets tend to lie in a low-dimensional output manifold rather than spanning the full label space. Hence, finding more efficient target embeddings could leverage these latent relationships, improving convergence speed without sacrificing accuracy.

Two primary methods for achieving these embeddings are discussed:

  1. Random Projections: By using random projections of the label space, the authors achieve lower dimensional embeddings, which significantly boost convergence rates at no additional computational cost.
  2. Normalized Eigenrepresentation: The method uses the spectral properties of the data, encoding targets with minimal information loss for improved accuracy. This approach harnesses the underlying manifold structure, particularly useful in datasets with complex inter-class relationships.

The experiments conducted on datasets such as CIFAR-100, CUB200-2011, ImageNet, and MIT Places validate the proposed methods. The results demonstrate faster convergence rates when compared to traditional one-hot encoding, particularly noticeable in scenarios with small mini-batch sizes. Furthermore, data-dependent encodings, facilitated by eigenrepresentations of the class similarity graph, are shown to enhance accuracy, underscoring their efficiency in capturing discriminative structure information.

The implications of this work are significant both theoretically and practically. The integration of Error-Correcting Output Codes (ECOC) demonstrates the adaptability of CNN architectures to different encoding strategies, which can generalize across various classification tasks without necessitating architectural adjustments. Practically, this approach can lead to more efficient model training, reduced parameter spaces, and enhanced adaptability to new tasks with minimal retraining, making it particularly attractive for real-time applications and resource-constrained environments.

In summary, this research advocates for a paradigm shift in target encoding for deep learning models, urging the community to move beyond the dominance of one-hot encoding. By exploiting the intrinsic geometric properties of label spaces, the proposed methods pave the way for more efficient and flexible models. Future work could further explore the integration of such embeddings in more complex network architectures or novel applications in other domains of artificial intelligence.

Dice Question Streamline Icon: https://streamlinehq.com

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Don't miss out on important new AI/ML research

See which papers are being discussed right now on X, Reddit, and more:

“Emergent Mind helps me see which AI papers have caught fire online.”

Philip

Philip

Creator, AI Explained on YouTube