Regularization via Structural Label Smoothing (2001.01900v2)

Published 7 Jan 2020 in cs.LG and stat.ML

Abstract: Regularization is an effective way to promote the generalization performance of machine learning models. In this paper, we focus on label smoothing, a form of output distribution regularization that prevents overfitting of a neural network by softening the ground-truth labels in the training data in an attempt to penalize overconfident outputs. Existing approaches typically use cross-validation to impose this smoothing, which is uniform across all training data. In this paper, we show that such label smoothing imposes a quantifiable bias in the Bayes error rate of the training data, with regions of the feature space with high overlap and low marginal likelihood having a lower bias and regions of low overlap and high marginal likelihood having a higher bias. These theoretical results motivate a simple objective function for data-dependent smoothing to mitigate the potential negative consequences of the operation while maintaining its desirable properties as a regularizer. We call this approach Structural Label Smoothing (SLS). We implement SLS and empirically validate on synthetic, Higgs, SVHN, CIFAR-10, and CIFAR-100 datasets. The results confirm our theoretical insights and demonstrate the effectiveness of the proposed method in comparison to traditional label smoothing.

PDF Abstract

Summarize Bookmark Chat (Pro)

Authors (3)

Weizhi Li (14 papers)
Gautam Dasarathy (38 papers)
Visar Berisha (34 papers)

Citations (51)

View on Semantic Scholar

Regularization via Structural Label Smoothing (2001.01900v2)

Related Papers