Papers
Topics
Authors
Recent
2000 character limit reached

Swish-T : Enhancing Swish Activation with Tanh Bias for Improved Neural Network Performance

Published 1 Jul 2024 in cs.LG and cs.CV | (2407.01012v3)

Abstract: We propose the Swish-T family, an enhancement of the existing non-monotonic activation function Swish. Swish-T is defined by adding a Tanh bias to the original Swish function. This modification creates a family of Swish-T variants, each designed to excel in different tasks, showcasing specific advantages depending on the application context. The Tanh bias allows for broader acceptance of negative values during initial training stages, offering a smoother non-monotonic curve than the original Swish. We ultimately propose the Swish-T${\textbf{C}}$ function, while Swish-T and Swish-T${\textbf{B}}$, byproducts of Swish-T${\textbf{C}}$, also demonstrate satisfactory performance. Furthermore, our ablation study shows that using Swish-T${\textbf{C}}$ as a non-parametric function can still achieve high performance. The superiority of the Swish-T family has been empirically demonstrated across various models and benchmark datasets, including MNIST, Fashion MNIST, SVHN, CIFAR-10, and CIFAR-100. The code is publicly available at https://github.com/ictseoyoungmin/Swish-T-pytorch.

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Authors (3)

Collections

Sign up for free to add this paper to one or more collections.