Papers
Topics
Authors
Recent
Search
2000 character limit reached

Stacked Denoising Autoencoder (SDAE)

Updated 10 March 2026
  • Stacked Denoising Autoencoder (SDAE) is a deep learning model that learns noise-resistant representations by reconstructing clean inputs from corrupted ones.
  • It employs a two-stage training: greedy layer-wise pre-training with denoising autoencoders followed by global fine-tuning to refine features.
  • SDAEs are widely applied in fields like vision, speech, biomedical imaging, and financial forecasting to handle noisy or incomplete datasets.

A Stacked Denoising Autoencoder (SDAE) is a deep neural architecture designed for learning robust, hierarchically structured representations from corrupted data. SDAEs leverage layer-wise unsupervised pre-training, where each layer is trained as a denoising autoencoder to reconstruct clean inputs from noise-perturbed versions, followed by global fine-tuning for either supervised or unsupervised objectives. They have been influential in tasks where generalization from noisy, incomplete, or limited labeled data is critical, and are widely adapted across vision, speech, natural language, biomedical imaging, financial forecasting, and more.

1. Mathematical Formulation of the Stacked Denoising Autoencoder

A single denoising autoencoder (DA) consists of an encoder and decoder pair. Let x[0,1]dx \in [0,1]^d denote a clean input. The corruption process q(x~x)q(\tilde{x}|x) generates a stochastically corrupted input x~\tilde{x}, often via masking noise (independently setting each xix_i to zero with probability ν\nu):

q(x~x)=i=1d[νδ(x~i=0)+(1ν)δ(x~i=xi)]q(\tilde{x}|x) = \prod_{i=1}^d [\nu \cdot \delta(\tilde{x}_i=0)+(1-\nu)\cdot\delta(\tilde{x}_i=x_i)]

The encoder maps the corrupted input to a hidden representation:

h=fθ(x~)=s(Wx~+b)h = f_\theta(\tilde{x}) = s(W\tilde{x} + b)

where WW, bb are layer parameters and s()s(\cdot) is typically the sigmoid nonlinearity s(u)=1/(1+eu)s(u)=1/(1+e^{-u}).

The decoder reconstructs the clean input from hh:

z=gθ(h)=s(Wh+b)z = g_{\theta'}(h) = s(W'h + b')

with weights WW', bb'. The reconstruction loss—either squared error or cross-entropy—is minimized over the parameters:

L(x,z)={xz22(squared error) i=1d[xilogzi+(1xi)log(1zi)](cross-entropy)L(x, z) = \begin{cases} ||x - z||^2_2 & \text{(squared error)}\ -\sum_{i=1}^d [ x_i \log z_i + (1-x_i) \log (1-z_i) ] & \text{(cross-entropy)} \end{cases}

Greedily stacking LL such DAs forms the SDAE: each layer is trained to reconstruct its clean input from corrupted versions, and the output of each encoder becomes the input to the next layer:

h()=fθ()(h(1)),h(0)=xh^{(\ell)} = f_{\theta^{(\ell)}}(h^{(\ell-1)}), \quad h^{(0)}=x

(Chowdhury et al., 2018)

After unsupervised pre-training, all layers are “unfolded” and optionally fine-tuned together with supervised loss (e.g., softmax cross-entropy for classification).

2. Pre-training and Training Algorithms

The canonical SDAE training protocol consists of two stages:

1. Greedy Layer-wise Pre-training:

2. Fine-tuning:

  • After stacking, the encoders are combined (decoders can be discarded or used for autoencoding tasks).
  • For supervised contexts, a classifier (typically softmax or logistic for binary/multi-class regression) is appended:

y^=softmax(Uh(L)+c)\hat{y} = \text{softmax}(U h^{(L)} + c)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Stacked Denoising Autoencoder (SDAE).