Stacked Denoising Autoencoder (SDAE)

Updated 10 March 2026

Stacked Denoising Autoencoder (SDAE) is a deep learning model that learns noise-resistant representations by reconstructing clean inputs from corrupted ones.
It employs a two-stage training: greedy layer-wise pre-training with denoising autoencoders followed by global fine-tuning to refine features.
SDAEs are widely applied in fields like vision, speech, biomedical imaging, and financial forecasting to handle noisy or incomplete datasets.

A Stacked Denoising Autoencoder (SDAE) is a deep neural architecture designed for learning robust, hierarchically structured representations from corrupted data. SDAEs leverage layer-wise unsupervised pre-training, where each layer is trained as a denoising autoencoder to reconstruct clean inputs from noise-perturbed versions, followed by global fine-tuning for either supervised or unsupervised objectives. They have been influential in tasks where generalization from noisy, incomplete, or limited labeled data is critical, and are widely adapted across vision, speech, natural language, biomedical imaging, financial forecasting, and more.

1. Mathematical Formulation of the Stacked Denoising Autoencoder

A single denoising autoencoder (DA) consists of an encoder and decoder pair. Let $x \in [0,1]^d$ denote a clean input. The corruption process $q(\tilde{x}|x)$ generates a stochastically corrupted input $\tilde{x}$ , often via masking noise (independently setting each $x_i$ to zero with probability $\nu$ ):

$q(\tilde{x}|x) = \prod_{i=1}^d [\nu \cdot \delta(\tilde{x}_i=0)+(1-\nu)\cdot\delta(\tilde{x}_i=x_i)]$

The encoder maps the corrupted input to a hidden representation:

$h = f_\theta(\tilde{x}) = s(W\tilde{x} + b)$

where $W$ , $b$ are layer parameters and $s(\cdot)$ is typically the sigmoid nonlinearity $s(u)=1/(1+e^{-u})$ .

The decoder reconstructs the clean input from $h$ :

$z = g_{\theta'}(h) = s(W'h + b')$

with weights $W'$ , $b'$ . The reconstruction loss—either squared error or cross-entropy—is minimized over the parameters:

$L(x, z) = \begin{cases} ||x - z||^2_2 & \text{(squared error)}\ -\sum_{i=1}^d [ x_i \log z_i + (1-x_i) \log (1-z_i) ] & \text{(cross-entropy)} \end{cases}$

Greedily stacking $L$ such DAs forms the SDAE: each layer is trained to reconstruct its clean input from corrupted versions, and the output of each encoder becomes the input to the next layer:

$h^{(\ell)} = f_{\theta^{(\ell)}}(h^{(\ell-1)}), \quad h^{(0)}=x$

(Chowdhury et al., 2018)

After unsupervised pre-training, all layers are “unfolded” and optionally fine-tuned together with supervised loss (e.g., softmax cross-entropy for classification).

2. Pre-training and Training Algorithms

The canonical SDAE training protocol consists of two stages:

1. Greedy Layer-wise Pre-training:

Each DAE layer is trained sequentially while previous layers are frozen.
Corruption is independently applied to each layer’s input (commonly $\nu=0.2$ –$0.3$ for masking) (Chowdhury et al., 2018, Liang et al., 2021, Kalmanovich et al., 2014).

2. Fine-tuning:

After stacking, the encoders are combined (decoders can be discarded or used for autoencoding tasks).
For supervised contexts, a classifier (typically softmax or logistic for binary/multi-class regression) is appended:

$\hat{y} = \text{softmax}(U h^{(L)} + c)$

Entire network is fine-tuned using backpropagation and stochastic gradient descent (SGD), minimizing negative log-likelihood or other relevant loss functions, potentially with regularization such as L2 weight decay or dropout (Chowdhury et al., 2018, Kalmanovich et al., 2014, Liang et al., 2021).
Fine-tuning is critical for adapting layerwise-learned features to the end task and consistently yields improvements in generalization [1412

Markdown Report Issue Upgrade to Chat

References (3)

On Stacked Denoising Autoencoder based Pre-training of ANN for Isolated Handwritten Bengali Numerals Dataset Recognition (2018)

Training Stacked Denoising Autoencoders for Representation Learning (2021)

Gradual training of deep denoising auto encoders (2014)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Stacked Denoising Autoencoder (SDAE).

Stacked Denoising Autoencoder (SDAE)

1. Mathematical Formulation of the Stacked Denoising Autoencoder

2. Pre-training and Training Algorithms

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Stacked Denoising Autoencoder (SDAE)

1. Mathematical Formulation of the Stacked Denoising Autoencoder

2. Pre-training and Training Algorithms

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research