Relational Sparse Autoencoder (RSAE)

Updated 19 February 2026

RSAE is an autoencoder architecture that integrates graph-based and sparsity constraints to preserve relational structure in data.
It utilizes a composite objective combining reconstruction, relational graph loss, and KL-divergence or L1 sparsity penalties to boost model performance.
Empirical studies show that RSAE improves reconstruction error and interpretability on tasks such as image feature extraction, knowledge graph completion, and biological matrix recovery.

A Relational Sparse Autoencoder (RSAE) is an autoencoder architecture that explicitly encodes sample-to-sample or entity-to-entity relationships within its feature learning objective, while enforcing sparsity constraints on the learned representations. This approach extends conventional autoencoders by integrating graph- or similarity-based regularization terms alongside standard reconstruction and sparsity penalties. RSAEs are particularly effective in domains where the preservation of relational or manifold structure is critical for robust feature extraction, knowledge base completion, or matrix completion in sparse settings.

1. Model Structures and Variants

RSAEs generalize the basic sparse autoencoder by augmenting it with relational regularization over sample or entity relationships. There are three principal instantiations documented in the literature:

Sample-Graph Regularization for Feature Extraction The architecture consists of an encoder mapping an input matrix $X \in \mathbb R^{n \times m}$ to hidden activations $H \in \mathbb R^{n \times p}$ and a decoder reconstructing $X$ from $H$ . A fixed similarity matrix $S \in \mathbb R^{n \times n}$ (e.g., from rectified inner products $S_{ij} = \max(0, x_i^\top x_j - t)$ , k-NN, or RBF) encodes which pairs of data samples are similar and should have proximate encodings. The output is subject to three losses: data reconstruction, relational graph loss that penalizes hidden representations $h_i, h_j$ for similar samples for being distant, and a Kullback-Leibler (KL) divergence sparsity penalty on the mean activation per code unit (Meng et al., 2018).
Relation Coding for Knowledge Base Completion Each relation is parameterized as a full $d \times d$ matrix $M_r$ and encoded via a sparse autoencoder: an encoder projects $\mathrm{vec}(M_r)$ into a lower-dimensional sparse code $z_r$ (typically via ReLU activations), and a decoder reconstructs $M_r$ . The resultant relational codes $z_r$ are used to reconstruct relations in a vectorized form and serve as interpretable, compositional features. The learning objective incorporates autoencoding, an $L_1$ sparsity penalty, and optionally compositional constraints such as $M_{r_1} M_{r_2} \approx M_{r_3}$ for compositional relations (Takahashi et al., 2018).
Dual-Regularized Matrix Completion for Sparse Biological Data In matrix completion settings such as drug–target or drug–disease association matrices, a shallow autoencoder reconstructs the sparse matrix using row- and column-similarity weight matrices ( $U,B$ ), each regularized to stay close to external similarity matrices (e.g., chemical or sequence similarity). The relational structure is enforced both in the reconstruction step and via graph Laplacian regularization, effectively yielding an RSAE in a linear, closed-form solution (Poleksic, 2024).

2. Mathematical Objectives and Regularization

The generic RSAE objective combines three terms:

$L_{RSAE}(\theta) = L_{rec} + \alpha L_{rel} + \beta L_{sparse}$

where:

$L_{rec}$ : Reconstruction loss ( $\|X - \hat{X}\|^2_F$ )
$L_{rel}$ : Relational loss, typically of graph Laplacian form ( $\sum_{i,j} S_{ij} \|h_i-h_j\|_2^2 = 2\,\mathrm{Tr}(H^\top L H)$ with $L=D-S$ )
$L_{sparse}$ : Sparsity penalty, such as the sum of KL divergences between the expected activation $\hat{\rho}_k$ of each hidden unit and a target $\rho$ , or an $L_1$ term on codes

Hyperparameters $\alpha$ and $\beta$ control the relative strength of relational and sparsity regularization.

Alternative relational objectives include coding compositional constraints between relation matrices ( $L_{comp} = \sum_{(r_1,r_2,r_3)} \| M_{r_1} M_{r_2} - M_{r_3} \|_F^2$ ) and dual-side regularization in matrix completion ( $\|B-B_1\|_F^2, \|U-U_1\|_F^2$ ).

3. Training Algorithms and Implementation

Training follows standard autoencoder optimization, with additional computation of relational and sparsity terms:

For graph-regularized feature extraction, the similarity matrix $S$ is precomputed from data and used throughout training. Parameters are updated using stochastic gradient descent or Adam, with the relational loss computed over minibatches.
In knowledge base completion, entities and relations are initialized and trained jointly via SGD, where the sparse autoencoder is trained alongside relational composition and knowledge base completion objectives, utilizing techniques like noise-contrastive estimation and negative sampling (Takahashi et al., 2018).
For matrix completion with dual regularization, closed-form solutions exist for the weight matrices, alternating between updates for row (user/drug) and column (item/target) similarity, greatly enhancing computational efficiency (Poleksic, 2024).

Across all settings, proper selection and tuning of relational hyperparameters ( $\alpha, \beta, \nu$ ) are crucial for balancing relational structure preservation, sparsity, and reconstruction fidelity.

4. Empirical Performance and Analysis

RSAEs demonstrate improved performance over non-relational (vanilla) or solely sparse autoencoders across multiple domains:

Feature Extraction On standard vision datasets, RSAE achieves lower mean squared error and reduced downstream classification error. For example, on MNIST, the reconstruction loss is reduced from 0.312 (SAE) to 0.296 (RSAE), and classification error from 2.2% to 1.8%. On CIFAR-10, loss decreases from 0.331 to 0.292 and error from 14.2% to 13.4% (Meng et al., 2018).
Knowledge Base Completion Joint training with a sparse relational autoencoder yields more interpretable relation codes and state-of-the-art mean rank and mean reciprocal rank on standard datasets. E.g., on FB15k-237, mean rank improves from 215 (base) to 197 (joint + AE + comp) (Takahashi et al., 2018).
Sparse Matrix Completion The dual-regularized autoencoder (DUET) achieves higher AUPR and faster runtimes compared to matrix factorization or neighborhood-regularized baselines. On DrugBank, DUET produces an AUPR of 0.580 compared to 0.549 for the best MF baseline, requiring less hyperparameter tuning and running 10-20x faster (Poleksic, 2024).

These results robustly support the inclusion of relational constraints in autoencoder frameworks to improve both reconstruction and task-specific generalization.

5. Interpretability and Structural Insights

RSAEs not only improve performance but also yield more interpretable representations:

Sparse Codes for Relations In knowledge base models, most relation codes are sparse (>80% zeros), with distinct code dimensions activating for semantically consistent relation groups. For example, one code dimension may correspond to currency-related relations, another to film-related, and combinations reflect composite relationships.
Preserving Manifold Geometry In feature extraction, RSAE preserves local geometry in the hidden representation space: similar inputs remain close, unlike in conventional sparse autoencoders where similar samples may be mapped distant if it reduces reconstruction error.
Respecting External Similarity In bioinformatics applications, dual regularizers enforce homophily, keeping latent profiles of chemicals/targets or diseases with similar external descriptors close in encoding space.

6. Relation to Standard and Alternative Approaches

RSAE extends the standard sparse autoencoder by incorporating relational regularization terms absent in conventional frameworks. Where Sparse Autoencoders minimize reconstruction plus a sparsity penalty,

$L_{SAE} = L_{rec} + \beta L_{sparse}$

the RSAE objective augments this with $\alpha L_{rel}$ . SAE may disperse similar samples in hidden space for reconstruction, whereas RSAE enforces local coherence and relational consistency (Meng et al., 2018).

In the context of knowledge base models and matrix completion, relational and compositional autoencoders outperform models omitting either component, confirming that explicit relational signal is integral to robust, generalizable latent representations.

7. Applications and Significance

RSAEs have been documented in domains including:

High-dimensional feature extraction (MNIST, CIFAR-10)
Knowledge graph representation and completion (WN18, FB15k, WN18RR, FB15k-237)
Biological network inference and drug repurposing (DrugBank, Hetionet), where sparse data and complex relationships require relational and sparsity-aware modeling

The inclusion of relational structure consistently yields more robust, interpretable, and computationally efficient solutions, supporting further adoption in scientific and industrial data modeling tasks (Meng et al., 2018, Takahashi et al., 2018, Poleksic, 2024).