Preferability of redundant encodings for generalization in feed-forward classifiers

Determine whether increasing the average degree of redundancy (as defined by the Shannon-invariant ratio of the sum of individual mutual informations to the joint mutual information) in the hidden-layer activations of the quantized fully-connected MNIST classification network yields representations that are more robust to small input variations and improves generalization performance on unseen data compared to less redundant encodings.

Background

The paper introduces Shannon-invariant measures—the average degree of redundancy and the average degree of vulnerability—to characterize multivariate information structure without computing full PID atoms. In experiments on a quantized fully-connected MNIST classifier, the authors observe that redundancy increases with depth while vulnerability decreases, and they discuss a potential combinatorial rationale for this pattern.

Based on these observations, the authors articulate a conjecture that more redundant encodings may be preferable because they could yield more robust representations and thereby better generalization. This links the empirical signatures of redundancy to a hypothesized benefit for out-of-sample performance.

References

We conjecture that more redundant encodings may be preferable for the network, as they could lead to representations that are more robust to small input variations and thereby support better generalization.

— Shannon invariants: A scalable approach to information decomposition (2504.15779 - Gutknecht et al., 22 Apr 2025) in Subsection "Feedforward MNIST Classification" (Section 4.2)

Preferability of redundant encodings for generalization in feed-forward classifiers

Background

References

Related Problems