Papers
Topics
Authors
Recent
Search
2000 character limit reached

Adversarial Examples in Random CNNs

Updated 10 February 2026
  • The paper demonstrates that adversarial examples can be generated in random CNNs within an â„“â‚‚-distance proportional to ‖x‖₂/√d, matching information-theoretic limits.
  • It employs isoperimetric inequalities, Fourier-spectral analysis, and Gaussian process arguments to rigorously characterize the models' susceptibility.
  • The findings reveal that inherent high-dimensional geometric properties, not training specifics, fundamentally limit the robustness of convolutional architectures.

Adversarial examples in random convolutional neural networks (CNNs) are small input perturbations—often imperceptible in the ℓ2\ell_2-norm—that can abruptly flip the output of a randomly initialized CNN with high probability. Recent theoretical advances have established that such vulnerabilities are not a byproduct of training or specific architectural choices, but an intrinsic feature of high-dimensional, random convolutional models. The phenomenon is now understood through rigorous analysis involving isoperimetric inequalities, spectral geometry, Fourier representations of convolution operators, and Gaussian process arguments. The resulting robustness thresholds match the information-theoretic limits: for a dd-dimensional input xx, adversarial examples can be found within ℓ2\ell_2-distance O(∥x∥2/d)O(\lVert x\rVert_2 / \sqrt{d}) of xx, which is essentially the minimal possible for any classifier uniformly over the input space.

1. Mathematical Foundations of Adversarial Vulnerability in Random CNNs

The vulnerability of random CNNs to small-norm adversarial examples is established through several distinct, yet conceptually aligned mathematical frameworks:

  • Isoperimetric Approach: Using the isoperimetric inequality on the special orthogonal group SO(d)\mathrm{SO}(d), one proves that, for a Lipschitz-invariant classifier ff defined via random convolutions, the decision boundary must, with overwhelming probability, be close (in â„“2\ell_2) to any given input x0x_0. Specifically, for random nets constructed from regular/Xavier convolutional layers and standard activations (odd or ReLU, with technical assumptions on network width and depth), there exists x′x' with sign-flip and ∥x′−x0∥2≲∥x0∥sp/d\lVert x'-x_0\rVert_2 \lesssim \lVert x_0\rVert_{\mathrm{sp}}/\sqrt{d}, where ∥x0∥sp\lVert x_0\rVert_{\mathrm{sp}} is the spectral norm (i.e., operator norm of the matrixized input) (Daniely, 14 Jun 2025).
  • Fourier-Spectral Analysis: Random convolutional layers, when represented in the Fourier basis, decompose into (almost) independent low-dimensional blocks. With random Gaussian weights, the spectral norms and minimum singular values of these blocks are tightly controlled, ensuring well-conditioning. This facilitates lower-bounding the input gradient and ensures its robustness within small balls, which directly underpins adversarial construction via gradient-based methods (Daniely et al., 3 Feb 2026).
  • Gaussian Process and Covering Number Arguments: For infinite-width limits, the output of a random deep network converges to a Gaussian process indexed by the input vector. Standard entropy integral and concentration arguments (Borell–TIS inequality, Dudley integral) show that, in any norm (â„“p\ell^p), the minimal adversarial perturbation distance is bounded by O(∥x∥p/d)O(\|x\|_p/\sqrt{d}) (Palma et al., 2020, Montanari et al., 2022).

2. Network Architectures, Random Initialization, and Theoretical Assumptions

The analytical results are valid across a range of convolutional architectures with the following features:

  • Layer Structure: Multiple convolutional layers of fixed, constant depth tt; layers can be either group convolutional (with weight-sharing determined by a finite abelian group GG) or conventional spatial convolutions. Minimal depth and width conditions ensure well-posedness of the spectral estimates and covering number arguments (Daniely et al., 3 Feb 2026, Daniely, 14 Jun 2025).
  • Activation Functions: Assumptions typically hold for C2C^2 activations (including ReLU, smooth functions, or odd functions for symmetric cases), with non-vanishing average squared derivatives to guarantee meaningful gradients.
  • Random Initialization: Weights are drawn i.i.d. from appropriate Gaussian distributions (Xavier, He, or similar scaling), yielding statistically isotropic or SO(d)-invariant random functions.
  • Output Mapping: The final network output is scalar-valued, constructed as an inner product of the last-layer activations with an independent random vector.

These conditions are broad enough to encompass most conventional untrained CNNs and, in the limit, theoretical random convolutional function classes (Daniely et al., 3 Feb 2026, Daniely, 14 Jun 2025).

3. Existence, Construction, and Size of Adversarial Examples

All approaches confirm that, with overwhelming probability over random weights, for any input xx of dimension dd, there exists a perturbation δ\delta such that

∥δ∥2≤O(∥x∥2d)\|\delta\|_2 \leq O\left(\frac{\|x\|_2}{\sqrt{d}}\right)

which flips the sign of the classifier's output. This claim holds for:

  • Random CNNs with any reasonable smooth or ReLU activation (Daniely, 14 Jun 2025, Daniely et al., 3 Feb 2026, Montanari et al., 2022).
  • Networks of constant depth and sufficiently large width per layer.
  • Various â„“p\ell^p-norms, with scaling ∥δ∥p=O(∥x∥p/d)\|\delta\|_p = O(\|x\|_p/\sqrt{d}) for all 1≤p<∞1\leq p < \infty (Palma et al., 2020).

Notably, this scaling is essentially optimal, as no classifier can be more robust in the worst-case scenario.

4. Explicit Construction: Gradient-Based Adversarial Attacks

Explicit adversarial perturbations can be computed through a single gradient-descent or "fast gradient sign" step:

  1. Gradient Computation: The input-gradient ∇xf(x;θ)\nabla_x f(x;\theta) is guaranteed to have norm bounded away from zero—uniformly over most xx of interest.
  2. Perturbation Size and Step: Setting δ=−η ∇xf(x;θ)\delta = -\eta\, \nabla_x f(x;\theta), with η\eta proportional to f(x;θ)/∥∇xf(x;θ)∥2f(x;\theta)/\|\nabla_x f(x;\theta)\|^2, ensures that the sign of ff at x+δx+\delta is flipped. The typical step has ∥δ∥2=O(∥x∥2/d)\|\delta\|_2 = O(\|x\|_2/\sqrt{d}) (Daniely et al., 3 Feb 2026, Montanari et al., 2022).
  3. Success Probability: With high probability (≥95%\geq 95\% for moderate dd), a single such step suffices, and the output switches class.

The Gaussian conditioning argument further ensures that the joint distribution of (f(x),f(x+δ))(f(x), f(x+\delta)) follows a precisely characterized bivariate normal law, leading to nearly certain attack success as dd increases (Montanari et al., 2022).

5. Geometric and Group-Theoretic Underpinnings

The existence of adversarial examples in random CNNs reflects deep properties of high-dimensional geometry:

  • SO(d)-Invariance and Isoperimetry: Random group-convolutional layers confer SO(d)-invariance to the output function on the input orbit, subjecting the network to sharp isoperimetric laws. Concentration of measure implies that for any sizable region (such as the decision region of a binary classifier), nearly all points are within O(1/d)O(1/\sqrt{d}) of the boundary—directly yielding adversarial examples (Daniely, 14 Jun 2025).
  • Fourier Diagonalization: For abelian group CNNs, the convolution operator diagonalizes block-wise in the Fourier basis, so robustness cannot be achieved by increasing width or by randomizing filter arrangements.

Geometric inseparability thus emerges as the foundational cause, unaffected by training or typical regularization.

6. Empirical and Experimental Findings

Theoretical guarantees are largely validated by empirical studies:

  • Experimental Verification: On random and trained networks (e.g., LeNet, ResNet, shallow and deep convolutional nets), â„“p\ell^p-bounded adversarial attacks (FGSM, PGD, Carlini–Wagner) find perturbations of size O(∥x∥p/d)O(\|x\|_p/\sqrt{d}) (Palma et al., 2020).
  • Effect of Training: Training on natural data (e.g., MNIST, CIFAR10) does not significantly increase robustness for â„“2\ell^2 or ℓ∞\ell^\infty attacks; in some settings, the adversarial distance even decreases for out-of-distribution data (Palma et al., 2020).
  • Architectural Variants: Modifications such as instance-wise random masking applied at shallow layers (e.g., "Random Mask" CNNs) increase robustness against black-box attacks, albeit at the cost of reduced expressivity or accuracy. Remaining adversarial examples in such architectures can induce perceptible changes in semantics, sometimes fooling humans as well (Luo et al., 2020).

7. Implications for Robustness, Defenses, and Theoretical Limits

Current theoretical evidence dictates several key implications:

  • Universality of Vulnerability: Robustness to norm-bounded perturbations cannot be fundamentally improved by architecture alone in generic isotropic CNNs unless depth increases with dimension, weight-sharing is broken, or input preprocessing disrupts SO(d)-invariance (Daniely, 14 Jun 2025, Daniely et al., 3 Feb 2026).
  • Limits of Training: Even adversarial training or large-scale pretraining cannot circumvent the high-dimensional isoperimetric barrier for standard data-unaware random initializations.
  • Defensive Strategies: Breaking the invariance structure via architectural randomization (masks, spatial permutations), input-dependent pre-processing, or learned anisotropic filters can offer partial gains, but no parameter-free defense can exceed the O(1/d)O(1/\sqrt{d}) scaling without leveraging data distribution properties (Luo et al., 2020, Daniely et al., 3 Feb 2026).
  • Conceptual Clarity: The formalism suggests a need to refine the definition of adversarial examples, potentially focusing on perceptual metrics or human label invariance rather than strict norm constraints (Luo et al., 2020).

In summary, adversarial examples in random convolutional neural networks are a universal and quantitatively sharp phenomenon dictated by high-dimensional geometry, spectral characteristics of random convolutions, and group-theoretic invariance. The information-theoretic lower bounds on adversarial distance tightly constrain the space of possible defenses and set a baseline for understanding robustness in both theory and practice (Daniely et al., 3 Feb 2026, Daniely, 14 Jun 2025, Palma et al., 2020, Montanari et al., 2022, Luo et al., 2020).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Adversarial Examples in Random CNNs.