Fast Gradient Non-sign Methods (FGNM)

Updated 5 March 2026

FGNM is defined as a set of algorithms that use full-vector, non-sign gradient updates to preserve directional fidelity and overcome the biases of sign-based methods.
They encompass fixed-scale and adaptive-scale attack variants, as well as advanced convex optimization and linear system solvers, offering superior convergence and robustness.
FGNM methods maintain low computational overhead while significantly improving adversarial transferability, convergence rates, and numerical stability in various applications.

Fast Gradient Non-sign Methods (FGNM) encompass a spectrum of optimization and attack algorithms that replace the standard coordinate-wise sign compression of gradient directions with full-vector, non-sign manipulations. These methods are formulized to preserve or enhance directional fidelity to the true gradient, improve convergence, or increase practical efficacy in machine learning, linear system solving, and adversarial attack generation. FGNM approaches are distinct from sign-based updates in both theoretical properties and empirical outcomes, finding application in convex optimization, adversarial robustness, and large-scale numerical linear algebra.

1. Theoretical Foundations and Motivation

Fast Gradient Non-sign Methods are rooted in the observation that quantizing gradients to their coordinate signs, as in SignSGD and Fast Gradient Sign Method (FGSM), induces a directional bias. For example, in adversarial attacks under an $\ell_\infty$ constraint, the optimal first-order gain $\delta^\mathrm{T}g$ is reduced because $\delta$ is not fully aligned with the true gradient $g$ . The cosine of the angle between the sign-based perturbation and the original gradient satisfies

$\cos{\theta} = \frac{\|g\|_1}{\|g\|_2\sqrt{D}} < 1$

unless $g$ has entries of equal magnitude (Cheng et al., 2021). In the context of convex optimization, sign-based methods lose curvature data and typically degrade to $O(1/\sqrt{k})$ convergence, lacking the second-order efficiency present in accelerated or adaptive methods (Cheng et al., 2016).

FGNM strategies instead rescale or otherwise preserve the full gradient vector, thereby maintaining the directional and magnitude information required for higher-order accuracy, effective alignment with low-curvature directions, and improved empirical robustness.

2. Algorithmic Structures and Principal Variants

FGNM frameworks are instantiated in multiple domains:

Adversarial Attack Generation: FGNM replaces the sign operation in classical $L_\infty$ -bounded attack methods by projecting or normalizing the gradient to retain its true direction.

Fixed-scale variant (N): Set the step as

$\delta'_t = \alpha \frac{g_t}{\|g_t\|_\infty}$

achieving perfect alignment (cosine = 1) with $g_t$ and saturating the $L_\infty$ norm constraint (Cheng et al., 2021).
Adaptive-scale variant (K): Select scaling factor $\zeta$ based on ranked $1/|g_{ti}|$ values, yielding

$\delta'_t = \alpha\zeta g_t$

with subsequent box clipping to the $\epsilon$ constraint. This allows controlled trade-off between alignment and norm magnitude.

Smooth and Composite Convex Optimization: Methods such as FLAG and FLARE eschew sign-compression, employing the full gradient in conjunction with adaptive diagonal rescaling matrices. FLAG combines a proximal gradient step, AdaGrad-style preconditioning, mirror descent, and Nesterov-style coupling (Cheng et al., 2016).

Proximal and mirror steps operate in the adaptively rescaled geometry:

$y_{k+1} \leftarrow \operatorname{prox}_{h/L}(x_k)$

$S_k = \operatorname{diag}(s_k) + \delta I,\quad s_k(i) = \|G_k(i,:)\|_2$

ensuring efficient adaptation to curvature and achieving accelerated convergence.

Symmetric Linear System Solvers: The family of FGNM algorithms for $Ax=b$ , including AOA (Asymptotically-Optimal with Alignment), MGA (Minimal Gradient with Alignment), and MGC (Minimal Gradient with Constant Step), periodically applies shorter or constant steps in place of the Cauchy step. These variants focus spectral energy on the critical eigenspaces, eliminating the “zig-zag” behavior of classical steepest descent (Zou et al., 2019).

For instance, MGA computes:

$\alpha_n^{MG} = \frac{g_n^\mathrm{T}Ag_n}{g_n^\mathrm{T}A^2g_n}$

and interleaves it with harmonic mean steps to enforce alignment.

3. Analytical Properties and Convergence Behavior

In supervised learning and adversarial contexts, by restoring the perturbation direction to be parallel with the gradient, FGNM methods maximize the first-order Taylor gain under norm constraints. This correction addresses the inefficiency of sign-based attacks, as demonstrated theoretically and empirically (Cheng et al., 2021).

In convex optimization, FLAG and FLARE attain an objective gap bound of

$F(y_{T+1}) - F^* \leq \mathcal{O}\left(\frac{\beta L D_\infty^2}{T^2}\right),\quad \beta\in[1,d]$

where $\beta$ reflects adaptive geometry from accumulated gradient norms (Cheng et al., 2016). By contrast, sign-based and non-adaptive methods lack such guarantees.

FGNM variants for symmetric linear systems are shown to exhibit $R$ -linear convergence under Dai’s Property A, ensuring that $\|g_n\|\rightarrow 0$ at a linear rate, and maintaining bounded step sizes for stability (Zou et al., 2019). Spectral alignment analysis demonstrates that periodic use of constant or harmonic mean steps drives search directions towards dominant eigenspaces, unlike the alternating two-dimensional cycles typical in sign-based, steepest-descent, or minimal-gradient approaches.

4. Empirical Performance and Benchmarks

Adversarial Attack Transferability

FGNM outperforms sign-based methods in both untargeted and targeted black-box adversarial settings:

I-FGSM $\rightarrow$ I-FGNM ("N"): $+9.8$ \% (avg. over standard models),
I-FGSM $\rightarrow$ I-FGNM ("K"): $+18.7$ \%,
SI-FGSM $\rightarrow$ SI-FGNM ("K"): $+20.7$ \%,
Targeted attacks against defense models: up to $+15.1$ \% (Cheng et al., 2021).

Perturbations generated by FGNM align much more closely with the gradient and can be tuned for $L_2$ norm using the "K" parameter.

Composite Convex Optimization

Across six data sets with $\ell_1$ -regularized and box-constrained losses, FLAG and FLARE match or surpass FISTA in both objective reduction and test accuracy. In classification on 20 Newsgroups (box-constrained), after 1000 iterations:

Method	Objective Gap	Test Accuracy (%)
FISTA	$4.5\times10^{-3}$	82.3
FLAG	$2.1\times10^{-3}$	84.1
FLARE	$2.0\times10^{-3}$	84.2

FLARE typically achieves the wall-clock minimum due to efficient iteration acceptance (Cheng et al., 2016).

Linear System Solving

In numerical experiments with random SPD and finite-difference tridiagonal matrices, MGC consistently outperforms Barzilai-Borwein and other two-point methods, especially for ill-conditioned problems ( $\kappa\in\{10^2,10^3,10^4,10^5\}$ ). On large, real-world problems (e.g., $N=1.56\times10^6$ ), MGC surpasses conjugate gradients in both iteration count and run time. MGC also exhibits improved numerical stability and maintains performance where Krylov methods degrade (e.g., under system perturbation) (Zou et al., 2019).

5. Computational Cost and Algorithmic Simplicity

FGNM implementations introduce minimal additional computational overhead compared to sign-based analogues:

In attack generation, the step cost is $O(D)$ for fixed-scale normalization, and at most $O(D\log D)$ for the adaptive-scale variant due to sorting (Cheng et al., 2021).
FLAG has per-iteration cost of $O(\mathcal{T}_{prox} \log(d T^3))$ , comparable to FISTA, while FLARE often keeps the number of expensive proximal calls per-iteration identical to FISTA (Cheng et al., 2016).
Linear solver variants incur one or two sparse matrix-vector products and a small number of inner products per iteration (Zou et al., 2019).

This efficiency facilitates integration into existing optimization and attack pipelines, justifying their use in high-dimensional or real-time settings.

FGNM are characterized by the avoidance of coordinate-wise sign compression:

They preserve full directional information, adaptive geometry, and enable the use of momentum (linear coupling, alignment steps).
Sign-based approaches (SignSGD, FGSM) forgo curvature and restrict updates to axis-aligned directions, reducing both convergence rates and attack transferability.
Empirically, sign-based methods exhibit degraded performance as curvature variance increases or as model defenses are strengthened; FGNM maintain robust performance via geometry-aware, non-quantized steps (Cheng et al., 2016, Cheng et al., 2021).

FGNM thus represent a distinct methodological class with theoretical, practical, and computational advantages in several domains.

Markdown Report Issue Upgrade to Chat

References (3)

Fast Gradient Non-sign Methods (2021)

FLAG n' FLARE: Fast Linearly-Coupled Adaptive Gradient Methods (2016)

Fast Gradient Methods with Alignment for Symmetric Linear Systems without Using Cauchy Step (2019)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fast Gradient Non-sign Methods (FGNM).

Fast Gradient Non-sign Methods (FGNM)

1. Theoretical Foundations and Motivation

2. Algorithmic Structures and Principal Variants

3. Analytical Properties and Convergence Behavior

4. Empirical Performance and Benchmarks

Adversarial Attack Transferability

Composite Convex Optimization

Linear System Solving

5. Computational Cost and Algorithmic Simplicity

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Fast Gradient Non-sign Methods (FGNM)

1. Theoretical Foundations and Motivation

2. Algorithmic Structures and Principal Variants

3. Analytical Properties and Convergence Behavior

4. Empirical Performance and Benchmarks

Adversarial Attack Transferability

Composite Convex Optimization

Linear System Solving

5. Computational Cost and Algorithmic Simplicity

6. Distinction from Sign-Based and Related Methods

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research