FLoPS-PA: FLOP-Aware Performance Frameworks

Updated 1 January 2026

The paper introduces FLoPS-PA as a FLOP-aware framework that extends classical slope stability by incorporating explicit flop-induced test configurations and invariant computations.
It utilizes FLOP-based discriminants and quantile comparisons to rank algorithms, detect performance anomalies, and validate runtime efficiency.
In deep learning, FLoPS-PA guides joint FLOP and parameter pruning, achieving significant compression and improved inference speed compared to traditional methods.

FLoPS-PA denotes a class of concepts, methodologies, and frameworks across algebraic geometry, scientific computing, and deep learning, unified by the theme of “FLOP-aware” performance or stability assessment. The term appears with domain-specific meanings in (i) algebraic/Kähler metric stability (“flop-slope” or Flops-PA stability), (ii) FLOP-based discriminants for algorithmic ranking, (iii) neural network pruning under joint FLOP and parameter constraints, and (iv) block-diagonal equivariant network architectures, among others. Each instantiation incorporates a rigorous mathematical or algorithmic approach to measuring, optimizing, or obstructing performance, stability, or metric existence in high-dimensional systems.

1. Flop‐Slope Stability: FLoPS-PA in Algebraic Geometry and Canonical Metrics

The foundational usage of “FLoPS-PA” arises in the context of canonical Kähler metrics and K-stability obstructions on pairs $(X, D)$ , where $X$ is a smooth projective variety and $D$ a simple normal-crossing divisor. The “flop‐slope” (abbreviated FLoPS-PA) criterion, introduced by Cheltsov–Rubinstein, extends the Ross–Thomas slope stability by incorporating explicit test configurations derived from flop degenerations of the deformation to the normal cone (Cheltsov et al., 2015).

Given $(X, D)$ with cone angle parameters $\beta$ , and a subvariety $Z \subset X$ , one first constructs a classical slope test configuration by blowing up $Z$ and forming the deformation to the normal cone as a family over $\mathbb{P}^1$ . The innovation of FLoPS-PA is to permit additional blow-ups of points on $Z$ , creating exceptional divisors whose central fibers support rational curves $C_i$ of normal type $\mathcal{O}_{\mathbb{P}^1}(-1) \oplus \mathcal{O}_{\mathbb{P}^1}(-1)$ . Flopping these $C_i$ and pushing forward the line bundle data yields a new, explicit family: the “flopped test configuration.”

The core computational invariant is an explicit shift in a “Futaki-type” intersection formula, encoding the effect of the flop on canonical metric obstructions. The central FLoPS-PA semistability criterion declares $(X, D, \beta)$ unstable if some flopped Futaki invariant becomes negative for a test configuration constructed via blow-up and flop sequences. In dimension $n=2$ , the relevant parameters are Seshadri constants before and after blow-ups, constraining the allowed range of slope parameters for which the instability may be detected.

FLoPS-PA stability is strictly stronger than classical slope stability—it detects canonical metric non-existence in circumstances where both Matsushima reductivity and the standard slope (Futaki) invariant fail. The framework remains computationally tractable, as all intersection numbers after flops are explicitly calculable (Cheltsov et al., 2015).

2. FLOPs as Discriminant: Performance Assessment in Linear Algebra (FLoPS-PA Framework)

In algorithmic linear algebra, FLoPS-PA refers to a formal methodology for testing whether floating-point operation counts (FLOPs) suffice to distinguish the fastest among mathematically equivalent algorithms for a given expression (Sankaran et al., 2022). The central goal is to determine when FLOP count alone reliably predicts empirical runtime or when other platform-dependent factors render FLOPs misleading as a performance proxy.

Given a collection of candidate algorithms (typically sequences of BLAS/LAPACK kernels), FLoPS-PA proceeds as follows:

For each algorithm $i$ , compute total FLOP count $F_i$ and the minimal FLOP count $F_\mathrm{min}$ , forming a relative FLOP score $RF_i$ .
Shortlist candidates by $F_i = F_\mathrm{min}$ and/or single-run relative time scores.
Collect iterative timing distributions, and use quantile-based comparisons (e.g., 25–75th percentile) to statistically rank algorithms. Ties are permitted when timing distributions overlap within quantile bands.
Assign mean ranks across multiple quantile bands for robustness, and declare convergence based on changes in mean-rank vectors.
Define an “anomaly” as any instance where a non-minimum-FLOP algorithm statistically outperforms all minimum-FLOP variants or when the minimum-FLOP set itself is not collectively optimal.

Empirical case studies show that in most cases, minimum-FLOP algorithms indeed occupy the fastest rank. However, rare anomalies arise due to cache locality, BLAS overhead, and other microarchitectural factors, emphasizing the need for empirical performance verification beyond FLOP minimization (Sankaran et al., 2022).

3. FLOP- and Parameter-Aware Pruning (FLoPS-PA in Neural Network Optimization)

Within deep neural network pruning, “FLoPS-PA” (FLOP- and parameter-aware) designates frameworks seeking to minimize model size and inference cost under joint constraints on parameter sparsity (number of nonzeros) and total computational complexity (FLOPs). The FALCON framework formalizes this via a mixed-integer quadratic program (ILP), embedding both a constraint on the number of kept weights $S$ and total FLOPs $F$ (Meng et al., 2024).

Mathematically, the core ILP is:

$\begin{align*} \min_{w, z} \quad & Q_L(w) \ \text{subject to} \quad & \sum_{i=1}^p z_i \leq S, \ & \sum_{i=1}^p f_i z_i \leq F, \ & w_i (1 - z_i) = 0\quad \forall i, \ & z_i \in \{0, 1\} \end{align*}$

where $w$ denotes weights, $f_i$ the FLOP cost per parameter, and $z_i$ indicator variables for “kept” weights.

Approximate algorithms (LP relaxation, dual golden-section minimization, discrete first-order methods, and block-diagonal or low-rank approximations for scalability) allow this framework to scale to millions of parameters. Empirically, FLoPS-PA pruning approaches greatly outpace classical magnitude-based or pure sparsity approaches, especially under extreme compression regimes. For example, at 20% FLOPs retained on ResNet-50, FALCON++ achieves 67.1% top-1 accuracy versus ~13% for previous combinatorial methods—a 48% relative improvement (Meng et al., 2024).

The FLoPS-PA constraint enables both one-shot and gradual (prune-retrain) pipelines, robustly outperforming methods that optimize only a single budget (FLOPs or nonzeros).

4. Block-Diagonal Equivariance and the FLOPs-per-Parameter Metric

Another notable instance is in equivariant neural network architectures that enforce symmetries (e.g., horizontal mirror-flopping) while controlling computational complexity through the FLOPs-per-parameter ratio (“ $\Phi$ ”) (Bökman et al., 7 Feb 2025). For the mirror-flopping group $G \cong C_2$ , feature spaces are decomposed into irreducibles, resulting in block-diagonal structure for all linear layers.

The effect is that both FLOP count $F$ and parameter count $P$ halve, but their ratio $\Phi = F/P$ remains invariant. Empirical results on vision architectures (ResMLP, DeiT-ViT, ConvNeXt) confirm that such equivariant models achieve a 50% reduction in both FLOPs and parameters while matching or slightly exceeding standard (non-equivariant) baseline accuracy. Inference speed is boosted by 30–50% with maintained accuracy on ImageNet-1K.

This explicit FLOPs-per-parameter equivalence demonstrates how symmetry exploitation can lead to architectures with provable efficiency not just in parameter count but also in computational workload, supporting scalable, high-throughput deployment (Bökman et al., 7 Feb 2025).

5. Connections to Derived Categories and Flop-Equivalences

Within algebraic geometry, “FLoPS-PA” terminology also appears in studies of derived categories associated to flops. In Jiang–Leung's formalism, fully faithful Fourier–Mukai functors and semiorthogonal decompositions relate the derived categories of two crepant resolutions connected by a flop (Jiang et al., 2018). The “flop–flop=twist” phenomenon characterizes certain autoequivalences as spherical twists, and the perverse Schober structure reflects the categorification of the geometry of flops. Applications include blowup formulas, projectivizations, and the study of symmetry within moduli spaces of sheaves or curves, encoding deep structural properties of the derived category under flop transitions.

6. Significance and Cross-Domain Interpretations

The recurring FLoPS-PA concept reflects a rigorous, FLOP-aware approach to quantifying performance, stability, or categorical equivalence under explicit algebraic or computational constraints. Whether in metric geometry (canonical metric existence obstructions (Cheltsov et al., 2015)), algorithm selection (runtime discriminants (Sankaran et al., 2022)), network compression (joint FLOP/parameter budgets (Meng et al., 2024)), or symmetry-aware neural network design (block-diagonal invariance (Bökman et al., 7 Feb 2025)), FLoPS-PA encodes a shift to fully quantitative, tractable, and often explicitly computable methodologies across fields.

A plausible implication is that, as system complexity increases, explicitly FLOP-aware frameworks—augmented by statistical validation, mixed-integer optimization, and symmetry-based reductions—will become increasingly necessary for both theoretical tractability and practical deployment in scientific and machine learning applications.