Size Transferability Bounds

Updated 11 October 2025

Size Transferability Bounds are explicit quantitative measures that describe how performance, error rates, or capacity change when inputs or system sizes scale.
They are applied across domains such as coding theory, deep learning, graph neural networks, and database query processing to derive rigorous upper and lower bounds.
These bounds facilitate robust model generalization, optimal code design, and informed transfer learning by linking structural properties to scalable performance.

Size transferability bounds formalize quantitative relationships between the performance or structural properties of models, codes, or algorithms when transferring from small or moderate-sized instances to larger ones. The term arises across discrete mathematics, mathematical learning theory, coding, deep learning, and statistical inference, and typically refers to explicit, rigorous upper or lower bounds that describe how much error, performance, or capacity changes when increasing the “size” of the input domain (number of atoms, graph nodes, database entries, codeword length, etc.). Modern research has established such bounds in diverse settings, linking them to fundamental structural properties—continuity, operator regularity, spectral norms, model architecture, or combinatorial constraints. These bounds serve key roles in certifying generalization, designing transfer learning systems, and guiding practical model deployment in large-scale settings.

1. Transferability Bounds in Coding Theory

Size transferability for error-correcting codes refers to controlling the maximum size $A_q(n,d)$ of $(n,d)$ -codes (possibly non-linear), given constraints or properties that relate codes of different sizes or projections. A central result (Bellini et al., 2012) improves the Litsyn–Laihonen bound via a weight-restriction argument and puncturing technique. For systematic-embedding codes, the new upper bound is:

$A_q(n,d) \le \frac{q^t}{|B(r,t)|} \left( A_q(n-t,d-2r) - \frac{|B(r,n-t)|}{|B(d-2r-1,n-t)|} + 1 \right),$

where $B(l,n) = \sum_{j=0}^l \binom{n}{j}(q-1)^j$ is the $q$ -ary ball. This bound strictly strengthens previous results: it allows one to “transfer” size constraints from an $n$ -dimensional code to a punctured code of length $(n-t)$ , effectively bounding the code size via its projection.

Compared to classical bounds (Griesmer, Johnson, Plotkin), these size transferability bounds are independent and can outperform traditional estimates, especially for codes over large alphabets ( $q \geq 9$ ). Experimental results confirm that, for $q=29$ , the new bound is better than all other theoretical bounds in 91% of tested cases. The technique provides structural insight for the design and search of optimal codes, especially in parameter regimes where prior bounds are loose.

2. Transferability in Deep and Graph Neural Networks

In deep learning and graph-based models, size transferability refers to the generalization capacity when models trained on small or moderately-sized instances (atoms, graphs, etc.) are applied to larger systems. For atomic force prediction (Kuritz et al., 2018), local deep learning models show strong size scaling—models trained on 27-atom Al or Na supercells yield mean absolute errors (MAE) comparable to those on 125-atom systems. Transferability is less robust in Si, where long-range or directional bonding is more complex.

In graph neural networks, rigorous size transferability bounds have been established via graph limit theory. Operator-theoretic approaches introduce the concept of graphops—P-operators capturing aggregation on infinite graphs (Le et al., 2023)—and provide quantitative bounds:

$d_M(A, A_n) \le 2 \sqrt{\frac{C_A C_v + C_v + 1}{n}},$

where $A$ is the graphop, $A_n$ its $n$ -dimensional discretization, and $C_A$ , $C_v$ are regularity constants. For graphop neural networks, analogous bounds hold for output discrepancy across different graph sizes. This enables the transfer of GNN features to vastly larger graphs with a guaranteed vanishing upper bound (decaying as $O(n^{-1/2})$ ), including for sparse graph classes, where classical graphon-based bounds do not apply.

Continuous-depth GNDEs and their infinite-node limits (Graphon-NDEs) (Yan et al., 4 Oct 2025) extend this framework, proving trajectory-wise convergence in $C([0,T]; L^2(I))$ with explicit bounds under deterministic graph sampling. For weighted graphs sampled from Hölder-continuous graphons,

$\| X^{(n)} - X \|_{C([0,T];L^2(I))} \le \frac{C}{n^\alpha},$

and for unweighted graphs (with upper box-counting dimension $b$ ),

$\| X^{(n)} - X \|_{C([0,T];L^2(I))} \le \frac{\tilde{C}}{n^{1-(b+\epsilon)/2}}.$

This confirms that models trained on moderate-size graphs can be transferred to much larger graphs without retraining, provided the graphs are structurally similar.

3. Transferability in Statistical and Learning Frameworks

In transfer learning, size transferability bounds quantify how source task size and similarity impact error rates and generalization. In high-dimensional quantile regression (Huang et al., 2022), explicit non-asymptotic bounds depend on the total transferable source sample size $n_h$ and an $l_1$ -distance threshold $h$ :

$\| \hat{\beta}_h - \beta_0^* \|_2 \lesssim \sqrt{\frac{s_0 \log p}{n_0 + n_h}} + \sqrt{h} \left( \frac{\log p}{n_0 + n_h} \right)^{1/4}.$

The detection algorithm ensures that only sources within this similarity threshold are included, tuning transfer benefit directly to the size and closeness of auxiliary data.

In transfer learning risk analysis, Wasserstein Distance-based Joint Estimation (WDJE) (Zhan et al., 2023) provides unified size transferability bounds for classification and regression:

$R_{D^T}(h, f^T) \le R_{D^S}(h, f^S) + k\lambda W[p^S(x), p^T(x)] + W[p^S(y), p^T(y)] + kM\phi(\lambda).$

The WDJE score compares this bound against target-only risk, guiding the decision to transfer based on empirical Wasserstein distances (domain/task differences) and source task size.

In adversarial robustness and attack transferability, explicit bounds relate ensemble size, diversity, and model smoothness to transfer probabilities (Yang et al., 2021). The derived lower and upper bounds incorporate ensemble error rates, cosine similarity of loss gradients, and smoothness constants:

Lower bound: probability $\geq$ function of gradient similarity, error rates, and perturbation size.
Upper bound: probability $\leq$ function of empirical risks, minimum loss, and ensemble smoothness.

By controlling model size, diversity, and smoothness, ensembles can achieve predictable and bounded adversarial transferability, confirming that increased structural richness and regularization directly reduce vulnerability.

4. Transferability under Non-Standard and Limit Conditions

Recent theory demonstrates that transferability is possible even when classic conditions (bounded density ratio between source and target) fail (Kalavasis et al., 18 Mar 2024). For low-degree polynomial estimators over $\mathbb{R}^n$ , the crucial transfer bound is

$E_Q[(f(x) - f^*(x))^2] \leq C_d \cdot \|dP/dQ\|_\infty^{2d} E_P[(f(x) - f^*(x))^2],$

showing that as long as the “inverse” density ratio is controlled, transfer is possible. In the Boolean case, transferability holds under a maximum influence constraint:

$E_Q[(f(x) - f^*(x))^2] \leq C_d Q(S)^{-2d} E_P[(f(x) - f^*(x))^2],$

where $I_{\max}(\widehat{f} - f^*)$ measures the largest variable-specific impact on error. These results extend size transferability theory to cases with severe support mismatch or heavy truncation.

5. Transferability in Database Query Processing

In database theory, size transferability bounds relate the output size of joins to degree sequences and their $\ell_p$ -norms (Khamis et al., 2023). The central theorem generalizes AGM and PANDA bounds:

$|Q(D)| \leq \prod_{i \in [s]} \| \text{deg}_{R_{j_i}^D}(V_i|U_i) \|_{p_i}^{w_i}$

for any join query $Q$ on input database $D$ . By tuning $p_i$ and $w_i$ , this bound captures instance-specific size effects, offering tighter predictions for both cyclic and acyclic queries—even when traditional bounds trivialize. This transferability across query sizes and structures is achieved by incorporating richer degree statistics.

6. Continuity, Operator Theory, and Design Principles

Transferability for varying input dimension is underpinned by continuity and operator-theoretic properties. For graph neural architectures and set models, transferability corresponds to continuity (and often Lipschitz continuity) with respect to problem size in a limit space (Levin et al., 29 May 2025). The Fréchet derivative of the model mapping can be bounded uniformly independently of the input dimension $n$ :

$\|D\rho_n^{(i)}(A, X_0, ..., X_S)[H, H_0, ..., H_S]\|_\infty \leq L(r)\| (H, H_0, ..., H_S)\|_\infty$

with $L(r)$ independent of $n$ . By composing such locally Lipschitz layers, entire models (e.g., GGNNs) are locally Lipschitz transferable; their infinite-dimensional extensions “inherit” this property, guaranteeing continuous extension as size increases.

This principle informs the design of architectures that generalize dimension-independently—key for applications in point clouds, sets, or large graphs. Size transferability bounds thus provide actionable criteria for creating dimension-agnostic learning systems.

7. Impact and Applications

Size transferability bounds are critical in theory and practice for:

Certifying generalization and robustness when upscaling models.
Designing and benchmarking error-correcting codes with tight size limits.
Ensuring cross-size (and temperature) transferability in molecular simulations.
Predicting output sizes in query processing and optimizing database algorithms.
Guiding architecture design for set, graph, or variable-size inputs in machine learning.
Enabling secure and controlled adversarial transfer in ensemble models.

By providing explicit, quantitative relationships anchored in structural properties and rigorous analysis, these bounds offer a foundation for both theoretical investigation and large-scale deployment of modern learning and inference systems.