Papers
Topics
Authors
Recent
Search
2000 character limit reached

Criticality-Derived Weighting

Updated 16 March 2026
  • Criticality-derived weighting is a framework that assigns importance to data or network elements based on critical states like phase transitions and marginal stability.
  • It integrates methodologies from deep learning, data assimilation, and reinforcement learning to enhance training dynamics and overall system performance.
  • Empirical studies demonstrate that this approach improves robustness and prediction accuracy by emphasizing rare or highly influential samples and structures.

Criticality-derived weighting encompasses a set of methodologies where the relative importance of elements—examples, samples, network parameters, or infrastructure nodes—is assigned according to measures of “criticality” drawn from statistical physics, optimization theory, human behavior, or data-driven signals. These schemes are unified by the notion that systems (physical, computational, or sociotechnical) exhibit optimality, robustness, or maximal functional range near points of criticality (phase transitions, marginal stability, peak uncertainty, or maximal influence). Criticality-derived weighting frameworks appear in domains as varied as deep learning, data assimilation, resilience analysis, offline reinforcement learning, and neural network initialization. The following sections review key principles, formal methodologies, canonical models, algorithmic prescriptions, and cross-domain impacts, drawing on recent research across multiple fields.

1. Foundations of Criticality-Derived Weighting

Criticality, in this context, refers to the heightened dynamical, informational, or functional relevance of system components when certain quantitative criteria are met—often associated with phase transitions, marginal stability, or maximal influence over global system behavior. Criticality-derived weighting schemes prescribe that these components be upweighted in objective functions, sampling distributions, or network initialization, either for improved training dynamics, robustness, sampling fidelity, or societal resilience.

In deep learning, the central observation is that the gradient magnitude with respect to a sample's output logits (i.e., /z|\partial \ell / \partial z|) directly quantifies its “pull” on the model; large-magnitude gradients mark “critical” data points for parameter updates (Wang et al., 2019). In ensemble data assimilation, weights attached to critical points are determined by both a data-mismatch functional and a local Jacobian determinant, reflecting local posterior geometry (Ba et al., 2023). Human-centered resilience metrics combine behavioral dependence, structural substitutability, and access patterns to generate per-facility criticality weights that modulate regional vulnerability scores (Ma et al., 18 Dec 2025). Offline RL leverages signals from uncertainty quantification or long-tail events to amplify data from rare, high-risk, or information-dense samples, directly injecting criticality-derived sampling probabilities into objective functions (Guillen-Perez, 25 Aug 2025). Finally, in network theory and statistical field perspectives, criticality appears as a set of initialization and architectural prescriptions ensuring maximal depth-to-width dynamical range and stable signal propagation (Sundberg et al., 1 Aug 2025).

2. Mathematical Schemes and Formal Definitions

Mathematical formalizations of criticality-derived weights vary by domain but share common elements: a criticality signal cic_i is computed per element (sample, configuration, facility, etc.), then normalized to produce nonnegative weights wiw_i for use in reweighting objectives, sampling, or predictions.

Deep Learning Loss Manipulation (DM)

In DM, the training objective for a model with parameter θ\theta and logits ziz_i for sample ii is altered by prescribing a desired emphasis density g(pi)g(p_i) over the softmax probability pip_i:

  • Compute model output ziz_i, pi=softmax(zi)yip_i = \mathrm{softmax}(z_i)_{y_i}.
  • Define the target gradient magnitude wiDM=g(pi)w_i^{DM} = g(p_i), e.g. polynomial, exponential, or normal forms.
  • Rescale the standard loss gradient to enforce ziDM=wiDM\|\nabla_{z_i}^{DM}\| = w_i^{DM}.
  • Optionally normalize weights within a batch: w^i=wiDM/jwjDM×B\hat w_i = w_i^{DM} / \sum_j w_j^{DM} \times B.
  • Inject modified gradients in backpropagation (Wang et al., 2019).

Ensemble-Based Data Assimilation

Criticality weights for each critical-point sample xx^*: w(x)exp[φ(x)]J(x)1w(x^*) \propto \exp\left[-\varphi(x^*)\right] J(x^*)^{-1} where φ(x)\varphi(x^*) is a quadratic data-mismatch (combining prior deviation and data misfit), and J(x)J(x^*) is the local Jacobian determinant of the mapping from prior samples to critical points. Under linear updates, J(x)J(x^*) is constant and effectively cancels; for hybrid nonlinear mappings, J(x)J(x^*) varies, often leading to multimodal posteriors and nontrivial sample efficacy (Ba et al., 2023).

Functional Criticality in Infrastructure

The functional criticality score for facility ff is

Cf=1NfiOfVi,fsiC_f = \frac{1}{N_f} \sum_{i\in O_f} \frac{V_{i,f}}{s_i}

where Vi,fV_{i,f} is the visit count from origin ii to facility ff and sis_i is substitutability of origin ii. These are linearly normalized to Cfnorm[0,1]C_f^{\rm norm} \in [0,1] and used as multiplicative weights in downstream risk assessment (Ma et al., 18 Dec 2025).

Offline RL Long-Tail and Uncertainty Weighting

Timestep or scenario-level criticality signals ct[0,1]c_t \in [0,1] (e.g., model uncertainty, action rarity, heuristic risk) are normalized: wt=cttctw_t = \frac{c_t}{\sum_{t'}c_{t'}} and introduced into the weighted loss or sampling distribution of the RL agent (Guillen-Perez, 25 Aug 2025).

Table 1: Examples of Criticality-Weighting Function Forms

Domain Criticality Signal Weighting Formula
Deep nets pip_i (softmax) g(pi)=pα(1p)ηg(p_i) = p^\alpha (1-p)^\eta, eβ(1p)e^{\beta(1-p)}
Data assimilation Data-mismatch, Jacobian exp[φ(x)]J(x)1\exp[-\varphi(x^*)] J(x^*)^{-1}
Resilience Visits, substitutability Cf=Nf1iVi,f/siC_f = N_f^{-1} \sum_i V_{i,f}/s_i
RL Uncertainty, rarity, risk wt=ct/(tct)w_t = c_t/(\sum_{t'}c_{t'})

Across settings, the weighting function is designed to emphasize, suppress, or equilibrate components as a function of their role in critical system behavior.

3. Algorithmic Prescriptions and Implementation

The practical implementation of criticality-derived weighting depends on the recognition of criticality signals, normalization, and their injection into core algorithms.

  • Derivative Manipulation (DM): Compute per-example criticality via the forward model, determine g(pi)g(p_i), normalize, and rescale backpropagation gradients accordingly. This enables direct control over which regions of pip_i (“easy”, “hard”, “intermediate”) are stressed during optimization, subsuming categorical cross-entropy, mean absolute or squared error, focal loss, and other sample-reweighting protocols as special cases (Wang et al., 2019).
  • Weighted RML in Data Assimilation: After assembling perturbed prior ensembles, each sample’s criticality is computed via a joint data-mismatch functional and (potentially sample-dependent) Jacobian, then incorporated into importance-weighted sampling for marginal posterior estimation. When hybrid models are present, local curvature and nonlinearity produce sample-specific weights essential for capturing multimodal posteriors (Ba et al., 2023).
  • Human-Centered Infrastructure Analysis: Populate an origin–facility matrix from behavioral mobility records, compute substitutability-adjusted dependence per facility, normalize within lifeline type, and apply as weights in the aggregation of hazard-exposure or vulnerability indices. The framework aligns infrastructure criticality assessments with real-world use rather than asset-centric proxies (Ma et al., 18 Dec 2025).
  • Offline RL Data Curation: After quantifying criticality via kinematic risk, interaction scores, action rarity, or model ensemble uncertainty, normalize sample scores, and use as sampling weights in batch stochastic optimization. Empirically, per-timestep uncertainty-weighting maximizes reactive safety, while scenario-level weighting improves long-horizon planning. Implementation leverages WeightedRandomSampler or scenario-resampling primitive in data pipelines (Guillen-Perez, 25 Aug 2025).
  • Critical Network Initialization: Statistical field theory and renormalization-group analysis yield hyperparameter formulas for weight/bias variance and depth/width ratios to ensure criticality of signal propagation, stable training, and optimal dynamical range. For ReLU networks, the prescription is Wij()N(0,2/n)W^{(\ell)}_{ij}\sim N(0, 2/n), b=0b=0, with learning rates scaled by $1/L$ and L/nL/n ratios kept low (Sundberg et al., 1 Aug 2025). In organizational Ising models, weights are iteratively fitted to reproduce critical long-range correlation structure, yielding maximal mutual information and behavioral transitions in embodied controllers (Aguilera et al., 2017).

4. Empirical Performance and Cross-Domain Impact

Empirical studies report that criticality-derived weighting strategies consistently outperform conventional uniform- or heuristic-based approaches under data imbalance, noise, complex dynamics, or risk propagation settings.

  • Deep Models: DM yields substantial gains on vision and language tasks with severe label noise or class imbalance (e.g., CIFAR-100 at 40% noise: accuracy increases from 53.2% to 61.0%; Clothing1M: 73.3% vs. best prior 72.2%) (Wang et al., 2019).
  • Data Assimilation: Weighted RML with criticality weighting enables accurate posterior estimation in highly non-Gaussian, multimodal settings (e.g., hierarchical Gaussian models, nonlinear permeability transforms); hybrid weighting outperforms standard iterative ensemble smoothers in non-convex regimes (Ba et al., 2023).
  • Resilience Planning: Functional criticality analysis exposes deeply concentrated behavioral dependence, with a small minority of facilities absorbing the majority of functional risk (2.8% of grocery stores, 14.8% of hospitals classified as highly critical); normalized criticality weights drive population-weighted vulnerability indices, revealing that climate-induced flood vulnerability grows disproportionately in critical service nodes (Ma et al., 18 Dec 2025).
  • Offline RL: Non-uniform sampling by model uncertainty reduces collision rates by nearly a factor of three (from 16.0% to 5.5%) in autonomous driving, compared to baseline CQL agents trained with uniform sampling. Scenario-level criticality weighting optimizes planning, while per-timestep weighting directly improves safety and comfort metrics (Guillen-Perez, 25 Aug 2025).
  • Neural Network Training: Critical initialization and architecture scaling (ReLU, CW=2/nC_W^*=2/n) enables stable training with stochastic gradient descent in nuclear binding energy models, achieving few-MeV final errors. Noncritical initialization or suboptimal L/nL/n ratios lead to instability or degraded performance (Sundberg et al., 1 Aug 2025).

5. Methodological Variants and Domain Extensions

Significant methodological diversity exists within criticality-derived weighting, as evidenced by differences in the underlying criticality signals and normalization procedures:

  • Sample criticality: Gradient magnitude or model uncertainty as immediate indicators of “critical” data; directly impacts loss functions in supervised/deep learning and RL.
  • Correlation-driven weighting: Weight matrices {Jij}\{J_{ij}\} learned to match critical correlations from physical models, as in Ising-based architectures for embodied agents; emphasizes scale-free, maximally informative dynamics (Aguilera et al., 2017).
  • Jacobian-augmented importance: In ensemble data assimilation, the Jacobian determinant modulates sample weights to correct for local curvature and nonlinearity, especially essential for multimodal or ill-posed problems (Ba et al., 2023).
  • Behavioral functional dependence: Facility importance assessed by behavioral (mobility-derived) dependence, substitutability, and catchment size in infrastructure resilience; aligns weighting with real-world systemic impact (Ma et al., 18 Dec 2025).
  • RG-based initialization and dynamical criticality: Explicit field-theory analysis provides layerwise or architecture-dependent prescriptions for hyperparameters, ensuring extended stable signal propagation (Sundberg et al., 1 Aug 2025).

A plausible implication is that as criticality-derived weighting schemes continue to propagate across scientific and engineering domains, further domain-specific signals of criticality (e.g., energy landscapes, network flow, mutual information) could be operationalized for weighting, driving advanced robustness and adaptivity.

6. Theoretical Significance and Limitations

Criticality-derived weighting offers a unifying physical/statistical foundation for a broad class of weighting, sampling, and initialization schemes. By rooting importance in critical behavior—via gradients, correlation structure, uncertainty, or functional necessity—these frameworks typically maximize information flow, robustness to perturbation (e.g., label noise, rare events, multimodal posteriors), and behavioral flexibility.

However, several limitations are noted across the literature:

  • Optimization: For deep networks, many criticality-based analyses are limited to simple SGD optimizers and may be superseded by adaptive schemes empirically; optimality of criticality under all training protocols is not guaranteed (Sundberg et al., 1 Aug 2025).
  • Implementation: Functional criticality metrics in infrastructure require extensive, high-fidelity human mobility records, and may be sensitive to temporal or spatial sampling biases (Ma et al., 18 Dec 2025).
  • Scalability: In ensemble techniques, effective sample size (NeffN_{\rm eff}) may become limiting under sharply multimodal geometries; denoising, regularization, or larger ensembles may be required (Ba et al., 2023).
  • Generalization: The universality of criticality as an organizing principle is supported in several models, but domain-specific adjustments, normalization conventions, and limitations on signal extraction must be empirically validated.

7. Synthesis and Outlook

Criticality-derived weighting constitutes a mathematically tractable, physically motivated, and empirically validated design principle for robustly weighting components in complex systems across machine learning, inference, control, and resilience analysis. Key attributes include:

  • Transparent mapping from criticality signal to sample/facility/parameter weight.
  • Subsumption and generalization of existing loss reweighting, hard-mining, curriculum, or regularization schemes.
  • Universal applicability in domains with identifiable phase transitions, uncertainty concentration, or dynamically marginal regimes.

As research progresses, deeper integration of criticality-derived weighting with adaptive optimization, high-resolution behavioral data, and mechanistic scientific models is expected to drive new advances in system robustness, controllability, and interpretability across domains.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Criticality-Derived Weighting.