Min-form Value Functions

Updated 12 April 2026

Min-form value functions are defined as the minimum over a set of evaluator functions, encoding worst-case or bottleneck objectives in various applications.
They play a key role in dynamic programming, reinforcement learning, and control theory by supporting nonconvex analyses and directional differentiability for robust optimization.
Algorithmic approaches leverage (min,+) algebra and proximal methods to efficiently handle the inherent nonsmoothness and combinatorial challenges of these functions.

A min-form value function is a function defined as the infimum or minimum over a set of evaluator functions or rewards, often subject to constraints or structured over admissible actions, parameter combinations, or trajectories. Min-form value functions arise naturally in mathematical programming, statistical inference, control theory, Markov decision processes, game theory, mean-field games, and reinforcement learning, where they encode worst-case, bottleneck, or extremal objectives. Their structure is central to the analysis of optimization sensitivity, the development of nonconvex and non-differentiable algorithms, and modern credit assignment mechanisms in sequential learning settings.

1. Formal Definitions and Algebraic Structure

A general min-form value function takes the form

$V(f)(x) = \inf_{u \in A(x)} f(u, x),$

where $f: U \times X \to \mathbb{R}$ , $A(x) \subset U$ is a compact, upper-hemicontinuous correspondence, and $x \in X$ is the state parameter (Firpo et al., 2019). This construction underpins classical optimal value functions in mathematical programming, where for a parameterized feasible set $Y(x) = \{ y \mid g(x, y) \leq 0 \}$ ,

$v(x) = \min_{y \in Y(x)} f(x, y)$

provides the baseline for parametric sensitivity and subdifferential analysis (Zemkoho, 2017).

In dynamic programming and control, the min-form appears as the (min,+) linear combination: $J_\theta(s) = \min_{1 \leq i \leq k} \{\phi_i(s) + \theta_i\}$ for basis functions $\phi_i$ , forming a (min,+) subsemimodule structure (Lakshminarayanan et al., 2014, Lakshminarayanan et al., 2014).

In reinforcement learning with process reward models, the min-form value function is constructed over reward trajectories $r_i$ as

$V_{\min}(s_t) = \mathbb{E}_t \Big[ \min_{i \geq t} r_i \Big],$

targeting the worst-reward event rather than the canonical cumulative discounted sum (Cheng et al., 21 Apr 2025).

Min-convex functions, as studied in variational analysis, represent the minimum over a finite family of convex functions: $f: U \times X \to \mathbb{R}$ 0 where each $f: U \times X \to \mathbb{R}$ 1 is a proper, lower semicontinuous, convex function (Dao et al., 2018).

2. Differentiability and Sensitivity Properties

The primary technical feature of min-form value functions is their general lack of classical differentiability. The mapping $f: U \times X \to \mathbb{R}$ 2 fails to be fully Hadamard differentiable due to non-smoothness and the possible non-uniqueness of minimizers (Firpo et al., 2019). However, the map is Hadamard directionally differentiable. The directional derivative at $f: U \times X \to \mathbb{R}$ 3 in direction $f: U \times X \to \mathbb{R}$ 4 is

$f: U \times X \to \mathbb{R}$ 5

where $f: U \times X \to \mathbb{R}$ 6.

In parametric optimization, when under sufficient constraint qualifications (e.g., Mangasarian–Fromovitz), first and second-order subdifferentials of $f: U \times X \to \mathbb{R}$ 7 can be computed. For a unique primal-dual solution, generalized Hessian estimates are given in terms of Lagrangian derivatives and coderivatives of the argmin and multiplier mappings (Zemkoho, 2017).

This directional differentiability undergirds the development of uniform inference procedures and resampling techniques that are valid for min-form models, enabling the construction of confidence bands via functional Delta methods and empirical bootstrap (Firpo et al., 2019).

3. Min-Form Value Functions in Dynamic Programming and RL

The (min,+) algebra forms a foundation for constructing min-form value functions on state spaces. The (min,+) linear span of designated basis vectors yields a subsemimodule within which approximate value functions are defined: $f: U \times X \to \mathbb{R}$ 8 Approximate Dynamic Programming (ADP) with (min,+) function approximation proceeds via a monotone projection operator: $f: U \times X \to \mathbb{R}$ 9 that finds the smallest dominating element in $A(x) \subset U$ 0. The projected Bellman recursion,

$A(x) \subset U$ 1

where $A(x) \subset U$ 2 is the Bellman operator, leads to a contraction in the sup-norm and ensures geometric convergence to a unique fixed point $A(x) \subset U$ 3 in $A(x) \subset U$ 4 (Lakshminarayanan et al., 2014, Lakshminarayanan et al., 2014). The associated error bounds and explicit convergence rates are strictly controlled by the projection error in the sup-norm.

In RL process credit assignment, min-form value functions are used to mitigate reward hacking, as only the worst future step propagates gradients. The PURE (Process sUpervised Reinforcement lEarning) framework leverages this property by replacing the canonical sum of discounted rewards with a min-over-future-rewards objective, yielding stable training even in the presence of adversarial process-based rewards (Cheng et al., 21 Apr 2025).

4. Min-Convexity, Proximal Algorithms, and Variational Structure

A function is min-convex if it is a pointwise minimum over a finite set of convex functions. This structure results in key properties:

The proximal operator of a min-convex function decomposes into a union of proximal operators for the constituent convex branches: $A(x) \subset U$ 5
This map is union $A(x) \subset U$ 6-averaged nonexpansive, facilitating analysis of the fixed-point iterations and proximal splitting algorithms (Dao et al., 2018).

Proximal point, forward-backward, and Douglas–Rachford splitting methods for min-convex objectives inherit local convergence guarantees from the union-averaged nonexpansive operator framework. If initialized near strong fixed points (where the active selector is locally constant), these methods enjoy per-iteration complexity equivalent to the convex case. The framework accommodates nonconvex, piecewise-convex objectives, with error metrics and stability analyses carried over from the convex setting.

5. Statistical Inference with Min-Form Value Functions

Uniform inference for min-form value functions requires specialized techniques due to their lack of Hadamard differentiability. For a plug-in estimator $A(x) \subset U$ 7, functional asymptotics are governed by the directional derivative structure. Tests based on Kolmogorov–Smirnov and Cramér–von Mises statistics are constructed using

$A(x) \subset U$ 8

with the limiting law determined through the functional Delta method for directionally differentiable maps (Firpo et al., 2019).

Bootstrap procedures incorporate estimates of the plug-in directional derivative by locally approximating the set of minimizers and contact sets. This approach yields valid uniform confidence bands, with empirical performance verified via Monte Carlo studies and applications to treatment effect bounds (Firpo et al., 2019).

6. Min-Form Value Functions in Mean Field Games, Games, and Control

In mean-field games and control, min-form value functions appear as minimal solutions to master equations. For extended mean-field games with multiple equilibria, a partial order on flows induces minimal and maximal equilibria, with the minimal flow supplying the minimal value function $A(x) \subset U$ 9 (Mou et al., 2023). This function solves a backward Hamilton–Jacobi PDE or master equation with a nonlocal operator, and is characterized as the minimal weak-viscosity solution under monotonicity conditions. Regularity results ensure $x \in X$ 0 is $x \in X$ 1 in the state variable, while possible discontinuities manifest in the parameter (measure) variable.

Proximal algorithms for min-convex composite objectives in variational problems also rely on the structure of min-form value functions, with convergence properties established under union-averaged nonexpansiveness (Dao et al., 2018).

7. Computational and Algorithmic Aspects

Computing min-form value functions is often dominated by minimization over possibly large or combinatorial index sets. In (min,+) ADP, each projection step requires $x \in X$ 2 time for $x \in X$ 3 states and $x \in X$ 4 basis functions (Lakshminarayanan et al., 2014). In practice, randomized and sampling-based methods (e.g., “weak” projection, variational projection) can be used to reduce per-iteration cost while preserving monotonicity and contraction.

For min-convex and min-of-convex objectives, each iteration reduces to solving a (typically smaller-scale) convex subproblem. Algorithms exploiting this structure achieve per-iteration efficiency comparable to the convex setting, and robustly handle nonconvexity introduced through the min-form structure.

In process RL, soft min-reweightings and trajectory pruning are used to implement min-form credit assignment efficiently, supporting stable PPO and advantage estimation (Cheng et al., 21 Apr 2025).

In summary, min-form value functions constitute a broad and technically rich class of functions with critical roles across optimization, control, reinforcement learning, inference, and game theory. Their nonsmoothness and extremal structure pose unique challenges for analysis, computation, and inference, but also endow them with properties—worst-case control, monotonicity preservation, directional differentiability, and contraction in sup-norm—that have enabled foundational advances in modern applied mathematics and machine learning.