G-Optimal Design Criteria

Updated 23 January 2026

G-optimal design criteria are methods for planning experiments that minimize the worst-case prediction variance, ensuring uniformly robust performance across the design space.
They apply to a wide range of models—from linear regression to kriging and graph sampling—leveraging convex optimization, matrix analysis, and equivalence theorems.
Advanced algorithms such as Wynn’s multiplicative, Fedorov's exchange, and online mirror descent facilitate efficient computation even in high-dimensional or robust settings.

G-optimal design criteria form a cornerstone of optimal experiment planning, particularly in regression, response surface, spatial, and modern graph sampling contexts. These criteria systematically target minimization of the largest possible predictive variance over the design or prediction domain, thus providing a direct guarantee on worst-case prediction accuracy. The G-optimality framework generalizes across linear, nonlinear, random-coefficient, Bayesian, and high-dimensional settings, with deep connections to matrix analysis, convex optimization, and information-based design theory.

1. Definition and Theoretical Foundation

G-optimality focuses on minimizing the maximum prediction variance for the best linear unbiased estimator across all points in the design space. In the canonical linear regression setting $y(x) = f(x)^\top \theta + \epsilon$ , with $f(x) \in \mathbb{R}^p$ , $\theta \in \mathbb{R}^p$ , and design measure $\xi$ , the Fisher information matrix is

$M(\xi) = \int f(x) f(x)^\top d\xi(x).$

The G-criterion for a design $\xi$ is given by

$\Phi_G(\xi) = \max_{x \in X} f(x)^\top M(\xi)^{-1} f(x).$

The G-optimal design $\xi_G$ is a minimizer of $\Phi_G$ : $\xi_G \in \arg \min_\xi \max_{x \in X} f(x)^\top M(\xi)^{-1} f(x).$ This scalar functional is convex in $\xi$ , as the matrix inverse is operator convex and the maximum is a pointwise supremum. G-optimality equivalently minimizes the supremum of the variance for predicted responses $\operatorname{Var}[\hat y(x; \hat \theta)]$ over $x \in X$ (Huan et al., 2024).

The celebrated Kiefer–Wolfowitz equivalence theorem establishes that, in linear models, the D-optimal design (maximizing $\det M(\xi)$ ) and G-optimal design coincide: a design is G-optimal if and only if it is D-optimal and the maximal standardized variance equals $p$ , the dimension of $\theta$ (Bloom et al., 2011, Huan et al., 2024).

2. Formulations Across Statistical Models

Classical Linear Regression

The standard setting directly employs $\Phi_G(\xi)$ as above, applicable to continuous or discrete designs. For polynomial regression on compact domains, G-optimal measures spread support points according to the equilibrium measure of $X$ as polynomial degree increases, and the Christoffel function becomes constant across the support (Bloom et al., 2011).

Random Coefficients Models

In random-coefficient regression (RCR), the model

$Y_{ij} = f(x_j)^\top \beta_i + \epsilon_{ij}$

for $i=1,\ldots,n$ , $j=1,\ldots,m$ , with i.i.d. $\beta_i \sim (\beta, \sigma^2 D)$ and error covariance $\mathrm{Var}[\epsilon_{ij}] = \sigma^2$ , leads to a modified G-criterion: $\Phi_G^{\mathrm{RCR}}(\xi) = \max_{x \in X} f(x)^\top \left(M(\xi)^{-1} + (n-1) [M(\xi) + \Delta^{-1}]^{-1} \right) f(x),\quad \Delta = m D.$ This incorporates between-individual parameter randomness, adding a term absent in fixed-effects settings. For straight-line models with diagonal $D$ and symmetric domains, G- and D-criteria again coincide; generally, this equivalence fails in RCR (Prus, 2018).

Nonlinear Regression and Robust Extensions

For nonlinear regression $y_i = \eta(x_i, \theta) + \varepsilon_i$ , local linearization at nominal $\theta^0$ reduces the problem to a standard G-optimality in terms of $g(x, \theta^0) = \frac{\partial \eta(x, \theta^0)}{\partial\theta}$ , but global model behavior is not protected. The extended (global) G-criterion, introduced for robustness, is

$\phi_{eG}(\xi; \theta^0) = \min_{\theta \neq \theta^0} \frac{\| \eta(\cdot, \theta) - \eta(\cdot, \theta^0) \|_\xi^2}{\max_{x \in X} [\eta(x, \theta) - \eta(x, \theta^0)]^2}$

which ensures the design prevents “folding” or near-overlapping model curves far from the nominal value (Pázman et al., 2013).

Kriging and Spatial Prediction

In kriging, the G-optimal criterion minimizes the supremum of the mean-squared prediction error (SMSPE) over the prediction region: $\mathrm{SMSPE}(\xi, \theta) = \sup_{(x_0, y_0) \in \mathcal{D}} \mathrm{MSPE}(x_0, y_0; \xi, \theta)$ The equispaced grid is G-optimal for separable exponential covariance structures, both in frequentist and pseudo-Bayesian analyses (Dasgupta et al., 2021).

Graph Signal Sampling

For graph sampling of K-bandlimited signals, the G-criterion selects a size- $M$ subset $\mathcal{S} \subset \mathcal{V}$ of nodes minimizing

$g(\mathcal{S}) = \max_i [ (V_{\mathcal{S} K}^\top V_{\mathcal{S} K} + \mu I)^{-1} ]_{ii}$

where $V_{\mathcal{S} K}$ projects onto the first $K$ eigenvectors. The criterion induces an $\alpha$ -supermodular set function, supporting efficient greedy approximation schemes (Li et al., 2021).

3. Algorithms and Computational Techniques

Classical Algorithms

Wynn’s Multiplicative Algorithm: Iteratively updates design weights:

$w_i \leftarrow w_i \frac{f(x_i)^\top M(\xi)^{-1} f(x_i)}{p}$

until convergence, yielding the unique continuous D- and G-optimal design (Huan et al., 2024).

Fedorov's Exchange Algorithm: Sequentially swaps support points to decrease the maximum variance.
Coordinate Exchange and Greedy Methods: Optimize one design coordinate or point at a time, typically for exact designs or when the candidate set is large (Walsh et al., 2022).

Continuous and Convex Relaxation

Convex Programming: For discrete design spaces, solve

$\min_{w \geq 0, \sum w_i = 1} t,~~\text{subject to}~f(x_j)^\top M(w)^{-1} f(x_j) \leq t,~j=1,\ldots,m$

via interior-point or first-order methods, rounding the result to obtain integer designs when needed (Huan et al., 2024).

Regret Minimization and Online Mirror Descent

Online Mirror Descent (OMD): Applies fractional design updates, achieving a $(1+\varepsilon)$ -approximation with $O(p/\varepsilon^2)$ samples. Binarization by pipage rounding or exchange methods yields discrete designs. G-OMD dominates prior approaches in computational efficiency and convergence rate (Allen-Zhu et al., 2017).

Metaheuristics for Exact G-Designs

Particle Swarm Optimization (PSO): Adapted to select design matrices minimizing $\max_x\, f(x)^\top (F^\top F)^{-1} f(x)$ over the region for response surface models, outperforming both classical coordinate-exchange and genetic algorithms in G-efficiency for $K \leq 5$ factors (Walsh et al., 2022).
Deterministic Retrospective Algorithms (Kriging): Efficient majorization-based grid update schemes identify the best possible subgrid under SMSPE for spatial design problems (Dasgupta et al., 2021).

Efficient Approximate Algorithms (Graph Settings)

Low-pass Filtering and Greedy Selection: Use Givens rotation–based filtering to avoid explicit eigendecomposition, then apply fast greedy subset selection with $\alpha$ -supermodular guarantees (Li et al., 2021).

4. Properties, Equivalence, and Theoretical Results

G-optimality enjoys several structural and theoretical characteristics:

Convexity: $\Phi_G$ is convex in the design measure due to operator convexity of the matrix inverse and the fact that $\max$ preserves convexity.
Equivalence Theorems: For linear regression, G- and D-optimality are equivalent. At a G-optimal design, the maximal variance equals the parameter dimension $p$ (Bloom et al., 2011, Huan et al., 2024). In RCR models, this equivalence fails except in straight-line/diagonal-D settings (Prus, 2018).
Support Properties: In polynomial and straight-line regression on intervals, optimal designs place support on interval boundaries or, for degree $s$ , distribute according to the equilibrium measure as $s \to \infty$ (Bloom et al., 2011, Prus, 2018).
Design Evenness Majorization: In spatial grid design, equispaced grids majorize all others w.r.t. partition-length, minimizing the worst-case MSPE (Dasgupta et al., 2021).
Bregman Divergence Structure: $\Phi_G$ (or $-\Phi_G$ ) induces a strictly concave functional over positive-definite matrices, yielding Bregman-type divergences that strictly distinguish distributions based on information matrices (Pronzato et al., 2018).

5. Applications and Illustrative Examples

Polynomial Regression: On $[-1,1]$ , the G-optimal design for quadratic regression is at $\{-1, 0, 1\}$ , equally weighted (Huan et al., 2024), agreeing with equilibrium measure predictions for large degree (Bloom et al., 2011).
Spatial/kriging Design: For separable exponential models on $[0,1]^2$ , the G-optimal design is always the regular equispaced grid. Retrospective addition/removal of grid points is handled efficiently via majorization, with direct application in methane-flux monitoring networks (Dasgupta et al., 2021).
Graph Sampling: G-optimal subset selection achieves near-minimax variance for bandlimited graph signals, scaling efficiently to large graphs with thousands of nodes (Li et al., 2021).
Response Surface Design: PSO-generated G-optimal designs improve on state-of-the-art for multivariate second-order models, especially in high-factor settings (Walsh et al., 2022).
Nonlinear Model Protection: The extended G-criterion yields designs that protect against global “folding” of non-identifiable models, demonstrably superior for pharmacokinetic and other nonlinear applications (Pázman et al., 2013).

6. Generalizations and Open Problems

Bayesian and Pseudo-Bayesian G-Optimality: G-criterion extends to Bayesian settings by optimizing expected-variance or SMSPE under parameter priors, remaining tractable under separable covariance and discrete domains (Dasgupta et al., 2021, Huan et al., 2024).
Robust and Global G-Criteria: Nonlinear and robust G-optimality require minimax or risk-based criteria, leading to LP or cutting-plane algorithms with empirical superiority in preventing near-unidentifiability (Pázman et al., 2013).
High-Dimensional and Implicit Models: For implicit or large-scale models, practical G-optimal algorithms exploit sampling, sketching, first-order methods, or randomized rounding (Allen-Zhu et al., 2017, Li et al., 2021).

Key open challenges include scalable maximization of minimum prediction variance for general nonlinear models, efficient approximation in very high-dimensional or implicit-design settings, and robustification with respect to model misspecification and global parameter identifiability (Huan et al., 2024).

7. Comparative Analysis and Criteria Interplay

Criterion	Objective	Invariance	Key Use
D-optimality	Minimize $\det M(\xi)^{-1}$	Parametric	Global parameter estimation
A-optimality	Minimize $\operatorname{tr} M(\xi)^{-1}$	Parametric	Average parameter variance
G-optimality	Minimize $\sup_x f(x)^\top M(\xi)^{-1} f(x)$	Not fully param.	Worst-case prediction variance over design region

Advantages: G-optimality directly bounds maximum prediction uncertainty, uniquely addressing uniform accuracy requirements.
Limitations: The non-smooth supremum objective can make optimization harder than for D- or A-criteria; in nonlinear or finite-sample settings, G- and D-optimality may diverge, requiring careful distinction in algorithmic application (Huan et al., 2024, Prus, 2018).

G-optimal designs are central to experimental design in fields seeking high-fidelity uniform predictions, robust model identification, or spatial/graph inference under tight accuracy requirements. The rapidly expanding methodology and computational toolkit underpin ongoing theoretical and practical advances in this area.