Papers
Topics
Authors
Recent
Search
2000 character limit reached

Preferential G-Optimal Design

Updated 2 February 2026
  • Preferential G-optimal design is a methodology that selects experimental points to minimize the maximum posterior predictive variance while accounting for sampling biases.
  • It extends classical G-optimal design by integrating Bayesian utility functions and preferential sampling models, such as log-Gaussian Cox processes, to optimize design criteria.
  • Applications include geostatistics, spatial statistics, and graph-signal processing, where tailored weighting and efficient algorithms reduce prediction uncertainty.

Preferential G-optimal Design refers to a class of experimental design methodologies that seek to select sampling points—or, more generally, experimental conditions—so as to minimize the worst-case (maximum) posterior predictive variance of targeted quantity estimates, while explicitly taking into account preferences or biases in the sampling process. Classic G-optimal design is rooted in the theory of optimal experimental design, where the goal is to ensure uniformly low variance for the estimation or prediction task. Preferential G-optimal design generalizes this to settings where the sampling mechanism is itself dependent on the underlying (often latent) process, as in preferential sampling in geostatistics, or when preference weights, costs, or specific utility functions are imposed in domains such as graph-signal processing.

1. Mathematical Foundations of G-optimality

G-optimality is formally defined as the design criterion that seeks to minimize the largest predictive variance in the estimable domain. Given a design matrix VSKV_{SK}, the G-optimal criterion in graph subset selection and similar linear models is

g0(S)=maxi=1,,K[Z(S)1]iig_0(S) = \max_{i=1,\dots,K} [Z(S)^{-1}]_{ii}

where Z(S)=VSKTVSKZ(S) = V_{SK}^T V_{SK} and SS denotes the selected set of measurement or sampling indices (Li et al., 2021). For stabilization, a ridge term μ>0\mu > 0 is frequently added, yielding

g(S)=maxdiag([Z(S)+μI]1)g(S) = \max \operatorname{diag}( [Z(S) + \mu I]^{-1} )

In geostatistics under preferential sampling, G-optimality similarly minimizes

uG(d,θ)=maxxDVar[S(x)Y,X,d,θ]u_G(d,\theta) = -\max_{x \in D} \operatorname{Var}[S(x) \mid Y, X, d, \theta]

where dd represents the candidate new sampling locations, θ\theta the model parameters, and S(x)S(x) the latent spatial process at location xx (Ferreira et al., 2015).

2. Preferential Sampling: Impact and Model Formulation

In preferential sampling, the probability of selecting a location for measurement depends on the unknown underlying process rather than being strictly random. For spatially indexed data, preferential sampling is typically modeled via a log-Gaussian Cox process where the intensity function is a function of the latent field SS:

λ(x)=exp{α+βS(x)}\lambda(x) = \exp\{\alpha + \beta S(x)\}

The joint model then comprises

  • A Gaussian prior on SS
  • A likelihood for the sampling pattern given SS and sampling parameters (α,β)(\alpha,\beta) (bias degree)
  • A standard observation model for measurements at sampled points

This induces a dependency between sampling pattern and prediction uncertainty, requiring adjusted design strategies (Ferreira et al., 2015).

3. Utility-based Bayesian Design Criteria

The Bayesian approach to G-optimal design within preferential frameworks centers on maximizing expected utility functions that quantify predictive improvement. For added mm new design points d=(d1,,dm)d=(d_1,\dots,d_m), the standard utility for integrated-variance reduction is

u(d,θ)=D[Var(S(x)Y,X,θ)Var(S(x)Y,X,d,θ)]dxu(d, \theta) = \int_D [\operatorname{Var}(S(x)\mid Y,X,\theta) - \operatorname{Var}(S(x)\mid Y,X,d,\theta)] dx

The G-optimal version focuses instead on reduction of the maximum posterior predictive variance over DD:

uG(d,θ)=maxxDVar[S(x)Y,X,d,θ]u_G(d, \theta) = -\max_{x \in D} \operatorname{Var}[S(x)\mid Y, X, d, \theta]

The corresponding expected utility is integrated over both model parameters and potential new observations. Evaluation and maximization proceed via MCMC over the joint space (d,θ,yd)(d, \theta, y_d) with an augmented-posterior target proportional to uG(d,θ)p(θX,Y)u_G(d, \theta) p(\theta \mid X, Y) (Ferreira et al., 2015).

4. Computational Algorithms and Complexity

Various algorithms implement G-optimal design under preferential settings. In geostatistics, MCMC samplers (e.g., Metropolis-within-Gibbs) are employed to traverse the design–parameter space, drawing samples with probability proportional to utility-weighted posteriors. For each draw, predictive variances at discretized locations are updated via kriging formulas,

Var[S(xi)Y,X,d,θ]=σ2σ4ri,[o,d]T[τ2In+m+σ2R[o,d][o,d]]1ri,[o,d]\operatorname{Var}[S(x_i)\mid Y,X,d,\theta] = \sigma^2 - \sigma^4 r_{i,[o,d]}^T [\tau^2 I_{n+m} + \sigma^2 R_{[o,d][o,d]}]^{-1} r_{i,[o,d]}

In graph-signal settings, the greedy algorithm (FAGOD) iteratively selects nodes minimizing the G-optimal criterion, leveraging Sherman–Morrison updates for efficient recalculation. The algorithm exploits the approximate α\alpha-supermodularity of the G-optimal function, enabling polynomial-time execution with provable bounds. The computational complexity is

O(NlogN+NK3)O(N \log N + N K^3)

where NN is the domain size and KK the signal bandwidth (Li et al., 2021).

5. Modeling and Correction of Preferential Sampling Effects

Preferential sampling necessitates correction terms in covariance estimation due to the extra information encoded in the sampling pattern. Both full embedding of the latent field SS in the design sampler and Gaussian approximations of the marginal posterior are strategies for adjusting kriging variances. When the preference parameter β|\beta| is moderate, standard kriging covariances suffice; for larger β\beta, Laplace expansions of the log-Cox likelihood produce corrected variances for design optimization (Ferreira et al., 2015).

6. Incorporating User Preferences and Weights

Weighted versions of the G-optimal criterion allow the experimenter to bias selection toward preferred locations, variables, or nodes. Common schemes include:

  • Weighted maximum variance:

gw(S)=maxiwi[(Z(S)+μI)1]iig_w(S) = \max_i w_i [ (Z(S) + \mu I)^{-1} ]_{ii}

where wi>0w_i > 0 encodes location “importance.”

  • Matrix scaling:

Zw(S)=iSwivi:Tvi:+μI,g(S)=maxdiag(Zw(S)1)Z_w(S) = \sum_{i \in S} w_i v_{i:}^T v_{i:} + \mu I, \qquad g(S) = \max \operatorname{diag}(Z_w(S)^{-1})

Both maintain monotonicity and α\alpha-supermodularity, with modified (weight-dependent) performance bounds. The same greedy algorithms persist, either via objective scaling or row/column-weighting in the covariance structure (Li et al., 2021).

7. Practical Implementation Details and Multi-modality

Practical solution strategies must contend with high-dimensional design spaces (e.g., DmD^m). Posterior pseudo-density functions for design may be multi-modal; visualization of their structure can expose trade-off regions and sampling ambiguities. Grid-based approximations for spatial domains (moderate MM values, e.g., 200–1000) and reuse of Cholesky factorization accelerate variance computations. Sequential greedy addition of sampling locations is popular when mm is large, though full joint optimization may yield superior but costlier solutions (Ferreira et al., 2015). For graph applications, low-pass filter approximations using sparse Givens rotations obviate computationally expensive eigen-decomposition (Li et al., 2021).


Preferential G-optimal design thus generalizes classical experimental design by accounting for dependencies between measurement selection and latent processes, as well as explicit sampling preferences. It finds application in geostatistics, spatial statistics, graph-signal processing, and any setting where targeted minimization of the worst-case predictive uncertainty is required under non-uniform, possibly biased, data acquisition schemes. Key references include methodology for spatial fields under Cox-process sampling (Ferreira et al., 2015) and accelerated algorithms for graph subset selection (Li et al., 2021).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (2)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Preferential G-optimal Design.