Indirect Cost Functions Overview

Updated 12 November 2025

Indirect cost functions are defined as the minimal cumulative cost over sequential experiments that replicate a target distribution of posteriors in information acquisition.
They satisfy key axioms like monotonicity and sub-additivity, ensuring that less informative or sequentially composed experiments incur lower costs.
Posterior-separable forms connect these costs to statistical divergences and entropy-based penalties, with significant implications for dynamic learning and cost-sharing frameworks.

An indirect cost function quantifies the minimum expected cost of generating an information structure when the design is unconstrained except by the cost assigned to individual direct experiments and the possibility to sequentially combine such experiments. This concept is central to the economic theory of information acquisition, where information is modeled as a stochastic reduction in uncertainty over a finite state space, and the costs of different acquisition strategies are analyzed according to their efficiency and compositional properties.

1. Formal Model and Definition

The state space $X$ is finite. An information structure or "experiment" is formalized as a distribution over posterior beliefs $\pi \in \Delta^2(X)$ , with mean (expected value) equal to the prior $\mu_0$ . The cost of producing such a structure in one step is given by a direct cost function $C_{\text{direct}} : \Delta^2(X) \rightarrow \mathbb{R}_+$ . This function is required to be bounded and (piecewise) continuous, convex under mixture (i.e., mixing experiments does not incur extra cost), and to satisfy $C_{\text{direct}}(\delta_\mu)=0$ for all deterministic (degenerate) posteriors $\delta_\mu$ .

The indirect cost $C_-(\pi|\mu_0)$ of generating a distribution $\pi$ is then defined as the minimal expected cumulative direct cost over all finite-horizon sequential designs replicating $\pi$ in law, allowing for "information-disposal" (steps that forget information at no cost). Formally,

$C_-(\pi|\mu_0) = \inf_{\langle\mu_t\rangle\rightsquigarrow\pi} \mathbb{E} \left[ \sum_{t=0}^{T-1} C_{\text{direct}}(\mathrm{Law}(\mu_{2t+1}|\mu_{2t})) \right],$

where the infimum is over all Markov chains on beliefs with the required boundary conditions and martingale property. This definition enables the modeling of the cheapest way to implement $\pi$ given a direct cost structure and is robust to sequential composition and probabilistic mixing of experiments (Zhong, 2018).

2. Fundamental Axioms: Monotonicity and Sub-additivity

All indirect cost functions $C_-$ derived as above satisfy two key properties:

Monotonicity: If $\pi'$ is a Blackwell-garbling of $\pi$ (i.e., $\pi' \le_{\mathrm{BW}} \pi$ — $\pi'$ is less informative than $\pi$ ), then

$C_-(\pi') \le C_-(\pi).$

This reflects the principle that less informative structures can be implemented at lower or equal cost.

Sub-additivity: For any two-stage experiment where $\pi''$ is obtained by drawing $\mu$ from $\pi$ then drawing $\nu$ from $\pi'(\cdot|\mu)$ (that is, sequentially implementing $\pi$ and then, conditionally, $\pi'$ ),

$C_-(\pi'') \le C_-(\pi) + \mathbb{E}_{\mu\sim\pi}[C_-(\pi'(\cdot|\mu))].$

This encodes the cost advantage of breaking an acquisition task into sequential sub-experiments.

These axioms are both necessary and sufficient: a function $C^*: \Delta^2(X) \to \mathbb{R}_+$ is an indirect cost for some valid $C_{\text{direct}}$ if and only if it satisfies monotonicity and sub-additivity. Furthermore, if $C^*$ already satisfies these it is the indirect cost function induced by itself, i.e., it is "optimized" and further indirectification leaves it unchanged (Zhong, 2018).

3. Posterior-separable and Uniformly Posterior-separable Indirect Costs

A cost function $C$ is posterior-separable if there exists a divergence $D(\nu\|\mu)\ge0$ such that

$C(\pi) = \mathbb{E}_{\nu\sim\pi}[D(\nu \| \mathbb{E}[\pi])].$

It is uniformly posterior-separable if there is a convex potential $H$ with

$C(\pi) = \mathbb{E}_{\nu\sim\pi}[H(\nu)] - H(\mathbb{E}_{\nu\sim\pi}[\nu]).$

Uniform posterior-separability is equivalent to monotonicity plus the stricter additivity property: $C(\pi'') = C(\pi) + \mathbb{E}_{\mu\sim\pi}[C(\pi'(\cdot|\mu))]$ for all two-stage joins. This class captures costs whereby the expected acquisition cost only depends on the dispersion of posteriors around the prior, evaluated through a convex potential—directly linking information costs to statistical divergences and entropy-based penalties.

If a direct cost "favors incremental evidence"—locally approximating, up to $o(\|\nu-\mu\|^2)$ , a quadratic form $E_{\nu\sim\pi}[(\nu-\mu)^T B(\mu) (\nu-\mu)]$ with positive semi-definite $B(\mu)$ , and admitting a $C^2$ convex $H$ such that $H''(\mu) = 2B(\mu)$ and $C(\pi)\ge E_{v\sim\pi}[H(v)] - H(E_{v\sim\pi}[v])$ —then the induced indirect cost is uniformly posterior-separable. Conversely, uniform posterior-separability of the indirect cost requires that $C_{\text{direct}}$ favors incremental evidence in this sense.

4. Prior-independence and Indirect Cost Separability

A direct cost function is prior-independent if it depends only on the conditional distributions (likelihood matrices) and not on the prior over states. For $|X|=2$ (the binary-state case), any prior-independent direct cost $C$ can be written in terms of the joint signal-state law and its marginals. The induced indirect cost is uniformly posterior-separable (i.e., admits a potential representation) if and only if there exists $\alpha>0$ such that

$C(P) \ge 2\alpha \left[ I(X;S) + \tfrac{1}{2} \mathbb{E}_x[D_{\mathrm{KL}}(P_s \| P^x)] \right],$

with $I(X;S)$ denoting mutual information, $D_{\mathrm{KL}}$ the Kullback–Leibler divergence, and $P^x$ the signal law conditional on $x$ . In this regime, the indirect cost is

$H(\mu) = \alpha[2(\mu\ln\mu + (1-\mu)\ln(1-\mu)) - \ln\mu - \ln(1-\mu)].$

For $|X|>2$ , there is no nontrivial prior-independent direct cost which generates a posterior-separable indirect cost. This delineates a sharp boundary between the binary and multi-state settings in terms of the admissible cost functionals.

5. Representative Cost Function Examples and Operational Insights

Common choices for direct cost functions and their induced indirect forms include:

Cost Type	$C_{\text{direct}}$ definition	Indirect Cost Property
Mutual information	$C_{\text{direct}}(\pi) = I(\text{state};\text{signal})$	Additive, posterior-separable
Quadratic variance	$C_{\text{direct}}(\pi) = k \cdot \operatorname{Var}_{\nu\sim\pi}[\nu]$	Additive, quadratic potential

For mutual information, the corresponding potential is $H(\mu) = \sum_x \mu(x)\ln\mu(x)$ (negative Shannon entropy), and both direct and indirect costs are equal.
For quadratic variance, $H(\mu) = k \mu^2$ when $X\subseteq\mathbb{R}$ and again the cost is additive, i.e., direct and indirect costs coincide.

A corollary is the dilution property: for a mixture $\pi_\lambda = (1-\lambda)\delta_\mu + \lambda\pi$ , $C_-(\pi_\lambda) = \lambda C_-(\pi)$ , which means "diluting" an experiment in time or mixtures scales its cost linearly.

In continuous time limits, the most efficient implementation of $\pi$ over $[0,T]$ leverages a compound Poisson signal, jumping to $\pi$ at rate $\lambda = C_-(\pi)/T$ .

An application is in dynamic learning: when a decision-maker faces a per-period flow cost based on $C_-(\cdot)$ and selects an optimal stopping rule, sub-additivity ensures that optimal information delivery can be characterized by a static allocation and a Poissonian event structure, with equal-cost smoothing across periods.

In cost-sharing problems where the cost function $C$ is not directly accessible but instead estimated from empirical data, “statistical cost sharing” approaches assess solution concepts (such as the core and the Shapley value) based on observed tuples $(S, C(S))$ and their distribution (Balkanski et al., 2017). Here, the indirect cost corresponds to the statistical framework that estimates cooperative cost functionals using data, sample complexity analysis, and probabilistic guarantees (e.g., uniform convergence based on VC-dimension).

For monotone submodular cost functions with curvature $\kappa$ , the statistical Shapley value can be approximated within a factor of $\sqrt{1-\kappa}$ , and this bound is information-theoretically tight. Statistical analogues of the Shapley axioms (balance, symmetry, zero-element, additivity) uniquely specify the data-driven Shapley value in this context.

7. Implications, Boundaries, and Research Directions

The paper of indirect cost functions clarifies which functional forms over distributions of posteriors are compatible with compositional, implementable, and economically meaningful cost measures. The equivalence of indirect cost functions with monotonicity and sub-additivity axioms provides a rigorous operational characterization, while the theory of posterior-separability establishes a bridge to potential-based representations widely used in information theory and learning theory.

The sharp distinction between the binary and multi-state cases for prior-independent, posterior-separable indirect costs restricts the general applicability of certain statistical measures as cost surrogates. The explicit representation of indirect cost functions in dynamic and statistical learning continues to motivate both theoretical and applied research in information acquisition, mechanism design, and data-driven cooperative game theory.

PDF Markdown Chat (Pro)

References (2)

The Indirect Cost of Information (2018)

Statistical Cost Sharing (2017)

Follow Topic

Get notified by email when new papers are published related to Indirect Cost Functions.