Statistical Jet Bundle: Information Geometry

Updated 21 November 2025

The statistical jet bundle is a geometric structure that organizes higher-order differential information and hierarchies of variance bounds in statistical models.
It employs jet bundles, contact geometry, and Cartan distributions to derive curvature corrections and integrability conditions for estimator efficiency.
The framework finds applications in both information geometry and collider phenomenology, enabling improved background rejection and measurement precision via ensemble jet analysis.

A statistical jet bundle is a geometric structure that systematically organizes higher-order differential information about statistical models, particularly encoding the hierarchy of variance bounds—including the Cramér–Rao and Bhattacharyya-type inequalities—within the framework of jet bundles, contact geometry, and Cartan distributions. Introduced and developed in the context of information geometry, the statistical jet bundle formalism provides a unified, intrinsic, and coordinate-free foundation for analyzing estimator efficiency, curvature corrections to variance lower bounds, and the geometric and differential-algebraic criteria for optimality of statistical estimators (Krishnan, 19 Nov 2025).

1. The Statistical Bundle and Jet Bundles

The foundational object is the statistical bundle $E = \Theta \times H = \mathbb{R} \times L^2(\mu)$ , where $\Theta \cong \mathbb{R}$ is the parameter space with coordinate $\theta$ , and the fibre over each $\theta$ is the Hilbert space $H = L^2(\mu)$ of square-integrable functions on the sample space. A section of $E$ is a mapping $s: \Theta \to H$ , such as the square-root embedding $s_\theta(x) = \sqrt{f(x;\theta)}$ associated with a parametric family of densities.

The $m$ -th order statistical jet bundle $J^m(E)$ over $\Theta$ consists, at each $\theta$ , of all $m$ -jets (equivalence classes of local sections matching up to $m$ derivatives at $\theta$ ). In coordinates, an element of $J^m(E)$ is represented as $(\theta; s_0, s_1, ..., s_m)$ , where each $s_k \in H$ denotes the $k$ -th derivative $s^{(k)}(\theta)$ . Natural projections $\pi_{m,k}: J^m(E) \rightarrow J^k(E)$ forget higher derivatives, forming a tower of bundles.

2. Canonical Contact Forms and Total Derivatives

On the trivial finite-dimensional jet bundle $J^m(\mathbb{R} \times \mathbb{R})$ , the standard contact 1-forms are

$\omega_k = ds_k - s_{k+1} d\theta, \quad 0 \leq k \leq m-1.$

These forms vanish along holonomic prolongations, i.e., lifts of genuine sections and their derivatives.

On the infinite-dimensional statistical jet bundle $J^m(E)$ , $\omega_k$ retains its form, with $s_k$ interpreted as $H$ -valued and $ds_k$ as $H$ -valued 1-forms. For each $x$ in the sample space, the scalar evaluation $\langle \omega_k, \delta_x \rangle$ matches the classical contact form at $x$ .

The total derivative, or Cartan vector field, on $J^m(E)$ is given by

$D = \partial_\theta + s_1 \partial_{s_0} + s_2 \partial_{s_1} + \cdots + s_m \partial_{s_{m-1}}.$

This operator generates the rank-1 Cartan distribution by annihilation of the contact forms.

3. Cartan Distribution, Ehresmann Connection, Torsion, and Curvature

The Cartan distribution $\mathcal{C}^m = \bigcap_{k=0}^{m-1} \ker \omega_k$ is a rank-1 distribution on $J^m(E)$ , spanned by $D$ . For the bundle projection $\pi_{m,m-1}: J^m(E) \rightarrow J^{m-1}(E)$ , the vertical bundle $V^m = \ker (\pi_{m,m-1})_* = \operatorname{span}\{ \partial_{s_m} \}$ . The connection is encoded by requiring the contact 1-form $\omega_{m-1}$ to vanish on horizontal vectors, yielding a decomposition $T J^m(E) = H^m \oplus V^m$ , with the horizontal component $H^m$ the kernel of $\omega_{m-1}$ .

The torsion 1-form is calculated as

$T = i_D d\omega_{m-1} = -\omega_m,$

while the curvature 2-form is $R = d\omega_{m-1}$ . Non-zero curvature measures the non-integrability of $H^m$ , representing the geometric source of curvature corrections in estimator variance bounds.

4. Efficient Models and ODE Submanifolds

A statistical model is termed “efficient up to order $m$ ” if the estimator residual function lies in the span of the first $m$ derivatives $\eta_k(\theta) = \partial_\theta^k s_\theta$ , $1 \leq k \leq m$ . Equivalently, there exist coefficients $a_0(\theta), ..., a_m(\theta)$ , not all zero, such that for all $x \in$ sample space,

$\sum_{k=0}^m a_k(\theta) \, \partial_\theta^k s_\theta(x) = 0.$

The submanifold $\Sigma \subset J^m(E)$ , defined by the linear constraint $F(\theta, s_0, ..., s_m) = \sum_{k=0}^m a_k(\theta) s_k = 0$ , specifies the locus of $m$ -th order efficient models. The image of the $m$ -jet prolongation $j^m s$ must lie entirely in $\Sigma$ for all $\theta$ .

5. Integrability, Variance Bounds, and Curvature Corrections

Classical information inequalities, such as the Cramér–Rao bound (CRB) and Bhattacharyya-type bounds, are re-expressed geometrically in the jet bundle formalism. An unbiased estimator $T$ achieves $m$ -th order efficiency if and only if the residual $e(x;\theta) \, s_\theta(x) = (T(x) - \theta) s_\theta(x)$ lies in the span of the first $m$ derivatives of $s_\theta$ . This is algebraically equivalent to the assertion that $s_\theta$ satisfies a homogeneous linear ODE of order $m$ .

Geometrically, $j^m s$ must be both contained in $\Sigma$ and tangent to the Cartan distribution, i.e., $D F|_{\Sigma} = 0$ , implying that $\Sigma$ is an integral submanifold for $D$ . Non-integrability, measured by torsion or curvature, precisely quantifies the amount by which an estimator fails to achieve the refined bound, and the geometric structure supplies the necessary extrinsic corrections to the variance.

For $m=1$ , the second fundamental form of the embedding $s: \Theta \to H$ is

$II(\partial_\theta, \partial_\theta) = P_\perp(s''(\theta)) = s''(\theta) - \frac{\langle s''(\theta), s'(\theta) \rangle}{\|s'(\theta)\|^2} s'(\theta),$

and the variance of an unbiased estimator $T$ satisfies

$\operatorname{Var}_\theta(T) \geq \|s'(\theta)\|^{-2} + \frac{\|II(\partial_\theta, \partial_\theta)\|^2}{\|s'(\theta)\|^4},$

where the first term is the inverse Fisher information and the second term arises from curvature. For $m > 1$ , higher-order “fundamental forms” yield Bhattacharyya-type corrections, and integrability of multiple intersecting efficiency ODEs signals the vanishing of all residual torsion/curvature, corresponding to full $m$ -th order efficiency.

6. Jet Bundle Formalism in Collider Phenomenology

In collider phenomenology, a “statistical jet bundle” denotes the construction in which each collision event’s jet is represented by an ensemble of clustering trees, obtained by randomizing the clustering sequence according to probabilistic weights (as in the Q-jets formalism) (Ellis et al., 2012). Each jet thus gives rise to a bundle of trees over the same event, enabling one to empirically study the distribution of any observable $O$ across the ensemble.

For a given event, the Q-jets procedure randomizes the recombination of constituent four-vectors according to a parameter $\alpha$ (“rigidity”), constructing multiple trees per jet. Observables are collected over this ensemble, defining the empirical distribution $f_N(O) = \frac{1}{N} \sum_{k=1}^N \delta(O - O_k)$ and associated summary statistics—mean, variance, and higher moments.

The width (variance) of these distributions serves as a powerful new discriminant: signal jets (e.g., boosted $W \to q\bar{q}$ ) typically exhibit narrow mass distributions, while QCD jets display broad volatility. Application of volatility cuts yields significant improvements to signal significance and statistical efficiency, reducing required integrated luminosity by up to a factor of two for boosted-object searches, as detailed in (Ellis et al., 2012).

7. Unifying Principles and Significance

The statistical jet bundle provides a rigorous language for encoding estimator efficiency, variance bounds, and curvature corrections in a single geometric hierarchy. In the information geometry context, the data required for the statement “variance $\geq$ 1/Fisher + curvature correction” is summarized as “the $m$ -th prolonged section $j^m s$ is an integral curve of the Cartan distribution restricted to the ODE submanifold $\Sigma_m$ .” The jet bundle formalism links algebraic projection conditions and geometric integrability in a unified framework, offering new conceptual and technical insights into the geometry of statistical estimation (Krishnan, 19 Nov 2025).

In collider analysis, the statistical jet bundle (in the Q-jets sense) enables the exploration of the statistical properties of jet observables at the ensemble level, fostering enhanced stability of background rejection and improved measurement precision (Ellis et al., 2012). This dual usage underscores the unifying power of jet bundle formalism in both information geometry and physical data analysis contexts.

PDF Markdown Chat (Pro)

References (2)

Cartan meets Cramér-Rao (2025)

Qjets: A Non-Deterministic Approach to Tree-Based Jet Substructure (2012)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Statistical Jet Bundle.