Papers
Topics
Authors
Recent
Search
2000 character limit reached

Brownian Distance Covariance

Updated 6 February 2026
  • Brownian Distance Covariance is a nonparametric measure that detects both linear and nonlinear associations between random vectors across various dimensions.
  • It computes dependence via pairwise Euclidean distances and characteristic functions weighted by Brownian motion, ensuring a value of zero only under independence.
  • Its practical applications span independence testing, model diagnostics, and integration in deep neural networks, backed by rigorous statistical properties and extensions.

Brownian distance covariance (BdCov, also called distance covariance or dCov) is a dependence measure for random vectors that generalizes classical covariance to quantify all types of dependence, including nonlinear and nonmonotone associations. Introduced by Székely and Rizzo, BdCov is defined via characteristic functions with a special weighting derived from Brownian motion, and is zero if and only if the random variables are independent. Its construction, based on pairwise Euclidean distances, is fundamentally nonparametric and applies to multivariate data of arbitrary dimension, with rigorous statistical properties and numerous extensions to metric, Hilbert, and functional spaces (Székely et al., 2010, Székely et al., 2010, &&&2&&&).

1. Formal Definition and Equivalent Forms

Let XRpX \in \mathbb{R}^p, YRqY \in \mathbb{R}^q be random vectors with joint characteristic function ϕX,Y(t,s)=E[ei(tTX+sTY)]\phi_{X,Y}(t, s) = E[e^{i (t^T X + s^T Y)}] and marginals ϕX(t)=E[eitTX]\phi_X(t) = E[e^{i t^T X}], ϕY(s)=E[eisTY]\phi_Y(s) = E[e^{i s^T Y}]. The squared population Brownian distance covariance is (Székely et al., 2010, Xie et al., 2022): V2(X,Y)=1cpcqRpRqϕX,Y(t,s)ϕX(t)ϕY(s)2t1+ps1+qdtds,V^2(X, Y) = \frac{1}{c_p c_q} \int_{\mathbb{R}^p} \int_{\mathbb{R}^q} \frac{|\phi_{X,Y}(t, s) - \phi_X(t)\phi_Y(s)|^2}{|t|^{1+p} |s|^{1+q}} dt ds, where cd=π(1+d)/2/Γ((1+d)/2)c_d = \pi^{(1+d)/2} / \Gamma((1+d)/2). This construction reduces to the normed L2L^2-distance between the joint and product characteristic functions, weighted by a kernel corresponding to Brownian motion increments.

An equivalent form in terms of Euclidean distances is (Székely et al., 2010, Lyons, 2011): V2(X,Y)=E[XXYY]+EXXEYY2E[XX]E[YY],V^2(X, Y) = E[|X-X'|\,|Y-Y'|] + E|X-X'|\,E|Y-Y'| - 2 E[|X-X'|]\,E[|Y-Y''|], where (X,Y)(X', Y') and (X,Y)(X'', Y'') are independent copies.

Sample (empirical) Brownian distance covariance for i.i.d. pairs {(Xi,Yi)}i=1n\{(X_i, Y_i)\}_{i=1}^n is computed by forming n×nn\times n distance matrices: aij=XiXj,bij=YiYj,a_{ij} = |X_i - X_j|, \quad b_{ij} = |Y_i - Y_j|, double-centering each: Aij=aijaˉiaˉj+aˉ,A_{ij} = a_{ij} - \bar{a}_{i\cdot} - \bar{a}_{\cdot j} + \bar{a}_{\cdot \cdot}, and analogously for BijB_{ij}. The empirical squared distance covariance is then

Vn2(X,Y)=1n2i,j=1nAijBij.V_n^2(X, Y) = \frac{1}{n^2} \sum_{i,j=1}^n A_{ij}B_{ij}.

The corresponding sample distance correlation is defined as

Rn(X,Y)=Vn(X,Y)Vn(X,X)Vn(Y,Y),R_n(X, Y) = \frac{V_n(X, Y)}{\sqrt{V_n(X, X)\,V_n(Y, Y)}},

with Rn=0R_n = 0 whenever the denominator vanishes (Székely et al., 2010).

2. Theoretical Properties

Characterization of independence: V2(X,Y)=0V^2(X, Y) = 0 if and only if XX and YY are independent, under mild moment conditions (finite first moments) (Székely et al., 2010, Lyons, 2011). This property holds in general metric spaces of strong negative type.

Scale and orthogonal invariance: V2(a1+b1C1X,a2+b2C2Y)=b1b2V2(X,Y)V^2(a_1 + b_1 C_1 X,\, a_2 + b_2 C_2 Y) = |b_1 b_2| V^2(X, Y) for scalars b1,b20b_1, b_2 \neq 0 and orthonormal matrices C1C_1, C2C_2; RnR_n is fully invariant under these transformations (Székely et al., 2010, Székely et al., 2010).

Non-negativity: V2(X,Y)0V^2(X,Y) \ge 0, equality holds if and only if independence.

Asymptotics: Under independence, nVn2(X,Y)n V_n^2(X,Y) converges in distribution to a non-degenerate quadratic form λkZk2\sum \lambda_k Z_k^2, with weights {λk}\{\lambda_k\} depending on the underlying distributions (Székely et al., 2010, Lyons, 2011). Under alternatives, Vn2a.s.V2>0V_n^2 \xrightarrow{a.s.} V^2 > 0 with OP(n1/2)O_P(n^{-1/2}) convergence rates.

Bias and Unbiased Estimation: The standard Vn2V_n^2 estimator is biased upward in small samples. Székely and Rizzo provided an unbiased estimator (Székely et al., 2010): Un(X,Y)=n2(n1)(n2)[Vn2(X,Y)T2n1],U_n(X, Y) = \frac{n^2}{(n-1)(n-2)} \left[V_n^2(X, Y) - \frac{T_2}{n-1}\right], where T2T_2 estimates the product of marginal distance means. The bias-corrected correlation CnC_n uses UnU_n in the same ratio as RnR_n.

3. Relation to Brownian Motion

The “Brownian” in Brownian distance covariance refers to a stochastic-process interpretation: V2(X,Y)V^2(X, Y) can be viewed as the squared covariance between W(X)W(X) and W(Y)W'(Y), where WW and WW' are independent Brownian motions with covariance kernels E[W(t)W(s)]=t+stsE[W(t)W(s)] = |t| + |s| - |t-s| (Székely et al., 2010). This viewpoint establishes that BdCov “sees” all deviations from independence, including nonmonotone nonlinearities, since Brownian motion has a full-rank expansion in function space (Székely et al., 2010).

4. Extensions Beyond Euclidean Data

Brownian distance covariance generalizes to any pair of metric spaces of strong negative type, such as separable Hilbert spaces, allowing its application to high-dimensional, functional, and even non-Euclidean data (Lyons, 2011, Székely et al., 2010). For functional data, the method applies to projections or truncated expansions, and with categorical variables, the distance matrices become indicator matrices on the simplex, reducing the method to analogues of squared-deviation statistics for contingency tables.

The BdCov machinery extends naturally to weighting schemes and other norms (AA,BB) in the distance calculations, allowing emphasis on “signal” directions or downweighting noise, and leads to unbiasedness and consistency even in non-standard spaces (Székely et al., 2010).

5. Computational Aspects and Practical Considerations

Computation of the empirical statistic is O(n2(p+q))O(n^2(p+q)) due to the pairwise distances, which can be limiting for large nn. For univariate data, algorithms of O(nlogn)O(n \log n) exist, and for high-dimensional settings, dimensionality reduction (e.g., via PCA or random projections) is recommended (Khoshgnauz, 2012). The bias in finite nn especially affects small-sample, high-dimensional applications such as genomics and motivates use of the unbiased estimator (Székely et al., 2010, Cope, 2010).

Permutation tests are recommended for independence hypotheses, leveraging the exchangeability of labels under the null. Principal components or clustering using the p×pp \times p distance correlation matrix may exhibit artifacts from small-sample bias; application of regularization or thresholding is advised (Cope, 2010).

6. Connections with Kernel, Energy, and Other Independence Measures

BdCov is closely related to energy distances and kernel-based dependence statistics. Its weighting kernel is related via Bochner’s theorem to reproducing kernel Hilbert space (RKHS) embeddings, and the Hilbert–Schmidt independence criterion (HSIC) is a special case with suitable kernel choice (Gretton et al., 2010). The form of BdCov enables extension to arbitrary domains (strings, graphs, groups) where a metric is available.

HSIC has some computational and power advantages, particularly at small sample sizes with well-chosen characteristic kernels. Both measures are consistent against all alternatives and have V-statistic-type estimators (Gretton et al., 2010).

7. Applications and Recent Developments

Brownian distance covariance has been used in testing independence, model diagnostics, structure learning in Markov networks, and as a pooling layer in deep neural networks for few-shot classification (Xie et al., 2022, Khoshgnauz, 2012). For instance, DeepBDC constructs a layer implementing the empirical BdCov matrix in high-dimensional embedding spaces, enabling plug-and-play nonparametric dependency measures in deep models (Xie et al., 2022).

Applied examples include detection of nonmonotone and nonlinear associations in genomics, ecological, and socio-economic data, with empirical demonstrations showing sensitivity to dependencies missed by linear correlation (Székely et al., 2010, Székely et al., 2010).

Extensions under active research include adaptations to mutual independence among more than two variables, high-dimensional consistency, fast approximations, and relaxation of metric and moment conditions (Lyons, 2011, Székely et al., 2010).


References:

  • (Székely et al., 2010) G. J. Székely and M. L. Rizzo, "Brownian distance covariance," Ann. Appl. Statist. 3(4), 1236–1265 (2009).
  • (Székely et al., 2010) G. J. Székely and M. L. Rizzo, "Rejoinder: Brownian distance covariance."
  • (Gretton et al., 2010) Gretton et al., "Discussion of: Brownian distance covariance."
  • (Cope, 2010) Leslie Cope, "Discussion of: Brownian distance covariance."
  • (Lyons, 2011) R. Lyons, "Distance covariance in metric spaces."
  • (Khoshgnauz, 2012) Y. Luo, "Learning Markov Network Structure using Brownian Distance Covariance."
  • (Xie et al., 2022) P. Hu et al., "Joint Distribution Matters: Deep Brownian Distance Covariance for Few-Shot Classification."

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Brownian Distance Covariance.