Papers
Topics
Authors
Recent
2000 character limit reached

Azadkia-Chatterjee Coefficient

Updated 9 December 2025
  • The Azadkia–Chatterjee coefficient is a nonparametric, rank-based measure defined via conditional probability variance that ranges from 0 (independence) to 1 (functional dependence).
  • It employs graph-based estimators using nearest neighbor ranks to achieve strong consistency, parametric rates, and established asymptotic normality in both marginal and conditional settings.
  • Its extensions include multivariate responses and scale-invariant variants, making it central for independence testing, graphical models, and model-free variable selection.

The Azadkia–Chatterjee coefficient is a nonparametric, rank-based measure of directed dependence between a vector-valued predictor and a univariate or multivariate response, defined at the population level via the variance of conditional probabilities and estimated using nearest-neighbor graphs. It features an interpretable scale—zero under independence and one under functional dependence—and a graph-based empirical estimator that admits parametric rates, strong consistency, bandwidth-free implementation, and central limit theorems in both marginal and conditional versions. Multivariate extensions, scale-invariant variants, and connections to broader classes of geometric graph and kernel-based dependence measures position the coefficient as a central object for independence testing, graphical models, and model-free variable selection.

1. Definition and Fundamental Properties

Let (X,Y)(X,Y) be jointly distributed random elements with XRdX\in\mathbb{R}^d and %%%%2%%%% either univariate or a vector in Rq\mathbb{R}^{q}. The Azadkia–Chatterjee (AC) coefficient for YY on XX is defined by

ξ(Y,X)  =  RVar(P(YyX))dPY(y)RVar(1{Yy})dPY(y)[0,1].\xi(Y,X) \;=\; \frac{ \int_{\mathbb{R}} \operatorname{Var}\big( P(Y\ge y\mid X) \big) \, dP^Y(y) }{ \int_{\mathbb{R}} \operatorname{Var}\left( \mathbf{1}\{ Y \ge y \} \right) dP^Y(y) } \in [0,1].

An equivalent form based on the cumulative distribution of YY yields, for continuous FYF_Y: ξ(Y,X)=6RVar(P(YyX))dPY(y)2.\xi(Y, X) = 6\int_\mathbb{R} \operatorname{Var}\left( P(Y\ge y \mid X) \right) dP^Y(y) - 2. Characterizing properties:

  • ξ(Y,X)=0\xi(Y,X)=0 if and only if XX and YY are independent.
  • ξ(Y,X)=1\xi(Y,X)=1 if and only if YY is almost surely a measurable function of XX.

The definition is directional and scale-invariant: strictly increasing transformations of YY or bijections of XX preserve ξ\xi (Ansari et al., 14 Mar 2025, Ansari et al., 2022). For conditional dependence, define (X,Y,Z)(X,Y,Z) jointly and set

ξ=Var ⁣(E[1{Yy}X,Z]X)dFY(y)Var ⁣(1{Yy}X)dFY(y).\xi = \frac{ \int \operatorname{Var}\!\left( \mathbb{E}[\, \mathbf{1}\{ Y \ge y \} | X,Z\,] \,|\, X \right) dF_Y(y)} { \int \operatorname{Var}\!\left( \mathbf{1}\{ Y \ge y \} | X \right) dF_Y(y)}.

ξ=0\xi = 0 if and only if YZXY \perp Z \mid X, and ξ=1\xi = 1 if and only if YY is a function of (Z,X)(Z,X) given XX (Shi et al., 2021, Huang et al., 2020).

2. Graph-Based and Rank-Based Estimator Construction

For i.i.d. data {(Xi,Yi)}i=1n\{ (X_i, Y_i) \}_{i=1}^n, construct the following graph-based estimator:

  • Compute the univariate ranks Ri=#{j:YjYi}R_i = \#\{j : Y_j \le Y_i\}.
  • Let N(i)=argminjiXjXiN(i) = \arg\min_{j\ne i} \| X_j - X_i \|.
  • The empirical AC coefficient is

ξn=6n21i=1nmin(Ri,RN(i))2n+1n1.\xi_n = \frac{6}{n^2-1} \sum_{i=1}^n \min(R_i, R_{N(i)}) - \frac{2n+1}{n-1}.

This estimator generalizes Chatterjee's original proposal to multivariate covariates XX by utilizing nearest-neighbor graphs in Rd\mathbb{R}^d (Lin et al., 2022).

Multivariate response: For YRqY\in \mathbb{R}^q, a "chain rule" or copula-based construction is used (Ansari et al., 2022, Huang et al., 8 Dec 2025): T(YX)=i=1q(ξ(Yi(X,Y<i))ξ(YiY<i))i=1q(1ξ(YiY<i)),T(Y|X) = \frac{\sum_{i=1}^q \left( \xi(Y_i | (X,Y_{<i})) - \xi(Y_i| Y_{<i}) \right)}{ \sum_{i=1}^q (1 - \xi(Y_i | Y_{<i})) }, with T(YX)[0,1]T(Y|X) \in [0,1], reduces to ξ\xi for q=1q=1, and can be strongly consistently estimated using graph-based estimators for each univariate constituent.

Scale invariance: The standard estimator is not invariant to affine changes in XX; a fully scale-invariant version uses coordinatewise rank transforms in XX before constructing the NNG (Tran et al., 3 Dec 2024).

3. Distributional Properties and Limit Theory

Asymptotic Normality and Variance Bounds

The central limit theorem holds under broad conditions. For i.i.d. draws from a continuous law: ξnE[ξn]Var[ξn]dN(0,1),\frac{\xi_n - \mathbb{E}[\xi_n]}{\sqrt{ \mathrm{Var}[\xi_n] }} \xrightarrow{d} N(0,1), whenever YY is not a measurable function of XX (Lin et al., 2022). The asymptotic variance Vn=nVar[ξn]V_n = n\cdot \mathrm{Var}[\xi_n] satisfies: 0<lim infnVnlim supnVn36,0 < \liminf_n V_n \leq \limsup_n V_n \leq 36, and, under absolute continuity of FXF_X, a sharper bound involving explicit dimension-dependent constants.

When XYX \perp Y, nξndN(0,25+25κq+45κo)\sqrt{n} \xi_n \xrightarrow{d} N(0, \tfrac{2}{5} + \tfrac{2}{5}\kappa_q + \tfrac{4}{5}\kappa_o ) with κq\kappa_q, κo\kappa_o linked to the geometry of the NNG in Rd\mathbb{R}^d (Lin et al., 2022, Han et al., 2022). Under manifold support, the limiting variance depends solely on the intrinsic dimension.

A consistent explicit estimator of the variance is available, allowing for valid inference (Lin et al., 2022).

Symmetric and Conditional Extensions

A symmetrized version, taking max{ξn(X,Y),ξn(Y,X)}\max\{\xi_n(X,Y), \xi_n(Y,X)\}, allows construction of two-sided tests—its limit law under independence is skew-normal with explicit variance (Zhang, 2022).

The conditional AC coefficient admits an empirical estimator with parallel asymptotics; under independence, nξn\sqrt{n} \xi_n is asymptotically normal with variance determined by the dimensions of the variables and graph-count statistics (Shi et al., 2021).

Continuity Considerations

Unlike classical measures (Spearman's rho, Kendall's tau), ξ\xi is not weakly continuous under distributional convergence. Instead, it is continuous with respect to convergence of Markov products (pairs of conditionally i.i.d. copies) under additional marginal quantile convergence or specific copula convergence. Practical families and models (ellipticals, Archimedeans, noises) satisfy required continuity, so stable large-sample inference is possible within these classes (Ansari et al., 14 Mar 2025).

4. Algorithmic and Computational Aspects

  • Nearest-neighbor graph construction can be done in O(nlogn)O(n\log n) (brute force for small dd; kd-trees or approximate methods for larger dd).
  • Rank computations for YY (and optionally for XX in the scale-invariant version) cost O(nlogn)O(n\log n) per coordinate.
  • Multivariate response: Efficient merge-sort or divide-and-conquer algorithms exist for blockwise rank counts, with time complexity O(n(logn)q)O(n (\log n)^{q}) (Huang et al., 8 Dec 2025).
  • For each observation ii, nearest-neighbor search and rank calculations admit nearly linear scaling, enabling use in large datasets.

5. Connections to Broader Dependence Measures

The AC coefficient is a specific instance within the family of graph–RKHS–OT dependency measures (Deb et al., 2020, Deb et al., 20 Nov 2024):

  • Population level: For sufficiently rich kernels (e.g., the min kernel on [0,1][0,1], or the indicator-integral kernel), the corresponding normalized conditional MMD directly recovers ξ\xi.
  • Sample level: The estimator is a geometric graph functional over empirical OT ranks.
  • Distribution-free: Under the null of independence, the law of the AC coefficient (when computed using empirical OT ranks and graph structure) is exactly permutation invariant, enabling finite-sample calibration for independence tests.

Multivariate extensions (both in predictors and responses) and conditional variants fit naturally into this graph–kernel framework, relating directly to kernel partial correlation (Huang et al., 2020), distance multivariance, and more general measures indexed by RKHS (Deb et al., 20 Nov 2024).

6. Practical Application Domains

Independence and Conditional Independence Testing

The AC coefficient and its conditional extension are used for:

  • Testing independence in arbitrary dimensions (direct, distribution-free under the null, with consistent critical values).
  • Conditional independence testing, e.g., through graph-based statistics evaluated with (conditional) randomization tests (Shi et al., 2021). However, these are known to exhibit low local power against contiguous local alternatives unless the nearest-neighbor graph is appropriately generalized or replaced with kk-NN approaches.

Graphical Model Structure Learning

Pairwise conditional AC coefficients are used as entries in adjacency matrices for learning undirected graphs representing conditional independence relationships in high dimensions, outperforming standard penalized Gaussian graphical model approaches in various regimes (Furmańczyk, 2023).

Model-Free Feature Selection and Network Analysis

The multivariate T extension and its estimator enable:

7. Theoretical Limitations and Open Problems

  • Under local parametric or minimax-detection boundary alternatives, the standard 1-NN estimator is asymptotically powerless unless graph construction is strengthened (increasing kk with nn) (Shi et al., 2021).
  • Weak continuity of ξ\xi fails under convergence in law, but holds under stricter Markov-product and copula-derivative types of convergence, implying care is needed in statistical inference (Ansari et al., 14 Mar 2025).
  • In practical high-dimensional settings, the curse of dimensionality in nearest-neighbor search may be partially circumvented due to intrinsic dimension adaptivity, but further analysis on computational–statistical tradeoffs remains ongoing (Han et al., 2022).

References:

Slide Deck Streamline Icon: https://streamlinehq.com

Whiteboard

Forward Email Streamline Icon: https://streamlinehq.com

Follow Topic

Get notified by email when new papers are published related to Azadkia-Chatterjee Coefficient.