Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A direct extension of Azadkia & Chatterjee's rank correlation to multi-response vectors (2212.01621v4)

Published 3 Dec 2022 in math.ST, stat.ME, and stat.TH

Abstract: Recently, Chatterjee (2023) recognized the lack of a direct generalization of his rank correlation $\xi$ in Azadkia and Chatterjee (2021) to a multi-dimensional response vector. As a natural solution to this problem, we here propose an extension of $\xi$ that is applicable to a set of $q \geq 1$ response variables, where our approach builds upon converting the original vector-valued problem into a univariate problem and then applying the rank correlation $\xi$ to it. Our novel measure $T$ quantifies the scale-invariant extent of functional dependence of a response vector $\mathbf{Y} = (Y_1,\dots,Y_q)$ on predictor variables $\mathbf{X} = (X_1, \dots,X_p)$, characterizes independence of $\mathbf{X}$ and $\mathbf{Y}$ as well as perfect dependence of $\mathbf{Y}$ on $\mathbf{X}$ and hence fulfills all the characteristics of a measure of predictability. Aiming at maximum interpretability, we provide various invariance results for $T$ as well as a closed-form expression in multivariate normal models. Building upon the graph-based estimator for $\xi$ in Azadkia and Chatterjee (2021), we obtain a non-parametric, strongly consistent estimator for $T$ and show -- as a main contribution -- its asymptotic normality. Based on this estimator, we develop a model-free and rank-based feature ranking and forward feature selection for multiple-outcome data that works without any tuning parameters. Simulation results and real case studies illustrate $T$'s broad applicability.

Summary

We haven't generated a summary for this paper yet.