Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 94 tok/s
Gemini 2.5 Pro 42 tok/s Pro
GPT-5 Medium 13 tok/s
GPT-5 High 17 tok/s Pro
GPT-4o 101 tok/s
GPT OSS 120B 460 tok/s Pro
Kimi K2 198 tok/s Pro
2000 character limit reached

Comparison of Spearman's rho and Kendall's tau in Normal and Contaminated Normal Models (1011.2009v1)

Published 9 Nov 2010 in cs.IT and math.IT

Abstract: This paper analyzes the performances of the Spearman's rho (SR) and Kendall's tau (KT) with respect to samples drawn from bivariate normal and bivariate contaminated normal populations. The exact analytical formulae of the variance of SR and the covariance between SR and KT are obtained based on the Childs's reduction formula for the quadrivariate normal positive orthant probabilities. Close form expressions with respect to the expectations of SR and KT are established under the bivariate contaminated normal models. The bias, mean square error (MSE) and asymptotic relative efficiency (ARE) of the three estimators based on SR and KT to the Pearson's product moment correlation coefficient (PPMCC) are investigated in both the normal and contaminated normal models. Theoretical and simulation results suggest that, contrary to the opinion of equivalence between SR and KT in some literature, the behaviors of SR and KT are strikingly different in the aspects of bias effect, variance, mean square error, and asymptotic relative efficiency. The new findings revealed in this work provide not only deeper insights into the two most widely used rank based correlation coefficients, but also a guidance for choosing which one to use under the circumstances where the PPMCC fails to apply.

Citations (20)
List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

Summary

  • The paper derives exact analytical formulas for the variance of Spearman’s rho and the covariance with Kendall’s tau, clarifying their differences.
  • The paper presents closed-form expressions for estimator bias, mean square error, and asymptotic relative efficiency, highlighting KT's lower variability than SR.
  • The paper demonstrates that both SR and KT remain robust in contaminated normal models, offering practical guidelines for non-Gaussian data analysis.

Analysis of "Comparison of Spearman's rho and Kendall's tau in Normal and Contaminated Normal Models"

This paper presents an in-depth comparative analysis of Spearman's rho (SR) and Kendall's tau (KT) under normal and contaminated normal models. The paper provides exact analytical formulae for the variance of SR and the covariance between SR and KT using Childs's reduction formula for quadrivariate normal positive orthant probabilities. The paper further establishes closed-form expressions for the expectations of SR and KT under bivariate contaminated normal models, offering insights into their bias, mean square error (MSE), and asymptotic relative efficiency (ARE) when compared to the Pearson's product-moment correlation coefficient (PPMCC).

Analytical Framework and Results

The paper derives exact expressions for SR and KT, highlighting their distinct behaviors in both normal and contaminated distributions. The paper corrects misconceptions found in previous literature where SR and KT were considered equivalent. Key findings reveal notable differences in bias, variance, MSE, and ARE between SR and KT, which challenge the conventional wisdom of their equivalence.

For normal models, the paper provides a computationally feasible expression for the variance of SR, improving upon previous imprecise approximations. The derived covariance between SR and KT further elucidates their interrelationship. The results demonstrate that SR exhibits more variability than KT, which aligns with simulation results.

Performance Under Contaminated Models

Under the contaminated normal model, which accounts for data with outliers, SR and KT maintain robustness compared to PPMCC, which is prone to distortion by non-Gaussian components. The paper rigorously details how SR and KT's expectations alter under varying levels of contamination, providing guidance on their utility in real-world scenarios where data may not be perfectly Gaussian.

Estimator Characteristics

The paper evaluates various estimators of the population correlation coefficient derived from SR and KT: ρ^S\hat{\rho}_S, ρ^K\hat{\rho}_K, and ρ^M\hat{\rho}_M (a mixture estimator). Performance metrics such as bias and variance are analytically established, indicating that while none are perfectly unbiased for small samples, ρ^M\hat{\rho}_M offers improved bias characteristics. Asymptotically, KT shows superior ARE when compared to SR, which provides a basis for selecting KT in large sample scenarios.

Implications for Practical Use

The findings serve as a significant resource for practitioners choosing between SR and KT in different contexts where PPMCC is not suitable. The paper's insights are particularly valuable for applications in social sciences and fields where data does not meet the assumptions of normality or contains outliers.

Conclusion

In conclusion, this paper offers comprehensive analytical and simulation-based comparisons of SR and KT. The results underscore the unique characteristics of each method, guiding practitioners in selecting the appropriate correlation measure. Future work could explore extending these findings to other types of contaminations and further refine the computational aspects of the derived expressions under different distribution assumptions. The methodologies and results presented form a pivotal reference for enhancing the understanding of rank-based correlation measures.

Ai Generate Text Spark Streamline Icon: https://streamlinehq.com

Paper Prompts

Sign up for free to create and run prompts on this paper using GPT-5.