- The paper derives exact analytical formulas for the variance of Spearman’s rho and the covariance with Kendall’s tau, clarifying their differences.
- The paper presents closed-form expressions for estimator bias, mean square error, and asymptotic relative efficiency, highlighting KT's lower variability than SR.
- The paper demonstrates that both SR and KT remain robust in contaminated normal models, offering practical guidelines for non-Gaussian data analysis.
Analysis of "Comparison of Spearman's rho and Kendall's tau in Normal and Contaminated Normal Models"
This paper presents an in-depth comparative analysis of Spearman's rho (SR) and Kendall's tau (KT) under normal and contaminated normal models. The paper provides exact analytical formulae for the variance of SR and the covariance between SR and KT using Childs's reduction formula for quadrivariate normal positive orthant probabilities. The paper further establishes closed-form expressions for the expectations of SR and KT under bivariate contaminated normal models, offering insights into their bias, mean square error (MSE), and asymptotic relative efficiency (ARE) when compared to the Pearson's product-moment correlation coefficient (PPMCC).
Analytical Framework and Results
The paper derives exact expressions for SR and KT, highlighting their distinct behaviors in both normal and contaminated distributions. The paper corrects misconceptions found in previous literature where SR and KT were considered equivalent. Key findings reveal notable differences in bias, variance, MSE, and ARE between SR and KT, which challenge the conventional wisdom of their equivalence.
For normal models, the paper provides a computationally feasible expression for the variance of SR, improving upon previous imprecise approximations. The derived covariance between SR and KT further elucidates their interrelationship. The results demonstrate that SR exhibits more variability than KT, which aligns with simulation results.
Under the contaminated normal model, which accounts for data with outliers, SR and KT maintain robustness compared to PPMCC, which is prone to distortion by non-Gaussian components. The paper rigorously details how SR and KT's expectations alter under varying levels of contamination, providing guidance on their utility in real-world scenarios where data may not be perfectly Gaussian.
Estimator Characteristics
The paper evaluates various estimators of the population correlation coefficient derived from SR and KT: ρ^S, ρ^K, and ρ^M (a mixture estimator). Performance metrics such as bias and variance are analytically established, indicating that while none are perfectly unbiased for small samples, ρ^M offers improved bias characteristics. Asymptotically, KT shows superior ARE when compared to SR, which provides a basis for selecting KT in large sample scenarios.
Implications for Practical Use
The findings serve as a significant resource for practitioners choosing between SR and KT in different contexts where PPMCC is not suitable. The paper's insights are particularly valuable for applications in social sciences and fields where data does not meet the assumptions of normality or contains outliers.
Conclusion
In conclusion, this paper offers comprehensive analytical and simulation-based comparisons of SR and KT. The results underscore the unique characteristics of each method, guiding practitioners in selecting the appropriate correlation measure. Future work could explore extending these findings to other types of contaminations and further refine the computational aspects of the derived expressions under different distribution assumptions. The methodologies and results presented form a pivotal reference for enhancing the understanding of rank-based correlation measures.