Smoothed Wilcoxon Rank Scores

Updated 19 November 2025

Smoothed Wilcoxon Rank Scores are nonparametric estimators that replace discrete rank indicators with kernel-smoothed functions to yield continuous, tie-robust statistics.
They enhance traditional Wilcoxon procedures by improving efficiency in correlation estimation and hypothesis testing under monotone, non-Gaussian associations.
Practical implementation hinges on optimal kernel and bandwidth choices to ensure asymptotic normality and accurate p-value approximations in small sample sizes.

The smoothed Wilcoxon rank scores refer to a family of nonparametric statistics and estimators in which the classical discrete rank indicators in Wilcoxon-type tests are replaced with smooth (kernel-based) functions of the data. This approach yields statistics that are continuous with respect to the data, inherit the fundamental distribution-free properties of Wilcoxon procedures, and offer practical benefits in terms of handling ties and improving efficiency under monotone but non-Gaussian associations. The method has been developed in several directions, including robust correlation estimation, one-sample and two-sample location inference, and hypothesis testing, providing a high-accuracy approximation to orthodox signed-rank and rank-sum procedures (Tasdan et al., 12 Nov 2025, Maesono et al., 2016, Moriyama et al., 2017).

1. Smoothed Empirical Cumulative Distribution Functions and Kernelifying Ranks

The core step is the replacement of the empirical cumulative distribution function (ecdf) with a smoothed or “kernelized” ecdf. The classical ecdf for a sample $\{X_j\}_{j=1}^n$ is

$F_n(x)=\frac{1}{n}\sum_{j=1}^n\mathbf{1}\{X_j\leq x\}.$

The smoothed version substitutes the indicator with a continuous cumulative distribution function (CDF) $H$ , typically a kernel CDF such as the standard normal: $\widetilde F_n(x)=\frac{1}{n}\sum_{j=1}^n H\left(\frac{x - X_j}{h}\right),$ with bandwidth $h=h_n>0$ satisfying $h_n\to 0$ , $n h_n\to\infty$ , $n h_n^4\to 0$ as $n\to\infty$ . For each sample point $X_i$ , the smoothed rank is then

$\widetilde R_i = n\,\widetilde F_n(X_i) = \sum_{j=1}^n H\left(\frac{X_i - X_j}{h}\right).$

Setting $h\to 0$ recovers the integer-valued ranks. Thus, the smoothing operation produces real-valued, tie-robust ranks that approach classical ranks in the limit (Tasdan et al., 12 Nov 2025).

2. Construction of Smoothed Wilcoxon Rank Scores and Correlation Estimators

The Wilcoxon linear score function for rank $r\in\{1,\dots,n\}$ is

$a(r) = \sqrt{12}\left(\frac{r}{n+1} - \frac{1}{2}\right),$

with analogous extension to the smoothed case: $a(\widetilde R_i) = \sqrt{12}\left(\frac{\widetilde R_i}{n+1} - \frac{1}{2}\right).$ These scores are used to build generalized inner-product statistics. For estimating rank correlations, the smoothed Wilcoxon correlation estimator is

$\widehat\rho_{sa} = \frac{1}{s_a} \sum_{i=1}^n a(\widetilde R_i^X)a(\widetilde R_i^Y), \qquad s_a = \frac{n(n-1)}{n+1},$

which, after algebraic manipulation, is equivalent to the classical Spearman correlation but evaluated on smoothed ranks: $\widehat\rho_{sa} = \frac{\sum_{i=1}^n (\widetilde R_i^X - \frac{n+1}{2})(\widetilde R_i^Y - \frac{n+1}{2})}{n(n^2-1)/12}.$ This approach can be interpreted as a "smoothed Spearman-type estimator" or a continuous extension of Wilcoxon’s statistic, handling ties and preserving the nonparametric spirit (Tasdan et al., 12 Nov 2025).

3. Smoothed Wilcoxon-Type Tests for One-Sample and Two-Sample Problems

In the one-sample signed-rank scenario, the smoothed Wilcoxon statistic for a sample $\{X_i\}$ symmetric about $0$ is

$W_n^* = \sum_{1\le i<j\le n} K\left(\frac{X_i+X_j}{2h_n}\right) + \frac{1}{2}\sum_{i=1}^n K\left(\frac{2X_i}{2h_n}\right),$

where $K(u)$ is a kernel CDF. Under the null, the mean and variance match the classical statistic up to $O(n h_n^2)$ and the leading order does not depend on the parent distribution (Maesono et al., 2016).

For two-sample inference, the discrete sum in the Wilcoxon rank-sum statistic

$W_2 = \sum_{i=1}^m\sum_{j=1}^n \mathbf{1}\{Y_j > X_i\}$

is replaced with its smoothed analogue: $\widetilde W_2 = \sum_{i=1}^m\sum_{j=1}^n K\left(\frac{Y_j - X_i}{h}\right).$ The key effect is that the statistic becomes real-valued, its distribution under the null is close to normality (enabling accurate normal approximation), and it avoids the lattice-related discreteness artifacts that distort $p$ -values in small samples (Moriyama et al., 2017).

4. Asymptotic Properties and Efficiency

Across all smoothed Wilcoxon variants, asymptotic expectations and variances under the null hypothesis are free of the underlying distribution to first order. For the smoothed Spearman-type rank correlation estimator $\widehat\rho_{sa}$ :

Under independence, $E[\widehat\rho_{sa}] = 0$ and $\operatorname{Var}[\widehat\rho_{sa}]\sim 1/(n-1)$ .
More generally, for a fixed value of the true association parameter $\rho$ , a CLT holds: $\sqrt{n}\left(\widehat\rho_{sa} - \rho\right) \rightsquigarrow N\left(0,\,\sigma_{sa}^2(\rho)\right).$
The asymptotic variance for Wilcoxon linear scores is strictly smaller than for classical Spearman’s $\rho$ under many monotonic but non-Gaussian settings; simulated MSE reduction up to $10$– $50\%$ is observed (Tasdan et al., 12 Nov 2025).

For the smoothed Wilcoxon signed-rank and rank-sum tests, the Pitman asymptotic relative efficiency (ARE) with respect to their classical analogues is $1$; the two statistics are asymptotically equivalent: $Z_n^* - Z_n \stackrel{p}{\longrightarrow} 0.$ Refined Edgeworth expansions with remainder $o(n^{-1})$ are available, leading to highly accurate $p$ -value approximations even for moderate sample sizes (Maesono et al., 2016, Moriyama et al., 2017).

5. Handling of Ties and Robustness to Data Discreteness

Smoothed Wilcoxon rank scores automatically handle ties via the kernel function. If $X_i = X_j$ , then $H(0)=1/2$ , ensuring both observations receive the same, non-integer smoothed rank without resorting to ad-hoc average or random tie-breaking. This feature eliminates small bias present in classical rank-based methods (Tasdan et al., 12 Nov 2025). In the two-sample context, smoothing removes gaps in attainable $p$ -values caused by the discreteness of the rank-sum statistic, yielding continuous $p$ -values with accurate calibration (Moriyama et al., 2017).

6. Implementation: Kernel and Bandwidth Choices

The choice of kernel and bandwidth is central for the practical performance of smoothed Wilcoxon procedures:

The kernel $k$ should be symmetric, typically of higher order (e.g., 4th-order) to eliminate $O(n^{-1/2})$ bias in the Edgeworth expansion for $p$ -values.
Bandwidth $h$ must satisfy $h_n\to 0$ , $n h_n\to\infty$ for asymptotic normality; typical choices include $h_n = n^{-1/4}$ , $h_n = n^{-1/3}$ , or $h_n = n^{-1/3}(\log n)^{-1}$ for refined Edgeworth expansions (Maesono et al., 2016).

A practical computation path:

Construct the smoothed ecdf $\widetilde F_n(x)$ with kernel $H$ and bandwidth $h$ .
Compute smoothed ranks $\widetilde R_i$ for all data points.
For correlation: apply Wilcoxon linear scores and form the inner-product estimator.
For tests: compute the smoothed sum and studentize according to the limiting variance.
Approximate $p$ -values using the normal (or Edgeworth-corrected) distribution (Tasdan et al., 12 Nov 2025, Maesono et al., 2016, Moriyama et al., 2017).

7. Applications and Empirical Efficiency Gains

Simulation studies document that under data with strong monotone but non-Gaussian association, the smoothed Wilcoxon correlation estimator outperforms both classical Spearman $\rho$ and Kendall $\tau$ , reducing MSE by $10$– $50\%$ while matching performance under Gaussian data (Tasdan et al., 12 Nov 2025). In testing scenarios, smoothed procedures exhibit empirical size close to nominal significance levels and avoid biases seen with classical Wilcoxon tests. Under heavy-tailed alternatives, smoothed medians can outperform the smoothed rank-sum, while under light-tailed alternatives, the smoothed Wilcoxon rank scores exhibit optimal power (Maesono et al., 2016, Moriyama et al., 2017).

In summary, the smoothed Wilcoxon rank score constructions offer a principled nonparametric approach yielding continuous, tie-robust, asymptotically normal statistics. They preserve efficiency and distribution-free properties while resolving issues associated with data discreteness and bias in classical rank-based methods (Tasdan et al., 12 Nov 2025, Maesono et al., 2016, Moriyama et al., 2017).

PDF Markdown Chat (Pro)

References (3)

Enhanced Rank-Based Correlation Estimation Using Smoothed Wilcoxon Rank Scores (2025)

Smoothed nonparametric tests and their properties (2016)

Smoothed nonparametric two-sample tests (2017)

Whiteboard

Generate a whiteboard explanation of this topic.

Topic to Video (Beta)

Generate a video overview of this topic.

Follow Topic

Get notified by email when new papers are published related to Smoothed Wilcoxon Rank Scores.

Smoothed Wilcoxon Rank Scores

1. Smoothed Empirical Cumulative Distribution Functions and Kernelifying Ranks

2. Construction of Smoothed Wilcoxon Rank Scores and Correlation Estimators

3. Smoothed Wilcoxon-Type Tests for One-Sample and Two-Sample Problems

4. Asymptotic Properties and Efficiency

5. Handling of Ties and Robustness to Data Discreteness

6. Implementation: Kernel and Bandwidth Choices

7. Applications and Empirical Efficiency Gains

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

Smoothed Wilcoxon Rank Scores

1. Smoothed Empirical Cumulative Distribution Functions and Kernelifying Ranks

2. Construction of Smoothed Wilcoxon Rank Scores and Correlation Estimators

3. Smoothed Wilcoxon-Type Tests for One-Sample and Two-Sample Problems

4. Asymptotic Properties and Efficiency

5. Handling of Ties and Robustness to Data Discreteness

6. Implementation: Kernel and Bandwidth Choices

7. Applications and Empirical Efficiency Gains

Sponsor

Whiteboard

Topic to Video (Beta)

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research