Skew-Adjusted Projection Depth (SPD)
- Skew-Adjusted Projection Depth (SPD) is a robust, affine-invariant data depth function that adjusts for skewness using medcouple-based asymmetric fences.
- It employs skew-adaptive projection and order statistics to generate outlyingness scores, significantly reducing misclassification rates in skewed or heavy-tailed populations.
- Its efficient computational framework uses multiple projections and robust statistics, achieving a dominant cost of O(p²n) per query for practical scalability.
Skew-Adjusted Projection Depth (SPD) is a robust, affine-invariant data depth function for multivariate and functional data that generalizes classical projection depth by adapting to skewness in the underlying distribution. SPD delivers asymmetrically scaled outlyingness scores via a skew-adjusted univariate density estimation in projected subspaces, correcting a key limitation of symmetric scale estimators in the presence of skewed classes. The construct emerges from replacing the median absolute deviation (MAD)-based scaling in the Stahel–Donoho outlyingness (SDO) with asymmetric "fences" parameterized by the medcouple, leading to substantially improved classification performance under heavy-tailed or skewed populations (Hubert et al., 2015).
1. Projection Depth and Skew-Adjustment
Classical Stahel–Donoho projection depth quantifies the “outlyingness” of a point with respect to a multivariate distribution by considering all one-dimensional projections for with , then normalizing the distance from the projected median by the MAD:
The associated projection depth is
When is skewed, the symmetric MAD- or IQR-based scale is not optimal. Skew-adjusted projection depth (SPD) remodels outlyingness via
where the univariate adjusted outlyingness is
The fences 0 are skew-adapted:
1
2
where 3 are quartiles, 4 is the medcouple statistic—the median of all pairwise slopes in 5—serving as a robust skewness measure, and 6. If 7 one replaces 8 by 9 to work in the right-skewed regime. Then
0
SPD thus replaces symmetric scaling with one-sided, skew-adaptive scaling, allowing the depth to respond asymmetrically to heavy-tailed data (Hubert et al., 2015).
2. Efficient Estimation and Computational Complexity
For finite-sample estimation, consider 1 and query 2. The affine-invariant procedure:
- Generate 3 directions by sampling 4 points and choosing normals to their affine hulls, repeated to obtain 5.
- For each 6:
- Project: 7, 8.
- Compute 9, 0, 1, 2, 3.
- Set 4, 5 as above.
- Compute 6.
- Set 7.
- Return 8.
The dominant cost is 9: with 0, the effort per query 1 is 2, accounting for projection, order statistics computation (3 for median/quartiles, 4 for medcouple using available optimized algorithms) (Hubert et al., 2015).
3. Theoretical Properties of SPD
SPD possesses the following properties:
- Affine invariance: SPD is preserved under all invertible linear transformations, as each projection and all order statistics are affine-invariant univariate functionals. The supremum over directions inherits this property.
- Robustness: SPD is bounded within 5, with its denominator (AO) growing roughly linearly in 6. As 7, 8. High breakdown point (9) is inherited from the robust medcouple and quartiles. The bounded influence of medcouple and order statistics ensures resistance to outliers.
- Sensitivity to skewness: Exponentially adjusted fences (0) cause the upper and lower fences to expand differently, adapting the depth to asymmetry and long tails. This mechanism prevents heavy-side values in skewed distributions from being spuriously classified as extreme.
- Depth-like behavior: 1 attains maximum 1 at points of zero adjusted outlyingness (typically the multivariate “median”) and decays to zero as 2 departs from the bulk of 3 (Hubert et al., 2015).
4. Algorithmic and Implementation Choices
Several practical choices influence SPD's estimation:
| Parameter | Default/Recommended | Trade-offs and Context |
|---|---|---|
| Number of directions 4 | 5 | Fewer 6 speeds up computation; more directions yield a closer approximation to the supremum. |
| Direction generation | Normals to hyperplanes of 7 samples; random Gaussians also viable | Normals ensure exploration of convex hull facets; random often needs larger 8. |
| Medcouple computation | 9 algorithm, can subsample for large 0 | Subsampling provides scalable proxy to exact 1. |
| Fence multipliers | 2 (boxplot literature), exponents 3, 4 | Default values robust in experiments; tuning impacts fence aggressiveness. |
These choices underpin the scalability and practical adaptability of SPD to high dimensions and large sample sizes (Hubert et al., 2015).
5. Comparative Performance in Classification
SPD is particularly effective in classification tasks where class distributions exhibit skewness. In Simulation Setting 2 of (Hubert et al., 2015), with one Normal and one highly right-skewed exponential class (both 6-variate), the DistSpace transform with 5-nearest neighbor classification was compared across three depth-based outlyingness measures:
| Method | Misclassification Rate (%) |
|---|---|
| DistSpace + kNN (SPD) | 6 |
| DistSpace + kNN (PD) | 7 |
| DistSpace + kNN (SDO) | 8 |
SPD halved the error of PD. This enhancement is attributed to the skew-adaptive AO9 fences, which avoid excessive contraction of the heavy-tailed region, ensuring effective discrimination under skew and heavy tails. The performance gain demonstrates that SPD’s asymmetric treatment of outlyingness is fundamental for depth-based learning in non-symmetric populations (Hubert et al., 2015).
6. Summary and Significance
Skew-adjusted projection depth merges projection-based multivariate depth geometry with univariate, medcouple-driven skew adjustment, providing a robust, affine-invariant, and computationally feasible depth measure. SPD is explicitly constructed for scenarios where classical symmetric data depth fails due to skewness or heavy-tailedness. It is straightforward to estimate (0 per query), deploys robust statistics at every stage, and can significantly reduce misclassification error in statistical learning pipelines. SPD exemplifies how robust univariate summaries combined with directional analysis yield substantial practical and theoretical gains in multivariate and functional data analysis (Hubert et al., 2015).