Papers
Topics
Authors
Recent
Search
2000 character limit reached

Skew-Adjusted Projection Depth (SPD)

Updated 1 June 2026
  • Skew-Adjusted Projection Depth (SPD) is a robust, affine-invariant data depth function that adjusts for skewness using medcouple-based asymmetric fences.
  • It employs skew-adaptive projection and order statistics to generate outlyingness scores, significantly reducing misclassification rates in skewed or heavy-tailed populations.
  • Its efficient computational framework uses multiple projections and robust statistics, achieving a dominant cost of O(p²n) per query for practical scalability.

Skew-Adjusted Projection Depth (SPD) is a robust, affine-invariant data depth function for multivariate and functional data that generalizes classical projection depth by adapting to skewness in the underlying distribution. SPD delivers asymmetrically scaled outlyingness scores via a skew-adjusted univariate density estimation in projected subspaces, correcting a key limitation of symmetric scale estimators in the presence of skewed classes. The construct emerges from replacing the median absolute deviation (MAD)-based scaling in the Stahel–Donoho outlyingness (SDO) with asymmetric "fences" parameterized by the medcouple, leading to substantially improved classification performance under heavy-tailed or skewed populations (Hubert et al., 2015).

1. Projection Depth and Skew-Adjustment

Classical Stahel–Donoho projection depth quantifies the “outlyingness” of a point xx with respect to a multivariate distribution PYP_Y by considering all one-dimensional projections vxv^\top x for vRpv\in\mathbb R^p with v=1\|v\|=1, then normalizing the distance from the projected median by the MAD:

SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}

The associated projection depth is

PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}

When PYP_Y is skewed, the symmetric MAD- or IQR-based scale is not optimal. Skew-adjusted projection depth (SPD) remodels outlyingness via

AO(x;PY)=supv=1AO1(vx;PvY)\mathrm{AO}(x; P_Y) = \sup_{\|v\|=1} \mathrm{AO}_1(v^\top x; P_{v^\top Y})

where the univariate adjusted outlyingness is

AO1(z;PZ)={zmed(Z)w2(Z)med(Z)z>med(Z) med(Z)zmed(Z)w1(Z)zmed(Z)\mathrm{AO}_1(z;P_Z)= \begin{cases} \frac{z-\mathrm{med}(Z)}{w_2(Z)-\mathrm{med}(Z)} & z>\mathrm{med}(Z)\ \frac{\mathrm{med}(Z)-z}{\mathrm{med}(Z)-w_1(Z)} & z\le\mathrm{med}(Z) \end{cases}

The fences PYP_Y0 are skew-adapted:

PYP_Y1

PYP_Y2

where PYP_Y3 are quartiles, PYP_Y4 is the medcouple statistic—the median of all pairwise slopes in PYP_Y5—serving as a robust skewness measure, and PYP_Y6. If PYP_Y7 one replaces PYP_Y8 by PYP_Y9 to work in the right-skewed regime. Then

vxv^\top x0

SPD thus replaces symmetric scaling with one-sided, skew-adaptive scaling, allowing the depth to respond asymmetrically to heavy-tailed data (Hubert et al., 2015).

2. Efficient Estimation and Computational Complexity

For finite-sample estimation, consider vxv^\top x1 and query vxv^\top x2. The affine-invariant procedure:

  1. Generate vxv^\top x3 directions by sampling vxv^\top x4 points and choosing normals to their affine hulls, repeated to obtain vxv^\top x5.
  2. For each vxv^\top x6:
    • Project: vxv^\top x7, vxv^\top x8.
    • Compute vxv^\top x9, vRpv\in\mathbb R^p0, vRpv\in\mathbb R^p1, vRpv\in\mathbb R^p2, vRpv\in\mathbb R^p3.
    • Set vRpv\in\mathbb R^p4, vRpv\in\mathbb R^p5 as above.
    • Compute vRpv\in\mathbb R^p6.
  3. Set vRpv\in\mathbb R^p7.
  4. Return vRpv\in\mathbb R^p8.

The dominant cost is vRpv\in\mathbb R^p9: with v=1\|v\|=10, the effort per query v=1\|v\|=11 is v=1\|v\|=12, accounting for projection, order statistics computation (v=1\|v\|=13 for median/quartiles, v=1\|v\|=14 for medcouple using available optimized algorithms) (Hubert et al., 2015).

3. Theoretical Properties of SPD

SPD possesses the following properties:

  • Affine invariance: SPD is preserved under all invertible linear transformations, as each projection and all order statistics are affine-invariant univariate functionals. The supremum over directions inherits this property.
  • Robustness: SPD is bounded within v=1\|v\|=15, with its denominator (AO) growing roughly linearly in v=1\|v\|=16. As v=1\|v\|=17, v=1\|v\|=18. High breakdown point (v=1\|v\|=19) is inherited from the robust medcouple and quartiles. The bounded influence of medcouple and order statistics ensures resistance to outliers.
  • Sensitivity to skewness: Exponentially adjusted fences (SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}0) cause the upper and lower fences to expand differently, adapting the depth to asymmetry and long tails. This mechanism prevents heavy-side values in skewed distributions from being spuriously classified as extreme.
  • Depth-like behavior: SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}1 attains maximum 1 at points of zero adjusted outlyingness (typically the multivariate “median”) and decays to zero as SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}2 departs from the bulk of SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}3 (Hubert et al., 2015).

4. Algorithmic and Implementation Choices

Several practical choices influence SPD's estimation:

Parameter Default/Recommended Trade-offs and Context
Number of directions SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}4 SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}5 Fewer SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}6 speeds up computation; more directions yield a closer approximation to the supremum.
Direction generation Normals to hyperplanes of SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}7 samples; random Gaussians also viable Normals ensure exploration of convex hull facets; random often needs larger SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}8.
Medcouple computation SDO(x;PY)=supv=1vxmed(vY)MAD(vY)\mathrm{SDO}(x; P_Y) = \sup_{\|v\| = 1} \frac{\left|v^\top x - \mathrm{med}(v^\top Y)\right|}{\mathrm{MAD}(v^\top Y)}9 algorithm, can subsample for large PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}0 Subsampling provides scalable proxy to exact PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}1.
Fence multipliers PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}2 (boxplot literature), exponents PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}3, PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}4 Default values robust in experiments; tuning impacts fence aggressiveness.

These choices underpin the scalability and practical adaptability of SPD to high dimensions and large sample sizes (Hubert et al., 2015).

5. Comparative Performance in Classification

SPD is particularly effective in classification tasks where class distributions exhibit skewness. In Simulation Setting 2 of (Hubert et al., 2015), with one Normal and one highly right-skewed exponential class (both 6-variate), the DistSpace transform with PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}5-nearest neighbor classification was compared across three depth-based outlyingness measures:

Method Misclassification Rate (%)
DistSpace + kNN (SPD) PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}6
DistSpace + kNN (PD) PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}7
DistSpace + kNN (SDO) PD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}8

SPD halved the error of PD. This enhancement is attributed to the skew-adaptive AOPD(x;PY)=11+SDO(x;PY)\mathrm{PD}(x;P_Y) = \frac{1}{1+\mathrm{SDO}(x;P_Y)}9 fences, which avoid excessive contraction of the heavy-tailed region, ensuring effective discrimination under skew and heavy tails. The performance gain demonstrates that SPD’s asymmetric treatment of outlyingness is fundamental for depth-based learning in non-symmetric populations (Hubert et al., 2015).

6. Summary and Significance

Skew-adjusted projection depth merges projection-based multivariate depth geometry with univariate, medcouple-driven skew adjustment, providing a robust, affine-invariant, and computationally feasible depth measure. SPD is explicitly constructed for scenarios where classical symmetric data depth fails due to skewness or heavy-tailedness. It is straightforward to estimate (PYP_Y0 per query), deploys robust statistics at every stage, and can significantly reduce misclassification error in statistical learning pipelines. SPD exemplifies how robust univariate summaries combined with directional analysis yield substantial practical and theoretical gains in multivariate and functional data analysis (Hubert et al., 2015).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Skew-Adjusted Projection Depth (SPD).