Papers
Topics
Authors
Recent
Search
2000 character limit reached

Fraction of Borderline Points (N1)

Updated 18 May 2026
  • Fraction of Borderline Points (N₁) is a metric that quantifies the proportion of training points critical for defining class boundaries in nearest-neighbor classification.
  • It measures boundary complexity by identifying points on non-empty Voronoi facets, reflecting the geometric and combinatorial structure of the dataset.
  • Algorithmic improvements enable efficient discovery of these border points, aiding classifier robustness assessment and dataset reduction strategies.

The Fraction of Borderline Points (N₁) is a metric in nearest-neighbor classification denoting the proportion of training samples that are essential for defining the decision boundaries of a classifier. These key points—variously termed "border points" or "relevant points"—are those whose removal would alter the classifier’s output for at least one query in Rd\mathbb{R}^d. The N₁ statistic provides a quantitative index of the geometric and combinatorial complexity of the class boundaries in a given dataset (Flores-Velazco, 2022).

1. Formal Definition of Borderline (Relevant) Points

Given a labeled dataset PRdP \subset \mathbb{R}^d of size nn, with class labels c(p)c(p) for each pPp \in P, a point pPp \in P is deemed a border (or relevant) point if there exists another sample p^P\hat p \in P with c(p^)c(p)c(\hat p) \neq c(p) and a query qRdq \in \mathbb{R}^d for which

qp=qp^<qrfor all  rP{p,p^}.\|q - p\| = \|q - \hat p\| < \|q - r\| \quad \text{for all} \; r \in P \setminus \{p, \hat p\}.

This condition holds exactly when PRdP \subset \mathbb{R}^d0 and PRdP \subset \mathbb{R}^d1 span a non-empty PRdP \subset \mathbb{R}^d2-dimensional Voronoi face (a "wall") that separates regions associated with different class labels. Alternatively, a point is relevant if its deletion from PRdP \subset \mathbb{R}^d3 would result in misclassification of some query PRdP \subset \mathbb{R}^d4 under the nearest-neighbor classifier. This equates the concept with those points lying on class-separating facets of the Voronoi diagram of PRdP \subset \mathbb{R}^d5 (Flores-Velazco, 2022).

2. Algorithmic Identification of the Border Set

Let PRdP \subset \mathbb{R}^d6 denote the set of all border points, with PRdP \subset \mathbb{R}^d7. The border set can be found by an output-sensitive search procedure, improving upon prior PRdP \subset \mathbb{R}^d8 algorithms with a method that avoids the initial PRdP \subset \mathbb{R}^d9 minimum spanning tree computation. The high-level steps are:

  1. Choose an arbitrary seed nn0 and initialize nn1.
  2. Iterate until no new points enter nn2:
    • For each nn3 in nn4 (already processed):
      • Let nn5 be the same-class subset as nn6.
      • Invert points of nn7 through a sphere centered at nn8, yielding set nn9.
      • Find extreme points of c(p)c(p)0, using, e.g., Chan’s output-sensitive convex hull algorithm.
      • Map these extreme points back to c(p)c(p)1 and add them to c(p)c(p)2.
  3. Output c(p)c(p)3 as the set of all border points.

The above procedure ensures that inversion only reports actual border points (bichromatic Voronoi walls), that all connected boundary components are completely discovered by repeated inversion, and that moving across single same-class regions is possible to visit disconnected walls, enabling single-pass complete discovery (Flores-Velazco, 2022).

3. Computational Complexity and Implementation Details

The algorithmic bottleneck is finding extreme points in high-dimensional inversion sets. Each inversion operation costs c(p)c(p)4, due to c(p)c(p)5 points and up to c(p)c(p)6 extreme points per inversion, with at most c(p)c(p)7 border points requiring expansion. Thus, the total runtime is c(p)c(p)8. For c(p)c(p)9, using Chan’s randomized hull algorithm gives an expected runtime of pPp \in P0. In general, for fixed pPp \in P1, the time complexity becomes

pPp \in P2

Key implementation concerns include numerically stable sphere inversion and efficient convex hull or extreme-point routines in pPp \in P3 dimensions. Randomized routines, such as Chan’s, provide practical improvements but depend on random sampling. For large datasets, subsampling or approximate extreme-point queries can yield an approximate border set, as can deployment of fast approximate nearest-neighbor structures to speed up inversion-related emptiness checks (Flores-Velazco, 2022).

4. The N₁ Statistic: Definition and Interpretation

Once the border set pPp \in P4 is determined, the fraction of borderline points

pPp \in P5

is calculated, where pPp \in P6. This metric quantifies the fraction of training data lying precisely on class-separating Voronoi facets and thus actually influencing the classifier’s decisions in pPp \in P7.

Interpretively, pPp \in P8 is an index of boundary complexity:

  • pPp \in P9 indicates simple, well-separated classes with few true border points.
  • pPp \in P0 implies highly interwoven or noisy class structure, where most samples are critical to correct classification, and nearest-neighbor classifiers may be fragile.

Common practice assumes general position (no pPp \in P1 points co-spherical) to ensure Voronoi faces are well-defined (Flores-Velazco, 2022).

5. Empirical Behavior and Practical Significance

The behavior of pPp \in P2 (and thus pPp \in P3) is dataset-dependent:

  • In pathological worst-case scenarios, such as highly interleaved or noisy class distributions yielding pPp \in P4, the algorithm’s pPp \in P5 complexity is prohibitive.
  • For most real-world datasets where classes cluster or boundaries are low-dimensional, pPp \in P6, making the procedure feasible and pPp \in P7 a meaningful regularity indicator.
  • In large-scale regimes, approximate algorithms or subsampling provide practical estimates of pPp \in P8.

Practical interpretation of pPp \in P9 as a boundary complexity index makes it diagnostically useful for evaluating the suitability of nearest-neighbor classifiers and for informing dataset reduction strategies by identifying the minority of points truly necessary for boundary accuracy (Flores-Velazco, 2022).

6. Summary Table

Quantity Definition Interpretation
p^P\hat p \in P0 Set of all border (relevant) points Points defining class-separating Voronoi facets
p^P\hat p \in P1 Number of border points Critical size parameter for complexity and runtime
p^P\hat p \in P2 Fraction of border points Index of class boundary complexity

The fraction of borderline points (N₁), defined as p^P\hat p \in P3, is thus a precisely characterized, geometrically motivated, and computationally tractable metric for assessing and exploiting the structure of training sets in nearest-neighbor classification (Flores-Velazco, 2022).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fraction of Borderline Points (N1).