Fraction of Borderline Points (N1)

Updated 18 May 2026

Fraction of Borderline Points (N₁) is a metric that quantifies the proportion of training points critical for defining class boundaries in nearest-neighbor classification.
It measures boundary complexity by identifying points on non-empty Voronoi facets, reflecting the geometric and combinatorial structure of the dataset.
Algorithmic improvements enable efficient discovery of these border points, aiding classifier robustness assessment and dataset reduction strategies.

The Fraction of Borderline Points (N₁) is a metric in nearest-neighbor classification denoting the proportion of training samples that are essential for defining the decision boundaries of a classifier. These key points—variously termed "border points" or "relevant points"—are those whose removal would alter the classifier’s output for at least one query in $\mathbb{R}^d$ . The N₁ statistic provides a quantitative index of the geometric and combinatorial complexity of the class boundaries in a given dataset (Flores-Velazco, 2022).

1. Formal Definition of Borderline (Relevant) Points

Given a labeled dataset $P \subset \mathbb{R}^d$ of size $n$ , with class labels $c(p)$ for each $p \in P$ , a point $p \in P$ is deemed a border (or relevant) point if there exists another sample $\hat p \in P$ with $c(\hat p) \neq c(p)$ and a query $q \in \mathbb{R}^d$ for which

$\|q - p\| = \|q - \hat p\| < \|q - r\| \quad \text{for all} \; r \in P \setminus \{p, \hat p\}.$

This condition holds exactly when $P \subset \mathbb{R}^d$ 0 and $P \subset \mathbb{R}^d$ 1 span a non-empty $P \subset \mathbb{R}^d$ 2-dimensional Voronoi face (a "wall") that separates regions associated with different class labels. Alternatively, a point is relevant if its deletion from $P \subset \mathbb{R}^d$ 3 would result in misclassification of some query $P \subset \mathbb{R}^d$ 4 under the nearest-neighbor classifier. This equates the concept with those points lying on class-separating facets of the Voronoi diagram of $P \subset \mathbb{R}^d$ 5 (Flores-Velazco, 2022).

2. Algorithmic Identification of the Border Set

Let $P \subset \mathbb{R}^d$ 6 denote the set of all border points, with $P \subset \mathbb{R}^d$ 7. The border set can be found by an output-sensitive search procedure, improving upon prior $P \subset \mathbb{R}^d$ 8 algorithms with a method that avoids the initial $P \subset \mathbb{R}^d$ 9 minimum spanning tree computation. The high-level steps are:

Choose an arbitrary seed $n$ 0 and initialize $n$ 1.
Iterate until no new points enter $n$ $n$ 2:
- For each $n$ $n$ 3 in $n$ $n$ 4 (already processed):
  - Let $n$ 5 be the same-class subset as $n$ 6.
  - Invert points of $n$ 7 through a sphere centered at $n$ 8, yielding set $n$ 9.
  - Find extreme points of $c(p)$ 0, using, e.g., Chan’s output-sensitive convex hull algorithm.
  - Map these extreme points back to $c(p)$ 1 and add them to $c(p)$ 2.
Output $c(p)$ 3 as the set of all border points.

The above procedure ensures that inversion only reports actual border points (bichromatic Voronoi walls), that all connected boundary components are completely discovered by repeated inversion, and that moving across single same-class regions is possible to visit disconnected walls, enabling single-pass complete discovery (Flores-Velazco, 2022).

3. Computational Complexity and Implementation Details

The algorithmic bottleneck is finding extreme points in high-dimensional inversion sets. Each inversion operation costs $c(p)$ 4, due to $c(p)$ 5 points and up to $c(p)$ 6 extreme points per inversion, with at most $c(p)$ 7 border points requiring expansion. Thus, the total runtime is $c(p)$ 8. For $c(p)$ 9, using Chan’s randomized hull algorithm gives an expected runtime of $p \in P$ 0. In general, for fixed $p \in P$ 1, the time complexity becomes

$p \in P$ 2

Key implementation concerns include numerically stable sphere inversion and efficient convex hull or extreme-point routines in $p \in P$ 3 dimensions. Randomized routines, such as Chan’s, provide practical improvements but depend on random sampling. For large datasets, subsampling or approximate extreme-point queries can yield an approximate border set, as can deployment of fast approximate nearest-neighbor structures to speed up inversion-related emptiness checks (Flores-Velazco, 2022).

4. The N₁ Statistic: Definition and Interpretation

Once the border set $p \in P$ 4 is determined, the fraction of borderline points

$p \in P$ 5

is calculated, where $p \in P$ 6. This metric quantifies the fraction of training data lying precisely on class-separating Voronoi facets and thus actually influencing the classifier’s decisions in $p \in P$ 7.

Interpretively, $p \in P$ 8 is an index of boundary complexity:

$p \in P$ 9 indicates simple, well-separated classes with few true border points.
$p \in P$ 0 implies highly interwoven or noisy class structure, where most samples are critical to correct classification, and nearest-neighbor classifiers may be fragile.

Common practice assumes general position (no $p \in P$ 1 points co-spherical) to ensure Voronoi faces are well-defined (Flores-Velazco, 2022).

5. Empirical Behavior and Practical Significance

The behavior of $p \in P$ 2 (and thus $p \in P$ 3) is dataset-dependent:

In pathological worst-case scenarios, such as highly interleaved or noisy class distributions yielding $p \in P$ 4, the algorithm’s $p \in P$ 5 complexity is prohibitive.
For most real-world datasets where classes cluster or boundaries are low-dimensional, $p \in P$ 6, making the procedure feasible and $p \in P$ 7 a meaningful regularity indicator.
In large-scale regimes, approximate algorithms or subsampling provide practical estimates of $p \in P$ 8.

Practical interpretation of $p \in P$ 9 as a boundary complexity index makes it diagnostically useful for evaluating the suitability of nearest-neighbor classifiers and for informing dataset reduction strategies by identifying the minority of points truly necessary for boundary accuracy (Flores-Velazco, 2022).

6. Summary Table

Quantity	Definition	Interpretation
$\hat p \in P$ 0	Set of all border (relevant) points	Points defining class-separating Voronoi facets
$\hat p \in P$ 1	Number of border points	Critical size parameter for complexity and runtime
$\hat p \in P$ 2	Fraction of border points	Index of class boundary complexity

The fraction of borderline points (N₁), defined as $\hat p \in P$ 3, is thus a precisely characterized, geometrically motivated, and computationally tractable metric for assessing and exploiting the structure of training sets in nearest-neighbor classification (Flores-Velazco, 2022).

Markdown Report Issue Upgrade to Chat

References (1)

Improved Search of Relevant Points for Nearest-Neighbor Classification (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Fraction of Borderline Points (N1).

Fraction of Borderline Points (N1)

1. Formal Definition of Borderline (Relevant) Points

2. Algorithmic Identification of the Border Set

3. Computational Complexity and Implementation Details

4. The N₁ Statistic: Definition and Interpretation

5. Empirical Behavior and Practical Significance

6. Summary Table

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Don't miss out on important new AI/ML research

Fraction of Borderline Points (N1)

1. Formal Definition of Borderline (Relevant) Points

2. Algorithmic Identification of the Border Set

3. Computational Complexity and Implementation Details

4. The N₁ Statistic: Definition and Interpretation

5. Empirical Behavior and Practical Significance

6. Summary Table

Topic to Video (Beta)

Whiteboard

Follow Topic

Continue Learning

Related Topics

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research