Condensed Nearest Neighbour (CNN)

Updated 5 April 2026

Condensed Nearest Neighbour (CNN) is a prototype selection method that reduces training set size while preserving 1-NN classification accuracy.
The algorithm iteratively adds misclassified points to form a condensed set, ensuring perfect classification on training data.
Theoretical bounds link the number of prototypes to data geometry, and variants like WNN and SFCNN offer enhanced efficiency and compression.

Condensed Nearest Neighbour (CNN) is a class of prototype selection techniques designed to reduce the size of training sets for nearest-neighbour (NN) classification. The canonical CNN algorithm operates by iteratively collecting a subset of the original data such that the reduced set (the condensed set) yields perfect classification of all training points under the 1-NN rule. This process is foundational in prototype reduction, memory-constrained learning, and provides a well-defined testbed for theory at the intersection of learning geometry, combinatorial optimization, and computational efficiency.

1. Formal Description and Algorithmic Workflow

Let $T=\{(x_1,c_1),\ldots,(x_n,c_n)\}\subset\mathbb{R}^d\times C$ , where $C$ is a finite label set. The CNN algorithm constructs a prototype set $P\subseteq T$ such that the 1-NN classifier induced by $P$ is consistent with the original training set: for every $(x,c)\in T$ , the label assigned by the prototype set’s nearest neighbour operator $C_P(x)$ equals $c$ . The standard algorithm (Christiansen, 2013, Bhatia et al., 2010) proceeds as follows:

$P$ 6

CNN iterates through $T$ in fixed order: each misclassified training point under current $P$ is added to $P$ . The process terminates within $C$ 0 steps, and at convergence $C$ 1 is consistent with respect to $C$ 2.

2. Theoretical Properties and Upper Bounds

CNN guarantees termination, but its storage complexity is nontrivial to analyze. Christiansen (Christiansen, 2013) establishes a margin-based upper bound via a connection to the multiclass perceptron mistake bound. For each “neighborly” feature map $C$ 3 (satisfying that the 1-NN rule is reproduced via a restricted class of linear multiclass perceptron decision functions), define:

Separability margin $C$ 4: the minimal margin such that for all $C$ 5 and all $C$ 6,

$C$ 7

Radius $C$ 8: maximal separation,

$C$ 9

The main result is that CNN accumulates at most

$P\subseteq T$ 0

prototypes, where the infimum is over all neighborly feature maps $P\subseteq T$ 1. This bound is independent of the training set size and depends only on geometric characteristics of the data embedding (Christiansen, 2013).

3. Algorithmic Complexity and Practical Implementation

Let $P\subseteq T$ 2 denote the final size of the condensed set. Each pass through $P\subseteq T$ 3 examines $P\subseteq T$ 4 points, with a nearest-neighbour search in $P\subseteq T$ 5 (cost $P\subseteq T$ 6 for linear scan, $P\subseteq T$ 7 for spatial structures). Typically, $P\subseteq T$ 8, so total complexity is subquadratic. Major implementation specifics include:

Prototype storage in spatial indices (e.g., kd-trees) for efficient NN queries.
Initialization heuristics: seeding $P\subseteq T$ 9 with one point per class improves boundary coverage.
Order randomization: different scan orders can yield different $P$ 0, reflecting recognized order-sensitivity.
Early stopping: practical variants terminate if a pass yields too few additions.

Empirically, CNN can reduce storage by an order-of-magnitude while maintaining high fidelity to the original classifier (Bhatia et al., 2010, Gottlieb et al., 2023).

The nearest-neighbor condensation problem is NP-hard; seeking a minimum consistent subset is computationally intractable (Flores-Velazco et al., 2020). Notable variants and connections include:

Fast CNN (FCNN) (Flores-Velazco, 2020): Adds batches of misclassified representatives per iteration. It can fail to guarantee polynomial-size upper bounds due to geometrically adversarial configurations.
Social-distanced FCNN (SFCNN) (Flores-Velazco, 2020): Adds only a single new point per iteration, yielding a tight upper bound $P$ 1, where $P$ 2 is the margin and $P$ 3 is the doubling dimension.
Weighted Distance Nearest-Neighbor Condensing (WNN) (Gottlieb et al., 2023): Each prototype is assigned an individual positive weight, and classification is by weighted distance. WNN can, in some cases, reduce the prototype set to size $P$ 4—substantially less than possible with standard (unweighted) condensation—while retaining generalization guarantees equivalent to the unweighted case.
Reduced Nearest Neighbour (RNN): Post-processes CNN’s output by greedily removing redundant prototypes, further shrinking the set at extra computational expense (Bhatia et al., 2010).
Coreset approaches: $P$ 5-selective/consistent subsets can yield theoretical coresets for the NN rule, offering refined guarantees on approximation and subquadratic construction in fixed dimension (Flores-Velazco et al., 2020).

5. Statistical and Geometric Insights

CNN’s upper bound and empirical efficacy are tightly linked to geometric data properties. A large separation margin leads directly to small prototype sets. The number of classes and feature dimension impact the bound only indirectly, through their influence on available feature maps and achievable margins. Conversely, densely packed, low-margin, or high-noise datasets force CNN to retain larger subsets. Order-sensitivity can cause relevant boundary points to be missed in a single pass, motivating multiple passes or ensemble practices (e.g., combining outputs from several random seeds) (Bhatia et al., 2010).

The extension to weighted distances (WNN) introduces an extra axis of flexibility that, in pathological constructions, can yield exponentially better compression (Gottlieb et al., 2023). Moreover, for general metric-space training data where the Bayes classifier has margin and the feature space is separable, WNN and its greedy heuristics can achieve Bayes consistency.

6. Practical Performance, Empirical Evidence, and Limitations

Experimental results demonstrate dramatic data reduction: on standard UCI datasets, CNN typically retains only 5–20% of the original set while preserving >95% of the 1-NN test accuracy; WNN and SFCNN can approach the theoretical optimum much more closely (Bhatia et al., 2010, Gottlieb et al., 2023, Flores-Velazco, 2020). For high-dimensional or complex data, performance remains dependent on the choice of distance metric and implementation of the base NN search.

The NP-hardness of optimal condensation means all practical methods are heuristic. For many real-world settings, boundary coverage is the principal challenge: CNN may miss necessary prototypes in poorly separated regions or under adversarial scan orders. This issue can be partly remedied by ensemble condensation schemes or by switching to weighted and relaxed variants.

7. Connections to Broader Literature and Applications

CNN is a central structureless approach to instance selection, distinct from structure-based NN acceleration methods such as kd-trees or cover-trees, which target query-time complexity rather than storage reduction (Bhatia et al., 2010). Prototype selection via CNN serves as a precursor to model compression, memory-efficient inference, and as an ingredient in coreset construction for scalable machine learning (Flores-Velazco et al., 2020). Its theoretical underpinnings, particularly the perceptron connection, clarify conditions for effective sample compression and inform methods in active learning and statistical geometry. The adaptability of condensation to weighted distances, as in WNN, opens further potential for tailored memory-reduction in complex metric spaces (Gottlieb et al., 2023).

Markdown Report Issue Upgrade to Chat

References (5)

An upper bound on prototype set size for condensed nearest neighbor (2013)

Survey of Nearest Neighbor Techniques (2010)

Weighted Distance Nearest Neighbor Condensing (2023)

Coresets for the Nearest-Neighbor Rule (2020)

Social Distancing is Good for Points too! (2020)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Condensed Nearest Neighbour (CNN).