Papers
Topics
Authors
Recent
Search
2000 character limit reached

Instance Hardness Ensemble Filtering

Updated 6 May 2026
  • Instance Hardness Ensemble Filtering is a method that uses metrics like kDN to measure the difficulty of data points and filter out noisy examples.
  • It integrates probabilistic sample weighting and dynamic ensemble selection to maintain informative boundary instances while reducing the impact of ambiguous samples.
  • Empirical results indicate improved accuracy in noisy datasets, with best practices emphasizing tuning of hardness thresholds and careful combination of multiple metrics.

Instance Hardness Ensemble Filtering (IHEF) is a family of methods in supervised machine learning that systematically exploits the concept of instance hardness to guide data selection, model training, or prediction routing within ensemble frameworks. IHEF techniques leverage quantitative measures of example-wise difficulty—most commonly, the k-Disagreeing Neighbors (kDN) metric—to bias training or inferential processes against noisy or ambiguous samples, thereby improving robustness and generalization in the presence of data irregularities and class-boundary complexity.

1. Formalization of Instance Hardness

Instance hardness quantifies the propensity of a data point (xi,yi)(\mathbf{x}_i, y_i) to be misclassified or predicted with high error by a pool of models, often capturing overlap, noise, or ambiguity in local regions of feature space. The most widely adopted family of metrics is the k-Disagreeing Neighbors (kDN) measure, defined for classification as

kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}

where NNk(xi)NN_k(\mathbf{x}_i) denotes the kk nearest neighbors of xi\mathbf{x}_i in input space. Values of kDNkDN near $0$ imply consensus among neighbors (“easy” instances), whereas values close to $1$ suggest boundary points or potential label noise (“hard” instances) (Walmsley et al., 2018, Torquette et al., 2022).

Further instance hardness meta-features include Disjunct Class Percentage (DCP), Tree Depth (TD), Class Likelihood Difference (CLD), and geometric network statistics such as Ratio of Intra- vs. Extra-Class Distances (N2), Local-Set Cardinality (LSC), and others. In regression settings, analogous metrics assess error post-linear or local regression, distribution rarity, or output discontinuities (Torquette et al., 2022).

2. Instance Hardness in Ensemble Generation: Bagging-IH

The canonical instance hardness ensemble filter is Bagging-IH—an adaptation of bootstrap aggregation (Bagging) that probabilistically biases instance selection for base-model training in favor of lower-hardness points. For a training set TT of size nn, Bagging-IH assigns each sample kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}0 a selection score

kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}1

and normalizes these to yield a sampling distribution kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}2. The uniform kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}3 floor guarantees that even kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}4 (maximal hardness) instances may still be sampled, although with reduced probability (Walmsley et al., 2018).

kk9

At inference, the Bagging-IH ensemble aggregates base learner predictions via majority vote. By design, Bagging-IH attenuates the influence of likely noisy points (high kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}5) while retaining class-boundary instances with intermediate hardness due to the nonzero sampling floor.

3. Multi-Feature Hardness Filtering and Thresholding

Beyond kDN, diverse meta-feature–based instance hardness signals can be aggregated to guide explicit data filtering prior to training. Key pipeline steps are:

  • Compute per-instance hardness scores for a set of kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}6 hardness meta-features kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}7;
  • Normalize each feature to kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}8 scale;
  • Aggregate via mean or weighted sum (weights proportional to correlation with empirical instance-level error across a pool of learners);
  • Remove all points with aggregated hardness exceeding a threshold kDN(xi)=1kjNNk(xi)1(yjyi),k=5 typicallykDN(\mathbf{x}_i) = \frac{1}{k} \sum_{j \in NN_k(\mathbf{x}_i)} \mathbf{1}(y_j \neq y_i),\quad k=5 \text{ typically}9 or a quantile;
  • Train downstream model or ensemble on filtered data (Torquette et al., 2022).

A notional algorithm is: xi\mathbf{x}_i0 Best practices advise prioritizing continuously varying, high-correlation metrics such as CLD, N2, and LSC for classification, and LE, S2 for regression. Threshold choice can be tuned via validation or quantile selection (Torquette et al., 2022).

4. Instance Hardness in Dynamic Ensemble and Representation Selection

Recent frameworks exploit instance hardness for dynamic, per-example selection of input representation and classifier pool, as in DRES for fake news detection (Farhangian et al., 21 Sep 2025). Here, instance hardness (again kDN-based) is computed for each sample in multiple feature spaces (e.g., 14 textual embeddings), forming a hardness matrix NNk(xi)NN_k(\mathbf{x}_i)0. At test time, for a query NNk(xi)NN_k(\mathbf{x}_i)1 and each representation NNk(xi)NN_k(\mathbf{x}_i)2, estimated hardness NNk(xi)NN_k(\mathbf{x}_i)3 is the mean hardness of NNk(xi)NN_k(\mathbf{x}_i)4 nearest training neighbors of NNk(xi)NN_k(\mathbf{x}_i)5 in that space.

  • Dynamic representation selection: Pick NNk(xi)NN_k(\mathbf{x}_i)6.
  • Dynamic ensemble selection: Within the chosen view, use dynamic ensemble selection (DES) algorithms—KNORA-E, DES-P, META-DES—to pick the most competent subset of classifiers based on neighborhood performance.

Empirical results demonstrate that jointly optimizing representation and classifier ensemble at the instance level via hardness estimation produces substantial accuracy gains compared to static or single-view designs. Notably, more than 50% of instances exhibit a cross-view hardness range NNk(xi)NN_k(\mathbf{x}_i)7, motivating per-instance view selection (Farhangian et al., 21 Sep 2025).

5. Instance Hardness Filtering in Algorithm Selection for Combinatorial Optimization

Instance-hardness ensemble filtering extends beyond classic supervised learning to combinatorial algorithms. For instance, in combinatorial auctions, instance hardness is defined via the greedy optimality gap:

NNk(xi)NN_k(\mathbf{x}_i)8

A binary hardness label NNk(xi)NN_k(\mathbf{x}_i)9 is assigned given threshold kk0 (calibrated by ROC analysis):

kk1

A lightweight MLP is trained to predict this gap from a 20-dimensional structural feature vector reflecting known failure modes. The resulting “hardness classifier” achieves 94.7% test-set accuracy, and is used to route each instance: easy (greedy heuristic) vs. hard (expensive GNN-based specialist) (Kang, 16 Feb 2026). The hybrid pipeline matches greedy speed on easy cases and GNN performance on hard cases, reducing optimality gap from kk2 (greedy) and kk3 (GNN) to kk4 (hybrid).

6. Empirical Outcomes and Practical Guidelines

Noise % Perceptron OvA Random Subspace Bagging Bagging-IH
0 69.94 68.39 78.60 78.02 (≈)
10 64.17 62.51 77.18 77.66 (+)
20 58.55 56.50 75.60 76.97 (+)
30 52.62 50.76 73.07 75.40 (+)
40 46.73 44.59 67.70 71.44 (+)

(“+” indicates statistical significance over Bagging.)

General recommendations:

  • Use kk5 (kDN) and kk6 (ensemble size) as robust defaults.
  • For kDN, proper feature scaling is essential; approximate nearest neighbor methods mitigate kk7 cost for large datasets.
  • For regression, replace kDN with residual/error-based hardness metrics.
  • Avoid over-filtering by cross-validating the removal threshold.
  • For tasks with highly complex boundaries or high label imbalance, tune kk8 and the sampling floor in ensemble generation to avoid under-sampling informative points (Walmsley et al., 2018, Torquette et al., 2022).

7. Limitations and Prospects

IHEF approaches rely on the quality and granularity of hardness estimates. Discrete measures (e.g. kDN, F1) may lack discrimination for “easy” regions, while tree-based metrics (TD, DCP) can be unstable in high dimensions. Current implementations often prioritize speed and tractability, sometimes at the cost of optimality (e.g., only one view selected in DRES; MLP-based thresholding in combinatorial problems).

Future directions highlighted include:

  • Combining multiple, complementary hardness measures for finer-grained filtering, particularly in high-noise or multi-view settings.
  • Learning to jointly aggregate softness and hardness signals across metric families and input domains.
  • Extending hardness-guided selection to contexts with imbalanced cost regimes, evolving data, or structured prediction tasks (Torquette et al., 2022, Farhangian et al., 21 Sep 2025, Kang, 16 Feb 2026).

Instance Hardness Ensemble Filtering thus unifies probabilistic sample weighting, data-centric filtering, and instance-dependent ensemble routing, demonstrating robust gains across diverse supervised learning and optimization tasks under label noise, boundary ambiguity, and heterogeneity.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Instance Hardness Ensemble Filtering.