Iterative Differential Entropy Minimization (IDEM)
- IDEM is an entropy-driven framework that iteratively minimizes differential (Shannon) entropy to extract minimal representations and optimal alignments in high-dimensional domains.
- It leverages confidence-calibrated stopping rules and statistical bounds, ensuring robust variable selection and noise-resilient 3D point cloud registration.
- Empirical studies show IDEM outperforms RMSE, Chamfer, and Hausdorff metrics by delivering precise, symmetric, and computationally tractable solutions under challenging conditions.
Iterative Differential Entropy Minimization (IDEM) encompasses a family of greedy, entropy-driven optimization techniques for identifying minimal representations or achieving optimal alignment in high-dimensional discrete or continuous domains. The core principle is to iteratively minimize a differential (Shannon) entropy criterion, using confidence-calibrated stopping or symmetry-enforcing discrepancy objectives as dictated by task and data modality. IDEM methods have been applied to variable selection for predictive tasks and to fine rigid 3D point cloud registration, offering computationally tractable, statistically robust, and noise-resilient solutions across these domains (Romero et al., 31 Oct 2025, &&&1&&&).
1. Foundational Principles
IDEM leverages information-theoretic quantities—specifically differential or conditional Shannon entropy—as a selection or alignment metric. In the discrete predictor setting, variable addition proceeds by greedily selecting the feature yielding the largest statistically significant reduction in the conditional entropy of the target variable, with finite-sample effects controlled by Cantelli’s one-sided inequality and empirical entropy variance estimates. In the continuous geometric domain, IDEM evaluates the sum of local differential entropies over the union of two point sets under rigid transformation, defining an objective as the discrepancy between this joint entropy and the sum of the individual marginals.
The unifying strategy is to structure the objective such that, when minimized with iterative or greedy procedures, global or statistically reliable local optima correspond to minimal feature sets or unique geometric alignments.
2. Methodologies and Algorithms
2.1 Discrete Variable Subset Selection
Given binary class variable and a set of mutually independent discrete predictors, IDEM seeks a minimal subset minimizing the residual conditional entropy . The NP-complete subset search is sidestepped by a greedy procedure:
- Initialize , estimate and by sub-sampling.
- At each iteration , for each :
- Estimate and .
- Compute selection score:
- Select . If or the corresponding confidence is below user-defined , halt. Otherwise, .
This approach rigorously controls family-wise error by only accepting variable inclusions justified at the prescribed statistical confidence (Romero et al., 31 Oct 2025).
2.2 Fine Rigid 3D Point Cloud Registration
In geometric registration, IDEM minimizes an entropy-discrepancy metric over the space of rigid-body transformations :
- For point clouds , , define joint cloud .
- Compute the total entropy as the sum of per-point entropies, each approximated using the local covariance matrix of -nearest-neighbors within radius :
- The IDEM objective is the symmetric, commutative quantity:
- Optimization proceeds via steepest descent or quasi-Newton methods (e.g., L-BFGS) from a coarse pre-alignment until (Barberi et al., 14 Jan 2026).
3. Statistical and Computational Properties
3.1 Confidence Calibration via Cantelli’s Bound
For finite-sample settings, the uncertainty in estimated conditional or differential entropies is quantified using empirical variances from sub-sampling. Cantelli’s inequality provides one-sided confidence bounds for entropy intervals:
Selection decisions are made when the lower bound on the current subset’s entropy equals the upper bound of a candidate-augmented subset (Romero et al., 31 Oct 2025).
3.2 Computational Complexity
- For variable selection, each iteration requires work (with sub-sampling, samples), giving total in the worst case.
- For point cloud registration, each iteration involves neighborhood searches and local covariance computation for all points in the joint cloud, resulting in a significant per-iteration cost, but compensating with robustness and symmetry of the objective (Barberi et al., 14 Jan 2026).
4. Practical Implementation and Robustness
4.1 Variable Selection
IDEM reliably identifies truly influential variables with low frequency of spurious selection, even with small sample sizes. Empirical results demonstrate that, as increases, IDEM increasingly favors actual class-influencing predictors and maintains low error rates for irrelevant variables. Stopping is enforced when no positive-score candidates remain or the confidence threshold is not exceeded (Romero et al., 31 Oct 2025).
4.2 Point Cloud Registration
Empirical evaluations compare IDEM to RMSE, Chamfer, and Hausdorff distance. Unlike RMSE, which lacks commutativity and can yield suboptimal alignments with density differences, noise, or partial overlap, the IDEM objective remains zero at perfect alignment under these perturbations (see Table 1 below, adapted from (Barberi et al., 14 Jan 2026)):
| Comparison | q_tot error | RMSE error | Chamfer error | Hausdorff error |
|---|---|---|---|---|
| B₀ – B₀ | 0 | 0, 0 | 0 | 0 |
| B₀ – B₀⁰․1 (10% down) | 0 | 0.35, 0 | 0 | 3.04 |
| B₀ – Bₙ⁰․25 (25% noise) | 0 | 0, 0.71 | 0 | 8.28 |
| B₀ – Bₕ²⁵ (holes) | 0 | 0.9, 0 | 0 | 5.27 |
| B₀ₚ1 – B₀ₚ2 (partial) | 0 | 8.31, 8.28 | 0 | 23.7 |
This demonstrates IDEM’s resilience to density, noise, holes, and limited overlap. The method’s symmetry eliminates the need for a reference point cloud and enables reliable localization of the optimal alignment.
5. Advantages, Limitations, and Comparative Analysis
IDEM provides explicit control of error rates in variable selection and a fully symmetric, density-agnostic, and noise-robust registration metric for point clouds. In variable selection, the confidence-thresholded score calculation directly limits spurious inclusions, with the method converging to the greedy optimum as sample size increases. In registration, the requirement for a good initial placement within the entropy valley (Region Of Interest) is necessary to guarantee convergence to the unique global minimum. The main computational limitation arises from the repeated estimation of conditional entropies (sub-sampling) or local covariances, leading to an increased per-iteration cost relative to some alternative methods. Nonetheless, IDEM outperforms RMSE and related metrics under challenging real-world conditions such as partial overlap and differing point densities (Romero et al., 31 Oct 2025, Barberi et al., 14 Jan 2026).
6. Applications and Empirical Evaluations
In feature selection for binary classification with mutually independent predictors, IDEM accurately recovers the true differential set with low probability of spurious selections. Table 2 from (Romero et al., 31 Oct 2025) exemplifies selection frequency on samples:
| Variable | Select. Freq. (%) |
|---|---|
| X₁ | 87.80 |
| X₂ | 12.15 |
| X₃ | 0.00 |
| X₄ | 0.04 |
| X₅ | 0.01 |
As sample size increases, the algorithm’s selection fidelity improves further.
In 3D point cloud registration, IDEM achieves exactly zero alignment error in all tested scenarios, outperforming RMSE and Hausdorff distance in the presence of density differences, noise, or partial overlap. This robustness is attributable to its commutative entropy-discrepancy metric, which remains stable across a wide range of practical degradations (Barberi et al., 14 Jan 2026).
7. Theoretical Guarantees and Future Perspectives
In the infinite-sample regime, IDEM’s greedy approach to subset selection attains, under submodularity, an approximation within $1-1/e$ of the optimum. For registration problems, the distinct double-Gaussian structure of the landscape ensures a well-defined, unique minimum accessible by standard deterministic optimizers, provided suitable initialization within the ROI. Future work may focus on reducing computational overhead or extending IDEM to multi-class, non-independent, or non-rigid domains.
IDEM constitutes a statistically rigorous, information-theoretically principled framework for iterative model selection and geometric alignment, with demonstrated utility and robust empirical performance in both discrete and continuous problem settings.