- The paper introduces PowerBin, a novel adaptive binning algorithm that recasts binning as a capacity-constrained optimal transport problem using CPDs.
- It employs a heuristic based on area–radius correspondence and iterative generator updates to achieve convex bins with near O(N log N) scaling.
- Empirical evaluations demonstrate its superior S/N uniformity, robustness to noise, and applicability to both astronomical surveys and artistic imaging.
PowerBin: Fast Adaptive Data Binning with Centroidal Power Diagrams
Introduction and Motivation
Adaptive binning remains essential in the analysis of modern astronomical datasets, especially in integral-field spectroscopy (IFS), where sufficient S/N is a prerequisite for robust physical inference from spatially resolved spectra. As demonstrated by parameter recovery experiments, low S/N leads to highly non-Gaussian, multimodal posteriors, producing unreliable and biased results even in ensemble aggregation (Figure 1).
Figure 1: Posterior probability distributions for four kinematic parameters at different S/N illustrating poor inference quality at low S/N and stability at high S/N.
The widely-adopted Voronoi-binning algorithms meet spatial compactness and uniform S/N constraints. However, their computational complexity is prohibitive for datasets with ∼106 pixels, and the use of multiplicatively-weighted Voronoi tessellations can produce non-convex or disconnected bins. The necessity for a scalable, robust, and morphologically constrained binning framework has become acute with the advent of next-generation IFS surveys.
Theoretical Framework: From Voronoi Diagrams to Optimal Transport
The PowerBin algorithm recasts the binning task as a capacity-constrained optimal transport problem. Instead of ordinary or multiplicatively/additively-weighted Voronoi diagrams, PowerBin employs Centroidal Power Diagrams (CPDs), which guarantee convex, morphologically optimal bin shapes and can enforce per-bin capacity constraints efficiently. The power diagram is distinguished from other generalized Voronoi diagrams by its linear, convex cell boundaries and the direct correspondence between generator weights and bin capacity (Figure 2).

Figure 2: Physical and geometric analogies for adaptive binning; optimal packing of equal-area cells (left), and soap bubble foam as a physical model for capacity-constrained tessellations (right).
The solution to the semi-discrete optimal transport problem with prescribed capacities is a CPD, in which each generator weight effectively controls the volume (pixel count or capacity) of its corresponding cell, while the generator position centers the bin.
Algorithmic Design: PowerBin Heuristic and Scalability
Direct optimization of the dual (Lagrangian) energy functional for CPDs is computationally prohibitive and numerically unstable with discrete, correlated-noise or non-additive capacity measures—that is, the condition encountered in real astronomical data. PowerBin instead develops a physically-motivated heuristic using the area–radius correspondence in the soap bubble analogy. Through iterative updates of generator positions (geometric centroids) and weights (squared radii), the algorithm drives bin capacities toward target values while maintaining convexity and compactness (Figure 2).
A crucial advance is the introduction of a bin-accretion algorithm with O(NlogN) scaling, leveraging precomputed Delaunay triangulations and heap-prioritized accretion of spaxels by local brightness or S/N. Both initialization and the ensuing CPD-based regularization operate efficiently, yielding dramatic efficiency gains over previous approaches (Figure 3).
Figure 3: Runtime comparison for classic VorBin and PowerBin, showing near O(NlogN) scaling for both accretion and regularization in PowerBin.
Empirical Evaluation: Robustness, Quality, and Generalization
PowerBin was evaluated on synthetic galaxies, background-limited mosaics, classic exemplar IFS datasets, and large-scale images with non-astronomical content.
On mock galaxy profiles with both additive (Poissonian) and non-additive (correlated noise, e.g., CALIFA-like modulation) capacity functions, PowerBin achieves high S/N uniformity and strict convexity, with bin shapes adapting smoothly to underlying signal gradients (Figure 4).
Figure 4: PowerBin results on mock galaxies with different Sersic indexes and noise models, showing uniform S/N and morphologically desirable bins.
Application to mosaics containing multiple galaxies in a noisy field demonstrates PowerBin's ability to initiate bins in multiple distant regions simultaneously, robust to negative or irregular capacities (Figure 5).
Figure 5: PowerBin tessellation for multi-object, background-limited galaxies; large bins in the background, tightly matched to target S/N elsewhere.
On canonical IFS observations, such as SAURON's NGC 2273, PowerBin slightly outperforms previous variants in S/N uniformity and delivers bins strictly adhering to convexity requirements (Figure 6).
Figure 6: PowerBin applied to SAURON IFS data of NGC 2273, reproducing and improving on earlier Voronoi-based approaches in both morphology and S/N dispersion.
PowerBin also generalizes outside astronomy, efficiently handling large-scale binning for artistic stippling, demonstrating the connection to blue-noise sampling distributions (Figure 7).
Figure 7: Distribution of 104 PowerBin generators for a 512×512 binary image, illustrating the method's applicability to computer graphics and complex 2D domains.
Practical and Theoretical Implications
PowerBin stands as both a scalable, robust practical solution and as an instantiation of optimal transport theory in scientific binning. Its O(NlogN) complexity enables the routine analysis of million-pixel datasets, aligning method performance with future survey requirements. The strict convexity of bins resolves long-standing issues of morphological pathologies in weighted Voronoi binning, which is essential for applications where bin shape impacts downstream inference chains or modeling.
The method’s extensibility to non-additive, empirically-determined capacity functions directly addresses the hardest cases in real data reduction, such as correlated-noise modeling, and supports its adoption in current and future workflows across astronomy and other domains.
Moreover, the PowerBin approach—built on a physical-geometric analogy instead of gradient-based optimization—offers a generic template for implementing capacity-constrained tessellations in higher dimensions and with complex constraints, opening paths for use in spatial statistics, image analysis, graphics, and machine learning.
Conclusion
PowerBin introduces an algorithmically optimal, physically principled solution for adaptive binning, guaranteeing bin convexity, superior S/N uniformity, and scalability unconstrained by prior quadratic-scaling bottlenecks. Its efficacy and versatility on both synthetic and real data position it as a new baseline for capacity-constrained spatial partitioning in astronomy and beyond. Ongoing and future adoption will likely focus on integrating PowerBin in automated analysis pipelines, exploring its performance in 3D, and leveraging its framework for advanced capacity constraints in scientific and technical disciplines.
Reference: "PowerBin: Fast Adaptive Data Binning with Centroidal Power Diagrams" (2509.06903).