Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PointSSIM: Resolution-Invariant Structural Comparison

Updated 1 July 2025
  • PointSSIM is a resolution-invariant metric that converts binary images into marked point patterns to capture essential geometric structures.
  • It extracts anchor points using distance transforms and skeletonization, forming a four-dimensional summary vector that quantifies structural attributes.
  • Its invariance to resolution and rotation and robust discrimination make it ideal for applications in geostatistics, computer vision, and medical imaging.

PointSSIM is a resolution-invariant, low-dimensional image comparison metric designed for robust structural analysis of binary images, with foundational elements drawn from the structural similarity index measure (SSIM) and mathematical morphology (2506.23833). Distinct from classical pixel-wise similarity approaches, PointSSIM operates by transforming images into marked point pattern representations, extracting essential geometric and structural features as anchor points. This approach enables cross-resolution and rotation-invariant comparison, making PointSSIM especially suitable for domains where structural fidelity is of paramount importance and images may differ in native resolution or orientation.

1. Motivation and Conceptual Foundations

Existing image comparison metrics such as MSE, SSIM, and their multi-scale variants are well-suited to intensity images of fixed resolution but exhibit pronounced shortcomings when applied to binary images or scenarios involving variable resolutions and rotations. These limitations affect numerous fields including geostatistics, computer vision, and medical imaging, where structural motifs and geometric integrity, rather than raw pixel-wise fidelity, are critical. PointSSIM addresses these challenges by leveraging strategies from mathematical morphology—particularly skeletonization and distance transforms—to derive a feature-centric, low-dimensional summary of image structure.

The essential conceptual leap is the movement from pixel arrays to a spatial point process framework, enabling the metric to focus on core object features and their spatial arrangement, rather than direct intensity correspondence.

2. Methodology: Image-to-Point Pattern Transformation

(a) Marked Point Pattern Representation

The first stage in the PointSSIM workflow is the transformation of each input binary image to a standard coordinate system. This ensures commensurability regardless of original resolution.

A minimal (Euclidean) distance transform is then applied, assigning to each non-background pixel its distance from the nearest background pixel. The local maxima of this distance field—representing geometric centers or skeleton loci of objects—are identified as anchor points. A locally adaptive maximum selection criterion is enforced: anchor points must be separated by at least their distance to the edge, preventing excessive density in closely-packed regions and promoting scale-adaptive representation.

For each anchor point, a mark is assigned comprising spatial coordinates, local radius (distance to boundary), and the object label (obtained by connected component analysis on the binary image).

(b) Construction of the Summary Vector

From the marked point pattern, PointSSIM forms a four-dimensional summary vector that encodes primary structural aspects:

  1. Intensity (Anchor Count): The number of anchor points, reflecting the point-process intensity.
  2. Area Coverage: The total area covered by circles (centered at anchor points, with radii corresponding to distance to boundary) normalized by the image domain.
  3. Connectivity/Complexity (Anchor Points per Object): Average number of anchor points per labeled object, conveying structural heterogeneity.
  4. Spatial Variance Irregularity: A normalized measure quantifying the spatial dispersion of anchor points across subdivisions of the image, distinguishing between regular, random, and clustered patterns.

This summary vector condenses the complex spatial configuration of each image into a concise set of descriptors with minimal redundancy.

3. Mathematical Formulation

Let xx denote a binary image. Its set of anchor points is A={(i,j,r,l)}\mathcal{A} = \{(i, j, r, l)\}, where (i,j)(i, j) are coordinates, rr is the local maximal distance to boundary, and ll is the object label.

The summary measures are as follows:

  1. Anchor Count:

V1(x)=AV_1(x) = |\mathcal{A}|

  1. Area Coverage:

V2(x)=k=1Ark2LxLyV_2(x) = \frac{\sum_{k=1}^{|\mathcal{A}|} r_k^2}{L_x L_y}

where Lx,LyL_x, L_y are the image dimensions.

  1. Anchor Points per Object:

V3(x)=Amax(l)V_3(x) = \frac{|\mathcal{A}|}{\max(l)}

where max(l)\max(l) is the number of labeled objects.

  1. Spatial Variance Irregularity: Anchor points are partitioned into subregions BiB_i, and for each, the variance in anchor count is compared with the Poisson expectation:

V4(x)=1+sλBs+λB2V_4(x) = \frac{1 + \frac{s - \lambda|B|}{s + \lambda|B|}}{2}

where ss is empirical variance, λ=AA\lambda = \frac{|\mathcal{A}|}{|A|}, and B|B| is the area of each subregion.

The PointSSIM between two images x1x_1 and x2x_2 is defined as: PointSSIM(x1,x2)=114i=13Vi(x1)Vi(x2)max(Vi(x1),Vi(x2))V4(x1)V4(x2)2\text{PointSSIM}(x_1, x_2) = 1 - \frac{1}{4} \sum_{i=1}^3 \frac{|V_i(x_1) - V_i(x_2)|}{\max(V_i(x_1), V_i(x_2))} - \frac{|V_4(x_1) - V_4(x_2)|}{2} yielding a similarity between 0 and 1, with 1 indicating perfect structural equivalence.

4. Invariance and Discriminative Properties

PointSSIM is inherently resolution invariant due to the anchoring on key geometric points as opposed to fixed pixel locations. Both anchor extraction and all summary measures are defined in terms of geometric relations and normalized quantities, decoupling the metric from underlying grid discretization. The methodology also provides strong rotation invariance, as structural attributes derived from the distribution of anchor points are unaffected by global image orientation.

Empirical results demonstrate that PointSSIM discriminates strongly between structurally distinct image classes, with higher within-class similarity and enhanced separation between divergent classes when compared to MSE, SSIM, and MS-SSIM. The four summary measures are observed to be relatively uncorrelated, thus capturing complementary aspects of image structure.

5. Applications and Experimental Results

PointSSIM's principal applications span geostatistics (comparison of multiple-point statistics models), computer vision (structural pattern matching in binary or segmented imagery), and any field requiring invariant structural image comparison across scales and orientations, such as medical imaging and remote sensing.

Benchmarking on synthetic binary datasets—including random fields, structured and unstructured object grids, and shape mixtures—consistently confirms PointSSIM's superiority in both resolution invariance and structural discrimination. The measure robustly distinguishes subtle class differences that remain undetected by conventional pixel-based similarity metrics.

6. Limitations and Future Directions

While PointSSIM delivers high computational efficiency and robust invariance, the restriction to four summary attributes is a deliberate compromise between discriminative power and interpretability. The measure may have limited sensitivity to specific structural nuances, such as fine-grained curvature, not captured by the present feature set. PointSSIM operates most effectively when images contain a sufficient number of distinct features to ensure meaningful anchor extraction.

Planned extensions include the development of analogues for grayscale and RGB images, exploration of curvature and higher-order statistics as additional features, integration as a regularizer in generative modeling workflows, and the development of adaptive, task-driven summary vectors.

7. Significance

PointSSIM constitutes a scalable, interpretable, and application-agnostic tool for structural comparison of binary images. By reframing image similarity in terms of marked point patterns, it provides robust invariance to resolution and orientation while remaining computationally tractable and easily interpretable. Its methodological foundations in mathematical morphology and summary statistics position it as a valuable complement to classical intensity-based metrics in academic and industrial settings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)