Papers
Topics
Authors
Recent
2000 character limit reached

Outer Mask Preserving Score (OMPS)

Updated 20 December 2025
  • OMPS is a task-specific evaluation metric that measures the fidelity of non-edited regions in drag-based image editing by comparing pixel-level differences.
  • It computes the normalized RMSE between the source-to-edited and source-to-ground truth images in the outer mask, ensuring fair, sample-independent comparisons.
  • The metric aids in benchmarking by penalizing models that alter unintended areas, thereby highlighting the benefits of mask-aware editing approaches.

Outer Mask Preserving Score (OMPS) is a task-specific evaluation metric introduced to rigorously quantify the preservation of non-edited regions in drag-based image editing. In this context, “drag” refers to user-specified manipulations where a localized region of an image is moved or transformed, guided by an explicit mask. OMPS directly measures the fidelity at pixel level in the outer (non-edited) regions, penalizing models that introduce any undesired changes to the background or content not targeted by the user’s edit. By normalizing the error in non-edited regions with respect to the actual motion present in those regions between the source and ground-truth target images, OMPS enables meaningful, sample-independent comparisons across diverse editing scenarios (Zafarani et al., 13 Dec 2025).

1. Motivation and Context

Drag-based image editing focuses edits on small, specified regions, typically controlled by an inner mask. However, generative models can introduce undesired changes outside the intended mask—such as smearing, blurring, or hallucination—particularly in the background. Standard metrics like SSIM or FID are global and can obscure such localized artifacts. OMPS was developed to specifically address this evaluation gap: it quantifies the extent to which methods preserve the outer (non-masked) areas, directly testing the critical requirement of localized editing—“Don’t touch pixels outside the user’s mask” (Zafarani et al., 13 Dec 2025).

2. Mathematical Definition

Let IS:ΩR3I_S : \Omega \to \mathbb{R}^3 denote the source image, IG:ΩR3I_G : \Omega \to \mathbb{R}^3 the generated (edited) image, IT:ΩR3I_T : \Omega \to \mathbb{R}^3 the real ground-truth target image, and M:Ω{0,1}M : \Omega \to \{0,1\} the user-specified inner mask (1 for pixels to edit, 0 otherwise). The outer mask is Mˉ=1M\bar M = 1 - M, and the associated pixel set is Ωouter={pΩMˉ(p)=1}\Omega_{\text{outer}} = \{p \in \Omega \mid \bar M(p) = 1\}.

Define the root-mean-square error (RMSE) in the outer region between two images XX and YY as: RMSE(X,Y)=1ΩouterpΩouterX(p)Y(p)22\mathrm{RMSE}(X, Y) = \sqrt{\frac{1}{|\Omega_{\text{outer}}|} \sum_{p \in \Omega_{\text{outer}}} \| X(p) - Y(p) \|_2^2} The OMPS for a sample is then: OMPS(IG,IS,IT,M)=RMSE(ISMˉ,IGMˉ)RMSE(ISMˉ,ITMˉ)1\mathrm{OMPS}(I_G, I_S, I_T, M) = \left| \frac{\mathrm{RMSE}(I_S \odot \bar M,\, I_G \odot \bar M)}{\mathrm{RMSE}(I_S \odot \bar M,\, I_T \odot \bar M)} - 1 \right| where \odot denotes elementwise multiplication by the mask, ensuring only outer mask pixels contribute. If RMSE(IS,IT)=0\mathrm{RMSE}(I_S, I_T) = 0 (no ground-truth change in the outer region), OMPS is set to RMSE(IS,IG)\mathrm{RMSE}(I_S, I_G).

3. Computation Protocol

The practical computation of OMPS involves several explicit steps to ensure reproducibility and comparability:

  1. Preprocessing: Align ISI_S, IGI_G, and ITI_T spatially, ensuring all images are registered and at matching resolution. The inner mask MM must be correspondingly registered.
  2. Form Outer Mask: Compute Mˉ=1M\bar M = 1 - M and extract the set Ωouter\Omega_{\text{outer}}.
  3. Numerator (Source-to-Edited RMSE): For each pΩouterp \in \Omega_{\text{outer}}, calculate the vector difference IS(p)IG(p)I_S(p) - I_G(p), square its 2\ell_2 norm, average, and take the square root.
  4. Denominator (Source-to-Target RMSE): Analogously compute RMSE(ISMˉ,ITMˉ)\mathrm{RMSE}(I_S \odot \bar M, I_T \odot \bar M).
  5. Normalization and Offset: Compute the ratio of numerator to denominator, subtract 1, and take the absolute value to yield the OMPS.
  6. Channel Handling: All RGB channels are treated equivalently in the 2\ell_2 computation; rescale images to [0,1][0,1] if needed.

This protocol enforces strict isolation of non-edited region assessment, ensuring that OMPS is robust to varying amounts of outer-mask content and is not confounded by inner region motion.

4. Interpretation and Comparative Value

OMPS yields low scores (with 0 as perfect preservation) when a method leaves the outer mask unchanged except where warranted by ground-truth motion. Higher OMPS indicates spillage or unwanted alteration in the non-edited area. The metric’s normalization ensures that samples with significant true background motion are not overly penalized, facilitating fair cross-sample comparison. OMPS is particularly sensitive to background artifacts overlooked by global image similarity metrics and highlights advantages of explicit mask-based editing frameworks over latent or text-only editing approaches.

5. Variants and Theoretical Extensions

While only the canonical RMSE-based OMPS is introduced in the referenced benchmark, several plausible extensions—none realized in the original work—are noted:

  • Thresholded-OMPS: Introducing an error threshold ε\varepsilon to ignore minor, imperceptible deviations.
  • Weighted-OMPS: Weighting the outer mask pixels by distance from the mask boundary, relaxing preservation constraints near the edit edge.
  • SSIM-OMPS: Substituting RMSE with local Structural Similarity (SSIM) indices and applying the same normalization scheme.

Such variants could adjust stringency or perceptual relevance for application-specific contexts. However, all benchmarking results refer to the standard OMPS definition (Zafarani et al., 13 Dec 2025).

6. Experimental Findings and Benchmark Results

Applying OMPS across 17 state-of-the-art drag editing models on the RealDrag benchmark (415 samples), the following trends were reported (all OMPS values multiplied by 100 for reporting clarity):

Model Class OMPS (×100) Outer Mask Handling
Mask-aware (e.g., DragNoise) 16.8–17.9 Explicit hard outer mask input
Latent-space/text-guided editing >25–30 Background often altered
Edit-minimizing (EEdit) ~43.3 Minimal overall change, but also poor object translation

Key findings:

  • Methods incorporating the outer mask as a hard constraint outperform those relying solely on semantic or latent space guidance in terms of non-edited region preservation.
  • Latent and text-guided models frequently spill edits beyond the intended area, inflating OMPS.
  • Approaches that minimize the overall edit while neglecting quality of the actual object transition can paradoxically score poorly, due to the denominator shrinking when the ground truth exhibits minimal change.
  • Strong mask-aware methods cluster in the 16–25 range; high OMPS correlates with poor background fidelity (Zafarani et al., 13 Dec 2025).

OMPS thus serves as a discriminative metric for ranking and diagnosing the locality and precision of drag-based editing algorithms, exposing limitations that are otherwise hidden by traditional holistic metrics.

7. Significance and Perspective

OMPS establishes a rigorous, reproducible criterion for the core requirement of localized editing: maintaining strict pixel-level fidelity outside the manipulation mask. Its adoption in the RealDrag benchmark provides objective, sample-independent assessment over diverse edit scenarios, and highlights the essential role of precise mask integration in image editing systems. The explicit penalization of outer-mask artifacts promotes development toward high-fidelity, localized editors and enables systematic comparison as the field advances.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Topic to Video (Beta)

Follow Topic

Get notified by email when new papers are published related to Outer Mask Preserving Score (OMPS).