Dice Question Streamline Icon: https://streamlinehq.com

Approximability of compressed-length minimization in OPE/OMS

Determine whether the compressed‑length minimization variants of Optimal Pair Encoding (OPE) or Optimal Merge Sequence (OMS) admit polynomial‑time approximation algorithms with nontrivial guarantees (e.g., constant‑factor approximations), and characterize their approximability status (such as APX‑completeness).

Information Square Streamline Icon: https://streamlinehq.com

Background

While APX-hardness extends to the compressed-length objective, the paper shows BPE does not achieve a constant-factor approximation for compressed length and leaves open whether any algorithm can do so.

Clarifying the approximability of compressed length is essential for applications where minimizing the final encoded size (rather than maximizing utility) is the primary objective.

References

The polynomial-time approximability of compression length (for OPE or OMS by any algorithm, with or without restrictions on the alphabet) is left open.

Theoretical Analysis of Byte-Pair Encoding (2411.08671 - Kozma et al., 13 Nov 2024) in Section 6 (Conclusion and open questions)