Generalizations of Length Limited Huffman Coding for Hierarchical Memory Settings (2010.05005v3)
Abstract: In this paper, we study the problem of designing prefix-free encoding schemes having minimum average code length that can be decoded efficiently under a decode cost model that captures memory hierarchy induced cost functions. We also study a special case of this problem that is closely related to the length limited Huffman coding (LLHC) problem; we call this the {\em soft-length limited Huffman coding} problem. In this version, there is a penalty associated with each of the $n$ characters of the alphabet whose encodings exceed a specified bound $D$($\leq n$), where the penalty increases linearly with the length of the encoding beyond $D$. The goal of the problem is to find a prefix-free encoding having minimum average code length and total penalty within a pre-specified bound ${\cal P}$. This generalizes the LLHC problem. We present an algorithm to solve this problem that runs in time $O( nD )$. We study a further generalization in which the penalty function and the objective function can both be arbitrary monotonically non-decreasing functions of the codeword length. We provide dynamic programming based exact and PTAS algorithms for this setting.