Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Asymmetric numeral systems: entropy coding combining speed of Huffman coding with compression rate of arithmetic coding (1311.2540v2)

Published 11 Nov 2013 in cs.IT and math.IT

Abstract: The modern data compression is mainly based on two approaches to entropy coding: Huffman (HC) and arithmetic/range coding (AC). The former is much faster, but approximates probabilities with powers of 2, usually leading to relatively low compression rates. The latter uses nearly exact probabilities - easily approaching theoretical compression rate limit (Shannon entropy), but at cost of much larger computational cost. Asymmetric numeral systems (ANS) is a new approach to accurate entropy coding, which allows to end this trade-off between speed and rate: the recent implementation [1] provides about $50\%$ faster decoding than HC for 256 size alphabet, with compression rate similar to provided by AC. This advantage is due to being simpler than AC: using single natural number as the state, instead of two to represent a range. Beside simplifying renormalization, it allows to put the entire behavior for given probability distribution into a relatively small table: defining entropy coding automaton. The memory cost of such table for 256 size alphabet is a few kilobytes. There is a large freedom while choosing a specific table - using pseudorandom number generator initialized with cryptographic key for this purpose allows to simultaneously encrypt the data. This article also introduces and discusses many other variants of this new entropy coding approach, which can provide direct alternatives for standard AC, for large alphabet range coding, or for approximated quasi arithmetic coding.

Citations (164)

Summary

  • The paper introduces ANS as a novel entropy coding method that balances the speed of Huffman coding with the effective compression of arithmetic coding.
  • It uses a single natural number state to achieve up to 50% faster decoding for a 256-symbol alphabet while maintaining near-optimal compression rates.
  • ANS adapts symbol distributions to varying probabilities, offering dual applications in secure encryption and high-performance data compression.

Asymmetric Numeral Systems: Bridging the Gap in Entropy Coding

The paper authored by Jarek Duda introduces a novel method for entropy coding known as Asymmetric Numeral Systems (ANS). This method seeks to balance the computational efficiency of Huffman coding (HC) with the compression performance of arithmetic coding (AC), providing a robust alternative that addresses the tradeoff between speed and compression rate present in traditional methods.

Key Concepts

The paper delineates the current landscape of two primary approaches to entropy coding: Huffman coding and arithmetic coding. Huffman coding is acknowledged for its speed but suffers from suboptimal compression rates due to its approximate probability handling. Arithmetic coding, while capable of achieving near-optimal compression rates, incurs substantial computational overhead because of its manipulation of probability ranges.

ANS distinguishes itself by maintaining a single natural number as a state, which streamlines the encoding and decoding process compared to the two-number state used in arithmetic coding. This novel approach facilitates a faster decoding process—approximately 50% faster than Huffman for a 256-size alphabet—without compromising the compression rate, which remains comparable to that of arithmetic coding.

Methodological Advancements

The ANS strategy involves redefining the distribution of symbol appearances on a natural number line, allowing more accurate encoding of general probability distributions. It eschews the conventional uniform distribution of numeral systems in favor of a more adaptive asymmetry that reflects the variability of symbol probabilities.

Several variants of ANS are presented and discussed. These include:

  • Uniform Asymmetric Binary Systems (uABS): Provides a formulaic approach for binary alphabets.
  • Range Variants (rABS and rANS): Offers alternatives akin to Range Coding but with reduced computational requirements.
  • Tabled ANS (tANS): Introduces coding tables for large alphabets, setting an example where ANS surpasses arithmetic coding by reducing computational complexity.

Performance and Practical Implications

The paper reports ANS's efficacy in compressing data close to Shannon entropy, with incredibly low overhead. For instance, tANS uses tables to efficiently manage large alphabet entropy coding, achieving impressive speed enhancements over Huffman coding while maintaining high compression rates.

A pivotal feature of ANS is its flexibility in symbol distribution for different probability contexts. This adaptability not only supports a wide range of applications but also enables simultaneous data encryption, leveraging the chaotic behavior of the state transitions in ANS.

Moreover, ANS's simplicity allows for efficient renormalization, handling fractional bits with fewer computational burdens than AC, which must deal with complex range adjustments and precision maintenance challenges.

Theoretical and Future Directions

The potential of ANS extends beyond traditional data compression applications. Its inherent properties make it suitable for cryptographic use. By employing pseudorandomly generated symbol distributions, ANS can provide a layered security measure while encoding data, thereby offering a dual-purpose mechanism for both compression and encryption.

Future developments could explore further optimization of ANS coding tables, especially for adaptive and context-dependent compression tasks. The research suggests a fertile ground for advancing entropy coding systems, potentially pushing the boundaries of compression efficiency and cryptographic security.

Conclusion

Asymmetric Numeral Systems present a compelling alternative that integrates the best attributes of Huffman and arithmetic coding. The method’s ability to achieve a delicate balance between operational speed and compression efficacy makes it an attractive choice for modern applications requiring efficient and secure data handling. The paper's contributions lay groundwork for further exploration into low-complexity, high-performance entropy coding and its integration into cryptographic frameworks.

Github Logo Streamline Icon: https://streamlinehq.com
Youtube Logo Streamline Icon: https://streamlinehq.com