QUICK REVIEW

[Paper Review] Asymmetric numeral systems

Jarek Duda|ArXiv.org|Feb 2, 2009

Algorithms and Data Compression5 references81 citations

TL;DR

This paper introduces Asymmetric Numeral Systems (ANS), a novel entropy coding framework that generalizes standard numeral systems to efficiently encode symbols with arbitrary probability distributions using a single state, achieving near-Shannon limit compression. It enables fast, precise encoding with built-in data encryption via pseudorandom initialization and introduces a practical error correction mechanism with linear expected correction time, approaching Shannon’s limit for any noise level.

ABSTRACT

In this paper will be presented new approach to entropy coding: family of generalizations of standard numeral systems which are optimal for encoding sequence of equiprobable symbols, into asymmetric numeral systems - optimal for freely chosen probability distributions of symbols. It has some similarities to Range Coding but instead of encoding symbol in choosing a range, we spread these ranges uniformly over the whole interval. This leads to simpler encoder - instead of using two states to define range, we need only one. This approach is very universal - we can obtain from extremely precise encoding (ABS) to extremely fast with possibility to additionally encrypt the data (ANS). This encryption uses the key to initialize random number generator, which is used to calculate the coding tables. Such preinitialized encryption has additional advantage: is resistant to brute force attack - to check a key we have to make whole initialization. There will be also presented application for new approach to error correction: after an error in each step we have chosen probability to observe that something was wrong. There will be also presented application for new approach to error correction: after an error in each step we have chosen probability to observe that something was wrong. We can get near Shannon's limit for any noise level this way with expected linear time of correction.

Motivation & Objective

To develop a universal entropy coding method that generalizes standard numeral systems to arbitrary symbol probability distributions.
To replace traditional two-state range coding with a one-state encoding mechanism for improved efficiency and simplicity.
To integrate data encryption directly into the coding process using key-initialized pseudorandom number generators.
To design a practical error correction scheme that approaches Shannon’s channel capacity with expected linear correction time.
To unify compression, encryption, and error resilience in a single, efficient framework based on ANS.

Proposed method

Use a single state to represent the current encoding position, replacing the two-state range definition in arithmetic coding.
Distribute symbols uniformly across the state space instead of in contiguous ranges, placing information in the least significant bits.
Initialize the coding table using a pseudorandom number generator seeded with a key, enabling built-in encryption.
Apply a one-way transformation to the state during encoding/decoding, using a precomputed lookup table based on symbol probabilities.
Introduce a probabilistic error detection mechanism: at each step, with probability $ p_d $, an error is detected, enabling correction via path tracking.
Construct generalized block codes by enforcing minimum Hamming distance on the least significant bits of the state, using XOR and permutation operations on codewords.

Experimental results

Research questions

RQ1Can a one-state entropy coder achieve compression efficiency close to Shannon’s entropy limit while being faster than arithmetic coding?
RQ2How can data encryption be natively integrated into the entropy coding process without additional computational cost?
RQ3Can a practical error correction mechanism be designed that maintains linear expected correction time and approaches Shannon’s limit for any noise level?
RQ4What is the impact of using pseudorandomly initialized coding tables on compression performance and security?
RQ5How can redundancy be connected across blocks to handle localized error concentrations without exponential correction cost?

Key findings

ANS achieves compression rates comparable to arithmetic coding while using only a single state, simplifying implementation and improving speed.
The method supports near-optimal compression with precision approaching the theoretical Shannon limit, especially when symbol probabilities are well-estimated.
Built-in encryption via key-initialized pseudorandom tables provides strong resistance to brute-force attacks, as full initialization is required to test any key.
The error correction mechanism detects errors with probability $ p_d $ per step and achieves expected linear correction time when $ p_d $ exceeds a threshold related to Shannon’s limit.
By connecting redundancy across blocks through the coder’s internal state, the method can handle high local error concentrations and approach the theoretical channel capacity.
Generalized block codes with enforced Hamming distance (e.g., distance 2 or more) allow immediate detection of single-bit errors, reducing the number of decoder table lookups and speeding up correction.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.