Arithmetic Coding Revisited

Alistair Moffat, Dept. of Computer Science, The University of Melbourne
Radford M. Neal, Dept. of Statistics and Dept. of Computer Science, University of Toronto
Ian H. Witten, Dept. of Computer Science, University of Waikato

Over the last decade, arithmetic coding has emerged as an important compression tool. It is now the method of choice for adaptive coding on multi-symbol alphabets because of its speed, low storage requirements, and effectiveness of compression. This paper describes a new implementation of arithmetic coding that incorporates several improvements over a widely-used earlier version (Comm. ACM, 30(6):520--541, June 1987), which has become a de facto standard. These improvements include fewer multiplicative operations, greatly extended range of alphabet sizes and symbol probabilities, and the use of low-precision arithmetic, permitting implementation by fast shift/add operations. We also describe a modular structure that separates the coding, modeling, and probability estimation components of a compression system. To motivate the improved coder, we consider the needs of a word-based text compression program. We report a range of experimental results using this and other models. Complete source code is available.

ACM Transactions on Information Systems, vol. 16, pp. 256-294.

There is software available on-line that implements the method described.

Associated reference: A preliminary and condensed form of this paper appears in the following conference proceedings:
Moffat, A., Neal, R., and Witten, I. H. (1995) ``Arithmetic coding revisited'', in J. A. Storer and M. Cohn (editors) Proceedings of the Fifth IEEE Data Compression Conference, pp. 202-211, Los Alamitos, California: IEEE Computer Society Press.