Radford Neal's Research: Data Compression

Data files can be compressed, on average, if a good probabilistic model of their contents is available. Such models must inevitably be Bayesian, though data compression practitioners are mostly unaware of this fact.

Given a model, near-optimal encoding and decoding can be performed using the method of ``arithmetic coding'', described in the following tutorial:

Witten, I. H., Neal, R. M., and Cleary, J. G. (1987) ``Arithmetic coding for data compression'', Communications of the ACM, vol. 30, pp. 520-540: abstract, associated software.
A newer implementation of arithmetic coding is described in the following paper:
Moffat, A., Neal, R. M., and Witten, I. H. (1998) ``Arithmetic coding revisited'', ACM Transactions on Information Systems, vol. 16, pp. 256-294: abstract, associated reference, associated software.

You can also obtain software implementing arithmetic coding, including the methods described in both the above papers.
Back to Radford Neal's research interests
Back to Radford Neal's home page