Bayesian Learning for Neural Networks

Radford M. Neal, Dept. of Statistics and Dept. of Computer Science, University of Toronto

Artificial ``neural networks'' are now widely used as flexible models for regression and classification applications, but questions remain regarding what these models mean, and how they can safely be used when training data is limited. Bayesian Learning for Neural Networks shows that Bayesian methods allow complex neural network models to be used without fear of the ``overfitting'' that can occur with traditional neural network learning methods. Insight into the nature of these complex Bayesian models is provided by a theoretical investigation of the priors over functions that underlie them. Use of these models in practice is made possible using Markov chain Monte Carlo techniques. Both the theoretical and computational aspects of this work are of wider statistical interest, as they contribute to a better understanding of how Bayesian methods can be applied to complex problems.

Presupposing only basic knowledge of probability and statistics, this book should be of interest to many researchers in Statistics, Engineering, and Artificial Intelligence. Software for Unix systems that implements the methods described is freely available over the Internet.

Lecture Notes in Statistics No. 118, Springer-Verlag New York, 1996, ISBN 0-387-94724-8, free download from Springer site.

The neural network programs that go with the book are now part of my software for flexible Bayesian modeling.

Associated references: This book is a revision of my thesis of the same title, with new material added:
Neal, R. M. (1994) Bayesian Learning for Neural Networks, Ph.D. Thesis, Dept. of Computer Science, University of Toronto, 195 pages: abstract, postscript, pdf, associated references, associated software.
Chapter 2 of Bayesian Learning for Neural Networks develops ideas from the following technical report:
Neal, R. M. (1994) ``Priors for infinite networks'', Technical Report CRG-TR-94-1, Dept. of Computer Science, University of Toronto, 22 pages: abstract, postscript, pdf.
Chapter 3 is a further development of ideas in the following papers:
Neal, R. M. (1993) ``Bayesian learning via stochastic dynamics'', in C. L. Giles, S. J. Hanson, and J. D. Cowan (editors) Advances in Neural Information Processing Systems 5, pp. 475-482, San Mateo, California: Morgan Kaufmann: abstract.

Neal, R. M. (1992) ``Bayesian training of backpropagation networks by the hybrid Monte Carlo method'', Technical Report CRG-TR-92-1, Dept. of Computer Science, University of Toronto, 21 pages: abstract, postscript, pdf.