## Bayesian Training of Backpropagation Networks
by the Hybrid Monte Carlo Method

**Radford M. Neal,
Dept. of Computer Science, University of Toronto**
It is shown that Bayesian training of backpropagation neural
networks can feasibly be performed by the "Hybrid Monte Carlo"
method. This approach allows the true predictive distribution
for a test case given a set of training cases to be approximated
arbitrarily closely, in contrast to previous approaches which
approximate the posterior weight distribution by a Gaussian. In
this work, the Hybrid Monte Carlo method is implemented in
conjunction with simulated annealing, in order to speed relaxation
to a good region of parameter space. The method has been applied
to a test problem, demonstrating that it can produce good
predictions, as well as an indication of the uncertainty of these
predictions. Appropriate weight scaling factors are found
automatically. By applying known techniques for calculation of
"free energy" differences, it should also be possible to compare
the merits of different network architectures. The work described
here should also be applicable to a wide variety of statistical
models other than neural networks.

Technical Report CRG-TR-92-1 (April 1992), 21 pages:
postscript, pdf.

**Associated references:**
Work related to that reported in ``Bayesian training of backpropagation
networks by the hybrid Monte Carlo method'' appears in the following
conference paper:
Neal, R. M. (1993) ``Bayesian learning via stochastic dynamics'',
in C. L. Giles, S. J. Hanson, and J. D. Cowan (editors)
*Advances in Neural Information Processing Systems 5*, pp. 475-482,
San Mateo, California: Morgan Kaufmann:
abstract.

The technical report is longer, and contains material on annealing
not present in the conference paper. The conference paper contains
material on uncorrected stochastic dynamics and on comparisons with
standard network training that are not in the technical report.
Further developments along the same lines are reported in Chapter 3 of
my thesis:

Neal, R. M. (1994) *Bayesian Learning for Neural Networks*, Ph.D.
Thesis, Dept. of Computer Science, University of Toronto, 195 pages:
abstract,
postscript, pdf,
associated references,
associated software.

A revised version of this thesis, with some new material, was
published by Springer-Verlag:
Neal, R. M. (1996) *Bayesian Learning for Neural Networks*,
Lecture Notes in Statistics No. 118, New York: Springer-Verlag:
blurb,
associated references,
associated software.