It is shown that Bayesian inference from data modeled by a mixture distribution can feasibly be performed via Monte Carlo simulation. This method exhibits the true Bayesian predictive distribution, implicitly integrating over the entire underlying parameter space. An infinite number of mixture components can be accommodated without difficulty, using a prior distribution for mixing proportions that selects a reasonable subset of components to explain any finite training set. The need to decide on a "correct" number of components is thereby avoided. The feasibility of the method is shown empirically for a simple classification task.
Technical Report CRG-TR-91-2 (June 1991), 23 pages: postscript, pdf.
Neal, R. M. (1992) ``Bayesian mixture modeling'', in C. R. Smith, G. J. Erickson, and P. O. Neudorfer (editors) Maximum Entropy and Bayesian Methods: Proceedings of the 11th International Workshop on Maximum Entropy and Bayesian Methods of Statistical Analysis, Seattle, 1991, pp. 197-211, Dordrecht: Kluwer Academic Publishers: abstract.The models in this paper with a countably-infinite number of components are equivalent to what are called Dirichlet process mixture models. Newer work on these is described in the following technical reports:
Neal, R. M. (1998) ``Markov chain sampling methods for Dirichlet process mixture models'', Technical Report No. 9815, Dept. of Statistics, University of Toronto, 17 pages: abstract, postscript, pdf, associated references, associated software.
Jain, S. and Neal, R. M. (2000) ``A Split-Merge Markov Chain Monte Carlo Procedure for the Dirichlet Process Mixture Model'', Technical Report No. 2003, Dept. of Statistics (July 2000), 32 pages: abstract, postscript, pdf, associated references.