Nonlinear Models Using Dirichlet Process Mixtures

Babak Shahbaba, Dept. of Public Health Sciences, University of Toronto
Radford M. Neal, Dept. of Statistics and Dept. of Computer Science, University of Toronto

We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, non-parametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relationship becomes nonlinear if the mixture contains more than one component, with different regression coefficients. We use simulated data to compare the performance of this new approach to alternative methods such as multinomial logit (MNL) models, decision trees, and support vector machines. We also evaluate our approach on two classification problems: identifying the folding class of protein sequences and detecting Parkinson's disease. Our model can sometimes improve predictive accuracy. Moreover, by grouping observations into sub-populations (i.e., mixture components), our model can sometimes provide insight into hidden structure in the data.

Journal of Machine Learning Research, vol. 10, pp. 1829-1850, August 2009: pdf.


Associated references:

This is a substantially revised version of the following technical report:

Shahbaba, B. and Neal, R. M. (2007) ``Nonlinear Models Using Dirichlet Process Mixtures'', Technical Report No. 0707, Dept. of Statistics, 16 pages: abstract, postscript, pdf.