## Gaussian Process Regression with Heteroscedastic or
Non-Gaussian Residuals

**Chunyi Wang,
Dept. of Statistical Sciences, University of Toronto**

Radford M. Neal,
Dept. of Statistical Sciences and Dept. of Computer Science,
University of Toronto

Gaussian Process (GP) regression models typically assume that residuals are
Gaussian and have the same variance for all observations. However, applications
with input-dependent noise (heteroscedastic residuals) frequently arise in
practice, as do applications in which the residuals do not have a Gaussian
distribution. In this paper, we propose a GP Regression model with a latent
variable that serves as an additional unobserved covariate for the regression.
This model (which we call GPLC) allows for heteroscedasticity since it allows
the function to have a changing partial derivative with respect to this
unobserved covariate. With a suitable covariance function, our GPLC model can
handle (a) Gaussian residuals with input-dependent variance, or (b)
non-Gaussian residuals with input-dependent variance, or (c) Gaussian residuals
with constant variance. We compare our model, using synthetic datasets, with a
model proposed by Goldberg, Williams and Bishop (1998), which we refer to as
GPLV, which only deals with case (a), as well as a standard GP model which can
handle only case (c). Markov Chain Monte Carlo methods are developed for both
modelsl. Experiments show that when the data is heteroscedastic, both GPLC and
GPLV give better results (smaller mean squared error and negative
log-probability density) than standard GP regression. In addition, when the
residual are Gaussian, our GPLC model is generally nearly as good as GPLV,
while when the residuals are non-Gaussian, our GPLC model is better than GPLV.

26 December 2012, 19 pages: pdf.

Also available at arxiv.org.