GP-PRED: Make predictions for test cases using Gaussian process model. GP-pred prints guesses at the target values for a set of test cases, as obtained using a Gaussian process, or set of Gaussian processes. If the true targets are known, performance of the guesses can also be evaluated. Inputs can be printed as well. Usage: gp-pred options { log-file range } [ / test-inputs [ test-targets ] ] The final optional arguments give the source of inputs and targets for the cases to look at; they default to the test data specification in the first log file given. The Gaussian processes to use in making guesses are taken from the records with the given ranges of indexes in the given log files. The outputs of all these are combined to give a single guess for each case. The Gaussian process models should all have the same specification and use the same data specifications. An index range can have one of the forms "[low][:[high]][%mod]" or "[low][:[high]]+num", or one of these forms preceded by "@". When "@" is present, "low" and "high" are given in terms of cpu time, otherwise they are iteration numbers. When just "low" is given, only that index is used. If the colon is included, but "high" is not, the range extends to the highest index in the log file. The "mod" form allows iterations to be selected whose numbers are multiples of "mod", with the default being "mod" of one. The "num" form allows the total number of iterations used to be specified; they are distributed as evenly as possible within the specified range. Note that it is possible that the number of Gaussian processes used in the end may not equal this number, if records with some indexes are missing. The 'options' argument consists of one or more of the following letters: i Display the input values for each case t Display the target values for each case r Use the raw form of the target values, before transformation p Display the log probability of the true targets (to base e) m Display the guess based on the mode, and whether it is in error n Display the guess based on the mean, and its squared error d Display the guess based on the median, and its absolute error D Display the guess based on the mean of the median for each iteration. Not really useful for gp-pred. q Display the 10% and 90% quantiles of the predictive distributions for the targets. Note that these distributions include the noise. Q Display the 1% and 99% quantiles of the predictive distributions. l Use stored latent values for predictions with a regression model (if available). Normally, predictions for regression models (except log-sum-exp models) are done as if no latent values were stored, since this is better, but suppressing this behaviour might be of interest for testing purposes. (Stored latent values are essential with other models, for which this option is ignored.) b Suppress headings and averages - just bare numbers for each case. The numbers are printed in exponential format, to high precision. B Bare numbers, but with blank lines whenever first input changes. a Display only average log probabilities and errors, suppressing the results for individual cases (makes sense only in combination with one or more of 'p', 'm', 'n', and 'd', and not with 'i' or 't') E Display the expected error for test cases, based on the predictive probabilities, as well as the actual error (if targets are known). Currently implemented only for "binary" data. W-Z If one or more of W,X,Y,Z are present, only the corresponding terms in a log-sum-exp model are used. This allows one to see the parts of such a model. (But see the notes below.) 0-9 If one or more of these characters are present, only the corresponding linear 0) or non-linear (1-9) parts of the covariance are used when computing covariances involving test cases. This results in predictions for just these components of the function, allowing one to see the components of an additive model. (But see the notes below.) x Add ten to the number of a part specified with 0-9. The effect is cumulative, so that, for instance, options of 1x2x3 specify that only terms 1, 12, and 23 of the covariance will be used. z Pretend that all the Gausian conditional distributions for targets or latent values in test cases have zero variance. Useful with 0-9 and W-Z. (See notes below.) Some of these options are illegal for some data models. The illegal combinations are marked with an 'X' in the following table: binary count class real-valued no-model r X X p X m X X X n d/D X X X q/Q X X X Furthermore, the 'a' option is incompatible with 'b', 'i', 't', or 'q', and the 't', 'p', and 'a' options may be used only if the true targets are given. The errors for individual cases are also displayed only if the true targets are known. The 'n' option for class models displays the mean probabilities for each class, and computes a single figure for squared error that is the sum of the squares of the differences between these probabilities and the indicator variables that are one for the right class and zero for the wrong classes. The 'r' option for count data causes the 'n' option to output the mean prediction for the log of the Poisson mean (and no squared errors are computed, regardless of whether targets are available). Making predictions requires inverting the covariance matrix of the training points for each Gaussian process used, which can take a substantial amount of time if the number of training points is large. This could be avoided if the inverse covariance matrix were saved in the log file for each iteration, but this could take up a very large amount of disk space. When the number of test cases is large, it will take much longer to compute the median or log probability predictions (options "d" or "p") for a regression model than to compute only the mean predictions (option "n"). For binary, count, and class models, all sorts of predictions take about the same amount of time. The median and quantiles are calculated by Monte Carlo, using a sample consisting of 101 points from the predictive distribution for each Gaussian process. A sample of 100 points for each Gaussian process is used to calculate predictive probabilities for binary and class models. If the model has case-by-case noise variances, a single test-case variance is chosen randomly for each Gaussian process (this is a bit sub-optimal, as it would be better to pick a new variance for each of the 101 points used to compute the median, and to integrate the variance away to produce a t-distribution when computing the log predictive probability - but the programming was easier this way). These Monte Carlo estimates are found using a random number stream initialized by setting the seed to one at the start of the program. Accuracy can be increased by repeating the same log-file/range combination several times, effectively increasing the sample size use. When using the 0-9 and W-Z options, note that if the identity of a term or component can change during a run, these options will be meaningful only when predictions are based on a single iteration. Note also that predictions are done for each test case individually, for in terms of the joint distribution of all test cases. Consequently, the components of an additive model or terms in a log-sum-exp model may appear "noisy", because any jitter or ambiguity in the additive decomposition is resolved andomly for each case. This can be avoided by using the 'z' option, but one should interpret the results with care. The 'D' option is implemented, but is of no real use, since models for survival analysis aren't supported, and individual medians for other models are the same as means. It's here only because of net-pred. Each average performance figure is accompanied by +- its standard error (as long as there is more than one test case). If only inputs and targets are to be displayed (no predictions), one may give just a single log file with no range. Otherwise, at least one Gaussian process must be specified. An additional option of -use-inverse or -alt-mean may appear before the "options" argument above. These control the way the matrix operations are done, and are primarily meant for testing purposes. However, users might have a use for -alt-mean, which forces a possibly more accurate computation of the mean prediction (which is, however, slower, if only the mean is being computed). Copyright (c) 1995-2004 by Radford M. Neal