NOTES ON THE VERSION OF 2007-03-31 Major additions in this version are support for checking correctness of MCMC methods (see item (1) below), a new program for computing weighted and batch means with standard errors (see item(18) below), a new syntax for network specifications (see item (5) below), support for "log-sum-exp" Gaussian process models (see item (10) below), a new "drop" option for Gaussian process models (see item (11) below), and support for a "last binary" target and for missing targets in Dirichlet diffusion tree models (items (6) and (7) below). The use of these last facilities for semi-supervised learning is described in a new example (see item (8)). A new module for inferring the locations of sources of atmospheric contaminants has been added. A new module for doing molecular modeling with the Lennard-Jones potential has also been added. This will be of no interest to people interested only in Bayesian modeling. Changes have been made to get the software to work properly under the Cygwin Unix emulation environment for MS Windows. In particular, the nrand48 routine from the C library doesn't work in the version of Cygwin I using, so I incorporated this routine (from the GNU C library) into the source for the rand.c module. Note that change (9) below is incompatible with the previous version of count models. To fix old command files using count models, you can add an int-target option of "-" to the data-spec commands. Log files for this version are not compatible with previous versions. Changes in this version: 0) Change to 'rand' module for use under Cygwin, as described above. 1) A new facility for checking the correctness of Markov chain sampling methods has been added, based on the joint distribution test technique of John Geweke. It is currently implemented only for the 'dist' module and the 'gp' modlule. See xxx-mc-test.doc. Several of the extensions below are motivated primarily by their use in supporting this feature. 2) A program has been added for generating from the distribution of data given parameter values, for Bayesian models specified with the 'dist' module. See dist-dgen.doc for details. 3) The gp-gen program has been extended to allow (optionally) the generation of latent values and/or case-by-case noise variances from the prior, as well as hyperparameter values. See gp-gen.doc for details. 4) A program has been added for generating from the distribution of targets given latent values and noise variances for Gaussian process models. See gp-dgen.doc for details. 5) A new, more user-friendly syntax has been introduced for specifying network architectures. See net-spec.doc for details. The examples in the documentation have been changed to use the new syntax, as have the command files that go with these examples. The old syntax is still accepted. The old command files in ex-netgp that used the old syntax are have been retained with the suffix ".old-style". 6) A model in which all targets are real valued except for the last, which is binary can now be specified. See model-spec.doc for details. This is implemented only for Dirichlet diffusion tree models, for the purpose of allowing experimentation with classification using joint probability models. 7) Dirichlet diffusion tree models and neural network models can now handle missing target values (represented by "?" in the data file). 8) An example (Ex-jclass.doc) of how a joint probability model can be used for classification has been added. The commands used can be found in ex-mixdft/rbcmds.dft. 9) A data specification can now restrict target values to being any integer (positive, zero, or negative) or to being a non-negative integer. See data-spec.doc for details. Models for count data now require that the data be specified to be an integer. 10) Gaussian process models in which the model likelihood is based on the log of the sum of the exponentials of several outputs (rather than on just a single output) are now supported. See gp.doc for details. 11) Gaussian process covariance functions can now have terms that apply only to some outputs, using the new "drop" option. See gp-spec.doc for details. 12) The 'mix' module has been changed so that records written to the log file contain space for only the actual number of targets, not the maximum. Accordingly, there is no longer any need to keep Max_targets small in order to save disk space. 13) The command files in ex-netgp, corresponding to the examples in the documentation, have been changed to include a prediction command at the end, after the MCMC run. The command used is similar to the prediction commands mentioned in the documentation. 14) The maximum number of exponential terms in a covariance function has been increased from 10 to 15. 15) A new "set-value" Markov chain operation has been added, which is mostly useful for testing and ad hoc explorations of convergence. See mc-spec.doc for details. 16) The sample-values operation for Gaussian process regression models has been reimplemented to speed it up. See gp-mc.doc for details. 17) The implementation of gp-eval has been changed to speed it up. It also now has an option to force it to ignore latent values. See gp-eval.doc for details. 18) A new program for computing weighted and batch means has been written. See mean.doc for details, and Ex-dist-n.doc for an example of its use. 19) A new module for molecular modeling has been added. See mol.doc and other 'mol' documentation files for details. There are no tutorial examples as yet. 20) The "#" quantity, which just evaluates to an array of indexes, can now take a modifier, n, which causes the values to be the indexes modulo n. Also, the "t" quantity, which just evaluates to the index of the iteration, can now take a modifier, n, which causes the values to the be one plus the indexes minus one modulo n - except this usage is masked by another definition for all but the 'bvg' and 'mol' modules. See quantities.doc for details. 21) The computation of network functions and derivatives in the `net` module has been changed so that it goes faster when many of the network inputs are zero. 22) A "frac" option is now possible for trajectory specifications, but is implemented only for the 'gp' module. See mc-spec.doc for details. 23) A file containing details of how gradient approximations are done may now be specified with a leapfrog trajectory specification. See mc-spec.doc for details. 24) The maximum number of energy gradient approximations is now 1000. 25) Metropolis updates now bypass components for which the stepsize is zero. (Saves computation, and avoids division by zero for random grid Metropolis updates.) 26) An application can now define ranges for coordinates, represented by upper-case letters, for use in operations such as met-1 and slice-1. This is currently done only by the 'src' module. 27) The documentation in mc-spec.doc was reorganized a bit, to avoid duplication. Bug fixes. 1) There are a few minor fixes regarding error checking and some minor documentation fixes. 2) The documentation for dist-initial said it appended a record with index one greater than the last. It has been changed to say it appends a record with index 0, which is what it does. Known bugs and other deficiencies. 1) The facility for plotting quantities using "plot" operations in xxx-mc doesn't always work for the first run of xxx-mc (before any iterations exist in the log file). A work-around is to do a run of xxx-mc to produce just one iteration before attempting a run of xxx-mc that does any "plot" operations. 2) The CPU time features (eg, the "k" quantity) will not work correctly if a single iteration takes more than about 71 minutes. 3) The latent value update operations for Gaussian processes may recompute the inverse covariance matrix even when an up-to-date version was computed for the previous Monte Carlo operation. 4) Covariance matrices are stored in full, even though they are symmetric, which sometimes costs a factor of two in memory usage. 5) Giving net-pred several log files that have different network architectures doesn't work, but an error message is not always produced (the results may just be nonsense). 6) Some Markov chain updates for Dirichlet diffusion tree models in which there is no data model (ie, no noise) are not implemented when some of the data is missing. An error message is produced in such cases.