NOTES ON THE VERSION OF 2007-03-31

Major additions in this version are support for checking correctness
of MCMC methods (see item (1) below), a new program for computing
weighted and batch means with standard errors (see item(18) below), a
new syntax for network specifications (see item (5) below), support
for "log-sum-exp" Gaussian process models (see item (10) below), a new
"drop" option for Gaussian process models (see item (11) below), and
support for a "last binary" target and for missing targets in
Dirichlet diffusion tree models (items (6) and (7) below).  The use of
these last facilities for semi-supervised learning is described in a
new example (see item (8)).

A new module for inferring the locations of sources of atmospheric
contaminants has been added.

A new module for doing molecular modeling with the Lennard-Jones
potential has also been added.  This will be of no interest to people
interested only in Bayesian modeling.

Changes have been made to get the software to work properly under the
Cygwin Unix emulation environment for MS Windows.  In particular, the
nrand48 routine from the C library doesn't work in the version of
Cygwin I using, so I incorporated this routine (from the GNU C
library) into the source for the rand.c module.

Note that change (9) below is incompatible with the previous version
of count models.  To fix old command files using count models, you can
add an int-target option of "-" to the data-spec commands.

Log files for this version are not compatible with previous versions.


Changes in this version:

0)  Change to 'rand' module for use under Cygwin, as described above.

1)  A new facility for checking the correctness of Markov chain sampling 
    methods has been added, based on the joint distribution test technique
    of John Geweke.  It is currently implemented only for the 'dist' module
    and the 'gp' modlule.  See xxx-mc-test.doc.  Several of the extensions
    below are motivated primarily by their use in supporting this feature.

2)  A program has been added for generating from the distribution of data
    given parameter values, for Bayesian models specified with the 'dist'
    module.  See dist-dgen.doc for details.

3)  The gp-gen program has been extended to allow (optionally) the generation
    of latent values and/or case-by-case noise variances from the prior, as
    well as hyperparameter values.  See gp-gen.doc for details.

4)  A program has been added for generating from the distribution of targets
    given latent values and noise variances for Gaussian process models.
    See gp-dgen.doc for details.

5)  A new, more user-friendly syntax has been introduced for specifying
    network architectures.  See net-spec.doc for details.  The examples
    in the documentation have been changed to use the new syntax, as have
    the command files that go with these examples.  The old syntax is still 
    accepted.  The old command files in ex-netgp that used the old syntax
    are have been retained with the suffix ".old-style".

6)  A model in which all targets are real valued except for the last, which
    is binary can now be specified.  See model-spec.doc for details.  This 
    is implemented only for Dirichlet diffusion tree models, for the purpose
    of allowing experimentation with classification using joint probability
    models.

7)  Dirichlet diffusion tree models and neural network models can now handle 
    missing target values (represented by "?" in the data file).

8)  An example (Ex-jclass.doc) of how a joint probability model can be used 
    for classification has been added.  The commands used can be found in
    ex-mixdft/rbcmds.dft.

9)  A data specification can now restrict target values to being any integer
    (positive, zero, or negative) or to being a non-negative integer.  See
    data-spec.doc for details.  Models for count data now require that the
    data be specified to be an integer.

10) Gaussian process models in which the model likelihood is based on the 
    log of the sum of the exponentials of several outputs (rather than
    on just a single output) are now supported.  See gp.doc for details.

11) Gaussian process covariance functions can now have terms that apply
    only to some outputs, using the new "drop" option.  See gp-spec.doc
    for details.

12) The 'mix' module has been changed so that records written to the log
    file contain space for only the actual number of targets, not the 
    maximum.  Accordingly, there is no longer any need to keep Max_targets
    small in order to save disk space.

13) The command files in ex-netgp, corresponding to the examples in the 
    documentation, have been changed to include a prediction command at 
    the end, after the MCMC run.  The command used is similar to the 
    prediction commands mentioned in the documentation.

14) The maximum number of exponential terms in a covariance function
    has been increased from 10 to 15. 

15) A new "set-value" Markov chain operation has been added, which is
    mostly useful for testing and ad hoc explorations of convergence.
    See mc-spec.doc for details.

16) The sample-values operation for Gaussian process regression models 
    has been reimplemented to speed it up.  See gp-mc.doc for details.

17) The implementation of gp-eval has been changed to speed it up.  It
    also now has an option to force it to ignore latent values.  See 
    gp-eval.doc for details.

18) A new program for computing weighted and batch means has been 
    written.  See mean.doc for details, and Ex-dist-n.doc for an
    example of its use.

19) A new module for molecular modeling has been added.  See mol.doc
    and other 'mol' documentation files for details.  There are no 
    tutorial examples as yet.

20) The "#" quantity, which just evaluates to an array of indexes, can
    now take a modifier, n, which causes the values to be the indexes
    modulo n.  Also, the "t" quantity, which just evaluates to the index 
    of the iteration, can now take a modifier, n, which causes the values 
    to the be one plus the indexes minus one modulo n - except this usage
    is masked by another definition for all but the 'bvg' and 'mol' modules.
    See quantities.doc for details.

21) The computation of network functions and derivatives in the `net`
    module has been changed so that it goes faster when many of the
    network inputs are zero.

22) A "frac" option is now possible for trajectory specifications, but
    is implemented only for the 'gp' module.  See mc-spec.doc for details.

23) A file containing details of how gradient approximations are done 
    may now be specified with a leapfrog trajectory specification.  See
    mc-spec.doc for details. 

24) The maximum number of energy gradient approximations is now 1000.  

25) Metropolis updates now bypass components for which the stepsize is 
    zero.  (Saves computation, and avoids division by zero for random
    grid Metropolis updates.)

26) An application can now define ranges for coordinates, represented
    by upper-case letters, for use in operations such as met-1 and
    slice-1.  This is currently done only by the 'src' module.

27) The documentation in mc-spec.doc was reorganized a bit, to avoid
    duplication.


Bug fixes.

1) There are a few minor fixes regarding error checking and some minor 
   documentation fixes. 

2) The documentation for dist-initial said it appended a record with
   index one greater than the last.  It has been changed to say it
   appends a record with index 0, which is what it does.


Known bugs and other deficiencies.

1) The facility for plotting quantities using "plot" operations in xxx-mc
   doesn't always work for the first run of xxx-mc (before any
   iterations exist in the log file).  A work-around is to do a run of
   xxx-mc to produce just one iteration before attempting a run of
   xxx-mc that does any "plot" operations.

2) The CPU time features (eg, the "k" quantity) will not work correctly
   if a single iteration takes more than about 71 minutes.

3) The latent value update operations for Gaussian processes may recompute 
   the inverse covariance matrix even when an up-to-date version was 
   computed for the previous Monte Carlo operation.

4) Covariance matrices are stored in full, even though they are symmetric,
   which sometimes costs a factor of two in memory usage.

5) Giving net-pred several log files that have different network architectures
   doesn't work, but an error message is not always produced (the results may
   just be nonsense).

6) Some Markov chain updates for Dirichlet diffusion tree models in which 
   there is no data model (ie, no noise) are not implemented when some of 
   the data is missing.  An error message is produced in such cases.