EXAMPLES OF CIRCULARLY-COUPLED MARKOV CHAIN SAMPLING Circular coupling is a method for diagnosing covergence and discarding "burn-in" iterations, described in my technical report on "Circularly-coupled Markov chain sampling". The examples here demonstrate this software's facilities for simple circular coupling, using "xxx-wrap" (see xxx-wrap.doc), and for circular coupling with multiple starting points (possibly done in parallel), using "xxx-circ" (see xxx-circ.doc). Note that although these facilities are implemented in a way that is designed for general use, not all aspects of the software have been adapted to this scheme. Also, the coupling techniques used are still being improved. Presently, circular coupling cannot be used at all with mixture models, and it is probably not yet useful for most neural network models. It can be used for regression models using Gaussian processes (if appropriate Markov chain operations are used), but not for classification models, or other models requiring latent variables. The example below is pretty simple. A more interesting example is in the latest version of the technical report, but isn't included here. There are also some command files demonstrating circular coupling in the ex-bayes directory. The command files for the example here and for two other examples are in the ex-circ directory. Circularly-coupled sampling for a Cauchy model. To start, we can look at a simple example of a Bayesian model for data that is Cauchy distributed. There is just one parameter, u, for this model, which is the location of the Cauchy distribution (the scale is fixed at one). The prior for u will be Normal(0,20^2), and two data points will be assumed to have been observed, with values of 18 and 25. Here is a specification for the resulting posterior distribution: > dist-spec clog "u~Normal(0,20^2)" "Log[1+(u-18)^2] + Log[1+(u-25)^2]" Here, the two data values have been put into the likelihood expression as constants. There will therefore be no data file for this example. The log file is written as "clog" above, but other names will be used for the examples below. One way of using circular coupling is to first simulate a chain in the usual way, and then "wrap the chain around" in order to discard an appropriate "burn-in" period, which is not from the stationary distribution of the chain. For this to work, however, we have to use Markov chain operations that couple appropriately. For one dimensional problems, "random grid" Metropolis updates work well. We can specify such updates (with stepsize of 5) with the following command, issued after the dist-spec command: > dist-spec clog.forw "u~Normal(0,20^2)" "Log[1+(u-18)^2] + Log[1+(u-25)^2]" > mc-spec clog.forw rgrid-met 5 The log file "clog.forw" will be the normal, "forward" version of the chain. We can run the chain for 500 iterations as follows: > dist-mc clog.forw 500 You can now plot the progress of the chain over time, starting from its default initial state of u=0, using the following command: > dist-plt t u clog.forw | plot Clearly, the initial portion of the chain is not representative of the stationary distribution. But it's not clear exactly how many iterations should be discarded from the beginning. With traditional Markov chain sampling methods, we'd just have to make a somewhat ad hoc decision about this. As an alternative, we can wrap the chain around - running it again with log file clog.wrap, using the last state of clog.forw as the start state, with the same random numbers as before. This is done with the following command: > dist-wrap clog.forw clog.wrap The dist-wrap procedure will recognize when (if ever) the new chain coalesces with the old chain (ie, reaches the same state). Once that occurs, there is no need to perform any further computation - the remaining iterations in clog.wrap can simply be copied from clog.forw. For this example, coalescence will probably occur within a few dozen iterations. The wrapped around chain can be plotted as follows: > dist-plt t u clog.wrap | plot Provided certain assumptions about rapid coalescence are satisfied, all the iterations in clog.wrap will come from approximately the right distribution. One can see how the wrapped around chain coalesces with the original chain by plotting both: > dist-plt t u clog.forw clog.wrap | plot For some problems, however, it is possible that the wrapped-around chain will not coalese with the original chain. In this case, the procedure must be repeated with a larger number of iterations. And even if coalescence does occur, it is possible that it takes too long, on average - violating the assumptions needed for the answer to be guaranteed to be (approximately) correct. These assumptions can be tested by running chains from many starting points, which this software draws from the prior distribution. This is done as follows: > dist-spec clog.circ "u~Normal(0,20^2)" "Log[1+(u-18)^2] + Log[1+(u-25)^2]" > mc-spec clog.circ rgrid-met 5 > dist-circ clog.circ 10 50 5 The dist-spec and mc-spec commands are as before. The dist-circ command runs a circularly-coupled simulation from 10 starting points (the first numerical argument), equally spaced along a 500-iteration span, so that the sections between starting points consist of 50 iterations (the second numerical argument). Each section is simulated and re-simulated up to 5 times (the third numerical argument), with the starting point the first time (called stage 0) being drawn from the prior, and the starting points for later stages being set to the final point from the last simulation of the previous section (with wrap-around from the last section to the first). A consistent circular chain is obtained if this process reaches a state where every section starts at the final state of the previous section, and ends at the initial state of the next section. If this happens after a number of stages that's no more than half the total number, there is fairly good reason to think that the necessary assumptions are met, and the states of the circular chain (which will be in clog.circ) all come from a good approximation to the correct posterior distribution. If the -p option is given to dist-circ, the 10 simulations in each stage will be done in parallel, in so far as this is possible with the number of processor cores your machine has.