User's Manual
BLOG can run on all operating systems that support Java. The minimal requirement is Java 1.6 or higher.
This manual assumes you have already downloaded the latest version of BLOG and correctly unzipped or installed it. If you have not, please refer to this.
To run BLOG, use the blog
command on Linux / Mac, or blog.bat
on Windows.
If your model is dynamic (i.e. uses Timestep
), then use dblog
on Linux /
Mac, or dblog.bat
on Windows. This manual assumes a Linux environment. If
you're on Windows, just replace blog
with blog.bat
.
Basic Usage
The BLOG package contains a library of examples.
One example is about burglaryearthquake network,
first described in "Artificial Intelligence: A Modern Approach", 2nd ed., p. 494.
The model file is example/burglary.blog
, the same as below.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33  random Boolean Burglary ~ BooleanDistrib(0.001);
random Boolean Earthquake ~ BooleanDistrib(0.002);
random Boolean Alarm ~
if Burglary then
if Earthquake then BooleanDistrib(0.95)
else BooleanDistrib(0.94)
else
if Earthquake then BooleanDistrib(0.29)
else BooleanDistrib(0.001);
random Boolean JohnCalls ~
if Alarm then BooleanDistrib(0.9)
else BooleanDistrib(0.05);
random Boolean MaryCalls ~
if Alarm then BooleanDistrib(0.7)
else BooleanDistrib(0.01);
/* Evidence for the burglary model saying that both
* John and Mary called. Given this evidence, the posterior probability
* of Burglary is 0.284 (see p. 505 of "AI: A Modern Approach", 2nd ed.).
*/
obs JohnCalls = true;
obs MaryCalls = true;
/* Query for the burglary model asking whether Burglary
* is true.
*/
query Burglary;

There is one query described in the model. It is asking whether there is burglary. By running the model, we expect to obtain the probability of burglary event.
Use the following command to run the model.
blog example/burglary.blog
If you do not have blog installed, you may run it with (after unzip universal package).
bin/blog example/burglary.blog
By default, BLOG uses Likelihoodweighting algorithm to infer the posterior probability. It will draw 10,000 samples and output a probability. The following is a typical output.
Running BLOG
Using fixed random seed for repeatability.
............................................
Constructing inference engine of class blog.engine.SamplingEngine
Constructing sampler of class blog.sample.LWSampler
Evidence: [JohnCalls = true, MaryCalls = true]
Query: [Burglary]
Running for 10000 samples...
Query Reporting interval is 10000
Samples done: 1000. Time elapsed: 0.437 s.
Samples done: 2000. Time elapsed: 0.625 s.
Samples done: 3000. Time elapsed: 0.707 s.
Samples done: 4000. Time elapsed: 0.775 s.
Samples done: 5000. Time elapsed: 0.825 s.
Samples done: 6000. Time elapsed: 0.887 s.
Samples done: 7000. Time elapsed: 0.957 s.
Samples done: 8000. Time elapsed: 0.997 s.
Samples done: 9000. Time elapsed: 1.024 s.
Samples done: 10000. Time elapsed: 1.05 s.
======== LW Trial Stats =========
Log of average likelihood weight (this trial): 6.307847922891953
Average likelihood weight (this trial): 0.0018219499999999767
Fraction of consistent worlds (this trial): 1.0
Fraction of consistent worlds (running avg, all trials): 1.0
======== Query Results =========
Number of samples: 10000
Distribution of values for Burglary
false 0.7233733088174801
true 0.2766266911825274
======== Done ========
It is possible to request 1 million samples by issuing the following command.
blog n 1000000 example/burglary.blog
Alternative algorithms are available. To use the MetropolisHasting algorithm (as described in Milch et al 2006):
blog s blog.sample.MHSampler example/burglary.blog
Commandline options
The general form of blog command is
blog [options] <blog file1> [<blog file2> <blog file3> ...]
The [options]
are optional. The orders of these options do not matter. If no option is provided, it will use LWSampler (parental likelihoodweighting algorithm), with 50,000 samples.
The following options are provided. For every option, there is a short form and a long form. Either is acceptable.

Print out help message:
help

Setting random seed.
r
orrandomize
Initialize the random seed based on the clock time. If this flag is not given, the program uses a fixed random seed so its behavior is reproducible. Default: false.
For example:
blog r example/burglary.blog
 Use Inference engine.
e classname
orengine=classname
Use classname as the inference engine. Default: blog.engine.SamplingEngine. For dynamic models, two additional engines are provided:  Bootstrap Particle filter (applicable to general dynamic models)
e blog.engine.ParticleFilter
 Liuwest Filter (Liu & West 2001), only applicable to Real static parameters.
e blog.engine.LiuWestFilter
For example, the following command uses particle filter to run a Hidden Markov Model.
blog e blog.engine.ParticleFilter example/hmm.dblog
 Run the sampling engine for a given number of samples.
'n [number]' ornum_samples [number]
It is used to control the accuracy of the inference. Default, 10,000.
For example, to run the burglary model with 1 million samples.
blog n 1000000 example/burglary.blog

Choose a sampling algorithm.
s [string]
orsampler [string]
BLOG provides three sampling algorithms rejection sampling
s blog.sample.RejectionSampler
 (default) likelihoodweighting
s blog.sample.LWSampler
 MetropolisHasting algorithm (Milch et al 2006)
s blog.sample.MHSampler
 rejection sampling

Skip the first few number of samples
b num
orburn_in=num
Treat first num samples as burnin period (don't use them to compute query results). Default: 0. 
Use a customized proposal for the MetropolisHastings sampler.
p classname
orproposer=classname
It should be used together withe blog.sample.MHSampler
. Default:blog.GenericProposer
(samples each var given its parents). The proposer should be implemented in Java and extendsblog.sample.AbstractProposer
. 
Output
o file
oroutput=file
Output query results in JSON format to this file. This is a machinereadable output format. For every query, the file contains a list of (value, log_probability) pairs. 
Print detailed information during inference.
v
orverbose
Print information about the world generated at each iteration. Off by default (for performance reasons, consider leaving this option off). 
Monitor the progress of inference.
interval=num
Report query results to stdout every num queries. Default: 1000. 
Generate possible worlds.
generate
Rather than answering queries, just sample possible worlds from the prior distribution defined by the model, and print them out. Default: false.
Note this option cannot be used on dynamic models and any models with Functions on Integers. 
Setting classpath for resolving the names of Distributions.
package=package
Look in package (e.g., "blog.distrib") when resolving the names of CondProbDistrib and NonRandomFunction classes in the model file. This option can be included several times with different packages; the packages are searched in the order given. The last places searched are the toplevel package ("") and finally the default package blog.distrib. Note that you still need to set the Java classpath so that it includes all these packages. 
Print debugging information.
debug
Print model, evidence, and queries for debugging. Default: false. 
Setting extra options for inference engine.
P key=value
Include the entry key=value in the properties table that is passed to the inference engine. This feature can be used to set configuration parameters for various inference engines (and the components they use, such as samplers). See the individual inference classes for documentation. Note: The P option cannot be used to specify values for properties for which there exist specialpurpose options, such as engine or num_samples. 
Setting extra classpath
It can accept an additional variable CLASSPATH to setup classpath of user provided distribution and library functions. For example,
CLASSPATH=userdir blog example/burglary.blog
 Setting extra memory You may set additional options for java through JAVA_OPTS="...". For example, to setup 4096MB memory,
JAVA_OPTS="Xmx4096M" blog example/burglary.blog
You may replace 4096 with other Integers to request memory in MB.
Checking and validating BLOG syntax
Sometimes one might make a small typo in the BLOG program.
bloglint
is a tool provided in the package to validate the syntax of a BLOG program.
It will point out syntax errors, and output an abstract syntax tree for the portion it can understand.
bloglint <blog file>
For example, the following command will check the syntax of example/burglary.blog
.
bloglint example/burglary.blog
Running dynamic models
For dynamic models (models with Timestep
), one can use bootstrap particle filter.
Bootstrap particle filter is an approximate algorithm for making inference about dynamic probabilistic model with general distributions. The following command runs a particle filter for a hidden Markov model.
blog e blog.engine.ParticleFilter example/hmm.dblog
The hidden Markov model describes the generative process of genetic sequences.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52  type State;
distinct State A, C, G, T;
type Output;
distinct Output ResultA, ResultC, ResultG, ResultT;
random State S(Timestep t) ~
if t == @0 then
Categorical({A > 0.3, C > 0.2, G > 0.1, T > 0.4})
else case S(prev(t)) in {
A > Categorical({A > 0.1, C > 0.3, G > 0.3, T > 0.3}),
C > Categorical({A > 0.3, C > 0.1, G > 0.3, T > 0.3}),
G > Categorical({A > 0.3, C > 0.3, G > 0.1, T > 0.3}),
T > Categorical({A > 0.3, C > 0.3, G > 0.3, T > 0.1})
};
random Output O(Timestep t) ~
case S(t) in {
A > Categorical({
ResultA > 0.85, ResultC > 0.05,
ResultG > 0.05, ResultT > 0.05}),
C > Categorical({
ResultA > 0.05, ResultC > 0.85,
ResultG > 0.05, ResultT > 0.05}),
G > Categorical({
ResultA > 0.05, ResultC > 0.05,
ResultG > 0.85, ResultT > 0.05}),
T > Categorical({
ResultA > 0.05, ResultC > 0.05,
ResultG > 0.05, ResultT > 0.85})
};
/* Evidence for the Hidden Markov Model.
*/
obs O(@0) = ResultC;
obs O(@1) = ResultA;
obs O(@2) = ResultA;
obs O(@3) = ResultA;
obs O(@4) = ResultG;
/* Queries for the Hiddem Markov Model, given the evidence.
* Note that we can query S(5) even though our observations only
* went up to time 4.
*/
query S(@0);
query S(@1);
query S(@2);
query S(@3);
query S(@4);
query S(@5);

Note when using particle filtering or LiuWest filter, BLOG is answering the query at the query time.
For example, query S(@2)
will be answered after all evidence at Timestep
2. It is expected to give probability of the state at 2nd Timestep
given all evidence at Timestep
0, 1, and 2.
To specify the number of particles, use n
. By default, BLOG uses 10,000 particles. The following command runs a particle filter with 100,000 particles.
blog e blog.engine.ParticleFilter n 100000 example/hmm.dblog
Tuning LiuWest fitler
If your BLOG model contains static variables (random functions defined on types other than Timestep
). You may consider using the LiuWest filter. The current implementation of LiuWest filter only work on scalar continuous static variables. To switch to LiuWest filter, use the option e LiuWestFilter
BLOG requires a parameter rho
for the degree of pertubation. Defaut is 0.95. It can be set using P rho=[number]
. The number should be in (0, 1]. 1.0 means no pertubation, i.e. plain particle filtering.
The following command runs LiuWest filter on a simple autoregressive model.
blog e blog.engine.LiuWestFilter P rho=0.98 example/ar1.dblog
Please refer to example/ar1.dblog
for the full model.