User's Manual

BLOG can run on all operating systems that support Java. The minimal requirement is Java 1.6 or higher.

This manual assumes you have already downloaded the latest version of BLOG and correctly unzipped or installed it. If you have not, please refer to this.

To run BLOG, use the blog command on Linux / Mac, or blog.bat on Windows. If your model is dynamic (i.e. uses Timestep), then use dblog on Linux / Mac, or dblog.bat on Windows. This manual assumes a Linux environment. If you're on Windows, just replace blog with blog.bat.

Basic Usage

The BLOG package contains a library of examples. One example is about burglary-earthquake network, first described in "Artificial Intelligence: A Modern Approach", 2nd ed., p. 494. The model file is example/burglary.blog, the same as below.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
random Boolean Burglary ~ BooleanDistrib(0.001);

random Boolean Earthquake ~ BooleanDistrib(0.002);

random Boolean Alarm ~
  if Burglary then
    if Earthquake then BooleanDistrib(0.95)
    else  BooleanDistrib(0.94)
  else
    if Earthquake then BooleanDistrib(0.29)
    else BooleanDistrib(0.001);

random Boolean JohnCalls ~
  if Alarm then BooleanDistrib(0.9)
  else BooleanDistrib(0.05);

random Boolean MaryCalls ~
  if Alarm then BooleanDistrib(0.7)
  else BooleanDistrib(0.01);

/* Evidence for the burglary model saying that both 
 * John and Mary called.  Given this evidence, the posterior probability 
 * of Burglary is 0.284 (see p. 505 of "AI: A Modern Approach", 2nd ed.).
 */

obs JohnCalls = true;
obs MaryCalls = true;

/* Query for the burglary model asking whether Burglary 
 * is true.
 */

query Burglary;

There is one query described in the model. It is asking whether there is burglary. By running the model, we expect to obtain the probability of burglary event.

Use the following command to run the model.

blog example/burglary.blog

If you do not have blog installed, you may run it with (after unzip universal package).

bin/blog example/burglary.blog

By default, BLOG uses Likelihood-weighting algorithm to infer the posterior probability. It will draw 10,000 samples and output a probability. The following is a typical output.

Running BLOG
Using fixed random seed for repeatability.
............................................
Constructing inference engine of class blog.engine.SamplingEngine
Constructing sampler of class blog.sample.LWSampler
Evidence: [JohnCalls = true, MaryCalls = true]
Query: [Burglary]
Running for 10000 samples...
Query Reporting interval is 10000
Samples done: 1000.    Time elapsed: 0.437 s.
Samples done: 2000.    Time elapsed: 0.625 s.
Samples done: 3000.    Time elapsed: 0.707 s.
Samples done: 4000.    Time elapsed: 0.775 s.
Samples done: 5000.    Time elapsed: 0.825 s.
Samples done: 6000.    Time elapsed: 0.887 s.
Samples done: 7000.    Time elapsed: 0.957 s.
Samples done: 8000.    Time elapsed: 0.997 s.
Samples done: 9000.    Time elapsed: 1.024 s.
Samples done: 10000.    Time elapsed: 1.05 s.
========  LW Trial Stats =========
Log of average likelihood weight (this trial): -6.307847922891953
Average likelihood weight (this trial): 0.0018219499999999767
Fraction of consistent worlds (this trial): 1.0
Fraction of consistent worlds (running avg, all trials): 1.0
======== Query Results =========
Number of samples: 10000
Distribution of values for Burglary
  false  0.7233733088174801
  true  0.2766266911825274
======== Done ========

It is possible to request 1 million samples by issuing the following command.

blog -n 1000000 example/burglary.blog

Alternative algorithms are available. To use the Metropolis-Hasting algorithm (as described in Milch et al 2006):

blog -s blog.sample.MHSampler example/burglary.blog

Commandline options

The general form of blog command is

blog [options] <blog file1> [<blog file2> <blog file3> ...]

The [options] are optional. The orders of these options do not matter. If no option is provided, it will use LWSampler (parental likelihood-weighting algorithm), with 50,000 samples.

The following options are provided. For every option, there is a short form and a long form. Either is acceptable.

blog -r example/burglary.blog
blog -e blog.engine.ParticleFilter example/hmm.dblog
blog -n 1000000 example/burglary.blog
CLASSPATH=userdir blog example/burglary.blog
JAVA_OPTS="-Xmx4096M" blog example/burglary.blog

You may replace 4096 with other Integers to request memory in MB.

Checking and validating BLOG syntax

Sometimes one might make a small typo in the BLOG program. bloglint is a tool provided in the package to validate the syntax of a BLOG program. It will point out syntax errors, and output an abstract syntax tree for the portion it can understand.

bloglint <blog file>

For example, the following command will check the syntax of example/burglary.blog.

bloglint example/burglary.blog

Running dynamic models

For dynamic models (models with Timestep), one can use bootstrap particle filter. Bootstrap particle filter is an approximate algorithm for making inference about dynamic probabilistic model with general distributions. The following command runs a particle filter for a hidden Markov model.

blog -e blog.engine.ParticleFilter example/hmm.dblog

The hidden Markov model describes the generative process of genetic sequences.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
type State;
distinct State A, C, G, T;

type Output;
distinct Output ResultA, ResultC, ResultG, ResultT;

random State S(Timestep t) ~
  if t == @0 then 
    Categorical({A -> 0.3, C -> 0.2, G -> 0.1, T -> 0.4})
  else case S(prev(t)) in {
    A -> Categorical({A -> 0.1, C -> 0.3, G -> 0.3, T -> 0.3}),
    C -> Categorical({A -> 0.3, C -> 0.1, G -> 0.3, T -> 0.3}),
    G -> Categorical({A -> 0.3, C -> 0.3, G -> 0.1, T -> 0.3}),
    T -> Categorical({A -> 0.3, C -> 0.3, G -> 0.3, T -> 0.1})
  };

random Output O(Timestep t) ~ 
  case S(t) in {
    A -> Categorical({
      ResultA -> 0.85, ResultC -> 0.05, 
      ResultG -> 0.05, ResultT -> 0.05}),
    C -> Categorical({
      ResultA -> 0.05, ResultC -> 0.85, 
      ResultG -> 0.05, ResultT -> 0.05}),
    G -> Categorical({
      ResultA -> 0.05, ResultC -> 0.05, 
      ResultG -> 0.85, ResultT -> 0.05}),
    T -> Categorical({
      ResultA -> 0.05, ResultC -> 0.05, 
      ResultG -> 0.05, ResultT -> 0.85})
  };

/* Evidence for the Hidden Markov Model.
 */

obs O(@0) = ResultC;
obs O(@1) = ResultA;
obs O(@2) = ResultA;
obs O(@3) = ResultA;
obs O(@4) = ResultG;

/* Queries for the Hiddem Markov Model, given the evidence.
 * Note that we can query S(5) even though our observations only
 * went up to time 4.
 */

query S(@0);
query S(@1);
query S(@2);
query S(@3);
query S(@4);
query S(@5);

Note when using particle filtering or Liu-West filter, BLOG is answering the query at the query time. For example, query S(@2) will be answered after all evidence at Timestep 2. It is expected to give probability of the state at 2nd Timestep given all evidence at Timestep 0, 1, and 2.

To specify the number of particles, use -n. By default, BLOG uses 10,000 particles. The following command runs a particle filter with 100,000 particles.

blog -e blog.engine.ParticleFilter -n 100000 example/hmm.dblog

Tuning Liu-West fitler

If your BLOG model contains static variables (random functions defined on types other than Timestep). You may consider using the Liu-West filter. The current implementation of Liu-West filter only work on scalar continuous static variables. To switch to Liu-West filter, use the option -e LiuWestFilter

BLOG requires a parameter rho for the degree of pertubation. Defaut is 0.95. It can be set using -P rho=[number]. The number should be in (0, 1]. 1.0 means no pertubation, i.e. plain particle filtering.

The following command runs Liu-West filter on a simple auto-regressive model.

blog -e blog.engine.LiuWestFilter -P rho=0.98 example/ar1.dblog

Please refer to example/ar1.dblog for the full model.