Posted in Algorithm, Statitics

Gibbs Sampling: about Parallelization

About BN
Belief Network, or directed acyclic graphical model (DAG).
When BN is huge:
Exact Inference(variable elimination)
Stochastic Inference(MCMC)

Dependence

Samples are dependent, form Markov Chain.
Converge to real distribution, when all p > 0.
Methods to improve convergence:
– Blocking
– Rao-Blackwellised
In each sample, variables are all dependent.
9 variables, binary states.
Parallel Computing the same colour ones. [1]
7187.png
Sequentially:
850970.png
MRF: divided into collections of variables (???)
Problem of Synchronous Gibbs:
Strong positive correlation at the start.
——Sequential Execution shows a strong positive correlation
——Parallel Execution shows a strong negative correlation
Conclusion is that: Adjacent variables cannot be sampled simultaneously.
DADG (Directed Acyclic Dependency Graph)
Chromatic Sampler (色度采样)
For models with weak dependencies.
k-coloring of the graphical model
183839.png
9 variables (means a part form a huge image), each vertex only depends on the peers (left, right, up, down)
So two threads start simultanously: from red and blue.
sequential consistency (顺序一致性)
Synchronous Gibbs Sampler
Fortunately, the parallel computing community has developed methods to directly transform sequential graph algorithms into equivalent parallel graph algorithms using graph colorings. [3]
The algo provided by [1]:
11576.png
436624.png
So in the MRF, all of the vertexes(variables) can be divided into two groups, as marked as two colors.
From this point of view, given k colors (states), we believe that in the same group, simutaiously we can flip a coin at each vertex.
I think it is like a pre-processing, which guarentees all variables within a color are conditionally independent given the variables in the remaining colors and can therefore be sampled independently and in parallel.
Ergodic Markov Chain[4]
Markov Chain discretes in time and states. If we say at time u, the MC is in state i, but when it is in time t+u, the state is in j, and the transition probability and process are not depend on time u, then we say it is an ergodic MC.
p ij represents the prob of i to j, then pij = P { xn+1 = j  |  xn = i } , i,j=0,1,2,…
So we can have a transition matrix:
687195.png
Splash Gibbs Sampler
An asynchronous Gibbs Sampler that adaptively addresses strong dependencies.
Adaptive asynchronous tree construction.
Conditaionally independent: Markov Blanket.
The set of all variables XNi adjacent to variable Xi is called the Markov Blanket of Xi.
When given its MB, a variable is independent with other variables.
186556.png
where X-i refers to the set of all variables excluding the variable Xi.
Reference:

Author:

Keep calm and update blog.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s