**This post contains very basic knowledge of few sampling methods. As I am going to implement a java version of Gibbs Sampling, I went through some materials on the internet, and kept a learning journal here.
Will learn more about Gibbs Sampling in few days, and I will focus on how it involves with Graph Processing.**

PS. I found in a time series, RNN (Recurrent Neural Networks) has something in common with Gibbs Sampling.

**Background**

Stochastic Simulation: Monte Carlo Simulation. 20th 40s.

Idea: given a probability distribution p(x), how to generate the samples?

Uniform(0,1) is easier to generate, to some extend, fake random numbers.

**1. Monte Carlo Method**

If we want to get derivative of a f(x):

But when f(x) is complicated and not that easy to get derivative, we use MC Method.

q(x): a prob dist. in a range about x.

f(x) : a function.

get n samples from q(x), when n is big enough, we can use avg to approximately get:

So the main thing is how to sampling when given a prob. dist.

**2.Uniform Dist, Box-Muller**

The fake random numbers of [0,1]…Uniform Dist.

Box-Muller: if U1 and U2 are independent, and U1,U2 ~ Uniform[0,1] :

Then Z0,Z1 ～N(0,1). >> Standard Normal Dist.

**3. Monte Carlo Principle**

MC sampling calculate Exceptions.

x ~ p(x), so what’s E(x)?

Let’s get samples xi from p(x):

**4. Acceptance-Rejection Sampling**

In the real world, sometimes it’s hard to sample in P(x).

Use an approximate way to reject some of the sampling, set a proposal diet q(x), maybe a Gaussian Dist.

The steps are as follows:

set a function q(x), and a constant k, which allow p(x) is below kq(x):

– x dimension: samples from q(x) to get a.

– y dimension: samples from Uniform Dist (0,kq(a) ) to get u.

– If the point is in the gray field: u > p(a), then reject; unless accept

– repeat.

In high dimensions, RS is in face of two problems: it is hard to find a suitable q distribution, and it is not easy to find a k.

**5. Markov Chain**

We have a state matrix:

if the init is: u(x) = (0.5, 0.2, 0.3)

so the next state should be :

u(x)T = (0.18, 0.64, 0.18)

if keep going, then >>> (0.22, 0.41, 0.37)

No matter the starting state, after long way of Markov Chain, all going to the nealy same result.

Python code:

import numpy as np

a = [0.5,0.2,0.3]

b = [(0,1,0),(0,0.1,0.9),(0.6,0.4,0)]

print a,b

c = np.dot(a,b)

for i in range(1,100):

c = np.dot(c,b);

print c

The result is:

(100 times)

You can change the starting of a.

In the end, u(x) will be -> p(x).

马氏链的收敛性质主要由转移矩阵决定, 所以基于马氏链做采样的关键问题是如何构造转移矩阵,使得平稳分布恰好是我们要的分布p(x)

**6. Gibbs Sampling**

Gibbs sampling is a widely-used MCMC method.

In RBM of Deep Learning, Gibbs Sampling is that when the weight attributes and a variable v are certain, through sampling we can get h. Then through sampling of h, we can get another variable v. Iterate v and h, then it’s going to converges to a joint probability. **(It seems like x and y, the 2D example ??? Need to double check!)**

*In Machine Learning, Gibbs Sampling is used in the inference of training: to calculate derivatives, expectations (average).*

MCMC:Monte Carlo Markov Chain

2-D x, y, Gibbs Sampling

For multiple dimensions :

The algorithm is as follows.

An example of 4 D:

(A,B,C,D):

It seems that G.S is used for sampling only.

I am now trying to find out how it work in the inference of Graph Models.

References:

http://www.cs.ubc.ca/~arnaud/andrieu_defreitas_doucet_jordan_intromontecarlomachinelearning.pdf

http://www.tuicool.com/articles/fqEf6f

http://www.cnblogs.com/tornadomeet/p/3511054.html

http://www.cnblogs.com/daniel-D/p/3388724.html