Previously, we introduced Autoencoders and Hierarchical Variational Autoencoders (HVAEs). In this post, we will cover the details of Denoising Diffusion Probabilistic Models (DDPM).

#### Diffusion Models

We can treat DDPM as a restricted HVAE. Here, each only depends on . In DDPM, we do not have parameters to add noises, and it is a predefined Gaussian model. This would bring us some computational convenience to obtain any arbitrary quickly.

As shown in the following image, we can see that DDPM has two phases: 1) forward diffusion: adding noises to an input image, at T steps, the image becomes pure noises; 2) reverse process: we try to generate the original image input based on the noised version at .

##### Forward Diffusion：

Similarly, each step is to use a linear Gaussian to add noises based on the input from the previous step. So the forward process from time step 0 to T is:

,

And for each we have:

.

Note that is the mean, and is the variance at , ranges from 0 to 1. Let’s first look at this variance value . Because at time step , we should have exactly , so one way is to start with a smaller value, and increase it: . So then the mean value is in a reversed trend.

As I mentioned before, such a definition makes it possible to obtain sample at any arbitrary forward step given . Because the sum of independent Gaussian is still a Gaussian:

,

where , and .

##### Reverse Process：

While the forward diffusion is unparameterized, the reverse process is parameterized with (eliminated in the image). So we defined the following:

.

Given that is pure noise, , so we do not have here.

The issue is how to define the objective function . It is not possible to look at all possible directions on .

.

Similar to VAEs, we have the first term here to be reconstruction loss and consistency loss. Actually, we are eliminating the loss at step (since this is a known distribution, we simply ignore it). Since the reverse steps are also Gaussians, we have:

.

Differently, we learn the mean value but fix the variance . After reparameterization, we transfer our loss to predict this noise ( is a weight at step ):

,

There are some small tricks to obtain such simplification, and we will include some details in a slide file, stay tuned!

#### References

[1]https://youtu.be/fbLgFrlTnGU

[2] Understanding Diffusion Models: A Unified Perspective (Calvin Luo)