Previously, we introduced Autoencoders and Hierarchical Variational Autoencoders (HVAEs). In this post, we will cover the details of Denoising Diffusion Probabilistic Models (DDPM).
We can treat DDPM as a restricted HVAE. Here, each only depends on . In DDPM, we do not have parameters to add noises, and it is a predefined Gaussian model. This would bring us some computational convenience to obtain any arbitrary quickly.
As shown in the following image, we can see that DDPM has two phases: 1) forward diffusion: adding noises to an input image, at T steps, the image becomes pure noises; 2) reverse process: we try to generate the original image input based on the noised version at .
Similarly, each step is to use a linear Gaussian to add noises based on the input from the previous step. So the forward process from time step 0 to T is:
And for each we have:
Note that is the mean, and is the variance at , ranges from 0 to 1. Let’s first look at this variance value . Because at time step , we should have exactly , so one way is to start with a smaller value, and increase it: . So then the mean value is in a reversed trend.
As I mentioned before, such a definition makes it possible to obtain sample at any arbitrary forward step given . Because the sum of independent Gaussian is still a Gaussian:
where , and .
While the forward diffusion is unparameterized, the reverse process is parameterized with (eliminated in the image). So we defined the following:
Given that is pure noise, , so we do not have here.
The issue is how to define the objective function . It is not possible to look at all possible directions on .
Similar to VAEs, we have the first term here to be reconstruction loss and consistency loss. Actually, we are eliminating the loss at step (since this is a known distribution, we simply ignore it). Since the reverse steps are also Gaussians, we have:
Differently, we learn the mean value but fix the variance . After reparameterization, we transfer our loss to predict this noise ( is a weight at step ):
There are some small tricks to obtain such simplification, and we will include some details in a slide file, stay tuned!
 Understanding Diffusion Models: A Unified Perspective (Calvin Luo)